# Programming Assignment 1 Percolation

## Programming Assignment 1: Percolation

Write a program to estimate the value of the *percolation threshold* via Monte Carlo simulation.

**Install a Java programming environment.** Install a Java programming environment on your computer by following these step-by-step instructions for your operating system [ Mac OS X · Windows · Linux ]. After following these instructions, the commands and will classpath in algs4.jar, which contains Java classes for I/O and all of the algorithms in the textbook.

*For Fall 2015, you must use the named package version of . To access a class, you need an import statement, such as the ones below:*

import edu.princeton.cs.algs4.StdRandom; import edu.princeton.cs.algs4.StdStats; import edu.princeton.cs.algs4.WeightedQuickUnionUF;

**Percolation.** Given a composite systems comprised of randomly distributed insulating and metallic materials: what fraction of the materials need to be metallic so that the composite system is an electrical conductor? Given a porous landscape with water on the surface (or oil below), under what conditions will the water be able to drain through to the bottom (or the oil to gush through to the surface)? Scientists have defined an abstract process known as *percolation* to model such situations.

**The model.** We model a percolation system using an *N*-by-*N* grid of *sites*. Each site is either *open* or *blocked*. A *full* site is an open site that can be connected to an open site in the top row via a chain of neighboring (left, right, up, down) open sites. We say the system *percolates* if there is a full site in the bottom row. In other words, a system percolates if we fill all open sites connected to the top row and that process fills some open site on the bottom row. (For the insulating/metallic materials example, the open sites correspond to metallic materials, so that a system that percolates has a metallic path from top to bottom, with full sites conducting. For the porous substance example, the open sites correspond to empty space through which water might flow, so that a system that percolates lets water fill open sites, flowing from top to bottom.)

**The problem.** In a famous scientific problem, researchers are interested in the following question: if sites are independently set to be open with probability *p* (and therefore blocked with probability 1 − *p*), what is the probability that the system percolates? When *p* equals 0, the system does not percolate; when *p* equals 1, the system percolates. The plots below show the site vacancy probability *p* versus the percolation probability for 20-by-20 random grid (left) and 100-by-100 random grid (right).

When *N* is sufficiently large, there is a *threshold* value *p** such that when *p* < *p** a random *N*-by-*N* grid almost never percolates, and when *p* > *p**, a random *N*-by-*N* grid almost always percolates. No mathematical solution for determining the percolation threshold *p** has yet been derived. Your task is to write a computer program to estimate *p**.

**Percolation data type.** To model a percolation system, create a data type with the following API:

public class Percolation {public Percolation(int N)// create N-by-N grid, with all sites initially blockedpublic void open(int row, int col)// open the site (row, col) if it is not open alreadypublic boolean isOpen(int row, int col)// is the site (row, col) open?public boolean isFull(int row, int col)// is the site (row, col) full?public int numberOfOpenSites()// number of open sitespublic boolean percolates()// does the system percolate?public static void main(String[] args)// unit testing (required)}

*Corner cases. * By convention, the row and column indices are integers between 0 and *N* − 1, where (0, 0) is the upper-left site: Throw a if any argument to , , or is outside its prescribed range. The constructor should throw a if *N* ≤ 0.

*Performance requirements. * The constructor should take time proportional to *N*^{2}; all methods should take constant time plus a constant number of calls to the union-find methods , , , and .

**Monte Carlo simulation.** To estimate the percolation threshold, consider the following computational experiment:

- Initialize all sites to be blocked.
- Repeat the following until the system percolates:
- Choose a site uniformly at random among all blocked sites.
- Open the site.

- The fraction of sites that are opened when the system percolates provides an estimate of the percolation threshold.

For example, if sites are opened in a 20-by-20 grid according to the snapshots below, then our estimate of the percolation threshold is 204/400 = 0.51 because the system percolates when the 204th site is opened.

By repeating this computation experiment *T* times and averaging the results, we obtain a more accurate estimate of the percolation threshold. Let *x _{t}* be the fraction of open sites in computational experiment

*t*. The sample mean μ provides an estimate of the percolation threshold; the sample standard deviation σ measures the sharpness of the threshold.

*T*is sufficiently large (say, at least 30), the following provides a 95% confidence interval for the percolation threshold:

To perform a series of computational experiments, create a data type with the following API.

The constructor should throw a if eitherpublic class PercolationStats {public PercolationStats(int N, int T)// perform T independent experiments on an N-by-N gridpublic double mean()// sample mean of percolation thresholdpublic double stddev()// sample standard deviation of percolation thresholdpublic double confidenceLow()// low endpoint of 95% confidence intervalpublic double confidenceHigh()// high endpoint of 95% confidence interval}

*N*≤ 0 or

*T*≤ 0.

The constructor should take two arguments *N* and *T*, and perform *T* independent computational experiments (discussed above) on an *N*-by-*N* grid. Using this experimental data, it should calculate the mean, standard deviation, and the *95% confidence interval* for the percolation threshold. Use to generate random numbers; use to compute the sample mean and standard deviation.

Example values after creatingPercolationStats(200, 100)mean() = 0.5929934999999997 stddev() = 0.00876990421552567 confidenceLow() = 0.5912745987737567 confidenceHigh() = 0.5947124012262428 Example values after creatingPercolationStats(200, 100)mean() = 0.592877 stddev() = 0.009990523717073799 confidenceLow() = 0.5909188573514536 confidenceHigh() = 0.5948351426485464 Example values after creatingPercolationStats(2, 100000)mean() = 0.6669475 stddev() = 0.11775205263262094 confidenceLow() = 0.666217665216461 confidenceHigh() = 0.6676773347835391

**Analysis of running time.**

- Implement the data type using the
*quick-find*algorithm in . Use to measure the total running time of for various values of*N*and*T*. How does doubling*N*affect the total running time? How does doubling*T*affect the total running time? Give a formula (using tilde notation) of the total running time on your computer (in seconds) as a single function of both*N*and*T*. - Now, implement the data type using the
*weighted quick-union*algorithm in . Answer the same questions in the previous bullet.

**Deliverables.** Submit only (using the weighted quick-union algorithm from ) and . We will supply . Your submission may not call library functions except those in , , , , , and . Also, submit a readme.txt file and answer all questions. You will need to read the COS 226 Collaboration Policy in order to answer the related questions in your readme file.

Copyright © 2008.

**3种版本的答案，第一种使用virtual top and bottom site, 但有backwash的问题，解决这个问题有两种方法：**

**1. 使用2个WQUUF, 但会增加memory. One for checking if the system percolates(include virtual top and bottom), and the other to check if a given cell is full(only include virtual top). 而且要注意，判断site 是否open只能用boolean ,不然memory 就会超出限制。记住：选择合适的data structure 很重要！！**

**2. 仍然使用1个WQUUF, 但不使用virtual top and bottom site, 增加判断connect to top 和connect to bottom, 如果出现site 既connect to top 也connect to bottom, 那么percolate.**

I found a solution that works really well, that helped me to get bonus points. The general idea is to use only one WQUUF (N*N) and an array (N*N) that contains info about site status.

We create ONE WQUUF object of size N * N, and allocate a separate array of size N * N to keep the status of each site: blocked, open, connect to top, connect to bottom. I use bit operation for the status so for each site, it could have combined status like Open and connect to top.

The most important operation is open(int i, int j): we need to union the newly opened site (let’s call it site ‘S’) S with the four adjacent neighbor sites if possible. For each possible neighbor site(Let’s call it ‘neighbor’), we first call find(neighbor) to get the root of that connected component, and retrieves the status of that root (Let’s call it ‘status’), next, we do Union(S, neighbor); we do the similar operation for at most 4 times, and we do a 5th find(S) to get the root of the newly (copyright @sigmainfy) generated connected component results from opening the site S, finally we update the status of the new root by combining the old status information into the new root in constant time. I leave the details of how to combine the information to update the the status of the new root to the readers, which would not be hard to think of.

For the isFull(int i, int j), we need to find the the root site in the connected component which contains site (i, j) and check the status of the root.

For the isOpen(int i, int j) we directly return the status.

For percolates(), there is a way to make it constant time even though we do not have virtual top or bottom sites: think about why?

So the most important operation open(int i, int j) will involve 4 union() and 5 find() API calls.

I am working on this but I think I understand it. Don't use virtual sites. Find(p) returns the same value for every p in the same component. For every open, call find(p) on neighbors and note down the status of each of the roots of neighbors. I( use a byte array where 0 is closed ; 1 is open. 2 is connectedness to top; 3 is connected to bottom and 4 is both.)

If any of the neighbors have connected to both set or (at least 1 is connected to top AND atleast 1 is connected to bottom) then set some local flag both to true

If connected to top is true set local flag top to true If connected to bottom is true set local flag bottom to true

Now after the unions with neighbors, find root of (I,j) And set its grid status to both or top or bottom.

If you do set it to both then you can also set a class variable percolatesFlag to true for use in the method percolates.

I haven't finished my implementation but it does seem like this will work.

java code

1. 有backwash

2. 使用2个WQUUF

3. 最佳方法，增加flag, 只使用1个WQUUF

Reference:

1. http://tech-wonderland.net/blog/avoid-backwash-in-percolation.html

## 0 Replies to “Programming Assignment 1 Percolation”