Background
We introduce several useful features of the Python language and associated libraries by writing a program to estimate the numerical value of pi by drawing random numbers.Learning Goals
Science: This introduces a powerful method for estimating via Monte Carlo.Computation: This warmup exercise is used as part of an interactive demo in class to illustrate Development Environments, Interpreters and Workflows. In addition, this exercise introduces some issues related to the performance of interpreted vs. compiled code, tools for analyzing such performance, and simple plotting.
Procedure
- Consult the general instructions for working with course modules in the pages on Development Environments, Interpreters and Workflows and Computer Exercises.
- Download the Estimate Pi Hints file. Make a new directory for this exercise (e.g., you might call it EstimatePi), put the hints file there, and cd into that directory.
- Make a copy of the hints file to a new file that you will edit: EstimatingPiHints.py -> EstimatingPi.py
- Open EstimatingPi.py in an editor of your choice. If you don't have an editor of your choice, you can open a text editor within the IPython notebook (more info below).
- In a terminal, start up an IPython notebook via ipython notebook.
- This exercise is focused on estimating the numerical value of pi via Monte Carlo. Random numbers are deposited uniformly within the square bounded by (-1,-1), (-1,1), (1,1), (1,-1). Each such random (x,y) point is tested to see whether it lies in the unit circle, i.e., the largest circle that lies within that square. Since the ratio of the area of the circle to the area of the square is pi/4, and since the random (x,y) points are uniformly distributed, the fraction of (x,y) points lying within the circle should provide an increasingly better estimate of pi/4 as more random numbers are drawn.
- At a minimum, you should implement throwDarts, inUnitCircle, and EstimatePi1, which involves drawing one random number at a time. Run EstimatePi1(n) for successively larger numbers of random numbers n to observe convergence toward the correct numerical value of pi. numpy.random.random() returns a single uniformly distributed random number between 0 and 1.
- If you're feeling more ambitious, next implement EstimatePi2, which involves drawing n random numbers all at once, and testing whether they are inside the circle all at once. This involves acting on arrays of numbers. numpy.random.random(size) returns an array of uniformly distributed numbers (between 0 and 1) of the specified size. Typing numpy.random.random? within ipython will print out the documentation for this function. In doing so, you should see that size (which is an optional argument, since it has a default value equal to None) can be either an int or a tuple of ints. If it is a single int (integer), that integer is interpreted as the length of an array, and a one-dimensional array of random numbers of that length is returned. If size is a tuple of ints (e.g., (100,2)), then an N-dimensional array of random number is returned, where N is the number of elements in the tuple, and the size of the array is given by the integer in the corresponding slot in the tuple; e.g., size=(100,2) results in a two-dimensional array of size 100x2. We can interpret such an object as an array of 100 two-dimensional (x,y) points.
- Not only can be create arrays en masse, but we can also operate on all their elements at once. Use the numpy.linalg.norm function to calculate the length of each (x,y) vector; you will need to figure out which axis to take the norm along, since you want the norm of each (x,y) vector, not the norm of the entire 2-d array. Once you've figured out how to compute the norm of each (x,y) vector, it is easy to test en masse whether each one lies within the unit circle.
- Test EstimatePi2(n) for various values of n to assure yourself that it is converging to the correct estimate for pi. Don't make n too large, however, since you will use up all available memory trying to build an array of random numbers.
- If you're interested in quantifying the performance increase achieved by drawing all n random numbers at once, instead of just one at a time, use the %timeit magic function in ipython. Run %timeit EstimatePi1(10000), and then %timeit EstimatePi2(10000). Then try both again for n=100000.
- If you're interested in pushing further with this exercise, you can implement EstimatePi3, which tries to strike a balance between the numerical efficiencies of drawing and testing more than one random number at a time within a loop versus the need to not use up all available memory just to draw random numbers. When you draw a single random number in a loop, you never create more than one random number at a time, and the interpreter is able to reuse the memory that was allocated to the last random number when you reassign a variable the next time through the loop. This reuse of memory is known as "garbage collection". But the random draws use to estimate pi are all independent of each other, so you don't need to act on an array of random numbers at once: that is just a convenience for numerical efficiency and ease of programming. We can strike a balance by drawing random numbers in blocks (that are sufficiently big to give us a performance speedup, but not so large as to swamp the machine's memory), and then summing up the statistics generated within each block to get a better estimate of pi.
- A nice graphical way to summarize the essence of this Monte Carlo method is to make a scatter plot of all the random (x,y) points you've drawn, but coloring those points that lie within the circle differently than those that lie outside the circle. You could also use the pylab.title function to title the graph with the current estimate of pi from the sampling.