Statistics 220E

Homework 4

Due Wednesday, February 4,1998


  1. (i) If X is normally distributed around a mean of 8 and a standard deviation of 5, find:
    1. Pr(X>5)
    2. Pr(X<10)
    3. Pr(6<X<9)
    4. Pr(2<X<4).

    (ii) Do the same thing using STATA. There is a command in STATA called
    normprob
    that will give you the same numbers as Table IV in the textbook. You need to input the values in a variable called "x", for example, and after that just type
    generate y=normprob(x)
    The variable "y" will contain the left tail probabilities and you can display it using the command list.
  2. WW 4-40
  3. WW 4-46
  4. WW 5-8
  5. Computer Assignment. This will be your first "Monte Carlo" simulation. Computer simulation is a very powerful tool for researchers in a number of fields, including statistics, physics and economics.


    A. A "fair" coin will land heads or tails with equal probability. We will simulate this process by a random variable that takes on only the values 1 and 0 with probabiliities .5, with the help of Stata. Save your work in the file ab2.log, where ab are your initials (if your name were Dan Nicolae, for example, you'd name the file dn2.log). For the sake of privacy, if you are working on a university PC, after you are done with the work copy the file on a disk and delete it from the computer.
    log using ab2.log, replace
    (the replace option is going to delete any previous version of the log file)
    set obs 10
    generate x10 = round(uniform(),1)
    is creating a variable containing the result of 10 tosses of a fair coin. Create in a similar manner (and in this order) the variables x50, x100, x1000, x10000 containing the results of 50, 100, 1000, 10000 of coin tosses. Some versions of Stata may not take 10000 observations, case in which replace 10000 with a reasonable large number (such as 4000). Summarize your data in a table containing as columns the sample sizes and the relative frequency of the number of heads for each of the five variables (either by hand, or using a Stata command). In any case, add by hand a column containing the number of heads in each variable.


    B. If you used the Stata command, the table contains also the standard deviation for each set of observations. Show explicitly how these numbers were obtained in each case (or, if you created the table by hand, calculate them).


    C. Plot, either by hand or using Stata, the relative frequency of heads versus log10(n). Comment on the behavior of the plot (you may need to lookup for the Stata commands input, generate and for the mathematical functions, such as log10()). Does it appear that the equation
    Prob(heads) = lim (number of heads observed/n) when n goes to infinity
    holds in this simulation? This is the "long range frequency" definition of probability in this context.


    D. What are the medians for each set of observations? Is the median a useful measurement here? Explain.