*Notation:*We shall use p to denote the proportion of in the entire
population (this is µ, the mean for the entire population if you are
scoring yes as 1 and no as 0). We shall use p-hat (this should be a lowercase
p with a caret (^) circumflex) to denote the proportion in the sample (this is
x-bar, the mean of the sample).

Conversion to proportion from count data entails division by n, hence p-hat is X/n as noted above, and the stadard deviation of p-hat (recall that p-hat is a random variable) is (p(1-p)/n)^.5 (this is ((np(1-p))^.5)/n). The standard deviation of p-hat is denoted as sigma-p-hat (a lower case sigma with p-hat as a subscript (this is the same as sigma-x-bar).

Examples:

If 612 out of 1100 students are males, what is the 95% confidence interval for the proportion of students which are males?

p-hat = 612/1100 = .5564.

Although sigma-p-hat is defined using p, the proportion of the population, we do not know what p is (we are constructing a confidence interval for p). Therefore we must use p-hat in lieu of p. sigma-p-hat = (.56 × .44 /1100)^.5 = .015

The z-score for a 95% confidence interval is 1.96

Therefore, the 95% confidence interval for p is

(.56 - 1.96 × .015, .56 + 1.96 × .015) = (.53, .59)

Two tailed test of hypothesis

If you are told that 50% of rabbits are male, but find 94 out of a sample
of 197 are male. Do you question the hypothesis?

p-hat = 94/197 = .4772

sigma-p-hat = (.5 × .5 / 197)^.5 = .0356; Note that if p is known, we
should always use it rather than p-hat in sigma-p-hat.

z = (.477 - .5)/.0356 = -.65, which provides a P-value of .52. since this is
large, you would not reject the hypothesis.

One tailed test of hypothesis

If President Clinton said 80% of Americans approve of his policies, but you
found only 112 out of a sample of 144 approved of his policies, would you
question him?

* He is really claiming at least 80% approve of his policies, you would
only question him if too few people approved, hence this is a one tailed
test. *

p-hat = .7778

sigma-p-hat = (.8 × .2/144)^.5 =.0333.

z = (.78-.8)/.033 = -.06; the corresponding P-value is .27; since this is large,
you would not reject the hypothesis.

Remarks

Recall that 35% = .35; do not get confused with the position of decimal
points

The value of sigma-p-hat depends on the value of p. However, it is readily verified that the greatest value is obtained at p=.5. Therefore problems of the form how large must n be to obtain a confidence interval with a radius less than or equal to a given value can be solved by using n=.5.

**Competencies:** If you get 527 heads when you flip a coin 1000 times, do you question that it is fair (at what significance level?)

If 712 out of 1200 persons like the color blue, what is the 95% confidence interval for the proportion of the population which likes the color blue?