Hypothesis Testing with Two Samples

# 52 Comparing Two Independent Population Proportions

When conducting a hypothesis test that compares two independent population proportions, the following characteristics should be present:

1. The two independent samples are random samples that are independent.
2. The number of successes is at least five, and the number of failures is at least five, for each of the samples.
3. Growing literature states that the population must be at least ten or even perhaps 20 times the size of the sample. This keeps each population from being over-sampled and causing biased results.

Comparing two proportions, like comparing two means, is common. If two estimated proportions are different, it may be due to a difference in the populations or it may be due to chance in the sampling. A hypothesis test can help determine if a difference in the estimated proportions reflects a difference in the two population proportions.

Like the case of differences in sample means, we construct a sampling distribution for differences in sample proportions: where and are the sample proportions for the two sets of data in question. XA and XB are the number of successes in each sample group respectively, and nA and nB are the respective sample sizes from the two groups. Again we go the Central Limit theorem to find the distribution of this sampling distribution for the differences in sample proportions. And again we find that this sampling distribution, like the ones past, are normally distributed as proved by the Central Limit Theorem, as seen in (Figure) .

Generally, the null hypothesis allows for the test of a difference of a particular value, ?0, just as we did for the case of differences in means.

Most common, however, is the test that the two proportions are the same. That is,

To conduct the test, we use a pooled proportion, pc.

The pooled proportion is calculated as follows:

The test statistic (z-score) is:

where δ0 is the hypothesized differences between the two proportions and pc is the pooled variance from the formula above.

A bank has recently acquired a new branch and thus has customers in this new territory. They are interested in the default rate in their new territory. They wish to test the hypothesis that the default rate is different from their current customer base. They sample 200 files in area A, their current customers, and find that 20 have defaulted. In area B, the new customers, another sample of 200 files shows 12 have defaulted on their loans. At a 10% level of significance can we say that the default rates are the same or different?

This is a test of proportions. We know this because the underlying random variable is binary, default or not default. Further, we know it is a test of differences in proportions because we have two sample groups, the current customer base and the newly acquired customer base. Let A and B be the subscripts for the two customer groups. Then pA and pB are the two population proportions we wish to test.

Random Variable:P′AP′B = difference in the proportions of customers who defaulted in the two groups.

The words “is a difference” tell you the test is two-tailed.

Distribution for the test: Since this is a test of two binomial population proportions, the distribution is normal:

(p′Ap′B) = 0.04 follows an approximate normal distribution.

Estimated proportion for group A:

Estimated proportion for group B:

The estimated difference between the two groups is : p′Ap′B = 0.1 – 0.06 = 0.04.

The calculated test statistic is .54 and is not in the tail of the distribution.

Make a decision: Since the calculate test statistic is not in the tail of the distribution we cannot reject H0.

Conclusion: At a 1% level of significance, from the sample data, there is not sufficient evidence to conclude that there is a difference between the proportions of customers who defaulted in the two groups.

Try It

Two types of valves are being tested to determine if there is a difference in pressure tolerances. Fifteen out of a random sample of 100 of Valve A cracked under 4,500 psi. Six out of a random sample of 100 of Valve B cracked under 4,500 psi. Test at a 5% level of significance.

The p-value is 0.0379, so we can reject the null hypothesis. At the 5% significance level, the data support that there is a difference in the pressure tolerances between the two valves.

### References

Data from Educational Resources, December catalog.

Data from Hilton Hotels. Available online at http://www.hilton.com (accessed June 17, 2013).

Data from Hyatt Hotels. Available online at http://hyatt.com (accessed June 17, 2013).

Data from Statistics, United States Department of Health and Human Services.

Data from Whitney Exhibit on loan to San Jose Museum of Art.

Data from the American Cancer Society. Available online at http://www.cancer.org/index (accessed June 17, 2013).

Data from the Chancellor’s Office, California Community Colleges, November 1994.

“State of the States.” Gallup, 2013. Available online at http://www.gallup.com/poll/125066/State-States.aspx?ref=interactive (accessed June 17, 2013).

“West Nile Virus.” Centers for Disease Control and Prevention. Available online at http://www.cdc.gov/ncidod/dvbid/westnile/index.htm (accessed June 17, 2013).

### Chapter Review

Test of two population proportions from independent samples.

• Random variable: difference between the two estimated proportions
• Distribution: normal distribution

### Formula Review

Pooled Proportion: pc =

Test Statistic (z-score):

where

and are the sample proportions, and are the population proportions,

Pc is the pooled proportion, and nA and nB are the sample sizes.

Use the following information for the next five exercises. Two types of phone operating system are being tested to determine if there is a difference in the proportions of system failures (crashes). Fifteen out of a random sample of 150 phones with OS1 had system failures within the first eight hours of operation. Nine out of another random sample of 150 phones with OS2 had system failures within the first eight hours of operation. OS2 is believed to be more stable (have fewer crashes) than OS1.

Is this a test of means or proportions?

What is the random variable?

POS1POS2 = difference in the proportions of phones that had system failures within the first eight hours of operation with OS1 and OS2.

State the null and alternative hypotheses.

What can you conclude about the two operating systems?

Use the following information to answer the next twelve exercises. In the recent Census, three percent of the U.S. population reported being of two or more races. However, the percent varies tremendously from state to state. Suppose that two random surveys are conducted. In the first random survey, out of 1,000 North Dakotans, only nine people reported being of two or more races. In the second random survey, out of 500 Nevadans, 17 people reported being of two or more races. Conduct a hypothesis test to determine if the population percents are the same for the two states or if the percent for Nevada is statistically higher than for North Dakota.

Is this a test of means or proportions?

proportions

State the null and alternative hypotheses.

1. H0: _________
2. Ha: _________

Is this a right-tailed, left-tailed, or two-tailed test? How do you know?

right-tailed

What is the random variable of interest for this test?

In words, define the random variable for this test.

The random variable is the difference in proportions (percents) of the populations that are of two or more races in Nevada and North Dakota.

Which distribution (normal or Student’s t) would you use for this hypothesis test?

Explain why you chose the distribution you did for the Exercise 10.56.

Our sample sizes are much greater than five each, so we use the normal for two proportions distribution for this hypothesis test.

Calculate the test statistic.

At a pre-conceived α = 0.05, what is your:

1. Decision:
2. Reason for the decision:
3. Conclusion (write out in a complete sentence):
1. Cannot accept the null hypothesis.
2. p-value < alpha
3. At the 5% significance level, there is sufficient evidence to conclude that the proportion (percent) of the population that is of two or more races in Nevada is statistically higher than that in North Dakota.

Does it appear that the proportion of Nevadans who are two or more races is higher than the proportion of North Dakotans? Why or why not?

### Homework

A recent drug survey showed an increase in the use of drugs and alcohol among local high school seniors as compared to the national percent. Suppose that a survey of 100 local seniors and 100 national seniors is conducted to see if the proportion of drug and alcohol use is higher locally than nationally. Locally, 65 seniors reported using drugs or alcohol within the past month, while 60 national seniors reported using them.

We are interested in whether the proportions of female suicide victims for ages 15 to 24 are the same for the whites and the blacks races in the United States. We randomly pick one year, 1992, to compare the races. The number of suicides estimated in the United States in 1992 for white females is 4,930. Five hundred eighty were aged 15 to 24. The estimate for black females is 330. Forty were aged 15 to 24. We will let female suicide victims be our population.

1. H0: PW = PB
2. Ha: PWPB
3. The random variable is the difference in the proportions of white and black suicide victims, aged 15 to 24.
4. normal for two proportions
5. test statistic: –0.1944
6. p-value: 0.8458
7. Check student’s solution.
1. Alpha: 0.05
2. Decision: Cannot accept the null hypothesis.
3. Reason for decision: p-value > alpha
4. Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the proportions of white and black female suicide victims, aged 15 to 24, are different.

Elizabeth Mjelde, an art history professor, was interested in whether the value from the Golden Ratio formula, was the same in the Whitney Exhibit for works from 1900 to 1919 as for works from 1920 to 1942. Thirty-seven early works were sampled, averaging 1.74 with a standard deviation of 0.11. Sixty-five of the later works were sampled, averaging 1.746 with a standard deviation of 0.1064. Do you think that there is a significant difference in the Golden Ratio calculation?

A recent year was randomly picked from 1985 to the present. In that year, there were 2,051 Hispanic students at Cabrillo College out of a total of 12,328 students. At Lake Tahoe College, there were 321 Hispanic students out of a total of 2,441 students. In general, do you think that the percent of Hispanic students at the two colleges is basically the same or different?

Subscripts: 1 = Cabrillo College, 2 = Lake Tahoe College

1. The random variable is the difference between the proportions of Hispanic students at Cabrillo College and Lake Tahoe College.
2. normal for two proportions
3. test statistic: 4.29
4. p-value: 0.00002
5. Check student’s solution.
1. Alpha: 0.05
2. Decision: Cannot accept the null hypothesis.
3. Reason for decision: p-value < alpha
4. Conclusion: There is sufficient evidence to conclude that the proportions of Hispanic students at Cabrillo College and Lake Tahoe College are different.

Use the following information to answer the next three exercises. Neuroinvasive West Nile virus is a severe disease that affects a person’s nervous system . It is spread by the Culex species of mosquito. In the United States in 2010 there were 629 reported cases of neuroinvasive West Nile virus out of a total of 1,021 reported cases and there were 486 neuroinvasive reported cases out of a total of 712 cases reported in 2011. Is the 2011 proportion of neuroinvasive West Nile virus cases more than the 2010 proportion of neuroinvasive West Nile virus cases? Using a 1% level of significance, conduct an appropriate hypothesis test.

• “2011” subscript: 2011 group.
• “2010” subscript: 2010 group

This is:

1. a test of two proportions
2. a test of two independent means
3. a test of a single mean
4. a test of matched pairs.

An appropriate null hypothesis is:

1. p2011p2010
2. p2011p2010
3. μ2011μ2010
4. p2011 > p2010

a

Researchers conducted a study to find out if there is a difference in the use of eReaders by different age groups. Randomly selected participants were divided into two age groups. In the 16- to 29-year-old group, 7% of the 628 surveyed use eReaders, while 11% of the 2,309 participants 30 years old and older use eReaders.

Test: two independent sample proportions.

Random variable: p1p2

Distribution:

The proportion of eReader users is different for the 16- to 29-year-old users from that of the 30 and older users.

Graph: two-tailed

Adults aged 18 years old and older were randomly selected for a survey on obesity. Adults are considered obese if their body mass index (BMI) is at least 30. The researchers wanted to determine if the proportion of women who are obese in the south is less than the proportion of southern men who are obese. The results are shown in (Figure). Test at the 1% level of significance.

Number who are obese Sample size
Men 42,769 155,525
Women 67,169 248,775

Two computer users were discussing tablet computers. A higher proportion of people ages 16 to 29 use tablets than the proportion of people age 30 and older. (Figure) details the number of tablet owners for each age group. Test at the 1% level of significance.

16–29 year olds 30 years old and older
Own a tablet 69 231
Sample size 628 2,309

Test: two independent sample proportions

Random variable: p′1p′2

Distribution:

A higher proportion of tablet owners are aged 16 to 29 years old than are 30 years old and older.

Graph: right-tailed

Do not reject the H0.

Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a higher proportion of tablet owners are aged 16 to 29 years old than are 30 years old and older.

A group of friends debated whether more men use smartphones than women. They consulted a research study of smartphone use among adults. The results of the survey indicate that of the 973 men randomly sampled, 379 use smartphones. For women, 404 of the 1,304 who were randomly sampled use smartphones. Test at the 5% level of significance.

While her husband spent 2½ hours picking out new speakers, a statistician decided to determine whether the percent of men who enjoy shopping for electronic equipment is higher than the percent of women who enjoy shopping for electronic equipment. The population was Saturday afternoon shoppers. Out of 67 men, 24 said they enjoyed the activity. Eight of the 24 women surveyed claimed to enjoy the activity. Interpret the results of the survey.

Subscripts: 1: men; 2: women

1. is the difference between the proportions of men and women who enjoy shopping for electronic equipment.
2. normal for two proportions
3. test statistic: 0.22
4. p-value: 0.4133
5. Check student’s solution.
1. Alpha: 0.05
2. Decision: Cannot reject the null hypothesis.
3. Reason for Decision: p-value > alpha
4. Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the proportion of men who enjoy shopping for electronic equipment is more than the proportion of women.

We are interested in whether children’s educational computer software costs less, on average, than children’s entertainment software. Thirty-six educational software titles were randomly picked from a catalog. The mean cost was ?31.14 with a standard deviation of ?4.69. Thirty-five entertainment software titles were randomly picked from the same catalog. The mean cost was ?33.86 with a standard deviation of ?10.87. Decide whether children’s educational software costs less, on average, than children’s entertainment software.

Joan Nguyen recently claimed that the proportion of college-age males with at least one pierced ear is as high as the proportion of college-age females. She conducted a survey in her classes. Out of 107 males, 20 had at least one pierced ear. Out of 92 females, 47 had at least one pierced ear. Do you believe that the proportion of males has reached the proportion of females?

1. is the difference between the proportions of men and women that have at least one pierced ear.
2. normal for two proportions
3. test statistic: –4.82
4. p-value: zero
5. Check student’s solution.
1. Alpha: 0.05
2. Decision: Cannot accept the null hypothesis.
3. Reason for Decision: p-value < alpha
4. Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the proportions of males and females with at least one pierced ear is different.

“To Breakfast or Not to Breakfast?” by Richard Ayore

In the American society, birthdays are one of those days that everyone looks forward to. People of different ages and peer groups gather to mark the 18th, 20th, …, birthdays. During this time, one looks back to see what he or she has achieved for the past year and also focuses ahead for more to come.

If, by any chance, I am invited to one of these parties, my experience is always different. Instead of dancing around with my friends while the music is booming, I get carried away by memories of my family back home in Kenya. I remember the good times I had with my brothers and sister while we did our daily routine.

Every morning, I remember we went to the shamba (garden) to weed our crops. I remember one day arguing with my brother as to why he always remained behind just to join us an hour later. In his defense, he said that he preferred waiting for breakfast before he came to weed. He said, “This is why I always work more hours than you guys!”

And so, to prove him wrong or right, we decided to give it a try. One day we went to work as usual without breakfast, and recorded the time we could work before getting tired and stopping. On the next day, we all ate breakfast before going to work. We recorded how long we worked again before getting tired and stopping. Of interest was our mean increase in work time. Though not sure, my brother insisted that it was more than two hours. Using the data in (Figure), solve our problem.

Work hours with breakfast Work hours without breakfast
8 6
7 5
9 5
5 4
9 7
8 7
10 7
7 5
6 6
9 5
1. H0: µd = 0
2. Ha: µd > 0
3. The random variable Xd is the mean difference in work times on days when eating breakfast and on days when not eating breakfast.
4. t9
5. test statistic: 4.8963
6. p-value: 0.0004
7. Check student’s solution.
1. Alpha: 0.05
2. Decision: Cannot accept the null hypothesis.
3. Reason for Decision: p-value < alpha
4. Conclusion: At the 5% level of significance, there is sufficient evidence to conclude that the mean difference in work times on days when eating breakfast and on days when not eating breakfast has increased.