# P B A bbgg - LHS AP STATISTICS

10. We all “know” that the body temperature of a healthy person is 98.6 °F. In reality, the actual body temperature of i...

Unit 1-6 Review  measurement and their resistance to outliers  boxplot/stemplots/dotplots/histograms – be able to graph and interpret them  outliers – be able to recognize on a graph and determine if an observation is an outlier  compare and contrast back-to-back stemplots/boxplots (center, shape, spread, outliers)  recognize skewed left vs. right  qualitative vs. categorical  sketch a normal distribution – use 68-9599.7 rule  z-scores  normal distribution notation – N()  proportions of a density curve  assess normality using histogram  correlation, strong vs. weak, positive vs. negative, no, determine effects of outliers  calculate LSRL, use to make predictions  reading computer output to find LSRL equation, r , r 2  interpret the slope, correlation coefficient, coefficient of determination

           

     

identify outliers and influential points of a dotplot types of lurking variables exponential and power models use of residual plot SRS, systematic, muti-stage, stratified sampling; convenience sampling block experimental design simulations treatments, factors, levels placebo effect, response bias observational study vs. experiment basic principles of design are control, randomization and replication P A B  P A  P B  P A B

b

g bg bg b Pb A  Bg Pb B| Ag  Pb Ag

g

conditional probability disjoint (add probabilities) vs. independent (multiply probabilities) complement rule discrete vs continuous random variables mean/variance/standard deviation of random variables combining random variables

Review Questions: 1. If you eat at Silly Bob’s Cafe there’s a 40% chance that your food will be cold (C) and a 30% chance your food will taste bad (B). Assume that these two events are independent. a. What is the probability that both will occur, you food is cold and it tastes bad? b. What is the probability that your food is cold or it tastes bad? 2. Members of a group of 50 college students were classified according to their hometown and ethnicity:

Hometown

Anglo

Hispanic

Asian

African-Amer.

Houston

7

4

4

6

8 7 11 San Antonio Assume random selection in each item, what is the probability that: a. a member is Asian? b. a member is Hispanic and from San Antonio? c. a member is African-American or from Houston? d. a member from San Antonio is Anglo? e. a member is from San Antonio given that he/she is Asian.. f. Are being Hispanic and from San Antonio independent? Prove numerically. 3. If P(M) = .3, P(N) = .5 and P(M or N) = .55, find P(N|M).

3

C

4. Use the following tree diagram to calculate P(F|G ) and P(G|F) .4 G .2

.8

F

C

C

.6

G

.65

G

.35

G

F

C

5. The Round Rock Express have a pitcher with an impressive winning record. Opposing teams are attempting to analyze his pitches. The pitcher throws a strike 35% of the time. If the pitcher throws a strike the next pitch is a fastball 40% of the time. If the pitcher doesn’t throw a strike the next pitch is a fastball 80% of the time. What is the probability of him throwing a fastball? What is the probability of him throwing a fastball given the previous pitch was not a strike. (hint: use a tree diagram) 6. The best male long jumpers for State College since 1973 have averaged a jump of 263.0 inches with a standard deviation of 14.0 inches. The best female long jumpers have averaged 201.2 inches with a standard deviation of 7.7 inches. This year Joey jumped 275 inches and his sister, Carla, jumped 207 inches. Both are State College students. Assume that male and female jumps are normally distributed. Within their groups, which athlete had the more impressive performance? Explain briefly. 7. For a normally distributed population, fill in the following blanks: (a) (b)

% of the population observations lie within 1.96 standard deviations on either side of the mean. % of the population observations lie within 1.64 standard deviations on either side of the mean.

8. The length of pregnancies from conception to natural birth among a certain female population is a normally distributed random variable with mean 270 and standard deviation 10 days. (a) What is the percent of pregnancies that last more than 300 days? (b) How short must a pregnancy be in order to fall in the shortest 10% of all pregnancies? 9. The test grades for a certain class were entered into a Minitab worksheet, and then “Descriptive Statistics” were requested. The results were: MTB > Describe 'Grades'. N MEAN Grades 28 74.71 MIN MAX Grades 35.00 94.00

MEDIAN 76.00 Q1 68.00

TRMEAN 75.50 Q3 84.00

STDEV 12.61

SEMEAN 2.38

You happened to see, on a scrap of paper, that the lowest grades were 35, 57, 59, 60, . . . but you don’t know what the other individual grades are. Nevertheless, a knowledgeable user of statistics can tell a lot about the dataset simply by studying the set of descriptive statistics above. (a) Write a brief description of what the results tell you about the distribution of grades. Be sure to address:  the general shape of the distribution  unusual features, including possible outliers  the middle 50% of the data  any significance in the difference between the mean and the median (b) Construct a modified boxplot for these data.

10. We all “know” that the body temperature of a healthy person is 98.6 °F. In reality, the actual body temperature of individuals varies. Here are boxplots, produced by Minitab, for the body temperatures of 130 individuals (65 males and 65 females). Gender 1 2

* *

* Temps

96.0

97.2

98.4

99.6

100.8

According to Minitab, µ = 98.103 and  = 0.700 for the male temperatures. If we assume that the males’ temperatures were normally distributed, what percent would have temperatures at 98.7 or above? Exercises 11 & 12 relate to the following. Bias is present in each of the following sampling designs. In each case, identify the type of bias involved and state whether you think the sampling frequency obtained is lower or higher than the actual population parameter. 11. A political pollster seeks information about the proportion of American adults that oppose gun controls. He asks a SRS of 1000 American adults: "Do you agree or disagree with the following statement: Americans should preserve their constitutional right to keep and bear arms." A total of 910, or 91%, said "agree" (that is, 910 out of the 1000 oppose gun controls). 12. A flour company in Minneapolis wants to know what percentage of local households bake at least twice a week. A company representative calls 500 households during the daytime and finds that 50% of them bake at least twice a week. Exercises 13—17 relate to the following. At summer camp, one of Carla’s counselors told her that air temperature can be determined from the number of cricket chirps. 13. What is the explanatory variable, and what is the response variable? (Note: this is in the context of this problem, not in the biological sense.) EXPLANATORY: RESPONSE: To determine a formula, Carla collected data on temperature and number of chirps per minute on 12 occasions. She entered the data into lists L1 and L2 of her TI-83 and then did STATS, CALC, 2-Var Stats. Here are some of the results: x = 166.8, y = 78.83 sx = 31.0 sy = 9.11 r = 0.461 14. Use this information to determine the equation of the LSRL. 15. One of Carla’s data points was recorded on a particularly hot day (93F). She counted 249 cricket chirps in one minute. What temperature would Carla’s model predict for this number of cricket chirps? (Round to the nearest degree.) 16. What is the residual for the data point in exercise 15? 17. Suppose that Carla counted 249 chirps on a day when the temperature was 55F. If this point were the 13th data point, what effect, if any, would this 13th point have on Carla’s LSRL? Explain briefly. 18. In general, is correlation a resistant measure of association? example to illustrate.

Explain briefly or give a simple

19. Is the least-squares regression line resistant? illustrate.

Explain briefly or give a simple example to

Exercises 20–21 relate to the following. Suppose the Richmond-Times Dispatch asks a sample of 150 Richmonders their opinions on the quality of life in Richmond. 20. Is this study an experiment? Explain why or why not. 21. Identify the sample and the population in the opinion poll in #20. Exercises 22-24 relate to the following. Read the brief article about aspirin and alcohol. Aspirin may enhance impairment by alcohol Aspirin, a longtime antidote for the side effects of drinking, may actually enhance alcohol’s effect, researchers at the Bronx Veteran’s Affairs Medical Center say. In a report on a study published in the Journal of the American Medical Association, the researchers said they found that aspirin significantly lowered the body’s ability to break down alcohol in the stomach. As a result, five volunteers who had a standard breakfast and two extra-strength aspirin tablets an hour before drinking had blood alcohol levels 30 percent higher than when they drank alcohol alone. Each volunteer consumed the equivalent of a glass and a half of wine. That 30 percent could make the difference between sobriety and impairment, said Dr. Charles S. Lieber, medical director of the Alcohol Research and Treatment Center at the Bronx center, who was co-author of the report with Dr. Risto Roine.

22. Does this article describe an experiment? Explain. 23. Did this study involve a simple random sample (SRS)? Explain. 24. Did this study use a particular design that we have studied? If so, identify the design. Then comment on the validity of the study. Exercises 25–26 relate to the following. It is believed that 75% of all apartment dwellers in a large city deadbolt their doors in addition to locking them as an added precaution against burglary. 25. Describe (in words, and in detail) how you would simulate a SRS of 20 apartment dwellers. 26. Beginning at line 127 in the random digit table, actually simulate a SRS of 20 apartment dwellers. (Reminder: Show Your Work!) What is the proportion p of people in the sample who deadbolt their doors? Exercises 27–29 relate to the following. You are participating in the design of a medical experiment to investigate whether or not a calcium supplement in the diet will reduce the blood pressure of middle-aged men. Preliminary research suggests that the supplement may have a greater effect on black men than on white men. 27. What sort of experimental design would you choose, and why? 28. Assume that the experimental population consists of 600 white men and 500 black men. Outline in a diagram the design of the experiment. (Be sure to indicate how many subjects are assigned to the various treatment groups.) 29. Use Line 134 of the Random Number Table to select the first 5 whites for the study, and use Line 142 to select the first 5 blacks for the study.

30. What is meant by disjoint (mutually exclusive) events? Give an example of two disjoint events. 31. Define and give an example of two complementary events. Exercises 32–33 relate to the following. When two dice are rolled, find the probability of getting 32. A sum greater than 9 33. A sum less than 4 or greater than 9 Exercises 34–35 relate to the following. A coin is tossed five times. 34. Find the probability of getting at least one tail. 35. Find the probability of getting 4 tails. Exercises 36–40 relate to the following. Suppose you are given a standard 6-sided die and told that the die is “loaded” in such a way that while the numbers 1, 3, 4, and 6 are equally likely to turn up, the numbers 2 and 5 are three times as likely to turn up as any of the other numbers. 36. The die is rolled once and the number turning up is observed. Use the information given above to fill in the following table: Outcome Probability

1

2

3

4

5

6

37. Let A be the event: the number rolled is a prime number (a number is prime if its only factors are 1 and the number itself; note that 1 is not prime). List the outcomes in A and find P(A). 38. Let B be the event: the number rolled is an even number. List the outcomes in B, and find P(B). 39. Are events A and B disjoint? Explain briefly. 40. Determine if events A and B are independent. Exercises 41–46 relate to the following. Consolidated Builders has bid on two large construction contracts. The company president believes that the probability of winning the first contract (event A) is 0.6, that the probability of winning a second (event B) is 0.4, and that the probability of winning both jobs is 0.2. 41. What is the probability of the event {A or B} that Consolidated will win at least one of the jobs? 42. Draw a Venn diagram that shows the relation between the events A and B in Exercise 41.

Write each of the following events in terms of A, B, Ac, and Bc. Indicate the events on your diagram for 42, and use the information in #41 to calculate the probability of each. 43. Consolidated wins both jobs. 44. Consolidated wins the first job but not the second. 45. Consolidated does not win the first job but does win the second. 46. Consolidated does not win either job.

47. “Normal” body temperature varies by time of day. A series of readings was taken of the body temperature of a subject. The mean reading was found to be 36.5° C with a standard deviation of 0.3° C. Find the mean and standard deviation when converted to °F. °F = °C(1.8) + 32. 48. The heights of American men aged 18 to 24 are approximately normally distributed with mean 68 inches and standard deviation 2.5 inches. Half of all young men are shorter than what height? 49. Use the information in the previous problem. About 5% of young men have heights outside what range? 50. Find the area under the standard normal curve corresponding to –0.3 < Z < 1.6. 51. In a statistics course, a linear regression equation was computed to predict the final exam score from the score on the first test. The equation was y = 10 + .9x where y is the final exam score and x is the score on the first test. Carla scored 95 on the first test. What is the predicted value of her score on the final exam? 52. Refer to the previous problem. On the final exam Carla scored 98. What is the value of her residual? 53. A study of the fuel economy for various automobiles plotted the fuel consumption (in liters of gasoline used per 100 kilometers traveled) vs. speed (in kilometers per hour). A least squares line was fit to the data. Here is the residual plot from this least squares fit. What does the pattern of the residuals tell you about the linear model? 54. What do we call a sample that consists of the entire population? 55. A member of Congress wants to know what his constituents think of proposed legislation on health insurance. His staff reports that 228 letters have been received on the subject, of which 193 oppose the legislation. What is the population in this situation? 56. What methods will improve the accuracy of a sample? 57. You play tennis regularly with a friend, and from past experience, you believe that the outcome of each match is independent. For any given match you have a probability of .6 of winning. What is the probability that you win the next two matches? 58. If P(A) = 0.24 and P(B) = 0.52 and A and B are independent, what is P(A or B)? Exercises 59–60 relate to the following. Ashley and Lizzette have been playing basketball for neighboring schools since they were freshmen phenoms. Now seniors, the media has been calling them the “best two female players in the district.” The career average number of points scored per game by Ashley is 17.3 with a standard deviation 4.2. Lizzette’s career average number of points has mean 18.9 with standard deviation 7.5. 59. Find the mean of the difference between Ashley’s and Lizette’s career average number of points scored per game. 60. Suppose the official in your district who is responsible for sports statistics feels that the performances of the girls on the court can be considered independent. With this assumption, find the standard deviation of the difference between their career average number of points scored per game.