8 Data Analysis 2
8.4 ZScores and the Normal Curve
Learning Objectives
By the end of this section it is expected that you will be able to:
 Convert a data item to a zscore
 Solve applications using zscore tables
The Normal Curve
When a set of data values is normally distributed, the 689599.7 Rule can be used to determine the percentage of values that lie one, two or three standard deviations from the mean. We will shift gears and explore how to determine where a specific data value lies in relation to all other values. As an example, a student who has written a college entrance exam may want to know where they placed in comparison to all other students. This section will explore how to determine this.
Consider the normal curve which is an idealized representation of a normally distributed population. The normal curve, also called a bellshaped curve, is represented in Figure 1. The area under the curve represents 100% (or 1.00) of the data (or population) and the mean score is 0.
We have seen that the standard deviation plays an important role in the normal distribution.
Refer to Figure 2 for the visual representation of the 68 – 95 – 99.7 Rule. For a normally distributed set of data:
 Approximately 68% (68.26%) of the data items fall within one standard deviation of the mean.
 Approximately 95% (95.44%) of the data items fall within two standard deviations of the mean.
 Approximately 99.7% of the data items fall within three standard deviations of the mean.
Z–Scores
When a data set is normally distributed we can use a standardized score, called the zscore, to determine the number of standard deviations that a data value is from the mean.
Reconsider an example from the previous section. Acertain segment of the economy has a normally distributed salary, with a mean salary of $45,000 and a standard deviation of $4000. Refer to Figure 3.
With this information we are able to determine that a salary of $49,000 lies exactly one standard deviation above the mean since $45,000 + $4000 = $49, 000. In turn, using the 689599.7 Rule we can determine that a salary of $49,000 is higher than 84% of the other salaries for this segment of the economy. The calculation would be 50% + (68%/2) = 84%. This calculation was possible since $49,000 was exactly one standard deviation away from the mean.
Consider a salary which does not lie exactly one, two or three standard deviations from the mean, such as $38,500. The calculation does not appear so straightforward but as it turns out we can use a zscore for situations such as this. A zscore converts a data value and standardizes it so that we are able to determine how many standard deviations a specific data value will lie above or below the mean.
Zscores can be used in situations with a normal distribution. Consider a chemistry class with a set of test scores that is normally distributed. The average score is 76% and one student receives a score of 55%. Converting the 55% to a zscore will provide the student with a sense of where their score lies with respect to the rest of the class. We can also use zscores to determine the percent of the data values that will lie between any two data values. Perhaps we wish to determine the percentage of students whose test scores lie between 70% to 85%. This can also be done using zscores.
We have seen that when calculating standard deviation we must consider whether we are working with the entire population or a sample of the population. We must do the same when calculating a zscore.
Formulas for Finding ZScores
The zscore represents the number of standard deviations a data value is from the mean value.
The formula for z is:
For a population we calculate the zscore using the population mean μ and standard deviation σ. The data value is represented by x.
For a representative sample of the population we calculate the zscore using the sample mean and standard deviation.:
A zscore is similar to a percentile in that it is a measure of position. As a rule, zscores above 2.0 (or below –2.0) are considered “unusual” values. According to the 689599.7 Rule, in a normal population such scores would occur less than 5% of the time. Zscores between 2.0 and 2.0 are considered “ordinary” values and these represent 95% of the values.
EXAMPLE 1
IQ scores are normally distributed. The mean IQ is 100 and the standard deviation is 15.
a) If Frank has an IQ of 127, find his zscore. b) Intrepret the meaning of this zscore. c) Using the 689599.7 Rule, how does Frank’s IQ compare to the rest of the population?
Solution
a) We will use the zscore for a population:
Frank’s zscore is 1.8.
b) For this zscore, the mean of 100 has been “standardized” to a value of 0 and the score of 127 has been standardized to a value of 1.8. This means that Frank’s IQ score is 1.8 (almost 2 standard deviations) higher than the average.
c) Considering the 689599.7 Rule, Frank’s score lies within 2 standard deviations of the mean. His score is certainly better than at least 84% of the population but does not rank in the top 2.5% of the population.
TRY IT 1
Consider a chemistry class and a set of test scores with an average of 76% and a standard deviation of 7%. A student receives a test score of 55%. a) Determine the student’s zscore. b) Intrepret the meaning of this zscore. c) Using the 689599.7 Rule, how does the student’s test score compare to the rest of the class?
Show answer
a) zscore = 3 b) This zscore is exactly 3 standard deviations less than the mean score of 76%. c) This student scored better than only 0.15% of the class (or 99.85% of the class scored higher than this student).
In Example 1 we were able to determine that Frank’s score is better than at least 84% of the population but it does not rank in the top 2.5% of the population. This is a fairly broad conclusion. As it turns out we can be more specific if we use zscore tables.
ZScore Tables
A zscore table allows us to determine, for a normal distribution, the percentage of data values that lie below (to the left) of a specific z–score. This in turn will enable us to determine the percentage of values that lie between or to the right of a given zscore.
We will use two different ztables, one for positive zscores and one for negative zscores. These tables are available online (or refer to the Tables at the end of this chapter).
A portion of a positive zscore table is shown in Figure 4. We use the positive ztable when we have zscores that are greater than 0 or lie to the right of the mean. The number in the ztable represents the area under the bell curve to the left of the zscore. The number is stated as a decimal fraction which can then be converted to a percentage by multiplying by 100. If for example the number in the table is 0.62552, then this would be interpreted as 62.552%.
Consider a data value X that is found to have a standardized zscore of 1.42. A data value with a zscore of 1.42 will lie between 1 and 2 standard deviations above the mean (refer to Fig. 5). We can use a zzcore table to determine the proportion of data vales that are less than (or geater than) this score.
This zscore value of 1.42 is positive so we refer to the positive zscore table in Figure 4. In the table go down the first column until you reach +1.4. The first column provides the zscore values to the nearest tenth (or one decimal place). To incorporate the digit that is in the second decimal place we must move through the body of the table until the heading of the column matches the digit in the hundredths place. In this example we move across the row for +1.4 until we reach the column headed with a 0.02. The corresponding number in the table is 0.92220 so for a zscore of 1.42 we can state that the area to the left of this score is 0.92220. Alternatively, 92.220% of all data values are less than this data value X. We are also able to conclude that 7.8% of the data values lie above this zscore value since 100% – 92.2% = 7.8%
Using only the zscore of 1.42 we were able to conclude that the data value X will lie between 1 and 2 standard deviations above the mean. By using the zscore table we are able to be more specific in stating that approximately 92% of the data values are less than X.
EXAMPLE 2
Consider a zscore of 0.18. a) Determine the area under the curve for a zscore of 0.18. b) Interpret what this zscore tells us.
Solution
a) This value is positive so we refer to the positive zscore table (a partial table is provided in Figure 4). In the table go down the first column until you reach +0.1. Then move across the row until you reach the column headed with a 0.08. This represents a zscore of 0.18. For a zscore of 0.18 the number in the table is 0.57142.
b) For a data value with a zscore of 0.18, approximately 57.1% of the data values will be below this and 42.9% will be above this data value.
TRY IT 2
Consider a zscore of 0.90. a) Determine the area under the curve for a zscore of 0.90. b) Interpret what this zscore tells us.
Show answer
a) area is 0.81594 b) Approximately 81.6% of data values are less than this value (or 18.4% are greater).
If the zscore is negative we use a negative ztable, a portion of which is illustrated in Figure 6. This table provides zscores that are less than 0 or lie to the left of the mean. The number in the ztable represents the area under the bell curve to the left of the zscore.
Consider an observation that has a zscore of 1.17. Refer to the negative zscore table in Figure 6. In the table go down the first column until you reach 1.1. Then move across the row until you reach the column headed with a 0.07. For a zscore of 1.17 the number in the table is 0.12100 or 12.1%. This represents the portion of the data values that lie to the left of the zscore (refer to Figure 7). This means that 12.1% of the data values are less than this data value. Alternatively 87.9% (100% – 12.1%) of the data values are greater than this data observation.
ZScores and the 689599.7 Rule
With a normal distribution half of the data values lie below the mean and half lie above the mean. If we calculate the zscore for a mean of 0 we will find that the zscore will also be 0. From the ztable we can determine that for a zscore of 0 the number is 0.5. This indicates that 50% of the data values lie below the mean and therefore 50% of the data values lie above the mean.
Refer to the normal distribution that is illustrated in Figure 8. For standard deviations of 1, 2 or 3 we can use the 689599.7 Rule to determine areas under the curve. We can also use zscore tables to do this.
EXAMPLE 3
a) A data value X is found to lie 1 standard deviation from the mean. Use the 689599.7 Rule to determine the percentage of data values that are lower than this data value.
b) For a data value X with a zscore of 1, determine the percentage of data values that are lower than X.
Solution
a) Refer to Figure 8. If 68% of the data values lie between 1 and 1 standard deviations, then 100% – 68% = 32% lie on either side of 1 or 1 standard deviations from the mean. Dividing 32% in half we get 16%. Therefore 16% of all data values are lower than the data value X. (Note: There are other approaches to this calculation).
b) Refer to the negative zscore table in Figure 6. For a zscore of 1 the area will be 0.15866 or 15.866%. Rounded to the nearest whole number we determine that 16% of the data values are lower than this data value.
Note: The zscore table will provide more accurate results than the 689599.7 Rule since the values in the 689599.7 Rule are rounded off approximations.
TRY IT 3
a) A data value X is found to lie 3 standard deviations from the mean. Use the 689599.7 Rule to determine the percentage of data values that are greater than this data value.
b) For a data value X with a zscore of 3, determine the percentage of data values that are greater than X.
Show answer
a) Refer to Figure 8. If 99.7% of the data values lie between 3 and 3 standard deviations, then 100% – 99.7% = 0.3% lie on either side of 3 or 3 standard deviations from the mean. Dividing 0.3% in half we get 0.15%. Therefore 0.15% of all data values are greater than the data value X.
b) Refer to a positive zscore table. For a zscore of 3 the area to the left will be 0.99865 or 99.865% so the area to the right (greater) will be 100% – 99.865% = 0.135% Rounded to the nearest 2 decimal places we determine that 0.14% of the data values are greater than this data value. Note that this slightly different than the answer obtained from using the 689599.7 Rule due to rounding.
Areas Between ZScores
We can use zscore tables to determine the area between two zscores. As an example, we can use the table to determine the area between the mean and a zscore of 1. A zscore of 1 lies one standard deviation to the right of the mean (as in Figure 9). Refer to the positive ztable in Figure 4. For a zzcore of 1 the value in the table is 0.84134. This indicates that approximately 84% of the data values lie to the left of this zscore.
From the zscore table we determine that for the zscore of 0 the area to the left is 0.5000. The area between the mean and one standard deviation would be as follows:
0.84134 – 0.500 = 0.3413 or 34.13%
Figure 10 illustrates the area between the mean and one standard deviation.
Note that this is consistent with the 689599.7 Rule. It states that approximately 68% of the data will lie within one standard deviation on either side of the mean. Half this amount, or 34%, will lie between the mean and a standard deviation of one.
The 689599.7 Rule is useful when data values lie exactly 1, 2 or 3 standard deviations from the mean. Zscore tables are useful for data values that have zscores that are not exactly 1, 2 or 3 standard deviations from the mean.
EXAMPLE 4
Given a normal distribution, use the zscore tables to find the area for each of the following zscores (rounded to the nearest tenth of a percent):
a) to the left of z = 1.72
b) to the left of z = 0.45
c) to the right of z = 0.45
d) between 0.45 and the mean
Solution
TRY IT 4
Given a normal distribution, find the area for each of the following zscores (rounded to the nearest tenth of a percent):
a) to the left of z = 0.85
b) to the right of z = 0.85
c) between the mean and z = 0.85
Show answer
a) 0.8023 = 80.2%
b) 0.1977 = 19.8%
c) 0.8023 – 0.5 = 0.3023 = 30.2%
EXAMPLE 5
Given a normal distribution, find the area for each of the following zscores (rounded to the nearest tenth of a percent):
a) between z = 0.65 and z = 0.65
b) between z = 0.65 and z = 2.8
c) between z = 1.44 and z = 2.8
Solution
TRY IT 5
Given a normal distribution, find the area for each of the following zscores (rounded to the nearest tenth of a percent):
a) between z = 0 and z = 0.73
b) between z = 0.73 and z = 1.95
c) between z = 2.12 and z = 0.73
Show answer
a) 0.2673 = 26.7%
b) 0.7417 = 74.17%
c) 0.2157 = 21.6%
Applications Using ZScores
With populations or samples that are normally distributed, zscores can be used to determine how data values compare (are positioned) with respect to other data values.
EXAMPLE 6
Frank has an IQ of 127, or a z score of 1.8. What percent of the population have IQ scores less than 127 and what percent have IQ scores higher than 127?
Solution
Refer back to Example 1 where the 689599.7 Rule was used to analyze Frank’s IQ score. From the Rule we were able to conclude that Frank’s IQ was better than at least 84% of the other scores and that his score did not rank in the top 2.5%. Comparing the two methods, the zscore provides us with a much more accurate analysis.
TRY IT 6
Consider a chemistry class and a set of test scores with an average of 76% and a standard deviation of 7%. A student receives a test score of 55% which yields a zscore of 3 (refer to TRY IT 1). a) Use a zscore table to determine what percent of the class had test scores less than 55% and what percent had test scores greater than 55%. b) How does this compare with the answer to TRY IT 1?
Show answer
a) From the table the area to the left of a zscore of 3 is 0.00135 therefore 0.135% of the class had test scores less than 55% and 99.865% of the class scored higher than 55%. b) Using the Rule in TRY IT1 the results were slightly different due to rounding: 0.15% of the students scored less than 55% and 99.85% scored higher than 55%.
For applications involving populations or samples that are normally distributed, we can calculate zscores if we know the mean and the standard deviation.
EXAMPLE 7
The waitinginline time at a certain grocery store is normally distributed with a mean
of 3.5 minutes and a standard deviation of 1.4 minutes.
a) What percent of the customers wait in line less than one minute?
b) What percent of the customers wait in line more than 5 minutes?
Solution
TRY IT 7
The waitinginline time to be seated at a popular retaurant during primetime hours is normally distributed with a mean of 24 minutes and a standard deviation of 11 minutes.
a) What percent of the customers wait in line less than twenty minutes?
b) What percent of the customers wait in line more than fortyfive minutes?
Show answer
a) 36% of customers wait less than 20 minutes b) 3% of customers wait more than 45 minutes
EXAMPLE 8
All first year psychology students wrote an exam that had 92 questions. Each question was worth 1 point for a total possible 92 points. The marks were normally distributed with a mean of score of 58 points and a standard deviation of 11 points.
a) Determine the zscore for a mark of 45 points. What percent of the scores were less than 45 points?
b) Determine the zscore for a mark of 84 points. What percent of the scores were greater than 84 points?
c) What percent of the scores were between 76 and 88 points?
Solution:
a) z = 1.1818 and area to the left is 0.11900 so 11.9% of the scores would be less than 45 points
b) z = 2.3636 and the area to the left is 0.99086 so the area to the right is 1 0.99534 = 0.00914. Therefore 0.9% (almost 1%) would score higher than 84 points
c) for 76 points z = 1.64 and the area to the left is 0.9495. For 88 points z = 2.73 and the area to the left is 0.99683. The difference is 0.99683 – 0.9495 = 0.04733. This means that 4.7% of the students would score between 76 and 88 points.
TRY IT 8
An entrance exam was given to a cohort of students. The mean score was 1000 points with a standard deviation of 150 points.
a) What percent of the students scored less than 820 points?
b) What percent of the students scored more than 1330 points?
c) What percent of the students scored between 950 and 1100 points?
Show answer
a) 11.5% scored less than 820 points b) 1.4% scored more than 1330 points
c) 37.8% scored between 950 and 1100 points
Key Concepts
 For a population we calculate the zscore for a data value x using the population mean μ and standard deviation σ.
 For a representative sample of the population we calculate the zscore for a data value x using the sample mean and standard deviation. :
 A zscore table allows us to determine, for a normal distribution, the percentage of data values that lie below (to the left) of a specific z–score. This in turn will enable us to determine the percentage of values that lie between or to the right of a given zscore.
 When a zscore is negative (lies to the left of the mean) we use a negative zscore table. When a zscore is positive (lies to the right of the mean) we use a positive zscore table.
Glossary
Zscore
is a standardized score that has been converted from a data value. The zscore indicates how many standard deviations away from the mean a data value lies.
8.4 Exercise Set
 Heights of adult males are normally distributed. The mean height of an adult male is 178 cm with a standard deviation of 10 cm.
 If Matt is 188 cm tall, find his zscore.
 Intrepret the meaning of this zscore.
 Using the 689599.7 Rule, how does Matt’s height compare to the rest of the population?
 Heights of adult males are normally distributed. The mean height of an adult male is 178 cm with a standard deviation of 10 cm.
 If Keegan is 158 cm tall, find his zscore.
 Interpret the meaning of this zscore.
 Using the 689599.7 Rule, how does Keegan’s height compare to the rest of the population?

 A data value X is found to lie 3 standard deviations from the mean. Use the 689599.7 Rule to determine the percentage of data values that are less than this data value.
 For a data value X with a zscore of 3, use a negative zscore table to determine the percentage of data values that are lower than X.

 A data value X is found to lie 2 standard deviations from the mean. Use the 689599.7 Rule to determine the percentage of data values that are lower than this data value.
 For a data value X with a zscore of 2, use a positive zscore table to determine the percentage of data values that are lower than X.
 Find the area under the normal curve for the followed z scores. Give the answers as both a decimal fraction (as a stated in the ztable) and as a percentage (rounded to the nearest tenth).
 less than z = 0.86 _____________________
 greater than z = 0.86 _____________________
 less than z = 1.34 _____________________
 greater than z = 1.34 _____________________
 greater than z = 2.88 _____________________
 Find the area under the normal curve for the following z scores. Give the answers as both a decimal fraction (as stated in the ztable) and as a percentage (rounded to the nearest tenth).
 between z = 0 and z = 0.47 _____________________
 between z = 0.3 and z = 0 _____________________
 between z = 0.3 and z = 0.47 _____________________
 between z = 2.24 and z = 0.55 _____________________
 between z = 1.46 and z = 2.37 _____________________
 between z = 1.5 and z = 1.5 _____________________
 The average resting heartrate for a normally distributed population of men was found to be 62 beats per minute with a standard deviation of 11 beats per minutes.
 What percent of men have resting heartrates under 70 beats per minute?
 What percent of men have resting heartrates over 70 beats per minute?
 What percent of men have resting heartrates between 40 and 80 beats per minute?
 In a group of normally distributed women, the average height is 5 feet 4 inches (64 inches) with a standard deviation of 2.8 inches. (1 foot = 12 inches)
 What percent of women are shorter than 5 feet?
 What percentage of woman are taller than 6 feet ?
 What percent of the women are between 5 feet and 6 feet ?
 A survey of college students enrolled in technology programs indicated that they spend an average of 29 hours a week outside of class time studying for their courses. The data was normally distributed with a standard deviation of 9 hours per week.
 What percent of the students spend more than 40 hours per week studying?
 What percent spend fewer than 10 hours per week studying?
 What percent spend between 20 and 50 hours per week studying?
 The number of toy cars assembled each day by a worker is normally distributed with a mean of 270 cars and a standard deviation of 16 cars.
 What percentage of workers assemble less than 240 cars per day?
 What percentage of workers assemble more than 265 cars per day?
 Workers are given a bonus every time they assemble more than 310 toy cars in one eight hour day. What percent of the workers receive a bonus each day?
 A radar unit measures the speed of passing cars on a toll highway where the speed limit is 120/km/hour. The speed of the cars is normally distributed with a mean speed of 114 km/h and a standard deviation of 9.8 km/h.
 What percent of the cars are travelling at less than 100 km/h?
 What percent of the vehicles are exceeding the speed limit?
 What percent of the vehicles are travelling between 115 km/h and 125 km/h?
 The lengths of cell phone calls in a particular city are normally distributed with a mean time of 8.2 minutes and a standard deviation of 2.6 minutes.
 What percent of phone calls are less than 10 minutes?
 What percent of phone calls are greater than 5 minutes?
 What percent of phone calls are between 7 and 12 minutes?
Answers

 zscore = 1
 This height is one standard deviation greater than the mean
 This height is greater than 84% of the population

 zscore = 2
 This height is two standard deviations less than the mean
 Keegan’s height is greater than approximately 2.5% of the population

 From the Rule approximately 1.5%
 From the table 1.4%

 From the Rule approximately 97.5%
 From the table 97.7%

 0.19489 = 19.5%
 0.80511 = 80.5%
 0.90988 = 91.0%
 0.09012 = 9.0%
 0.99801 = 99.8%

 0.18082 = 18.1%
 0.11791 = 11.8%
 0.29873 = 27.9%
 0.06326 = 6.3%
 0.86638 = 86.6%

 76.7%
 23.3%
 92.7%

 7.6%
 0.2%
 92.2%

 11.1%
 1.7%
 83.1%

 3%
 62.2%
 0.6%

 7.7%
 27.1%
 32.9%

 z = 0.6923 and area to the left is 0.75490 so 75.5% of the calls would be less than 10 minutes.
 z = 1.23 and area to the left is 0.10935 so area to the right is 1 0.10935 = 0.89065 Therefore 89.1% of the calls would be more than 10 minutes.
 7 minutes has z = 0.4615 and the area to the left is 0.32276 and for 12 minutes z = 1.4615 and the area to the left is 0.92785. The difference is 0.92785 – 0.32276 = 0.60509. This means that 60.5% of the calls would be between 7 and 12 minutes.
Attribution
 Figure 6 is from https://www.ztable.net/.
 Some of the content for this chapter is from “Unit 9: Mortgages”, “Unit 10: Interest rates on loans”, and “Review Questions” in Financial Mathematics by Paul Grinder, Velma McKay, Kim Moshenko, and Ada Sarsiat, which is under a CC BY 4.0 Licence.. Adapted by Kim Moshenko. See the Copyright page for more information.