{"id":973,"date":"2019-08-13T19:06:20","date_gmt":"2019-08-13T19:06:20","guid":{"rendered":"https:\/\/opentextbc.ca\/researchmethods\/chapter\/describing-single-variables\/"},"modified":"2019-11-05T17:29:18","modified_gmt":"2019-11-05T17:29:18","slug":"describing-single-variables","status":"publish","type":"chapter","link":"https:\/\/opentextbc.ca\/researchmethods\/chapter\/describing-single-variables\/","title":{"raw":"Describing Single Variables","rendered":"Describing Single Variables"},"content":{"raw":"[latexpage]\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Learning Objectives<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ol>\r\n \t<li>Use frequency tables and histograms to display and interpret the distribution of a variable.<\/li>\r\n \t<li>Compute and interpret the mean, median, and mode of a distribution and identify situations in which the mean, median, or mode is the most appropriate measure of central tendency.<\/li>\r\n \t<li>Compute and interpret the range and standard deviation of a distribution.<\/li>\r\n \t<li>Compute and interpret percentile ranks and\u00a0<i>z<\/i>\u00a0scores.<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<b>[pb_glossary id=\"1263\"]Descriptive\u00a0statistics[\/pb_glossary]<\/b>\u00a0refers to a set of techniques for summarizing and displaying data. Let us assume here that the data are quantitative and consist of scores on one or more variables for each of several study participants. Although in most cases the primary research question will be about one or more statistical relationships between variables, it is also important to describe each variable individually. For this reason, we begin by looking at some of the most common techniques for describing single variables.\r\n<h1><b><\/b>The Distribution of a Variable<\/h1>\r\nEvery variable has a\u00a0<b>[pb_glossary id=\"1280\"]distribution[\/pb_glossary]<\/b>, which is the way the scores are distributed across the levels of that variable. For example, in a sample of 100 university students, the distribution of the variable \u201cnumber of siblings\u201d might be such that 10 of them have no siblings, 30 have one sibling, 40 have two siblings, and so on. In the same sample, the distribution of the variable \u201csex\u201d might be such that 44 have a score of \u201cmale\u201d and 56 have a score of \u201cfemale.\u201d\r\n<h1><b><\/b>Frequency Tables<\/h1>\r\nOne way to display the distribution of a variable is in a\u00a0<b>[pb_glossary id=\"1080\"]frequency\u00a0table[\/pb_glossary]<\/b>.\u00a0Table 12.1, for example, is a frequency table showing a hypothetical distribution of scores on the Rosenberg Self-Esteem Scale for a sample of 40 college students. The first column lists the values of the variable\u2014the possible scores on the Rosenberg scale\u2014and the second column lists the frequency of each score. This table shows that there were three students who had self-esteem scores of 24, five who had self-esteem scores of 23, and so on. From a frequency table like this, one can quickly see several important aspects of a distribution, including the range of scores (from 15 to 24), the most and least common scores (22 and 17, respectively), and any extreme scores that stand out from the rest.\r\n<table><caption>Table 12.1\u00a0Frequency Table Showing a Hypothetical Distribution of Scores on the Rosenberg Self-Esteem Scale<\/caption>\r\n<tbody>\r\n<tr>\r\n<th scope=\"col\"><b><\/b><b>Self-esteem<\/b><\/th>\r\n<th scope=\"col\"><b><\/b><b>Frequency<\/b><\/th>\r\n<\/tr>\r\n<tr>\r\n<td>24<\/td>\r\n<td>3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>23<\/td>\r\n<td>5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>22<\/td>\r\n<td>10<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>21<\/td>\r\n<td>8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>20<\/td>\r\n<td>5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>19<\/td>\r\n<td>3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>18<\/td>\r\n<td>3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>17<\/td>\r\n<td>0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>16<\/td>\r\n<td>2<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>15<\/td>\r\n<td>1<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThere are a few other points worth noting about frequency tables. First, the levels listed in the first column usually go from the highest at the top to the lowest at the bottom, and they usually do not extend beyond the highest and lowest scores in the data. For example, although scores on the Rosenberg scale can vary from a high of 30 to a low of 0,\u00a0Table 12.1 only includes levels from 24 to 15 because that range includes all the scores in this particular data set. Second, when there are many different scores across a wide range of values, it is often better to create a grouped frequency table, in which the first column lists ranges of values and the second column lists the frequency of scores in each range.\u00a0Table 12.2, for example, is a grouped frequency table showing a hypothetical distribution of simple reaction times for a sample of 20 participants. In a grouped frequency table, the ranges must all be of equal width, and there are usually between five and 15 of them. Finally, frequency tables can also be used for categorical variables, in which case the levels are category labels. The order of the category labels is somewhat arbitrary, but they are often listed from the most frequent at the top to the least frequent at the bottom.\r\n<table><caption>Table 12.2\u00a0A Grouped Frequency Table Showing a Hypothetical Distribution of Reaction Times<\/caption>\r\n<tbody>\r\n<tr>\r\n<th scope=\"col\"><b><\/b><b>Reaction time (ms)<\/b><\/th>\r\n<th scope=\"col\"><b><\/b><b>Frequency<\/b><\/th>\r\n<\/tr>\r\n<tr>\r\n<td>241\u2013260<\/td>\r\n<td>1<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>221\u2013240<\/td>\r\n<td>2<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>201\u2013220<\/td>\r\n<td>2<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>181\u2013200<\/td>\r\n<td>9<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>161\u2013180<\/td>\r\n<td>4<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>141\u2013160<\/td>\r\n<td>2<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h1><b><\/b>Histograms<\/h1>\r\nA\u00a0<b>[pb_glossary id=\"1072\"]histogram[\/pb_glossary]<\/b>\u00a0is a graphical display of a distribution. It presents the same information as a frequency table but in a way that is even quicker and easier to grasp. The histogram in\u00a0Figure 12.1 presents the distribution of self-esteem scores in Table 12.1. The\u00a0<i>x-<\/i>axis of the histogram represents the variable and the\u00a0<i>y-<\/i>axis represents frequency. Above each level of the variable on the\u00a0<i>x-<\/i>axis is a vertical bar that represents the number of individuals with that score. When the variable is quantitative, as in this example, there is usually no gap between the bars. When the variable is categorical, however, there is usually a small gap between them. (The gap at 17 in this histogram reflects the fact that there were no scores of 17 in this data set.)\r\n\r\n[caption id=\"attachment_148\" align=\"aligncenter\" width=\"899\"]<a href=\"http:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2015\/09\/12.1.png\"><img src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2019\/08\/12.1.png\" alt=\"Histogram. There are no spaces between the bars, and there is no bar for the score 17.\" class=\"wp-image-148 size-full\" width=\"899\" height=\"533\" \/><\/a> Figure 12.1 Histogram Showing the Distribution of Self-Esteem Scores Presented in Table 12.1[\/caption]\r\n<h1>Distribution Shapes<\/h1>\r\nWhen the distribution of a quantitative variable is displayed in a histogram, it has a shape. The shape of the distribution of self-esteem scores in\u00a0Figure 12.1 is typical. There is a peak somewhere near the middle of the distribution and \u201ctails\u201d that taper in either direction from the peak. The distribution of\u00a0Figure 12.1 is unimodal, meaning it has one distinct peak, but distributions can also be bimodal, meaning they have two distinct peaks.\u00a0Figure 12.2, for example, shows a hypothetical bimodal distribution of scores on the Beck Depression Inventory. Distributions can also have more than two distinct peaks, but these are relatively rare in psychological research.\r\n\r\n[caption id=\"attachment_149\" align=\"aligncenter\" width=\"825\"]<a href=\"http:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2015\/09\/12.2.png\"><img src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2019\/08\/12.2.png\" alt=\"Histogram. Long description available.\" class=\"wp-image-149 size-full\" width=\"825\" height=\"475\" \/><\/a> Figure 12.2 Histogram Showing a Hypothetical Bimodal Distribution of Scores on the Beck Depression Inventory <a href=\"#fig12.2\">[Long Description]<\/a>[\/caption]Another characteristic of the shape of a distribution is whether it is symmetrical or skewed. The distribution in the centre of\u00a0Figure 12.3 is <b>[pb_glossary id=\"1122\"]symmetrical[\/pb_glossary]<\/b>. Its left and right halves are mirror images of each other. The distribution on the left is negatively\u00a0<b>[pb_glossary id=\"1140\"]skewed[\/pb_glossary]<\/b>, with its peak shifted toward the upper end of its range and a relatively long negative tail. The distribution on the right is positively skewed, with its peak toward the lower end of its range and a relatively long positive tail.\r\n\r\n[caption id=\"attachment_150\" align=\"aligncenter\" width=\"900\"]<a href=\"http:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2015\/09\/12.3.png\"><img src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2019\/08\/12.3.png\" alt=\"&quot;&quot;\" class=\"wp-image-150 size-full\" width=\"900\" height=\"94\" \/><\/a> Figure 12.3 Histograms Showing Negatively Skewed, Symmetrical, and Positively Skewed Distributions[\/caption]\r\n\r\nAn\u00a0<b>[pb_glossary id=\"1054\"]outlier[\/pb_glossary]<\/b>\u00a0is an extreme score that is much higher or lower than the rest of the scores in the distribution. Sometimes outliers represent truly extreme scores on the variable of interest. For example, on the Beck Depression Inventory, a single clinically depressed person might be an outlier in a sample of otherwise happy and high-functioning peers. However, outliers can also represent errors or misunderstandings on the part of the researcher or participant, equipment malfunctions, or similar problems. We will say more about how to interpret outliers and what to do about them later in this chapter.\r\n<h1><b><\/b>Measures of Central Tendency and Variability<\/h1>\r\nIt is also useful to be able to describe the characteristics of a distribution more precisely. Here we look at how to do this in terms of two important characteristics: their central tendency and their variability.\r\n<h2><b><\/b>Central Tendency<\/h2>\r\nThe\u00a0<b>[pb_glossary id=\"1309\"]central\u00a0tendency[\/pb_glossary]<\/b>\u00a0of a distribution is its middle\u2014the point around which the scores in the distribution tend to cluster. (Another term for central tendency is\u00a0<i>average<\/i>.) Looking back at\u00a0Figure 12.1, for example, we can see that the self-esteem scores tend to cluster around the values of 20 to 22. Here we will consider the three most common measures of central tendency: the mean, the median, and the mode.\r\n\r\nThe\u00a0<b>[pb_glossary id=\"1097\"]mean[\/pb_glossary]<\/b>\u00a0of a distribution (symbolized\u00a0<i>M<\/i>) is the sum of the scores divided by the number of scores. As a formula, it looks like this:\r\n<p style=\"text-align: left;\">\\[M=\\Sigma X\\div N\\]<\/p>\r\nIn this formula, the symbol \u03a3 (the Greek letter sigma) is the summation sign and means to sum across the values of the variable\u00a0<i>X<\/i>.\u00a0<i>N<\/i>\u00a0represents the number of scores. The mean is by far the most common measure of central tendency, and there are some good reasons for this. It usually provides a good indication of the central tendency of a distribution, and it is easily understood by most people. In addition, the mean has statistical properties that make it especially useful in doing inferential statistics.\r\n\r\nAn alternative to the mean is the <b>[pb_glossary id=\"1092\"]median[\/pb_glossary]<\/b>. The\u00a0median\u00a0is the middle score in the sense that half the scores in the distribution are less than it and half are greater than it. The simplest way to find the median is to organize the scores from lowest to highest and locate the score in the middle. Consider, for example, the following set of seven scores:\r\n<p style=\"text-align: center;\">8 4 12 14 3 2 3<\/p>\r\nTo find the median, simply rearrange the scores from lowest to highest and locate the one in the middle.\r\n<p style=\"text-align: center;\">2 3 3\u00a0<b>4<\/b>\u00a08 12 14<\/p>\r\nIn this case, the median is 4 because there are three scores lower than 4 and three scores higher than 4. When there is an even number of scores, there are two scores in the middle of the distribution, in which case the median is the value halfway between them. For example, if we were to add a score of 15 to the preceding data set, there would be two scores (both 4 and 8) in the middle of the distribution, and the median would be halfway between them (6).\r\n\r\nOne final measure of central tendency is the mode. The\u00a0<b>[pb_glossary id=\"1033\"]mode[\/pb_glossary]<\/b>\u00a0is the most frequent score in a distribution. In the self-esteem distribution presented in Table 12.1 and\u00a0Figure 12.1,\u00a0for example, the mode is 22. More students had that score than any other. The mode is the only measure of central tendency that can also be used for categorical variables.\r\n\r\nIn a distribution that is both unimodal and symmetrical, the mean, median, and mode will be very close to each other at the peak of the distribution. In a bimodal or asymmetrical distribution, the mean, median, and mode can be quite different. In a bimodal distribution, the mean and median will tend to be between the peaks, while the mode will be at the tallest peak. In a skewed distribution, the mean will differ from the median in the direction of the skew (i.e., the direction of the longer tail). For highly skewed distributions, the mean can be pulled so far in the direction of the skew that it is no longer a good measure of the central tendency of that distribution. Imagine, for example, a set of four simple reaction times of 200, 250, 280, and 250 milliseconds (ms). The mean is 245 ms. But the addition of one more score of 5,000 ms\u2014perhaps because the participant was not paying attention\u2014would raise the mean to 1,445 ms. Not only is this measure of central tendency greater than 80% of the scores in the distribution, but it also does not seem to represent the behaviour of anyone in the distribution very well. This is why researchers often prefer the median for highly skewed distributions (such as distributions of reaction times).\r\n\r\nKeep in mind, though, that you are not required to choose a single measure of central tendency in analyzing your data. Each one provides slightly different information, and all of them can be useful.\r\n<h2><b><\/b>Measures of Variability<\/h2>\r\nThe\u00a0<b>[pb_glossary id=\"1153\"]variability[\/pb_glossary]<\/b>\u00a0of a distribution is the extent to which the scores vary around their central tendency. Consider the two distributions in\u00a0Figure 12.4, both of which have the same central tendency. The mean, median, and mode of each distribution are 10. Notice, however, that the two distributions differ in terms of their variability. The top one has relatively low variability, with all the scores relatively close to the centre. The bottom one has relatively high variability, with the scores are spread across a much greater range.\r\n<i><\/i>\r\n\r\n[caption id=\"attachment_151\" align=\"aligncenter\" width=\"750\"]<a href=\"http:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2015\/09\/12.4.png\"><img src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2019\/08\/12.4.png\" alt=\"Two histograms with the same central tendency but different variability. Long description available.\" class=\"wp-image-151 size-full\" width=\"750\" height=\"790\" \/><\/a> Figure 12.4 Histograms Showing Hypothetical Distributions With the Same Mean, Median, and Mode (10) but With Low Variability (Top) and High Variability (Bottom) <a href=\"#fig12.4\">[Long Description]<\/a>[\/caption]One simple measure of variability is the\u00a0<b>[pb_glossary id=\"1176\"]range[\/pb_glossary]<\/b>, which is simply the difference between the highest and lowest scores in the distribution. The range of the self-esteem scores in\u00a0Table 12.1, for example, is the difference between the highest score (24) and the lowest score (15). That is, the range is 24 \u2212 15 = 9. Although the range is easy to compute and understand, it can be misleading when there are outliers. Imagine, for example, an exam on which all the students scored between 90 and 100. It has a range of 10. But if there was a single student who scored 20, the range would increase to 80\u2014giving the impression that the scores were quite variable when in fact only one student differed substantially from the rest.\r\n\r\nBy far the most common measure of variability is the standard deviation. The <b>[pb_glossary id=\"1133\"]standard\u00a0deviation[\/pb_glossary]<\/b>\u00a0of a distribution is, roughly speaking, the average distance between the scores and the mean. For example, the standard deviations of the distributions in\u00a0Figure 12.4 are 1.69 for the top distribution and 4.30 for the bottom one. That is, while the scores in the top distribution differ from the mean by about 1.69 units on average, the scores in the bottom distribution differ from the mean by about 4.30 units on average.\r\n\r\nComputing the standard deviation involves a slight complication. Specifically, it involves finding the difference between each score and the mean, squaring each difference, finding the mean of these squared differences, and finally finding the square root of that mean. The formula looks like this:\r\n\r\n\\[SD=\\sqrt{\\dfrac{\\Sigma (X-M)^2}{N}}\\]\r\n\r\nThe computations for the standard deviation are illustrated for a small set of data in\u00a0Table 12.3. The first column is a set of eight scores that has a mean of 5. The second column is the difference between each score and the mean. The third column is the square of each of these differences. Notice that although the differences can be negative, the squared differences are always positive\u2014meaning that the standard deviation is always positive. At the bottom of the third column is the mean of the squared differences, which is also called the\u00a0<b>[pb_glossary id=\"1151\"]variance[\/pb_glossary]<\/b>\u00a0(symbolized\u00a0<i>SD<\/i><sup>2<\/sup>). Although the variance is itself a measure of variability, it generally plays a larger role in inferential statistics than in descriptive statistics. Finally, below the variance is the square root of the variance, which is the standard deviation.\r\n<table><caption>Table 12.3\u00a0Computations for the Standard Deviation<\/caption>\r\n<tbody>\r\n<tr>\r\n<th scope=\"col\">\\(X\\)<\/th>\r\n<th scope=\"col\">\\(X-M (M\u00a0= 5)\\)<\/th>\r\n<th scope=\"col\">\\((X - M)^2\\)<\/th>\r\n<\/tr>\r\n<tr>\r\n<td>3<\/td>\r\n<td>\u22122<\/td>\r\n<td>4<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>0<\/td>\r\n<td>0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>4<\/td>\r\n<td>\u22121<\/td>\r\n<td>1<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>2<\/td>\r\n<td>\u22123<\/td>\r\n<td>9<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>7<\/td>\r\n<td>2<\/td>\r\n<td>4<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>6<\/td>\r\n<td>1<\/td>\r\n<td>1<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>0<\/td>\r\n<td>0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>8<\/td>\r\n<td>3<\/td>\r\n<td>9<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n\\(\\begin{array}{rrl}\r\nSD^2&amp;=&amp;28\\div 8 \\\\ \\\\\r\nSD^2&amp;=&amp;3.50 \\\\ \\\\\r\nSD&amp;=&amp;\\sqrt{3.50} \\\\ \\\\\r\nSD&amp;=&amp;1.87\r\n\\end{array}\\)\r\n<div class=\"textbox shaded\">\r\n\r\n<em><strong>N\u00a0or\u00a0N \u2212 1<\/strong><\/em>\r\n\r\nIf you have already taken a statistics course, you may have learned to divide the sum of the squared differences by\u00a0<i>N<\/i> \u2212 1 rather than by\u00a0<i>N<\/i>\u00a0when you compute the variance and standard deviation. Why is this?\r\n\r\nBy definition, the standard deviation is the square root of the mean of the squared differences. This implies dividing the sum of squared differences by <i>N<\/i>, as in the formula just presented. Computing the standard deviation this way is appropriate when your goal is simply to describe the variability in a sample. And learning it this way emphasizes that the variance is in fact the <i>mean<\/i>\u00a0of the squared differences\u2014and the standard deviation is the square root of this\u00a0<i>mean<\/i>.\r\n\r\nHowever, most calculators and software packages divide the sum of squared differences by\u00a0<i>N<\/i>\u00a0\u2212 1. This is because the standard deviation of a sample tends to be a bit lower than the standard deviation of the population the sample was selected from. Dividing the sum of squares by\u00a0<i>N<\/i>\u00a0\u2212 1 corrects for this tendency and results in a better estimate of the population standard deviation. Because researchers generally think of their data as representing a sample selected from a larger population\u2014and because they are generally interested in drawing conclusions about the population\u2014it makes sense to routinely apply this correction.\r\n\r\n<\/div>\r\n<h1><b><\/b>Percentile Ranks and\u00a0z\u00a0Scores<\/h1>\r\nIn many situations, it is useful to have a way to describe the location of an individual score within its distribution. One approach is the percentile rank. The <b>[pb_glossary id=\"1045\"]percentile\u00a0rank[\/pb_glossary]<\/b>\u00a0of a score is the percentage of scores in the distribution that are lower than that score. Consider, for example, the distribution in\u00a0Table 12.1. For any score in the distribution, we can find its percentile rank by counting the number of scores in the distribution that are lower than that score and converting that number to a percentage of the total number of scores. Notice, for example, that five of the students represented by the data in\u00a0Table 12.1 had self-esteem scores of 23. In this distribution, 32 of the 40 scores (80%) are lower than 23. Thus each of these students has a percentile rank of 80. (It can also be said that they scored \u201cat the 80th percentile.\u201d) Percentile ranks are often used to report the results of standardized tests of ability or achievement. If your percentile rank on a test of verbal ability were 40, for example, this would mean that you scored higher than 40% of the people who took the test.\r\n\r\nAnother approach is the\u00a0<i>z<\/i>\u00a0score. The\u00a0<b>[pb_glossary id=\"1147\"]z score[\/pb_glossary]\u00a0<\/b>for a particular individual is the difference between that individual\u2019s score and the mean of the distribution, divided by the standard deviation of the distribution:\r\n<p style=\"text-align: left;\">\\[z=(X-M)\\div SD\\]<\/p>\r\nA\u00a0<i>z<\/i>\u00a0score indicates how far above or below the mean a raw score is, but it expresses this in terms of the standard deviation. For example, in a distribution of intelligence quotient (IQ) scores with a mean of 100 and a standard deviation of 15, an IQ score of 110 would have a\u00a0<i>z<\/i>\u00a0score of (110 \u2212 100) \u00f7 15 = +0.67. In other words, a score of 110 is 0.67 standard deviations (approximately two thirds of a standard deviation) above the mean. Similarly, a raw score of 85 would have a\u00a0<i>z<\/i>\u00a0score of (85 \u2212 100) \u00f7 15 = \u22121.00. In other words, a score of 85 is one standard deviation below the mean.\r\n\r\nThere are several reasons that\u00a0<i>z<\/i>\u00a0scores are important. Again, they provide a way of describing where an individual\u2019s score is located within a distribution and are sometimes used to report the results of standardized tests. They also provide one way of defining outliers. For example, outliers are sometimes defined as scores that have\u00a0<i>z<\/i>\u00a0scores less than \u22123.00 or greater than +3.00. In other words, they are defined as scores that are more than three standard deviations from the mean. Finally,\u00a0<i>z<\/i>\u00a0scores play an important role in understanding and computing other statistics, as we will see shortly.\r\n<div class=\"textbox shaded\">\r\n\r\n<b>Online Descriptive Statistics<\/b>\r\n\r\nAlthough many researchers use commercially available software such as SPSS and Excel to analyze their data, there are several free online analysis tools that can also be extremely useful. Many allow you to enter or upload your data and then make one click to conduct several descriptive statistical analyses. Among them are the following.\r\n\r\n<a href=\"http:\/\/onlinestatbook.com\/stat_analysis\/index.html\">Rice Virtual Lab in Statistics<\/a>\r\n\r\n<a href=\"http:\/\/vassarstats.net\/\">VassarStats<\/a>\r\n\r\n<a href=\"http:\/\/www.brightstat.com\/\">Bright Stat<\/a>\r\n\r\nFor a more complete list, see\u00a0<a href=\"https:\/\/statpages.info\/\" target=\"_blank\" rel=\"noopener\">Interactive Statistical Calculator Pages<\/a>.\r\n\r\n<\/div>\r\n<div class=\"textbox textbox--key-takeaways\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Key Takeaways<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li>Every variable has a distribution\u2014a way that the scores are distributed across the levels. The distribution can be described using a frequency table and histogram. It can also be described in words in terms of its shape, including whether it is unimodal or bimodal, and whether it is symmetrical or skewed.<\/li>\r\n \t<li>The central tendency, or middle, of a distribution can be described precisely using three statistics\u2014the mean, median, and mode. The mean is the sum of the scores divided by the number of scores, the median is the middle score, and the mode is the most common score.<\/li>\r\n \t<li>The variability, or spread, of a distribution can be described precisely using the range and standard deviation. The range is the difference between the highest and lowest scores, and the standard deviation is roughly the average amount by which the scores differ from the mean.<\/li>\r\n \t<li>The location of a score within its distribution can be described using percentile ranks or\u00a0<i>z<\/i>\u00a0scores. The percentile rank of a score is the percentage of scores below that score, and the\u00a0<i>z<\/i>\u00a0score is the difference between the score and the mean divided by the standard deviation.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Exercises<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ol>\r\n \t<li>Practice: Make a frequency table and histogram for the following data. Then write a short description of the shape of the distribution in words.\r\n<ul>\r\n \t<li>11, 8, 9, 12, 9, 10, 12, 13, 11, 13, 12, 6, 10, 17, 13, 11, 12, 12, 14, 14<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li>Practice: For the data in Exercise 1, compute the mean, median, mode, standard deviation, and range.<\/li>\r\n \t<li>Practice: Using the data in Exercises 1 and 2, find\r\n<ol type=\"a\">\r\n \t<li>the percentile ranks for scores of 9 and 14<\/li>\r\n \t<li>the\u00a0<i>z<\/i>\u00a0scores for scores of 8 and 12.<\/li>\r\n<\/ol>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<h1>Long Descriptions<\/h1>\r\n<strong id=\"fig12.2\">Figure 12.2 long description:<\/strong> A histogram showing a bimodal distribution of scores on the Beck Depression Inventory. The horizontal axis is labelled \"Beck Depression Inventory Score,\" and the vertical axis is labelled \"Frequency.\" The data is as such:\r\n<ul>\r\n \t<li>BDI: 0\u20139, Frequency: 3<\/li>\r\n \t<li>BDI: 10\u201319, Frequency: 14<\/li>\r\n \t<li>BDI: 20\u201329, Frequency: 6<\/li>\r\n \t<li>BDI: 30\u201339, Frequency: 2<\/li>\r\n \t<li>BDI: 40\u201349, Frequency: 3<\/li>\r\n \t<li>BDI: 50\u201359, Frequency: 12<\/li>\r\n \t<li>BDI: 60\u201369, Frequency: 4<\/li>\r\n<\/ul>\r\nThe two distinct peaks are the 10\u201319 range and the 50\u201359 range. <a href=\"#attachment_149\">[Return to Figure 12.2]<\/a>\r\n\r\n<strong id=\"fig12.4\">Figure 12.4 long description:<\/strong> Two histograms with the same central tendency but different variability. Each horizontal axis is labelled \"X\" and has values from 1 to 20, and each vertical axis is labelled \"Frequency\" and has values from 1 to 20. Each histogram also has a mean, median, mode, and central tendency of 10.\r\n\r\nIn the first histogram, variability is relatively low. The data is as such:\r\n<ul>\r\n \t<li>X: 6, Frequency: 1<\/li>\r\n \t<li>X: 7, Frequency: 5<\/li>\r\n \t<li>X: 8, Frequency: 10<\/li>\r\n \t<li>X: 9, Frequency: 16<\/li>\r\n \t<li>X: 10, Frequency: 18<\/li>\r\n \t<li>X: 11, Frequency: 16<\/li>\r\n \t<li>X: 12, Frequency: 10<\/li>\r\n \t<li>X: 13, Frequency: 5<\/li>\r\n \t<li>X: 14, Frequency: 1<\/li>\r\n<\/ul>\r\nIn the second histogram, variability is relatively high. The data is as such:\r\n<ul>\r\n \t<li>X: 0, Frequency: 1<\/li>\r\n \t<li>X: 1, Frequency: 1<\/li>\r\n \t<li>X: 2, Frequency: 2<\/li>\r\n \t<li>X: 3, Frequency: 2<\/li>\r\n \t<li>X: 4, Frequency: 3<\/li>\r\n \t<li>X: 5, Frequency: 3<\/li>\r\n \t<li>X: 6, Frequency: 5<\/li>\r\n \t<li>X: 7, Frequency: 6<\/li>\r\n \t<li>X: 8, Frequency: 7<\/li>\r\n \t<li>X: 9, Frequency: 7<\/li>\r\n \t<li>X: 10, Frequency: 8<\/li>\r\n \t<li>X: 11, Frequency: 7<\/li>\r\n \t<li>X: 12, Frequency: 7<\/li>\r\n \t<li>X: 13, Frequency: 6<\/li>\r\n \t<li>X: 14, Frequency: 5<\/li>\r\n \t<li>X: 15, Frequency: 3<\/li>\r\n \t<li>X: 16, Frequency: 3<\/li>\r\n \t<li>X: 17, Frequency: 2<\/li>\r\n \t<li>X: 18, Frequency: 2<\/li>\r\n \t<li>X: 19, Frequency: 1<\/li>\r\n \t<li>X: 20, Frequency: 1<\/li>\r\n<\/ul>\r\n<a href=\"#attachment_151\">[Return to Figure 12.4]<\/a>","rendered":"<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Learning Objectives<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ol>\n<li>Use frequency tables and histograms to display and interpret the distribution of a variable.<\/li>\n<li>Compute and interpret the mean, median, and mode of a distribution and identify situations in which the mean, median, or mode is the most appropriate measure of central tendency.<\/li>\n<li>Compute and interpret the range and standard deviation of a distribution.<\/li>\n<li>Compute and interpret percentile ranks and\u00a0<i>z<\/i>\u00a0scores.<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<p><b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1263\">Descriptive\u00a0statistics<\/a><\/b>\u00a0refers to a set of techniques for summarizing and displaying data. Let us assume here that the data are quantitative and consist of scores on one or more variables for each of several study participants. Although in most cases the primary research question will be about one or more statistical relationships between variables, it is also important to describe each variable individually. For this reason, we begin by looking at some of the most common techniques for describing single variables.<\/p>\n<h1><b><\/b>The Distribution of a Variable<\/h1>\n<p>Every variable has a\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1280\">distribution<\/a><\/b>, which is the way the scores are distributed across the levels of that variable. For example, in a sample of 100 university students, the distribution of the variable \u201cnumber of siblings\u201d might be such that 10 of them have no siblings, 30 have one sibling, 40 have two siblings, and so on. In the same sample, the distribution of the variable \u201csex\u201d might be such that 44 have a score of \u201cmale\u201d and 56 have a score of \u201cfemale.\u201d<\/p>\n<h1><b><\/b>Frequency Tables<\/h1>\n<p>One way to display the distribution of a variable is in a\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1080\">frequency\u00a0table<\/a><\/b>.\u00a0Table 12.1, for example, is a frequency table showing a hypothetical distribution of scores on the Rosenberg Self-Esteem Scale for a sample of 40 college students. The first column lists the values of the variable\u2014the possible scores on the Rosenberg scale\u2014and the second column lists the frequency of each score. This table shows that there were three students who had self-esteem scores of 24, five who had self-esteem scores of 23, and so on. From a frequency table like this, one can quickly see several important aspects of a distribution, including the range of scores (from 15 to 24), the most and least common scores (22 and 17, respectively), and any extreme scores that stand out from the rest.<\/p>\n<table>\n<caption>Table 12.1\u00a0Frequency Table Showing a Hypothetical Distribution of Scores on the Rosenberg Self-Esteem Scale<\/caption>\n<tbody>\n<tr>\n<th scope=\"col\"><b><\/b><b>Self-esteem<\/b><\/th>\n<th scope=\"col\"><b><\/b><b>Frequency<\/b><\/th>\n<\/tr>\n<tr>\n<td>24<\/td>\n<td>3<\/td>\n<\/tr>\n<tr>\n<td>23<\/td>\n<td>5<\/td>\n<\/tr>\n<tr>\n<td>22<\/td>\n<td>10<\/td>\n<\/tr>\n<tr>\n<td>21<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>20<\/td>\n<td>5<\/td>\n<\/tr>\n<tr>\n<td>19<\/td>\n<td>3<\/td>\n<\/tr>\n<tr>\n<td>18<\/td>\n<td>3<\/td>\n<\/tr>\n<tr>\n<td>17<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>16<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>15<\/td>\n<td>1<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>There are a few other points worth noting about frequency tables. First, the levels listed in the first column usually go from the highest at the top to the lowest at the bottom, and they usually do not extend beyond the highest and lowest scores in the data. For example, although scores on the Rosenberg scale can vary from a high of 30 to a low of 0,\u00a0Table 12.1 only includes levels from 24 to 15 because that range includes all the scores in this particular data set. Second, when there are many different scores across a wide range of values, it is often better to create a grouped frequency table, in which the first column lists ranges of values and the second column lists the frequency of scores in each range.\u00a0Table 12.2, for example, is a grouped frequency table showing a hypothetical distribution of simple reaction times for a sample of 20 participants. In a grouped frequency table, the ranges must all be of equal width, and there are usually between five and 15 of them. Finally, frequency tables can also be used for categorical variables, in which case the levels are category labels. The order of the category labels is somewhat arbitrary, but they are often listed from the most frequent at the top to the least frequent at the bottom.<\/p>\n<table>\n<caption>Table 12.2\u00a0A Grouped Frequency Table Showing a Hypothetical Distribution of Reaction Times<\/caption>\n<tbody>\n<tr>\n<th scope=\"col\"><b><\/b><b>Reaction time (ms)<\/b><\/th>\n<th scope=\"col\"><b><\/b><b>Frequency<\/b><\/th>\n<\/tr>\n<tr>\n<td>241\u2013260<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td>221\u2013240<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>201\u2013220<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>181\u2013200<\/td>\n<td>9<\/td>\n<\/tr>\n<tr>\n<td>161\u2013180<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>141\u2013160<\/td>\n<td>2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h1><b><\/b>Histograms<\/h1>\n<p>A\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1072\">histogram<\/a><\/b>\u00a0is a graphical display of a distribution. It presents the same information as a frequency table but in a way that is even quicker and easier to grasp. The histogram in\u00a0Figure 12.1 presents the distribution of self-esteem scores in Table 12.1. The\u00a0<i>x-<\/i>axis of the histogram represents the variable and the\u00a0<i>y-<\/i>axis represents frequency. Above each level of the variable on the\u00a0<i>x-<\/i>axis is a vertical bar that represents the number of individuals with that score. When the variable is quantitative, as in this example, there is usually no gap between the bars. When the variable is categorical, however, there is usually a small gap between them. (The gap at 17 in this histogram reflects the fact that there were no scores of 17 in this data set.)<\/p>\n<figure id=\"attachment_148\" aria-describedby=\"caption-attachment-148\" style=\"width: 899px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2015\/09\/12.1.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2019\/08\/12.1.png\" alt=\"Histogram. There are no spaces between the bars, and there is no bar for the score 17.\" class=\"wp-image-148 size-full\" width=\"899\" height=\"533\" \/><\/a><figcaption id=\"caption-attachment-148\" class=\"wp-caption-text\">Figure 12.1 Histogram Showing the Distribution of Self-Esteem Scores Presented in Table 12.1<\/figcaption><\/figure>\n<h1>Distribution Shapes<\/h1>\n<p>When the distribution of a quantitative variable is displayed in a histogram, it has a shape. The shape of the distribution of self-esteem scores in\u00a0Figure 12.1 is typical. There is a peak somewhere near the middle of the distribution and \u201ctails\u201d that taper in either direction from the peak. The distribution of\u00a0Figure 12.1 is unimodal, meaning it has one distinct peak, but distributions can also be bimodal, meaning they have two distinct peaks.\u00a0Figure 12.2, for example, shows a hypothetical bimodal distribution of scores on the Beck Depression Inventory. Distributions can also have more than two distinct peaks, but these are relatively rare in psychological research.<\/p>\n<figure id=\"attachment_149\" aria-describedby=\"caption-attachment-149\" style=\"width: 825px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2015\/09\/12.2.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2019\/08\/12.2.png\" alt=\"Histogram. Long description available.\" class=\"wp-image-149 size-full\" width=\"825\" height=\"475\" \/><\/a><figcaption id=\"caption-attachment-149\" class=\"wp-caption-text\">Figure 12.2 Histogram Showing a Hypothetical Bimodal Distribution of Scores on the Beck Depression Inventory <a href=\"#fig12.2\">[Long Description]<\/a><\/figcaption><\/figure>\n<p>Another characteristic of the shape of a distribution is whether it is symmetrical or skewed. The distribution in the centre of\u00a0Figure 12.3 is <b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1122\">symmetrical<\/a><\/b>. Its left and right halves are mirror images of each other. The distribution on the left is negatively\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1140\">skewed<\/a><\/b>, with its peak shifted toward the upper end of its range and a relatively long negative tail. The distribution on the right is positively skewed, with its peak toward the lower end of its range and a relatively long positive tail.<\/p>\n<figure id=\"attachment_150\" aria-describedby=\"caption-attachment-150\" style=\"width: 900px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2015\/09\/12.3.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2019\/08\/12.3.png\" alt=\"&quot;&quot;\" class=\"wp-image-150 size-full\" width=\"900\" height=\"94\" \/><\/a><figcaption id=\"caption-attachment-150\" class=\"wp-caption-text\">Figure 12.3 Histograms Showing Negatively Skewed, Symmetrical, and Positively Skewed Distributions<\/figcaption><\/figure>\n<p>An\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1054\">outlier<\/a><\/b>\u00a0is an extreme score that is much higher or lower than the rest of the scores in the distribution. Sometimes outliers represent truly extreme scores on the variable of interest. For example, on the Beck Depression Inventory, a single clinically depressed person might be an outlier in a sample of otherwise happy and high-functioning peers. However, outliers can also represent errors or misunderstandings on the part of the researcher or participant, equipment malfunctions, or similar problems. We will say more about how to interpret outliers and what to do about them later in this chapter.<\/p>\n<h1><b><\/b>Measures of Central Tendency and Variability<\/h1>\n<p>It is also useful to be able to describe the characteristics of a distribution more precisely. Here we look at how to do this in terms of two important characteristics: their central tendency and their variability.<\/p>\n<h2><b><\/b>Central Tendency<\/h2>\n<p>The\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1309\">central\u00a0tendency<\/a><\/b>\u00a0of a distribution is its middle\u2014the point around which the scores in the distribution tend to cluster. (Another term for central tendency is\u00a0<i>average<\/i>.) Looking back at\u00a0Figure 12.1, for example, we can see that the self-esteem scores tend to cluster around the values of 20 to 22. Here we will consider the three most common measures of central tendency: the mean, the median, and the mode.<\/p>\n<p>The\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1097\">mean<\/a><\/b>\u00a0of a distribution (symbolized\u00a0<i>M<\/i>) is the sum of the scores divided by the number of scores. As a formula, it looks like this:<\/p>\n<p style=\"text-align: left;\">\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 12px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/ql-cache\/quicklatex.com-c45fb186799b0902bdc046efa8e66d90_l3.png\" height=\"12\" width=\"109\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#77;&#61;&#92;&#83;&#105;&#103;&#109;&#97;&#32;&#88;&#92;&#100;&#105;&#118;&#32;&#78;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p>In this formula, the symbol \u03a3 (the Greek letter sigma) is the summation sign and means to sum across the values of the variable\u00a0<i>X<\/i>.\u00a0<i>N<\/i>\u00a0represents the number of scores. The mean is by far the most common measure of central tendency, and there are some good reasons for this. It usually provides a good indication of the central tendency of a distribution, and it is easily understood by most people. In addition, the mean has statistical properties that make it especially useful in doing inferential statistics.<\/p>\n<p>An alternative to the mean is the <b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1092\">median<\/a><\/b>. The\u00a0median\u00a0is the middle score in the sense that half the scores in the distribution are less than it and half are greater than it. The simplest way to find the median is to organize the scores from lowest to highest and locate the score in the middle. Consider, for example, the following set of seven scores:<\/p>\n<p style=\"text-align: center;\">8 4 12 14 3 2 3<\/p>\n<p>To find the median, simply rearrange the scores from lowest to highest and locate the one in the middle.<\/p>\n<p style=\"text-align: center;\">2 3 3\u00a0<b>4<\/b>\u00a08 12 14<\/p>\n<p>In this case, the median is 4 because there are three scores lower than 4 and three scores higher than 4. When there is an even number of scores, there are two scores in the middle of the distribution, in which case the median is the value halfway between them. For example, if we were to add a score of 15 to the preceding data set, there would be two scores (both 4 and 8) in the middle of the distribution, and the median would be halfway between them (6).<\/p>\n<p>One final measure of central tendency is the mode. The\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1033\">mode<\/a><\/b>\u00a0is the most frequent score in a distribution. In the self-esteem distribution presented in Table 12.1 and\u00a0Figure 12.1,\u00a0for example, the mode is 22. More students had that score than any other. The mode is the only measure of central tendency that can also be used for categorical variables.<\/p>\n<p>In a distribution that is both unimodal and symmetrical, the mean, median, and mode will be very close to each other at the peak of the distribution. In a bimodal or asymmetrical distribution, the mean, median, and mode can be quite different. In a bimodal distribution, the mean and median will tend to be between the peaks, while the mode will be at the tallest peak. In a skewed distribution, the mean will differ from the median in the direction of the skew (i.e., the direction of the longer tail). For highly skewed distributions, the mean can be pulled so far in the direction of the skew that it is no longer a good measure of the central tendency of that distribution. Imagine, for example, a set of four simple reaction times of 200, 250, 280, and 250 milliseconds (ms). The mean is 245 ms. But the addition of one more score of 5,000 ms\u2014perhaps because the participant was not paying attention\u2014would raise the mean to 1,445 ms. Not only is this measure of central tendency greater than 80% of the scores in the distribution, but it also does not seem to represent the behaviour of anyone in the distribution very well. This is why researchers often prefer the median for highly skewed distributions (such as distributions of reaction times).<\/p>\n<p>Keep in mind, though, that you are not required to choose a single measure of central tendency in analyzing your data. Each one provides slightly different information, and all of them can be useful.<\/p>\n<h2><b><\/b>Measures of Variability<\/h2>\n<p>The\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1153\">variability<\/a><\/b>\u00a0of a distribution is the extent to which the scores vary around their central tendency. Consider the two distributions in\u00a0Figure 12.4, both of which have the same central tendency. The mean, median, and mode of each distribution are 10. Notice, however, that the two distributions differ in terms of their variability. The top one has relatively low variability, with all the scores relatively close to the centre. The bottom one has relatively high variability, with the scores are spread across a much greater range.<br \/>\n<i><\/i><\/p>\n<figure id=\"attachment_151\" aria-describedby=\"caption-attachment-151\" style=\"width: 750px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2015\/09\/12.4.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/uploads\/sites\/37\/2019\/08\/12.4.png\" alt=\"Two histograms with the same central tendency but different variability. Long description available.\" class=\"wp-image-151 size-full\" width=\"750\" height=\"790\" \/><\/a><figcaption id=\"caption-attachment-151\" class=\"wp-caption-text\">Figure 12.4 Histograms Showing Hypothetical Distributions With the Same Mean, Median, and Mode (10) but With Low Variability (Top) and High Variability (Bottom) <a href=\"#fig12.4\">[Long Description]<\/a><\/figcaption><\/figure>\n<p>One simple measure of variability is the\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1176\">range<\/a><\/b>, which is simply the difference between the highest and lowest scores in the distribution. The range of the self-esteem scores in\u00a0Table 12.1, for example, is the difference between the highest score (24) and the lowest score (15). That is, the range is 24 \u2212 15 = 9. Although the range is easy to compute and understand, it can be misleading when there are outliers. Imagine, for example, an exam on which all the students scored between 90 and 100. It has a range of 10. But if there was a single student who scored 20, the range would increase to 80\u2014giving the impression that the scores were quite variable when in fact only one student differed substantially from the rest.<\/p>\n<p>By far the most common measure of variability is the standard deviation. The <b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1133\">standard\u00a0deviation<\/a><\/b>\u00a0of a distribution is, roughly speaking, the average distance between the scores and the mean. For example, the standard deviations of the distributions in\u00a0Figure 12.4 are 1.69 for the top distribution and 4.30 for the bottom one. That is, while the scores in the top distribution differ from the mean by about 1.69 units on average, the scores in the bottom distribution differ from the mean by about 4.30 units on average.<\/p>\n<p>Computing the standard deviation involves a slight complication. Specifically, it involves finding the difference between each score and the mean, squaring each difference, finding the mean of these squared differences, and finally finding the square root of that mean. The formula looks like this:<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 43px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/ql-cache\/quicklatex.com-5c10b89a2dc7455a40eaf6e74d5d4392_l3.png\" height=\"43\" width=\"165\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#83;&#68;&#61;&#92;&#115;&#113;&#114;&#116;&#123;&#92;&#100;&#102;&#114;&#97;&#99;&#123;&#92;&#83;&#105;&#103;&#109;&#97;&#32;&#40;&#88;&#45;&#77;&#41;&#94;&#50;&#125;&#123;&#78;&#125;&#125;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p>The computations for the standard deviation are illustrated for a small set of data in\u00a0Table 12.3. The first column is a set of eight scores that has a mean of 5. The second column is the difference between each score and the mean. The third column is the square of each of these differences. Notice that although the differences can be negative, the squared differences are always positive\u2014meaning that the standard deviation is always positive. At the bottom of the third column is the mean of the squared differences, which is also called the\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1151\">variance<\/a><\/b>\u00a0(symbolized\u00a0<i>SD<\/i><sup>2<\/sup>). Although the variance is itself a measure of variability, it generally plays a larger role in inferential statistics than in descriptive statistics. Finally, below the variance is the square root of the variance, which is the standard deviation.<\/p>\n<table>\n<caption>Table 12.3\u00a0Computations for the Standard Deviation<\/caption>\n<tbody>\n<tr>\n<th scope=\"col\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/ql-cache\/quicklatex.com-d4ee28752517d6062a3ca0314890342d_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#88;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"16\" style=\"vertical-align: 0px;\" \/><\/th>\n<th scope=\"col\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/ql-cache\/quicklatex.com-eeee89348191faf31ad49cd8c8d27187_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#88;&#45;&#77;&#32;&#40;&#77;&#32;&#61;&#32;&#53;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"122\" style=\"vertical-align: -4px;\" \/><\/th>\n<th scope=\"col\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/ql-cache\/quicklatex.com-fd67f1263dc89960d0ab8c284a91c90f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#40;&#88;&#32;&#45;&#32;&#77;&#41;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"77\" style=\"vertical-align: -4px;\" \/><\/th>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>\u22122<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>0<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>\u22121<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>\u22123<\/td>\n<td>9<\/td>\n<\/tr>\n<tr>\n<td>7<\/td>\n<td>2<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>1<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>0<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>3<\/td>\n<td>9<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/ql-cache\/quicklatex.com-e1d2457c89cef019834999f2c7c35f57_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#98;&#101;&#103;&#105;&#110;&#123;&#97;&#114;&#114;&#97;&#121;&#125;&#123;&#114;&#114;&#108;&#125; &#83;&#68;&#94;&#50;&#38;&#61;&#38;&#50;&#56;&#92;&#100;&#105;&#118;&#32;&#56;&#32;&#92;&#92;&#32;&#92;&#92; &#83;&#68;&#94;&#50;&#38;&#61;&#38;&#51;&#46;&#53;&#48;&#32;&#92;&#92;&#32;&#92;&#92; &#83;&#68;&#38;&#61;&#38;&#92;&#115;&#113;&#114;&#116;&#123;&#51;&#46;&#53;&#48;&#125;&#32;&#92;&#92;&#32;&#92;&#92; &#83;&#68;&#38;&#61;&#38;&#49;&#46;&#56;&#55; &#92;&#101;&#110;&#100;&#123;&#97;&#114;&#114;&#97;&#121;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"149\" width=\"129\" style=\"vertical-align: -67px;\" \/><\/p>\n<div class=\"textbox shaded\">\n<p><em><strong>N\u00a0or\u00a0N \u2212 1<\/strong><\/em><\/p>\n<p>If you have already taken a statistics course, you may have learned to divide the sum of the squared differences by\u00a0<i>N<\/i> \u2212 1 rather than by\u00a0<i>N<\/i>\u00a0when you compute the variance and standard deviation. Why is this?<\/p>\n<p>By definition, the standard deviation is the square root of the mean of the squared differences. This implies dividing the sum of squared differences by <i>N<\/i>, as in the formula just presented. Computing the standard deviation this way is appropriate when your goal is simply to describe the variability in a sample. And learning it this way emphasizes that the variance is in fact the <i>mean<\/i>\u00a0of the squared differences\u2014and the standard deviation is the square root of this\u00a0<i>mean<\/i>.<\/p>\n<p>However, most calculators and software packages divide the sum of squared differences by\u00a0<i>N<\/i>\u00a0\u2212 1. This is because the standard deviation of a sample tends to be a bit lower than the standard deviation of the population the sample was selected from. Dividing the sum of squares by\u00a0<i>N<\/i>\u00a0\u2212 1 corrects for this tendency and results in a better estimate of the population standard deviation. Because researchers generally think of their data as representing a sample selected from a larger population\u2014and because they are generally interested in drawing conclusions about the population\u2014it makes sense to routinely apply this correction.<\/p>\n<\/div>\n<h1><b><\/b>Percentile Ranks and\u00a0z\u00a0Scores<\/h1>\n<p>In many situations, it is useful to have a way to describe the location of an individual score within its distribution. One approach is the percentile rank. The <b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1045\">percentile\u00a0rank<\/a><\/b>\u00a0of a score is the percentage of scores in the distribution that are lower than that score. Consider, for example, the distribution in\u00a0Table 12.1. For any score in the distribution, we can find its percentile rank by counting the number of scores in the distribution that are lower than that score and converting that number to a percentage of the total number of scores. Notice, for example, that five of the students represented by the data in\u00a0Table 12.1 had self-esteem scores of 23. In this distribution, 32 of the 40 scores (80%) are lower than 23. Thus each of these students has a percentile rank of 80. (It can also be said that they scored \u201cat the 80th percentile.\u201d) Percentile ranks are often used to report the results of standardized tests of ability or achievement. If your percentile rank on a test of verbal ability were 40, for example, this would mean that you scored higher than 40% of the people who took the test.<\/p>\n<p>Another approach is the\u00a0<i>z<\/i>\u00a0score. The\u00a0<b><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_973_1147\">z score<\/a>\u00a0<\/b>for a particular individual is the difference between that individual\u2019s score and the mean of the distribution, divided by the standard deviation of the distribution:<\/p>\n<p style=\"text-align: left;\">\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 18px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opentextbc.ca\/researchmethods\/wp-content\/ql-cache\/quicklatex.com-f38a0358f6d25e5408bd9477bb8d5265_l3.png\" height=\"18\" width=\"152\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#122;&#61;&#40;&#88;&#45;&#77;&#41;&#92;&#100;&#105;&#118;&#32;&#83;&#68;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p>A\u00a0<i>z<\/i>\u00a0score indicates how far above or below the mean a raw score is, but it expresses this in terms of the standard deviation. For example, in a distribution of intelligence quotient (IQ) scores with a mean of 100 and a standard deviation of 15, an IQ score of 110 would have a\u00a0<i>z<\/i>\u00a0score of (110 \u2212 100) \u00f7 15 = +0.67. In other words, a score of 110 is 0.67 standard deviations (approximately two thirds of a standard deviation) above the mean. Similarly, a raw score of 85 would have a\u00a0<i>z<\/i>\u00a0score of (85 \u2212 100) \u00f7 15 = \u22121.00. In other words, a score of 85 is one standard deviation below the mean.<\/p>\n<p>There are several reasons that\u00a0<i>z<\/i>\u00a0scores are important. Again, they provide a way of describing where an individual\u2019s score is located within a distribution and are sometimes used to report the results of standardized tests. They also provide one way of defining outliers. For example, outliers are sometimes defined as scores that have\u00a0<i>z<\/i>\u00a0scores less than \u22123.00 or greater than +3.00. In other words, they are defined as scores that are more than three standard deviations from the mean. Finally,\u00a0<i>z<\/i>\u00a0scores play an important role in understanding and computing other statistics, as we will see shortly.<\/p>\n<div class=\"textbox shaded\">\n<p><b>Online Descriptive Statistics<\/b><\/p>\n<p>Although many researchers use commercially available software such as SPSS and Excel to analyze their data, there are several free online analysis tools that can also be extremely useful. Many allow you to enter or upload your data and then make one click to conduct several descriptive statistical analyses. Among them are the following.<\/p>\n<p><a href=\"http:\/\/onlinestatbook.com\/stat_analysis\/index.html\">Rice Virtual Lab in Statistics<\/a><\/p>\n<p><a href=\"http:\/\/vassarstats.net\/\">VassarStats<\/a><\/p>\n<p><a href=\"http:\/\/www.brightstat.com\/\">Bright Stat<\/a><\/p>\n<p>For a more complete list, see\u00a0<a href=\"https:\/\/statpages.info\/\" target=\"_blank\" rel=\"noopener\">Interactive Statistical Calculator Pages<\/a>.<\/p>\n<\/div>\n<div class=\"textbox textbox--key-takeaways\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Key Takeaways<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li>Every variable has a distribution\u2014a way that the scores are distributed across the levels. The distribution can be described using a frequency table and histogram. It can also be described in words in terms of its shape, including whether it is unimodal or bimodal, and whether it is symmetrical or skewed.<\/li>\n<li>The central tendency, or middle, of a distribution can be described precisely using three statistics\u2014the mean, median, and mode. The mean is the sum of the scores divided by the number of scores, the median is the middle score, and the mode is the most common score.<\/li>\n<li>The variability, or spread, of a distribution can be described precisely using the range and standard deviation. The range is the difference between the highest and lowest scores, and the standard deviation is roughly the average amount by which the scores differ from the mean.<\/li>\n<li>The location of a score within its distribution can be described using percentile ranks or\u00a0<i>z<\/i>\u00a0scores. The percentile rank of a score is the percentage of scores below that score, and the\u00a0<i>z<\/i>\u00a0score is the difference between the score and the mean divided by the standard deviation.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Exercises<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ol>\n<li>Practice: Make a frequency table and histogram for the following data. Then write a short description of the shape of the distribution in words.\n<ul>\n<li>11, 8, 9, 12, 9, 10, 12, 13, 11, 13, 12, 6, 10, 17, 13, 11, 12, 12, 14, 14<\/li>\n<\/ul>\n<\/li>\n<li>Practice: For the data in Exercise 1, compute the mean, median, mode, standard deviation, and range.<\/li>\n<li>Practice: Using the data in Exercises 1 and 2, find\n<ol type=\"a\">\n<li>the percentile ranks for scores of 9 and 14<\/li>\n<li>the\u00a0<i>z<\/i>\u00a0scores for scores of 8 and 12.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<h1>Long Descriptions<\/h1>\n<p><strong id=\"fig12.2\">Figure 12.2 long description:<\/strong> A histogram showing a bimodal distribution of scores on the Beck Depression Inventory. The horizontal axis is labelled &#8220;Beck Depression Inventory Score,&#8221; and the vertical axis is labelled &#8220;Frequency.&#8221; The data is as such:<\/p>\n<ul>\n<li>BDI: 0\u20139, Frequency: 3<\/li>\n<li>BDI: 10\u201319, Frequency: 14<\/li>\n<li>BDI: 20\u201329, Frequency: 6<\/li>\n<li>BDI: 30\u201339, Frequency: 2<\/li>\n<li>BDI: 40\u201349, Frequency: 3<\/li>\n<li>BDI: 50\u201359, Frequency: 12<\/li>\n<li>BDI: 60\u201369, Frequency: 4<\/li>\n<\/ul>\n<p>The two distinct peaks are the 10\u201319 range and the 50\u201359 range. <a href=\"#attachment_149\">[Return to Figure 12.2]<\/a><\/p>\n<p><strong id=\"fig12.4\">Figure 12.4 long description:<\/strong> Two histograms with the same central tendency but different variability. Each horizontal axis is labelled &#8220;X&#8221; and has values from 1 to 20, and each vertical axis is labelled &#8220;Frequency&#8221; and has values from 1 to 20. Each histogram also has a mean, median, mode, and central tendency of 10.<\/p>\n<p>In the first histogram, variability is relatively low. The data is as such:<\/p>\n<ul>\n<li>X: 6, Frequency: 1<\/li>\n<li>X: 7, Frequency: 5<\/li>\n<li>X: 8, Frequency: 10<\/li>\n<li>X: 9, Frequency: 16<\/li>\n<li>X: 10, Frequency: 18<\/li>\n<li>X: 11, Frequency: 16<\/li>\n<li>X: 12, Frequency: 10<\/li>\n<li>X: 13, Frequency: 5<\/li>\n<li>X: 14, Frequency: 1<\/li>\n<\/ul>\n<p>In the second histogram, variability is relatively high. The data is as such:<\/p>\n<ul>\n<li>X: 0, Frequency: 1<\/li>\n<li>X: 1, Frequency: 1<\/li>\n<li>X: 2, Frequency: 2<\/li>\n<li>X: 3, Frequency: 2<\/li>\n<li>X: 4, Frequency: 3<\/li>\n<li>X: 5, Frequency: 3<\/li>\n<li>X: 6, Frequency: 5<\/li>\n<li>X: 7, Frequency: 6<\/li>\n<li>X: 8, Frequency: 7<\/li>\n<li>X: 9, Frequency: 7<\/li>\n<li>X: 10, Frequency: 8<\/li>\n<li>X: 11, Frequency: 7<\/li>\n<li>X: 12, Frequency: 7<\/li>\n<li>X: 13, Frequency: 6<\/li>\n<li>X: 14, Frequency: 5<\/li>\n<li>X: 15, Frequency: 3<\/li>\n<li>X: 16, Frequency: 3<\/li>\n<li>X: 17, Frequency: 2<\/li>\n<li>X: 18, Frequency: 2<\/li>\n<li>X: 19, Frequency: 1<\/li>\n<li>X: 20, Frequency: 1<\/li>\n<\/ul>\n<p><a href=\"#attachment_151\">[Return to Figure 12.4]<\/a><\/p>\n<div class=\"glossary\"><span class=\"screen-reader-text\" id=\"definition\">definition<\/span><template id=\"term_973_1263\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1263\"><div tabindex=\"-1\"><p>A set of techniques for summarizing and displaying data.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1280\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1280\"><div tabindex=\"-1\"><p>The way the scores are dispersed across the levels of the variable.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1080\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1080\"><div tabindex=\"-1\"><p>A table in which one column lists the values of a variable (the possible scores) and the other column lists the frequency of each score (how many participants had that score).<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1072\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1072\"><div tabindex=\"-1\"><p>A graphical display of a distribution.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1122\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1122\"><div tabindex=\"-1\"><p>A distribution whose left and right halves are mirror images of each other.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1140\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1140\"><div tabindex=\"-1\"><p>The peak of a distribution is shifted towards either the upper or lower end of its range.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1054\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1054\"><div tabindex=\"-1\"><p>An extreme score that is much higher or lower than the rest of the scores in the distribution.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1309\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1309\"><div tabindex=\"-1\"><p>The point around which the scores in the distribution tend to cluster, also called the average.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1097\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1097\"><div tabindex=\"-1\"><p>Symbolized M, the sum of the scores divided by the number of scores.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1092\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1092\"><div tabindex=\"-1\"><p>The middle score in the sense that half the scores in the distribution are less than it and half are greater than it.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1033\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1033\"><div tabindex=\"-1\"><p>The most frequent score in a distribution.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1153\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1153\"><div tabindex=\"-1\"><p>The extent to which the scores vary around their central tendency.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1176\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1176\"><div tabindex=\"-1\"><p>The difference between the highest and lowest scores in the distribution.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1133\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1133\"><div tabindex=\"-1\"><p>The average distance between the scores and the mean.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1151\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1151\"><div tabindex=\"-1\"><p>The mean of the squared differences; a measure of variability.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1045\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1045\"><div tabindex=\"-1\"><p>The percentage of scores in the distribution that are lower than a particular score.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_973_1147\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_973_1147\"><div tabindex=\"-1\"><p>The difference between an individual\u2019s score and the mean of the distribution, divided by the standard deviation of the distribution.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><\/div>","protected":false},"author":123,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-973","chapter","type-chapter","status-publish","hentry"],"part":967,"_links":{"self":[{"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/pressbooks\/v2\/chapters\/973","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/wp\/v2\/users\/123"}],"version-history":[{"count":7,"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/pressbooks\/v2\/chapters\/973\/revisions"}],"predecessor-version":[{"id":1493,"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/pressbooks\/v2\/chapters\/973\/revisions\/1493"}],"part":[{"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/pressbooks\/v2\/parts\/967"}],"metadata":[{"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/pressbooks\/v2\/chapters\/973\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/wp\/v2\/media?parent=973"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/pressbooks\/v2\/chapter-type?post=973"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/wp\/v2\/contributor?post=973"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/opentextbc.ca\/researchmethods\/wp-json\/wp\/v2\/license?post=973"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}