If you know the midrange and range, you can calculate the maximum and minimum.

The inter-quartile range is defined as Q3-Q1. For data sets A and B the interquartile range is 8-3 = 5 and 7-2 = 5, respectively. Note that the inter-quartile range is a single number and not the ordered pair consisting of the quartiles. For the weights of students the inter-quartile range is 175-130 = 45. [Different definitions for the quartiles will produce different inter-quartile ranges.] Since Q3 is the middle of the data above the median, and Q1 is the middle of the data below the median; Q3-Q1=(Q3-Q2)+(Q2-Q1) is twice the average distance of a datum from the median.

Note that knowing the median and the inter-quartile range does not let you calculate the minimum and maximum.

Rather than the maximum extent, one might want a measure of the average distance of data from the center

Another way to avoid negative summands is to square them. (1/(n-1))*sum*((x(i) - x-bar)^2) is called the variance, which is denoted by s^2. [The reason for dividing by n-1 rather than n, is that this is the estimate for the variance of a population based on a sample; if we had divided by n we would have still called it the variance, but denoted it with *sigma*^2 where *sigma* is lower case sigma.] Evaluating this expression for data set A yields ((2-5.4)^2 + (3-5.4)^2 + (5-5.4)^2 + (8-5.4)^2 + (9-5.4)^2)/4 = 9.3. This is not a good measure of the average distance from the mean, but its square root 3.05 is (taking the square root essentially undoes the previous squaring). The square root of the variance is called the standard deviation, and denoted by s. For the weights of students the variance is 881.77, and the standard deviation is 29.69.

Exercise: In what sense is the standard deviation an average distance? How many (what percentage) of the weights of students lie within one standard deviation unit of the mean? Within two standard deviation units of the mean? Within three? How many standard deviation units is the range?

For the standard deviation, there is no exact result either, but many data
distributions are approximately normal (for our purposes, approximately normal
means symmetric and unimodal, mound shaped). Hence we can use the exact result
for normal distributions as a rule of thumb for most data distributions:

Approximately .68 of the data (approximately 2/3) lies within one standard
deviation unit of the mean.

Approximately .95 of the data (approximately 19/20) lies within two standard
deviation units of the mean.

Approximately .9974 of the data (approximately 399/400) lies within three
standard deviation units of the mean.

[Bounds on on how much data lies within one, two, or three standrd deviation
units from the mean are available from Tchebychev's theorem, but we shall not
discuss that in this course.]

**Competencies:** For the data set {2 5 9 4 6 7 6 8 8}, calculate the
variance, standard deviation, and range.

**Reflection:** For the above data set, which of the above statistics best
describes the spread of the data?

**Challenge:** Is the variance always greater than the standard deviation?
When
will the variance, standard deviation, and range be equal?

May 2003