Similarly, if we had milage or safety information on all the models of cars, we would be overwhelmed by the information, but if we grouped the cars into a few categories such as manufacturers (Ford, Chrysler, General Motors, etc.) or type of vehicle (subcompact, sedan, station wagon, van, etc.) we might be able to comprehend the essence of the information. Although the usefulness of grouping into categories is clear, it is often difficult to determine the appropriate way to group categories. If you grouped cars by manufacturer, you would miss safety information which depended on the type of vehicle. However, there are sometimes natural groupings dependent on the nature of the data.

*Quantitative* (also further specified as interval and ratio, the
distinction between which is not of interest for our purposes) data is data
where what is being recorded can be identified with the real numbers.
Examples include age, I.Q.,
weight, height. Identification with the real numbers facilitates organizing,
comprehending,
and communicating this data. We can always group quantitative data as one groups
numbers which are close together. We will later combine it using algebraic
operations to describe where the data lies.

**N.B.:** We can count all data, whether categorical or
quantitative, the terms categorical and quantitative refer to the essence of
the individual items which we are counting.

Exercise: What characteristics of people are qualitative? quantitative? What characteristics of cars are qualitative? quantitative?

10 | 5 11 | 023 12 | 055 13 | 025 14 | 055 15 | 5558 16 | 058 17 | 0055 18 | 0555 19 | 0 20 | 21 | 22 | 23 | 5In practice, stem-and leaf plots are formed before the data has been ordered (or inorder to order the data). Thus it is a three step process: 1) choose the stems based on scanning the data, 2) add the leaves in the order encountered, 3) reorder the leaves on each stem from smallest to largest.

Sometimes to enhance visual presentation of data, stems will be split (e.g., repeat each stem on the left, once for the digits 0-4, once for the digits 5-9). Sometimes data are truncated (rightmost digits dropped) in order to have an informative plot with single digit leafs.

The essence of a histogram is best illustrated by the method of its construction.

- Choose the number of classes; this will be an aesthetic judgement based on the data. Generally you will want between 5 and 20 classes: your goal is to communicate where the data lies. The number of classes is important, although subtle.
- Choose the class size. Divide the range by the target number of classes above, then round off aesthetically. If you do not end up with the number of classes above, it will not matter since that number was a rough aesthetic guess.
- Choose the class marks (or class boundaries). Again, do this aesthetically. Since classes are all the same size, if you know the class marks (which are at the center of the classes), you know the class boundaries, and vice-versa. It is important that each datum lies in exactly one class.
- Draw the histogram. This requires that you count the number of data which lie in each class, and make the heights (hence areas) of the bars proportional to the number of data in each class. Do not forget to label the histogram, since it will convey no information if it is not labelled.

10_| _______ | | | | _______| | | _______| | | | | | | | 5_| | | | | | | | | | | _______| | | | | | | | | | | | | | | |_______ _______ __|__|_______|_______|_______|_______|_______|_______|____ | | | | | | 100 125 150 175 200 225 weights of students in pounds Weights of Students in Statistics CourseNote that all the original data can be recovered from a stem-and-leaf plot but you only know the approximate value of the data when it is presented in a histogram.

**Competencies:** Give examples of categorical (qualitative) data and
quantitative data.

Present the following weights: {132, 180, 200, 150, 165, 144, 194, 125, 160,
130, 140, 140, 160, 170, 150, 155, 135, 165, 120, 185, 141, 210, 105, 115,
125, 162, 215, 235, 170, 200, 125, 125, 225, 170, 140, 135, 185, 230, 269,
130, 220, 198, 285, 140, 173, 180, 210, 148, 115, 205, 130} as a stem-and-leaf
plot.

Present the above weights as a hisotgram.

**Reflection:** When would you violate the above rules for making a
histogram, and how would you do it?

**Challenge:**

May 2003