Probability Statistics Describing Quantitative Data Describing Quantitative Data
Probability & Statistics Describing Quantitative Data
Describing Quantitative Data “A picture is worth a thousand words” To describe quantitative data is to explain how the data is distributed. What is the shape of the data? Where is the center of the data? How spread out is the data? In order to describe the distribution of quantitative data, you will need some vocabulary.
Quantitative Data Think Before You Draw… n n n Remember the “Make a picture” rule? Now that we have options for data displays, you need to Think carefully about which type of display to make. Before making a stem-and-leaf display, a histogram, or a dotplot, check the n Quantitative Data Condition: The data are values of a quantitative variable whose units are known.
Quantitative Data Shape, Center, and Spread n When describing a distribution, make sure to always tell about three things: shape, center, and spread…
Quantitative Data What is the Shape of the Distribution? 1. Does the histogram have a single, central hump or several separated humps? 2. Is the histogram symmetric? 3. Do any unusual features stick out?
Quantitative Data Humps 1. Does the histogram have a single, central hump or several separated bumps? n Humps in a histogram are called modes. n A histogram with one main peak is dubbed unimodal; histograms with two peaks are bimodal; histograms with three or more peaks are called multimodal.
Quantitative Data Humps (cont. ) n A bimodal histogram has two apparent peaks:
Quantitative Data Modes are one way to describe the SHAPE of the graph? The humps of the graph are called MODES.
Quantitative Data n A histogram that doesn’t appear to have any mode and in which all the bars are approximately the same height is called uniform: Proportion of Wins
Quantitative Data Symmetry 2. Is the histogram symmetric? n If you can fold the histogram along a vertical line through the middle and have the edges match pretty closely, the histogram is symmetric.
Quantitative Data Symmetry (cont. ) n n The (usually) thinner ends of a distribution are called the tails. If one tail stretches out farther than the other, the histogram is said to be skewed to the side of the longer tail. In the figure below, the histogram on the left is said to be skewed left, while the histogram on the right is said to be skewed right.
Quantitative Data Symmetry is another way to describe SHAPE: Does the graph have a tail?
Quantitative Data Anything Unusual? 3. Do any unusual features stick out? n Sometimes it’s the unusual features that tell us something interesting or exciting about the data. n You should always mention any stragglers, or outliers, that stand off away from the body of the distribution. n Are there any gaps in the distribution? If so, we might have data from more than one group.
Quantitative Data Are there GAPS in your data? n The following histogram has outliers—there are three cities in the leftmost bar: Three cities have a significantly lower number of people per housing unit than the other cities. There is a relatively large gap in this data.
Quantitative Data Are there groups or clusters of data? If so, your data may not all be of the same type, may come from different sources, or contain more than one group. There are two main clusters of data here: one ranging from approximately 7 to 12 and the other ranging from approximately 25 to 37.
Quantitative Data Where is the CENTER of the distribution? It's easy to identify the center of data that is somewhat uniform or unimodal-symmetric: The center of this data is at approximately 0. 5 at approximately 0
Quantitative Data The center of a skewed graph or a multimodal graph are not as easy to determine or use. Certain measures of center can be meaningless or may not be useful in these types of graphs. Mode Median There is no data in the center of this graph. The center may have no meaning at all. Mean Each measure of center is in a different location. Which measure should be used?
Quantitative Data Center of a Distribution – Median n The median is the value with exactly half the data values below it and half above it. n It is the middle data value (once the data values have been ordered) that divides the histogram into two equal areas. n It has the same units as the data.
Quantitative Data Spread: Home on the Range How SPREAD out is the distribution? n n n Always report a measure of spread along with a measure of center when describing a distribution numerically. The range of the data is the difference between the maximum and minimum values: Range = max – min A disadvantage of the range is that a single extreme value can make it very large and, thus, not representative of the data overall.
Quantitative Data Below is a histogram of the Average Wind Speed for every day in 1989. Describe the distribution of the data. le p m a Ex n n n The range of the data is 9. The data values range from the minimum The distribution unimodalwhich and skewed which is 0 to theismaximum is aboutto 9. the right. modevalue is about The high may 0. 8 be mph an outlier. There is a large gap between 6. 5 and 8. 5. The center (median) daily wind speed is about 1. 90 mph The cluster of data seems to represent wind speeds between 0 Can main we say more? and 2. 5 mph.
Quantitative Data Example: The following graph displays the time it takes for a warehouse to retrieve parts for customer orders. Describe the distribution of the data. What is the modality? • Is the Unimodal, Bimodal, Multimodal? graph somewhat symmetric? • Is the Where is skewed? (are) the mode(s)? Give the graph or range numbers where the • number Is it skewed left orofright? mode(s) occur(s). Or is the graph uniform?
Quantitative Data Example: The following graph displays the time it takes for a warehouse to retrieve parts for customer orders. Describe the distribution of the data. Is there anything unusual about the graph? Where is the center of the graph? • Gaps, outliers, clusters? What • is Where the range of theoccur? data? do they What are the maximum and minimum values of the data?
Quantitative Data Why is the graph shaped like this? The following graph displays the time it takes for a warehouse to retrieve parts for customer orders. What real life aspects could account for the shape of this graph? It is possible thatpossibilities…. the two groups. It This There graph are other is obviously bimodal. are separated byare weight. Maybeof seems like there two groups the warehouse employees need parts Maybe listed. some. One parts group are kept of parts in a to use a between machine retrieve the away takes warehouse that 1 toisand a bit 5 minutes further to heavier whereas the retrieve from theparts and customer the other than group thelighter main takes 6 parts can be retrieved more quickly to warehouse. 12 minutes to retrieve. by hand.
Try this. The dot plot to the right shows Kentucky Derby winning times, plotting each race as its own dot. a) How many Kentucky Derby winning times were greater than 150 seconds? b) Approximate the best Kentucky Derby winning time. c) Why do you think there is a large gap in the middle of the data? (What real life situation may have caused this gap? )
- Slides: 25