Displaying and Describing Categorical Data Descriptive Statistics 1
Displaying and Describing Categorical Data Descriptive Statistics 1
Visualizing Categorical Data � Basic options: ◦ Pie charts. ◦ Bar charts. � Actual categorical data comes as a list of values … each value is one out of a set of categories. � However, we will see later we might simply be given a summary of the data
Pie Charts � � A pie chart is a circle divided into sections, one for each category. The area (angle) of each sector is proportional to the frequency/relative frequency of that category. Pie charts are useful for showing the relative proportions of each category, compared to the whole. Not effective if there are too many categories or if some relative frequencies are too small. May need to combine categories.
Pie Chart Example � Suppose we took a random sample of student’s blood types � A Pie Chart for this example containing: ◦ ◦ Add in percentages Highlight the A+ Group More informative Title Create an “Other” Category 4
Bar Chart � Lists the categories on the horizontal axis. � Draws rectangles above each category where the heights are equal to the category’s frequency or relative frequency. � Used for categorical data � Let’s look at the same data in a Bar Chart
Bar Chart Example � Let’s look at the same blood type data in a Bar Chart � Next consider a pareto chart, where the bars are arranged in decreasing order 6
Bar Charts vs. Pie Charts � Pie Charts: ◦ Pros: �Can easily picture relative frequencies ◦ Cons: �Not effective if there are too many categories or if some relative frequencies are too small � Bar Charts: ◦ Pros: �Can handle more categories. �Easily compare categories w/ Pareto Chart ◦ Cons: �Relative frequencies can be deceiving. �Can arrange bars however you want.
Interpreting Categorical Distributions � When describing a variables distribution we want to talk about typical value and variation in the sample. � For Categorical data that means: ◦ Mode ◦ Variability
Mode: The most frequently occurring value �We can have more than one. For categorical data they only need to be roughly the same height. �Use the following wording: ◦ Unimodal: One distinct mode ◦ Bimodal: Two modes with same (or very close) frequency ◦ Multimodal: More than two modes with (or close) frequency
Variability in Categorical Data Variability: For Categorical Data, think of it as diversity in the data values � � ◦ ◦ What to look for: High variation: Each value is represented with about the same frequency (many observations in many different categories). Low variation: A small number of values appear a large number of times (many observations fall into a few categories).
Pie Chart Example � Consider the following random sample of the genre of the top movies released last year � Create a Pie Chart Animation Comedy Horror Drama Action Drama Animation Drama Comedy Horror Action Comedy Action Drama Comedy Drama Action Comedy Horror Action Drama Genre Action Comedy Drama Action Comedy Action Drama Action Comedy Drama Horror Action Sci. Fi Drama Horror Romance Action Comedy Action Drama Comedy Drama Action Comedy Horror Drama Action Drama Animation Drama Comedy Horror Action Comedy Action Sci. Fi Romance Western Comedy Musical Drama Action Animation Horror Action Comedy Horror Action 11
Minitab Instructions (Pie Chart) � Graph box -> Pie Chart. � We are given the raw data here so leave “Chart counts of unique values” � Choose � Click your variable OK
Minitab Pie Chart Edits � � Depending on your results make any edits you may want: Create an “Other” Category ◦ From Pie Chart Menu click Pie Options Button ◦ Combine slices of this percent or less � I used 4% here � � Add desired labels ◦ From Pie Chart Menu click Pie Options Button Explode category of interest ◦ Double click on category on graph � I chose Animation here
Another Example from Table � The following table gives information on college majors at a particular university. Produce appropriate graphs. � Notice here we do not have the data summarized in a table rather than the raw data. How will this effect our results?
Minitab Instructions � What if we do not have the raw data? � In this situation, choose pie chart but click “Chart values from a table”. � Choose labels you categorical variable ◦ In our case, Major � Then Choose you Summary variables with the count or percentage ◦ In our case, Percentage � Click OK
Bar Chart Example � Consider the following random sample of the genre of the top movies released last year � Create a Bar Chart Animation Comedy Horror Drama Action Drama Animation Drama Comedy Horror Action Comedy Action Drama Comedy Drama Action Comedy Horror Action Drama Genre Action Comedy Drama Action Comedy Action Drama Action Comedy Drama Horror Action Sci. Fi Drama Horror Romance Action Comedy Action Drama Comedy Drama Action Comedy Horror Drama Action Drama Animation Drama Comedy Horror Action Comedy Action Sci. Fi Romance Western Comedy Musical Drama Action Animation Horror Action Comedy Horror Action 16
Minitab Instructions (Bar Chart) � Graph box ->Bar Plot � Here we have “counts of unique values” � Again let’s use the “Genre” variable. � Finally, click OK!
Pareto Chart in Minitab � Double Click on bars to get Edit Bars screen � Click Chart options � Choose decreasing Y
- Slides: 18