Handout 7: Understanding Relationships Between Two Categorical Variables
STAT 100 – Spring 2016
16
Analyzing Data from an Existing Contingency Table in Excel
In Handout 7, we computed percentages that would be relevant for investigating this question
of interest. Here, our objective is to compute these percentages and to obtain a graphical
summary of the data using Excel.
Note that in terms of how the data were presented to us, there is one big difference between this
example and the Titanic example. In the Titanic example, our Excel file contained the raw data.
In other words, each row in the spreadsheet represented a single subject in the study.
A snippet of the Titanic Data
We were given the raw data for the Titanic example, and we had to create a contingency table
using PivotTables in Excel. We were not provided with raw data for the Immigration Reform
study; instead, the data were given to us in the form of a contingency table. If we were to use
the exact same steps in Excel for the Immigration Reform example, we would need to create an
Excel sheet that was set up so that each row represents the outcomes for a single subject in the
study. For example, we could start with something like this:
This is just a start. There were 1,010 respondents in this study, so we would have to create a
sheet with 1,010 rows. Of these rows, 202 would identify Democrats who said “Legal
residency,” 79 would identify Democrats who said “Stop flow/deport,” etc. Doesn’t sound like
much fun, does it? Instead of going through all of this trouble, let’s take a short-cut. In Excel,
we can enter the data as follows. Essentially, we will give each cell in the contingency table its
own row in Excel. Then, we’ll add a column to our data sheet that tells Excel how many
respondents appeared in each cell.