Today is your last chance to enroll in The Data Visualization Academy. This is where you totally change how you think about data.

Let’s give the audience a peek behind our mental curtain. We want to make our logical organization of our data very very clear because what makes sense to us (who have been so steeped in the data we’re dreaming about it) will not be readily obvious to an outside viewer (even if it’s someone who cares quite a lot about the results). One way to be very supportive of the viewer’s efforts to engage with our data is to build an interpretive, systematic structure. Here, let’s do it with color.

Say we introduce this first graph on school district reading scores.

line graph of 9th through 12th graders' reading scoresWe have now introduced a color-coding scheme for the data, where 9th graders are dark yellow, 10th graders are red, 11th graders are blue, and 12th graders are green. We’ve even reinforced the color scheme by making the data labels at the end of each line the matching color.

Here is the important part: To the extent possible (which is a lot), we should keep the same colors associated with the same grade levels. When we put grade-level data in tables, we use the same colors on the column headers. When we write about each grade level in the report, the heading “Ninth Grade” should be in dark yellow. Consistent use of the color system makes interpretation faster and engagement easier. You with me so far?

Eventually we get to the point where more complex data like this scatterplot doesn’t even need a legend cluttering it up in order to understand what each color means.

scatterplot of reading scores and days absent, color coded by grade levelNow, you’re probably like “Duh, mate, Excel automatically assigns the same color to each row in the data table.” That’s true. Excel can do the heavy lifting for you IF the rows in each data table are in the same order every time. In this example it makes sense – there’s a natural order because we have an ordinal scale – 9th through 12th grade. What about for purely categorical scales? Well it means you must stick with the same order of categories in each data table in your spreadsheet.

All that said, it is still a very good idea to use color (or lack of it) to highlight a certain point. Let’s say I’m the reading curriculum coach for the 12th grade in my district and I’ve got to address the 12th grade reading teachers at a staff meeting. I can gray out all other grades (but not remove them, since they’re a helpful point of reference) and keep the color for my grade of interest.

So build a strong color coding system whenever possible, repeat it without variation, until you need a bit of variation to make a point.

Learn something new?

Share this helpful info with a friend who needs an extra perk today or post it to your social where your third cousin can benefit, too.