Recognizing Bias in Data

In the last few posts, we’ve discussed using data to quantitatively analyze aspects of the world around us. Often, we will not look at the data itself (due to lack of time, etc), and instead we will look at the graphical representations of data that people have created for us. This allows us to quickly take in what was previously an overwhelming amount of numbers. Today, we’ll continue that theme in a (perhaps) more down-to-earth manner. 

To begin, the Learning Target for today’s class was “I can analyze inherent bias in graphical displays of information.” In trying to give an example that had several ‘layers’ of difficulty embedded into it, I chose this chart, which comes from one of the Data Crunches on the fantastic financial education resource, NGPF.org. It is important to note that the chart actually comes from 2017, and 2020 was based on projected numbers. 

Screenshot 2020-03-05 at 6.33.04 AM

First, I of course gave students a minute to analyze it, as well as a second minute to discuss with a turn-and-talk partner. Right at the start of this process, I already had one student in my class exclaim, upon reading the learning target, “It’s data – it can’t have a bias! It’s just the numbers, and numbers don’t have a bias!” Excellent starting place for a teacher! 

The natural first topic of conversation, based on the learning target, was the title (although I should mention a few students wanted to talk about the colors used). The questions raised were: Which is a more accurate overview of what the chart shows us – the main title, or the sub-title? Why? Is this an example of how the title of a chart can subtly alter what sorts of information people are looking for in the data? What sort of a political bias – right or left – would one assume this chart came from based on the title? 

Next, we began to analyze the graph’s trends. Now, normally I would make sure to call attention to the axes, but for two reasons I saved that until later, wondering if a student would call out the things that I was noticing about them. Obviously the percentage of federal spending used for military and defense has dropped the most out of the categories listed from 1962 until now, and health spending as a percentage of the national budget has increased the most. Where do these topics fit into the national conversation? My students unanimously agreed that military and defense spending was a traditionally right talking point, but interestingly pegged health as an issue on the left. It seems to me that health care is being discussed across the board, and there are certainly contrasting views of it, but for a quick example of some of the shifting baselines within health care, look to President Trump’s feelings on the Government’s ability to negotiate prices of pharmaceutical drugs. It seems that this is at least one issue within the broader conversation of health care on which both sides of the isle agree something needs to begin to change. I assigned the February 25th episode of The Journal – How Big Pharma Lost Its Swagger for homework. 

But back to the point! My class and I were looking for potential bias, and I felt that it was time to draw attention to the axes. What scale was chosen on the x-axis? A four-year scale. Why? Presidential election years! But what the heck happens here, because I know off the top of my head that 1964 and (especially) 1968 were BIG election years during the height of the Civil Rights Movement and the Vietnam War that eventually lead to stepping down of LBJ, the assassination of Bobby Kennedy, and the election (for the first term) of Richard Nixon. Why does the axes have 1962 as the first year? What the heck!? They missed an increment and went up by six years between 1994 and 2020! What does this tell us about the authors of the graph? Honestly, I’m not really sure… However, if that discrepancy causes us to just ask the next question – what is the Tax Foundation and what might their aims be? – then we may be able to research it to learn more. 

However, my BIG question for the class was this: Did the dollars on Military and Defense spending decrease between 1962 and 2020? 

“YES!” They all answered. 

“Hold on, listen to the question again. Did the dollar amount spent on Military and Defense decrease from ‘62 to ‘20?” 

A few ceased to answer, sensing that they needed to think more slowly. “YES!” the others said. 

“OK, I’m going to add to my question. Did the dollar amount spent – NOT the percentage of the total budget, but the dollars spent in total on Military and Defense – decrease?” 

Now they were onto it. “What this graph shows us is only the percentage of total spending… but if the total amount of government spending went up over these years as well, the amount spent on Military and Defense might be exactly the same or even more than it was, even while the spending as a percent of the total budget decreased.” Exactly. 

So what does this ‘catch’ tell us about bias? Might we want to know why the authors only included percent of the total budget, and might it be important to analyze the total spending as well? Is there a way to show the graph that displays the total spending, but also still gives the reader a sense of how the proportions stack up against each other? Yes, but what would be the downsides of that graph? Well, it would be easier to see the proportions where the total government spending was highest – presumably closer to the present – and harder to see in the early part of the graph where total spending was lower. Yet, the title seemed to have a right-leaning bias, and conservatives traditionally want to minimize total government spending, so would it not also benefit a conservative to show total spending going up? Or was it simply more important to highlight the decreasing Military and Defense spending in contrast to the spending on Social Programs? These are all questions I want my kids to ask. 

My wife is a data analyst for a big software company. She’s excellent at the job, and I enjoy hearing her discuss some of the specifics of the position in informal conversations with people we meet in town or on ski chairlifts. Most people are just interested to hear about what she does, but every once in a while the person we are talking to has a background in data and business. Recently, we met one of those people, and when my wife told him what she does, he said “Oh, yeah! Cool, so you ask your boss what she wants the data to say, and then you make the data say what the company wants it to say, huh?” 

“Yes. Exactly,” she responded. 

I think our kids ought to know that data can have bias.

-mmm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s