How to Use Data in Your Writing

Share on facebook
Share on google
Share on twitter
Share on linkedin

“According to statistics – eighty percent of statistics are made up.”

Maybe, you have even heard that provocative statement. However, have you ever thought about its origin? Do you know who said it and what data was processed? Most likely, this statement itself is made up as well.

What is statistics?

Statistics collects and analyzes data to resolve a situation. It appeals to logos – one of the three main persuasive writing tools to convince readers and/or catch their attention. And statistics often appear distorted – simply because the author tries to make the audience believe certain things.

Benjamin Disraeli, a British politician once said: “There are three kinds of lies: lies, damn lies, and statistics.” Of course, the whole presented data may be false. Needless to say, you should take data from reliable sources if you want to write good quality text.

But in most cases, the method of analyzing data is the fallacy. Your task as an author is to:

  • evaluate the validity of data
  • interpret it correctly in your text (especially, if you use primary data – the information you collect yourself)
  • make a big amount of numbers that are presented in a study, sound clear and natural by benefiting your text and not burdening it

If all your text is linked to a discovery based on data – use the structure of the inverted pyramid, where you put the most important information at the first paragraph. Use as few numbers as possible in your opening and don’t forget about active voice.  In this way, you can easily tell your audience who has collected the data.

Look at this example:

“Ninety-five percent of dentists approve ‘N’ toothbrush!”

This is an average statement you might hear in a commercial. For the brand, it sounds astonishing, but not for a critical thinker. The audience doesn’t know who those dentists are, the number of respondents, and what is meant by approval.

This reasoning leads us to the first (and biggest) misconception that might happen to statistics – manipulation with information sources.

Data samples are the main field of a lie. They can be either too small to show a realistic situation or skewed, like the ones that don’t show a general picture.

Let’s illustrate this point. What if I say, most students at a K-12 school believe they urgently need to build a new playground with several slides. It may look logical: children want to ride slides during breaks. However, the numbers might differ if we take 60 students from a primary school or 60 students from a high school. To be true with an audience, we must specify: most primary school students would like to have more slides at the playground.

However, information manipulation isn’t the only danger to statistics. Abraham Wald, a mathematician, discovered another famous fallacy about information sources called survivorship bias. During World War II, many American bombers were lost during aerial fights. The planes of surviving pilots returned with damage on their wings and backsides. The first instinct was to add additional armor to the damaged areas.

Abraham Wald realized the analyzed damage wasn’t crucial, because those bombers survived. On the contrary, the damage to other “unhit” areas proved fatal and was the reason why bombers with those losses were unable to fly back to base. If we analyze statistics to prevent something, first, we must look at the failures, and only afterwards compare them with the successes. Survivorship bias by Abraham Wald

So, think about a proper question you want to answer with statistics. This will support your opinion as well as your expertise in the eyes of a reader. Your reader wants to hear a story from you, so your main focus should be on what you have found and why!

There is also the Texas sharpshooter fallacy. This misunderstanding of data occurs when differences are ignored, and attention is drawn to the similarities. A basic example is the superstition about a leap year being an unlucky year. There is relatively the same amount of good and bad events every year, but usually people emphasize only negative examples from a leap year to prove its misfortune.

No matter how badly you want to help your argument – show the entire picture and draw comparisons, so your audience can objectively imagine the case. By the way, when you compare numbers – avoid putting them into brackets. Try to find a way to present them inside a sentence.

Even when the information sources are correct and relevant questions are answered, room for manipulation occurs when analyzing information. There are multiple variants of how it happens, but we will look at the most common ones.

Occasionally or on purpose, a researcher might confuse mean, median, and mode. Also, be careful with the word ‘average’ in your text.

What is average?

Why you do not the word 'average' - confusion between mean, mode, and median

Pay attention to the amount of percentage presented, because it may turn out to not equal 100 percent.

If we go back to the toothbrush example, the dialogue between a dentist – respondent (D) and a statistician (S) could be next:

  •   S: What toothbrush do you recommend?
  •   D: I recommend the ‘N’ toothbrush. I also recommend ‘X’, and ‘Y’ seems good to me.

No matter how much dentists mentioned ‘N’ toothbrush, results will be skewed without mentioning other brands.

Darrell Huff provides an example of the false interpretation of probabilities in his book How to Lie with Statistics. As a case of transference, he shows two relevant statistics during the Spanish-American War in 1898 – the mortality rate among members of the United States Navy and the mortality rate among the civil population. During the same period, there were 9 deaths per 1000 people in the Navy and 16 deaths per 1000 people among those who were not sailors. Recruiters were using those numbers to prove that joining the Navy was safer. The main misleading point here was that the Navy consisted of young and healthy men. Civilians had sick people, the elderly, and children who had more chance to die no matter where they were. Huff’s example stated that there was not enough relevant information to compare the data, even though the sailors were risking their lives.

Even the most obvious variable – inflation – can be ignored while analyzing the average income, GDP, or any other monetary statistics. People might earn more money every month, but at the same time they spend more on products, utilities, or entertainment.

Making statistics easy to read, you should avoid a big amount of numbers in your text. You can visually separate your data by putting bullet points inside your text, or, of course, create visuals.

Visual representation of data helps the reader to imagine what is going on.

However, drawing a graph is still a part of processing data, and yes, there is still an opportunity to manipulate information. So, if you decide to illustrate your article by yourself or use someone else’s graph – here are things to avoid.

If the researchers’ goal is to shock people, most likely there will be no 0 points on the line graph. Imagine the y-axis is cut, but the x-axis starts from 0. The slope of the line will look more extreme. Note, that there are specific areas where such a representation is understandable. For example, a graph with stock – market speculations might start from non-zero point where even small changes on 0,1¢ are important.

Manipulation on a scale of graph

However, bar graphs ultimately must start from 0 with no excuse. There is also a scale, and fixed marks on the y-axes, that could be ignored and then the real tendency will look differently. When bar graphs are presented as a 3D model, it is even easier to implement.

Manipulation with bar charts

Another example that might cause misunderstanding is the combination between visual representation and information analysis. During the US presidential elections, we might see the map of the USA where every state is colored red or blue, symbolizing the votes given to democrats or republicans accordingly. The goal of this map is to present the results state by state. In other words, if we see a certain color prevailing, there is no guarantee that a particular candidate will become president. We should remember about the population density and the number of votes every state is eligible to give due to the Electoral College system.

In conclusion:

These main types of statistics manipulation (and there are even more factors we haven’t covered here) are used daily to confuse and influence an audience. Statistics is an easy tool for propaganda, fake news, and irresponsible methods of advertising. As you can see, there is conscious and unconscious manipulation.

However, a lot may depend on the interpretation of data and the way it is told inside the text. Remember to focus on this knowledge every time you report numbers and involve them in your text. If you have difficulties doing that, book an appointment with Best Edit, and we will help you improve your writing.

More to explore

Semicolon guide

Semicolons: A Micro-Guide

It’s not a comma and it’s not a period. It definitely isn’t a colon! So, what is a semicolon, what does it

Page: 1 of 6

Words: 243

English (U.S.)

Let's talk