Data Visualization Tips & Tricks: What Not To Do!

The data we deal with has such immense complexity built into it (across metrics, methodologies, sources), the business itself is so complex (no more just run ads on TV, wait for people to walk into our stores), any analyst has to strive to simplify complexity every day.

Hence, one thing our community has in common is a love for data visualization.

The above-mentioned complexity is also the reason good data visualizers are paid so much more, and great ones can write their own compensation amount!

It is important to be transparent that I come from the school of thought that believes in extreme simplicity in deploying data viz in a business context. I'm not a fan of the much heralded Minard's Napoleon's March. I much rather prefer the accessibility, speed to insight, of Snow's Cholera Outbreak visual.

[If you are a TMAI Premium member: See the deep reflections in TMAI #319: Data Viz Philosophy From The Other John Snow, and TMAI #320: Data Visualization: Inspiration to Think Different. If you can't find it, please email me.]

Having shared with you loads of inspiring data viz examples to apply to your day-to-day work, today rich learnings from examples of what not to do!

And, a bit of fun. :)


This blog post was originally published as edition #425 of my newsletter TMAI Premium. Each week, the newsletter shares strategic frameworks, tactical tips, and knowledge from the bleeding edge of Marketing, Analytics, User Experience, and Leadership. Sign up for TMAI Premium to accelerate your career trajectory.


CBSN Marijuana Pie Chart

Pie charts are an incredibly ineffective medium of communication.

Don't use them – unless you work at CBS and someone has a gun to your head, then definitely, against your better judgement, convert what should be a bar chart into a pie chart!

Pie charts are a poor visualization choice because humans are not great at internalizing the different sections of a pie chart – unless the differences are dramatic.

Technically speaking: Comparison by angle is significantly more difficult than by length.

Follow me on LinkedIn, for gems like this one: Eat Pies, Don't Share Them!

People often say:

Well, if I have two numbers, say New and Returning Visitors, I can use a pie chart.

Yes. But, way waste time making a visual?

On the slide, in font size 120, just say 62% and 38%. That'll be way more effective. :)


Six-way Venn Diagram

Long-time readers will know that I adore Venn diagrams. They can be a simple way to communicate relationships.

But, honestly, what's going on above! [Source.]

With sincerity, A LOT OF hard work, smart work, went into the above visualization. And, I further postulate that the person who made it probably gets what's going on in the six-way Venn diagram. Unfortunately, it is very inaccessible to most humans. I postulate, even if we are in the gene sequencing community.

I also do not doubt that after a 40-minute explanation unpacking the data, perhaps the rest of us will get some of the visual. But, then, what's the point of a visual?

Don't make things more complicated than they need to be.

The visualization is not a path to your next job promotion. The visualization is not an ideal mechanism to express cleverness via complexity. The objective is to get to an insight efficiently, with the end goal of influencing a meaningful business action.

Special note for my peer Data Viz Nerds: Many Venn diagrams are actually Euler diagrams.

The difference between the two: Unlike Venn diagrams, which show all possible relations between different sets, the Euler diagram shows only relevant relationships.

If that feels confusing, take one look at the Euler showing relationships between different solar system objects, and you'll get it instantly!


Military Budgets China Russia UK India Saudi Arabia US

If you want to spot a hidden agenda, start by looking at the x, y, z axes.

Do that above.

China looks scary above.

Less so after you realize the highest point in China is well below the lowest U.S. point – ever.

Never manipulate the x, y, z axes. And, if you do, call attention to it!

You've seen exaggerated differences in lines/bars by manipulating the y-axis. You've seen trends made to look worse or better by manipulating the x-axis. Try not to be that person.

Bonus: For things like Military spend, it is extremely difficult to make any chart sans agenda. There are so many caveats, nuances, politics, and human tradeoffs.

Here's another one chart in the same spirit of difficult topic, active manipulation

Gun deaths in Florida

Did you recognize the active manipulation?

Takes a second – which most people who look at the chart won't give it, they'll simply assume things got better.

Surprised me that this disappointing chart came from Reuters.

But, then again… I should not be that surprised. This was their official company Venn

Thomson Reuters Values

: |

Talk about a visual that does not communicate what they think it is communicating!


Ireland's position in Olympics Medals Table - Irish Times

Rules can't be applied blindly.

Notice the y-axis flip caused the sald gun deaths graph above to be manipulative. The general rule is, start at zero and then go up. But, notice that the Irish Times followed exactly that rule and made all of Ireland feel sad.

Though, a quick read of the numbers indicates that the Irish had a great Tokyo and an even better Paris. Most people will quickly glance at the graph and feel the exact opposite – things have become worse.

Lisa Muth did a quick data check (turns out the Irish Times also did not have accurate data!), and shared this alternative version of the graph…

Ireland's position in Olympics Medals Table - Lisa Muth

Lisa's post is here, it is a lot of fun.

It also illustrates the value of adding context (which she does via US and French rankings).

It also leads to one more rule for me:

Do not show all the data you have, unless *all* the data adds to the story.



[TMAI Premium Members: How good are your data visuals? Use my simple algorithm to identify and improve: TMAI #223: Data Viz Quality Assessment Algorithm.]


Appellate judgeships confirmed history

Lines, bars, dots can seem boring. Hence, the trend to sexify them with different visual artifacts.

Resist the urge.

You'll catch the obvious issue that each gavel seems to have its own scale (!). 24 can't be that much bigger than 15!

I'm also urging you to consider that the gavels themselves, even if rightly sized, get in the way of understanding the insights. The angled gavels are making a bad situation worse.

In the same vein, graphical cartograms can be immensely insightful. (Ready or not, get ready to be bombarded with them as the US election approaches.)

But, every good visual technique can be made as appealing as nuclear waste.

Here's an example, that builds using multivariate Chernoff Faces symbology…

Multivariate Chernoff Faces symbology

WTH!

Let me illuminate the full glory of the complexity encoded…

Multivariate Chernoff Faces symbology - definitions

W! T! H!

At the scale at this it is communicating the data, it is extraordinary to imagine encoding such deep detail that nearly no one will be able to see/understand.

It is unproductive.

Note: I would also choose a different color for Republicans. Their official color is red, but in this case, with faces attached, it just lots and lots of anger. That is unfair to them.

Understand your target audience. Understand your data. Understand the altitude. Then, apply massive restraint when it comes to the urge to be "cool" and "clever."


Spurious Correlations

If data visualization had an Original Sin, it would be:

Implying correlation = causation!

Never, ever, never, do that.

I see this all over Marketing data visualizations where, as an example, the plot will have one line for Spend on Marketing and another line for Overall Company Revenue. Criminal behavior.

Correlations abound in Marketing because so much of Marketing is ineffective. It drives nothing close to what the SVP of Growth and Paid Media is claiming. So the quest begins to hunt for anything at all that moved. It does not matter if none of the Marketing was meant to drive Store Traffic. It moved? Plot it! It does not matter if Organic Social Agency posts cost us $2mil and have an Organic Reach of 1.1%. Did Site Visits go up during this time? Plot it!

I understand the protective instinct. If we really show that Email Marketing is a net negative driver of NPS, we will lose budget, jobs will be gone. I offer that, at least in the privacy of your Email Marketing team… Try not to fool yourself. Know the reality. Work to create a new one. Then, when you get found out by the CFO… You will show causality and not correlation. (To something business positive, of course.)

Tyler's got a ton of these Spurious Correlations, in case you want to start to change the culture in your company… While having some fun…

Spurious Correlations - Butter

Slides tend to be the medium most commonly used to share data, sometimes as visuals, with senior leadership or larger business audiences.

This is done so poorly, at such a global scale, that humanity has come to blame the tool (PowerPoint) rather than the user (you). Incredible.

I obsess about improving slides to present data. See the 50 examples deconstructed here: Great Storytelling With Data: Visualize Simply And Focus Obsessively.

Rule number zero to communicate insights via slides: Do. Not. Data. Puke.

Close behind is this one:

Context is Queen.

Adding context is more art than science.

Adding context helps accelerate understanding the size/scope of the challenge/opportunity. That in turn ensures impactful insights get through. Which is key to ensuring the right set of actions are taken.

Understanding the use of cigarettes (cig alternatives) is important. This slide from the FDA attempts to do that, see if you can spot opportunities for improvement/hidden agendas…

FDA NYTS e-cigarettes usage

It looks fairly ok. Yes?

Charles Gardner (his Twitter) did not think so.

Here is his fixed version, please take a moment to absorb each module on the slide and what he has done to add context.

FDA NYTS e-cigarettes usage - More accurate

It is helpful to know how big the problem is – adding the 27 mil, 7.7%, and -61%, at the top, aid a more accurate understanding.

Stark difference between 1/4 vs. 1/53 on the left. The middle and right are both now sized correctly.

Should the FDA try to get the 2.1 mil number even lower? Yes. Absolutely. Let's do that with a bit more context around the size of the problem, and with a little more understanding of this problem in the context of all others when it comes to teens.

Under the guise of adding more context, Analysts end up stuffing their slides with so much, I'm sorry, crap… You can see nothing.

The very best context comes from understanding our data fully, and from the truth pre-identifying the story worth telling. More art, a bit of science.



No article on the topic of data visualization can ever be complete without an homage to the inimitable xkcd.

xkcd - log scales

I do use log scales to try to fit the data. But. I absolutely appreciate the point being made above, and it is made supremely effectively.

: )


Time-tested Rules for Impactful Data Visuals.

If you are looking to email your co-workers a summary of the lesson you learned today, to stop abuse via data visualization…

1. Comparison by angle is significantly more difficult than by length.

2. Don't make things more complicated than they need to be.

3. Never manipulate the x, y, z axes. And, if you do, call attention to it!

4. Rules should never be applied blindly.

5. Do not show all the data you have, unless *all* the data adds to the story.

6. Apply massive restraint when it comes to the urge to be "cool" and "clever."

7. Be vigilant: Never imply correlation = causation!

8. Context is Queen – it comes from understanding our data fully, and from the truth pre-identifying the story worth telling.


Bottom Line.

Awesome quote, by Andrew Lang:

“Most people use statistics like a drunk man uses a lamppost; more for support than illumination”

Let us resolve to use data visualization for illumination.

Carpe diem.

Comments

  1. 1
    Anastasia Clarkson says

    I think there is a place for pie charts to emphasize a single value (whether there are 2 values or many) because people are so familiar with clock faces (the first slice should start at 12:00).

    When it comes to digital versus analog time displays, each has advantages. The digital display requires less cognitive load to answer the question "What time is it?" but an analog display requires less cognitive load to answer contextual questions like "How long since or until?"

    I find a pie chart helpful when there are a lot of numbers and/or charts nearby. In a dashboard or on a slide, it can act like a little park bench on the cognitive hiking trail: pause here, take this in.

Add your Perspective

*