There has been a lot of heartbreak around the world with the CV-19 pandemic.

This chart, from NPR, illustrates some cause for optimism. It shows the *7-day average new cases per day *across the world.

It is crucial to acknowledge what’s hidden in the aggregated trend above: The impact on individual countries is variable.

A large percentage of humans on the planet remain under threat. We don’t nearly have enough vaccines finding arms. We have to remain vigilant, and commit to getting the entire planet vaccinated.

Recent worries about Covid were increased by the proliferation of virus variants around the world. Variant B.1.1.7 was first identified in the UK. Variant B.1.351 was first identified in South Africa. Variant P.1 in Brazil has 17 unique mutations. The variant identified in India, B.1.617.2, had a particularly devastating impact (see the blue spike above). There are multiple "variants of interest" in the United States, Philippines, Vietnam, and other countries.

A particularly dangerous thing about variants is that they are highly transmissible (evolution, sadly, in action).

Some journalists rush to point out, hey, the death rate remains the same.

I believe this is a mistake. It imprecisely minimizes the danger, and results in some of our fellow humans feeling a false sense of hope. This is possibly due to a lack of mathematical savvy.

As Analysts, you can appreciate that a lay individual might not quite understand the complexity behind infection rates, and the impact on death rates. At the same time all of us, journalists and Analysts have to figure out how to communicate this type of insight in a way that everyone can understand.

This reality is similar to what we face in our business environment every single day. We have too much data. It is complicated. There are a lot of things happening below the surface. We somehow have to figure out how to preserve the complexity, but be able to communicate it simply.

Inspired by the work done by **Adam Kucharski**, mathematician, epidemiologist, and **Mona Chalabi**, Data Editor at Guardian US, I want to showcase how we can simplify complexity – in this case using Covid data, but the lessons apply across multiple use cases for Analysts.

**The Problem.**** **

Let’s simplify the challenge of explaining the problem we face with these variants down to these two scenarios:

Which of these is more dangerous:

**1.** A variant that’s 50% more deadly?

Or

**2.** A variant with 50% increase in transmission?

Take a pause with everything you know about Covid and math.

Ponder the problem, and what do you think the answer is?

Get a Post-It. Do some rough computations. Note your assumptions.

Did you choose #1 or #2?

Ready?

**The Answer.**

Have you heard someone say humans are not wired to understand the impact of compounding interest?

Applies here as well.

As Adam explained:

*An increase in something that grows exponentially (transmission, in this case) can have far more effect than the same proportional increase in something that just scales an outcome.*

Or, in English, as I suspect you all already noted above: #2 is the worse scenario.

It is far worse if the new virus variant is 50% more transmissible.

**The Answer in Equations.**** **

**R**represents the

**reproduction number**. Let’s assume R to be 1.1.

This means that every 10 people who are infected will infect another 11 people.

**[**You can assume whatever R, math still works**.]**

**F** represents the **fatality rate**. Let’s assume F to be 0.8%.

This means that 8 out of every 1,000 people who get the virus will die.

**G** represents **generation time**. Let’s assume G to be 6 days.

This means that from the time that someone is exposed, it takes around 6 days for them to infect the next person. So each month, the virus can generate about 5 times. (30/6=5)

One final assumption, let’s assume 1,000 people were infected.

**Scenario Normal**

1,000 x (1.1^5) x (0.8%) = **12.9** fatalities after 1 month.

**[**For extreme clarity: 1.1^5 denotes 1.1 to the power of 5.**]**

*Scenario 50% More Deadly*

1,000 x (1.1^5) x (0.8% x 1.5) = **19.3** fatalities after 1 month.

A sad increase for sure. But. Wait.

**Scenario 50% More Transmissible**

1,000 x ((1.1 x 1.5)^5) x 0.8% = **97.8** fatalities after 1 month.

OMG.

**12.9 to 97.8.**

This is why all the scientists, and ultimately Boris J as well, got so freaked out about a variant that was 50% to 75% more transmissible.

The math is scary at the higher end of that range.

If folks in your circle are less well versed in exponential growth (in their bank account, in loss in retail stores, or pandemics), do please take a moment to illustrate that for them.

You’ll be helping them think smarter.

Analysts typically feel that they are done at this stage. They got the data. They got the formula. they did the math. It is all so clear.

And, they are right. It is clear. But. It is not yet as accessible as it could be.

I urge you to think about accessibility of your work.

**The Answer Visualized.**** **

The “problem”, if I may use that ugly word, with the formulas above, is that they are a little bit

*dry*.

Well. Maybe, it is more accurate to say: They are not quite as accessible.

A lot of people have a natural aversion to math. Even simple formulas like the one above can seem intimidating.

In life, and at work, perhaps the #1 job we have as Analysts is to be able explain data in a way that’ll be understood.

This is where **Mona** stepped in to help. She built on **Adam’s** excellent insights and explanation, and sketched some pictures that made the analysis potentially accessible to everyone on the planet.

Let’s do the exercise again.

**Scenario Normal.**

You’ll recall from above, the normal Covid scenario was:

1,000 x (1.1^5) x (0.8%) = **12.9 **fatalities after 1 month.

Here’s Mona’s simplified visualization to make the formula a ton more accessible to all humans of the planet:

Isn't it more accessible compared to the formula?

Of course it is.

We all have this opprotunity in our day to day business work (in addition to an opportunity for all journalists who have to make this, literally, life and death data more accessible).

Let's keep going.

**Scenario 50% More Deadly.**

The challenge is to simply visualize this formula:

1,000 x (1.1^5) x (0.8% x 1.5) = **19.3** fatalities after 1 month.

Here’s the visual:

Simple. Effective.

And now to our final scenario.

*Scenario 50% More Transmissible*

Our formula for the impact of 50% more transmissible:

1,000 x ((1.1 x 1.5)^5) x 0.8% = **97.8 **fatalities after 1 month.

Mona’s visual to illustrate the impact:

The simple visualizations, (possibly hand drawn?), make the data exponentially (there’s that word again) accessible.

While the data is heartbreaking, I had a momentary smile from appreciation for Mona and Adam for this timely and accessible lesson for all of us.

**The Answer Visualized, Try 2.**** **

There are many other ways to illustrate this data.

One thought I had was, I wonder if we should have three boxes?.

One for 1,000 infected. A second one for the increase in infections (huge in scenario three).

A third one for the unfortunate increase in deaths.

More information to sketch, I’m so wary of clutter in these cases. Something to sketch and see how it comes out.

My friend Kaiser Fung illustrated an alternative approach in a recent post on his excellent, and I really mean excellent, blog Junk Charts.

For a (poor) visualization used in a video published in Germany, showing the danger posed by new variants, Kaiser whipped up R = 1.0 and R = 1.4.

**[**Note: Being 50% to 70% more transmissible means the reproduction rate goes from 1 to 1.4.**]**

Here’s R = 1.0.

A very different approach to simplifying the complexity in the data, and a very different approach from the ones earlier in this post.

All in all, such an interesting visual.

I like that in a way it captures the haphazardness / randomness of the actual spread.

And here’s what happens when R = 1.4.

It depicts something truly heartbreaking, but does so in a mesmerizing way.

**[**I grew up in India, R1.4 reminds me of a mandala.**]**

More infections obviously mean more deaths (holding deaths constant as above).

The super nerd in me loves Kaiser’s version. There is an organic chemistry *virusy *nature of the visual that holds a certain appeal.

My experience would suggest that Mona’s is unquestionably more accessible. I would use something like Kaiser’s for certain audiences.

I wanted to share Adam's formulas, Mona's visuals and Kaiser's *mandalas *to highlight the diversity in the paths we can take on the quest for data accessibility.

**Bottom line.**

In the quest to communicate your insights more clearly, you can pick the path that works optimally for your audience knowing that there are multiple paths for simplifying complexity.

However you choose to do it, I urge you to figure out how to convert your numbers, assumptions, and formulas into a visual story that’ll make your insights more accessible.

Good for you. Good for the audience. Good for your company/planet.

Carpe diem!

As always, it is your turn now.

Please share your critique, reflections, and your lessons from the approaches you have taken in your quest to simplify data’s complesity. Thank you.

What an amazing and timely example of how to move beyond our comfort level to deliver a story.

Even as an analyst with a statistics background I found the three visuals to be the best way to help a large group of people understand underlying facts to help them make better decisions.

Far too often I'm just as guilty as the next person and fall in love with my own analytical sophistication.

Thanks for a great reminder.

Great visuals, I'm already translating the approach for my business dashboards to improve impact.

One potential helpful thing might be to show how many more people are getting infected. For example, the starts are the same but after one month there will be way more infections in the third one.

Does not help the cumulative view, but a bit more insight.

Michael.

Michael:You are right, it would be interesting (especially for us Analysts) even if the cumulative impact won't be there.For most lay people, I suspect Mona's simple input < > output approach suffices in understand which scenario is worse (and hopefully to drive better behavior like mask-wearing, vaccines, etc.).

Thank you.

Avinash.

Your basic assumption is flawed. It's based on PERCENTAGES. Percentages aren't "real numbers" they simply describe the relationship between two numbers. What are those numbers? Who did the counting? When was it done? How big was the data set?

Seek the source, and find out the raw data behind ANY percentage calculation before you use it to support anything else. Most percentages are unsourced, and 76.3% are made up.

They are lies, damn lies, statistics, and percentages. -Wilfred

Wilferd: I'm sorry, I don't understand your comment. I think you are trying to say something very important but I'm unable to get your point. Would you please care to try another way to explain? I'll be happy to engage.

The percentage used in the computations is based on real-world observations/results. In applying them to various scenarios, we are applying them using raw numbers you encourage us to. You sense my confusion.

Thank you.

-Avinash.

I've had a "math allergy" since early schooling and even I understood the formulas and results. Kudos Avinash.

One thing you left unsaid is that the outcomes are best case scenarios. As inflections increase, or fatalities, it places an extraordinary burden on the healthcare system and associated services. That in turn will make the predictive outcomes in your post worse.

Thank you for a great lesson today.

As a visual learner I immensely appreciate the approach you've taken in this blog post. It makes complicated data accessible. I have forwarded this to a ton of people who need to understand this valuable information.

I find the numbers view also helpful in this type of a situation, for those who are doubtful where the "pictures" came from it helps demystify.

The visualization by Kaiser Fung is really interesting.

It is so hard to feel that a change from R=1.0 and R=1.4 would have a revolutionary impact. In fact to lay folks those two numbers are basically the same. Yet, when you visualize the infections after 10 steps the impact is stunning.

For some people – in all countries – no amount of data will be enough, as demonstrated by the "anti-mask" rallies. I believe there is a significant amount of the population that are in the gray zone and for them data can swing them to positive action for themselves and others. This is who we are solving for.

Thanks.