[ NOTE: There is a updated version of this post & visualization: Click Here. ]
We all deal with data and we do our best to crunch it and munch it and scrunch it and beat it to death to find some actionable insights. Then we go up against a “committee” or to a decision maker and present our insights only to cause a lot less action to be taken than we had originally anticipated.
Effective communication of reports / data / insights / tables / graphs / stuff is extremely hard, and it is harder still in the complex ecosystem that is the web where there are sixteen things changing all the time and many numbers look different and data is hard to trust. (All the more reason that we internalise the limitations and “get over it”.)
My thoughts went to all of the above when I first saw: Death and Taxes: A visual look at where your tax dollars go by Jesse Bachman. Here is a smaller version of Jesse’s amazing piece of art:
The visual you see above represents how the US government spends it budget (E d i t: The graph shows "2004 Federal Discretionary Budget", so excludes "fixed costs" such as social security and medicare. Post "entitlements" the graph shows how our elected representatives spend the portion of our money they are allowed to spend.) Talk about representing something extremely complex that perhaps very few people in the US government have a handle on.
It took Jesse a full year to produce this visual. Jesse’s self state goal was: “Most people are unaware of how much of their taxes fund our military, and those aware are often misinformed. Well here it is. Laid out, easy to read and compare. With data straight from the White House.”
The shrunk down version above does not quite do the visual enough justice, please see the full size version. The first post on that page contains a link to the full version which is 3500×2333 and 1.8 mb. Viewed at the highest resolution your monitor will support I am positive that you’ll be impressed by all the detail. That regardless of your political leanings.
Indeed the true majesty of Jesse’s visual, and testament to the hard work it took, is that it is really easy to understand it with very little explanation, I would even go so far as to say that it requires no explanation at all. Even the more simplest of brains (mine) to the most complex of brains (say Dr. Hawking) would find insights right away (he sooner than I of course : )).
Each of us will see this visual and interpret something different, above you see details of spending on education and NASA. The author raises these questions:
- Why do we spend more on jets than we do on public housing?
- Why is the Endowment for the Arts so small?
- What's with all this foreign military financing?
You will see other things that you might find interesting. Without betraying my political thoughts I can say that for me this was a surprising piece of data in the discussion that is on the deviantart website: “The US Institute of Peace receives 27 million dollars next year. The Defence Department receives 560 … billion.”
So what’s this got to do with the world of Web Insights?
All too often we are tasked to run reports in our favourite Web Analytics tool or present our analysis of our survey or lab usability data. Often we fall back on what Omniture or WebTrends or ClickTracks or HBX can provide, they do all make it easy to export into excel. Sometimes we will take raw data and give it a spin in excel ourselves.
For me visuals that provide awesome insights come from 1) a deep understanding of the goal / objectives 2) from thinking beyond what standard trend lines or stacked bar graphs can provide. Something non-normal to grab attention and yet communicate insights (sort of already contain recommendations and action items and not just data, as Jesse’s great visual does).
I’ll use my blog data to illustrate these principles. The goal for my blog, or perhaps any blog, is to engage its readers.
“Engagement”, or the much maligned “Stickiness”, for me is a combination of readers spending more time on the website (hence consuming content) and page views (this is harder on the blog where the first page has 12 posts on it).
There are many different ways to graphs this information (close your mind and imagine the many possibilities : )).
Inspired by Jesse’s visual I used ClickTracks to segment the data by my top sources of traffic and then created this graph to help me identify the most valuable sources of traffic as defined by three things:
- Amount of traffic from different sources (and there is some duplication, non-us could also be Yahoo Group)
- # of pages viewed in each session
- Average number of seconds spent on the blog
Here is the resulting graph (please click on the image below for a larger version of the graph):
This does not take long. It took me around 15 mins from start of thinking on what I should do from Jesse’s visual to creating the segments on ClickTracks and then the graph in excel.
Each circle represents a traffic source, size of circle represents size of traffic. Time on Site on X axis and Page View per Visitor on Y axis.
While a million miles away from Jesse’s excellence this graph above is quite effective in my little world, for my little problem, to tell me a few key insights (and as I promised each insights provides a action):
- Love the people in the Yahoo Web Analytics Group, good chunk of them come and stay and read etc. Action: Print t-shirts with the blog’s address and send it to them.
- People searching for me by name, while small, seem to be the next best source of traffic and they come from google. Action: WordPress plugin that will auto-generate site map and upload into google after creation of each blog post was a great idea.
- I am surprised that almost all segments fall beyond 50 seconds, higher than I would have anticipated. Action: Content is king baby.
What do you think? Either of Jesse’s work or my much smaller inspired attempt? If you had the same three dimensions for blog success do you have a idea to visualise it more effectively? I would love to have your feedback. Positive, negative or otherwise.
[Like this post? For more posts like this please click here.]
PS: The post this week comes from one of the most visually appealing places in the world, Hawaii. Aloha!!
I am constantly wowed by clear visualizations myself. I stumbled upon a compelling nutrition data site this weekend. I am truly impressed by their (wisely copyrighted) visualizations of key data points, and the way they combine related points. Could you imagine doing the pyramid one for three k-kpi's? Or any of them really …
Jen: Excellent pointer, very compelling visuals. Sadly it also means that I will never again eat my favourite cheese cake. : )
Perhaps my #1 example of visualization is Charles Joseph Minard's graph of Napoleon's advance into Europe. Here is the wikipedia entry and here is bit more background. It is from 1861 and it is a classic.
For buffs of American civil war here is a Minard inspired Civil War equivalent.
Thanks for taking the time to comment Jen.
The visual by Jesse Bachman is a cool example of how complex hierarchical data can be presented on one page. In fact we can "drill" down on the same page without a call for another page or object. This is how analytical dashboards should be done.
As for the blog success I would try to add % or counts of comments by segment shown by shades of blue (like in ClickTracks)if it is possible to catch (e.g., visited "thank you for your comment" page). Well, it would make such a graph less colorful. Another choice would be inspired by Jesse: make lines to segments and show visitors/comment counts.
Avinash,
Only yesterday I was lead from your blog to Clint Ivy's blog and hence to http://perceptualedge.com/library.htm#. Contains interesting pieces among others on "design practices for BI dashboard design". Also interesting was Edward Tufte's website.
In context I guess i'd agree with what I read in on of the articles:
"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery".
As alwys great post. I am loving this stuff!!
Avinash,
Excellent Post – and it draws freely, from subjects outside of Web Analytics, but which relate to Analytics.
The charts you show plus those mentioned in comments to this post show me how much "Art" is contained in Web Metrics – Visualizing data in a meaningful way is…..finding what's important and amplifying it – highlighting it – and if that's not Art, I don't know what is.
Having said that – I never know what I'll find when I pull data, say, at IBM or for some of my own SEO clients. I like to think as if we're "detectives" and "channels" to find what's actually meaningful to represent.
When I was brought in to lend meaning to IBM US Homepage Metrics it took a while to figure out what it was that my stakeholders wanted to know. I pulled a bunch of data, but found my self "groping" for a while – not quite sure what would be valuable (much as Cezanne would paint a landscape trying to find the ideal form of a motif – say Mount Saint Victoire).
I happened on a relationship between unique visitors to the Homepage and Higher Click Through (Next Clicks) on the Lead Promotion module and started pulling data to show that relationship.
Result: I showed that our new template was more effective in "engaging" visitors (based on the frequency they would click on a link on the landing page). But I would not have known what I would find when I started – I just pull data and ideas "come to me" just as the colors spoke directly to Cezanne. The work itself – Metrics, or Art, dictate the next steps.
I just like knowing that I am not the only person who blogs on vacation.
Robbin
Great stuff as usual.
I still say that if anyone would only consider subscribing to one blog on the topic of web analytics it should be your blog.
I've seen Jesse's visualization before, but was just thinking about it when you posted.
You seem to time your posts pretty well to my thoughts at the moment.
Nice photos taken of beautiful Hawaii on your Sony DSC-H1. :-)
Gotta dig out my Nikon D70 and take it out for a trip. If only it wasn't so heavy with all the lenses, tripods, meters, filters etc. Got a compact Sony too, but it's off for repair (factory error).
Hey, you might be coming up on digg.com, so just be prepared for a slammin' if you get on the main page. Great article, really enjoyed it!
Lookin' out for the site,
jtibble
Avinash,
You are a true American. Thanks to the power of technology to bright light to these kind of facts, the PEOPLE will be able to know and decide.
The only place I've seen that communicates the nature of our military budget better is http://www.truemajority.org/oreos.
This site was set up by Ben Cohen of Ben & Jerry's ice cream and uses stacks of Oreo cookies to illustrate.
As soon as I saw the little flash animation they did I was shocked and I'm a CPA that has looked into these numbers before.
Thanks to folks like you, this information will get out and then we'll all be able to make better decisions about it.
Thanks,
Good article. But I disagree with your graph. It would be great if one could trust the numbers behind it. The 'time on site' is presented in an authoritative way by all web analytics software but I would advise caution. It is generated from the sequence of entries in your log file that are asocciated with a visitor (IP address, cookie, user agent, etc. being the used to identify him/her). As long as the user stays on your web site the number may be somewhat accurate but as soon as the user leaves it you have no record that the user has gone off to fresher pastures. In other words, the last page that a reader visits on your site can have a high 'time on site' value purely because the analytics software doesn't (and can't) know what the user did next and times out the visit/session after 30 minutes (default value in WebTrends). Further, 'time on site' could also be spent drinking coffee or chatting at the water fountain and need not mean that the user is actually reading an article.
Apart from this minor criticism I found your article valuable.
It's a nice looking visualization, but what it leaves out the perspective of the greater system and that's what makes it really dangerous.
It explicitly ignores the mandatory budget which in the past few years has been around 400 billion more than the discretionary budget visualized. The amount that we spend on Social Security, Medicare and Medicaid has been larger than the entire discretionary budget for at least the past two years and I'm sure for a good time before that.
Also beyond it's scope in the state and local budgets which wouldn't be necessary if it weren't for the overlaps in purpose such as transportation and education. Factoring in state and local expenditures, we spend almost twice as much on education as we do on the military.
Visualizations are great for simplifying data and giving a perspective, but the perspective will always have a bias in it whether fair or not. This chart looks like it wants to present a sharp contrast between military spending and other spending without bothering to put it in a more complete view. Rather irresponsible when it calls for political action right in the middle of the graphic.
http://www.whitehouse.gov/omb/budget/fy2004/tables.html
I followed the link to Jesse Bachman graph (the very large one) and started reading the text. She noted that her graph shows only what congress has "control" over as far as spending and since congress has no control over programs such as social s. and medicare, that she left those out for clarity. Well if I am not mistaken here, congress has full control over those aspects of the bugdet and chooses not to change those as they spiral out of control. It will not be the military spending that will crash the budget in about 10 to 20 years.
So what Jesse has shown us is that you can be deceptive with simple graphs as well as complex, visually appealing ones as well.
I'd like to see the federal budget information in a treemap.
Great article! I'm going to rethink how I present data to my high school students (I'll definitely use Jen's suggestion from comment #1). Also, it would be interesting to see how the Digg referrals fit onto your graph. Do they read the posts or just look at the pictures?
The subtitle of Bachman's chart, "A visual look at where your tax dollars go," is innacurrate and misleading because Social Security and Medicare are indeed taxes, yet neither of them are included in the data.
A more accurate (albeit drier) title for the chart is found in the center circle: "The 2004 Federal Discretionary Budget."
jtibble: You comment puzzled me, not sure what slammin' would be if I got to the main page. Well I woke up this morning (in Hawaii) and found out what it means to be on the home page of digg.com. Indeed got a slammin'. Thanks for looking out for me. : )
Raju Varghese: I think your caution applies to pretty much all the delightful numbers that we get out of our web analytics applications. : ) That said most javascript tag based solutions do a better job of capturing "time on site", they are combining session cookies with benefits from javascript tags on each page. They will terminate your session if you leave the site or close your browser. Still have a problem if the last page was left open on the browser and the user walked away (in this case the session is terminated after 29 mins of inactivity in almost all applications). So not perfect not much better than weblog based solutions.
dcmet, Patrick: Great point on what the graph displays, just discretionary spending. I should have made that more explicit (have edited the post to make that very clear).
Jesse's graph still is very relevant, with the same level of insights. IMHO if the government had discretion over all of the budget you can rest assured that similar lopsided spending, as shown in the graph, would persist. Would you agree about that? (Again regardless of politics and regardless of what political party is in power.)
the reason that the government spends as much as it does on the military vs. public housing etc. is that simply there wont be any public housing if we get invaded. Granted, the government needs to do a better job of funding what needs to be funded rather than overkill on certain less than useful projects but clearly we do need the airforce, we do need an army and we do need to spend money on things like our space program (not neccessarily NASA) so that we do not fall behind the rest of the world.
That was an interesting visual map. Wonder if anything like this exists for Singapore?
But the congress does have discretion over both the mandatory and discretionary budgets. It's just with the discretionary budget they have to opt in every year instead of letting the status quo continue.
If the graph included the mandatory budget as well, it would still show a very lopsided picture, just without all the visual weight on the US military. It would still be an interesting diagram, but it wouldn't have the impact with the author's intended audience.
The question is what people are supposed to get out of just a chart. Your chart of hits has a very easy heuristic. We know hits are good, and the bubble's size is larger for a source that generates more hits, so the bigger the bubble, the better the source.
For the budget graph, the size of the bubble is related to the size of the budget. For the graph to have meaning, there needs to be a heuristic that the audience uses to interpret it. The belief that I expect the author wants his audience to have is they either don't like the military, or don't like how it's being used. That makes for an easy heuristic when the majority of the shown budget is for something they don't favor. I really don't see a point to the chart that's anything other than political. It doesn't try to change anyone's view or give them a new perspective, but instead just tries to motivate action with the otherwise complacent. The budget chart is just preaching to the choir, and, to me, makes it much less useful than your chart on referrers.
Bachman's chart doesn't make this totally clear, but I'm assuming the center bubble represents the the total $782B, and the DoD and Congress bubbles to the left and right of it represent the $399B military and $383B non-military budgets.
In which case, this design seems to imply that there's some other entity initially slicing off $399B for the military, and giving the remaining $383B to Congress to allocate. But of course there is no other entity — Congress sets the DoD budget, and everything else on this chart.
Therefore, showing Congress as having only $383B of the total is misleading. To make this chart work, the bubble on the right should be "Non-Military spending"… which of course isn't a department/agency (like every other circle on the chart), just an arbitrary distinction made by Bachman.
For a couple of years I've been working on a way for people to visualize what's in their information landscape – computer files, web pages, notes and scraps of files. I tried software to make concept and mind maps with attached files. This helped when trying to understand the information I have and how it's related. But the drawback I always found was that the 2D field is very limiting in any kind of serious project, both by how much you can see at once, and pretty soon by how much the software will accept. So I went 3D. There's an example of the kind of thing that can result here.I have found that my picture of complex bodies of reference information improves a lot as a result. You can fly around the landscape and zoom in to pick out the detail. Then have it redrawn centered on a new topic and a new way of looking at the information appears.
Avinash, I was typing a comment on this post and of course i got carried away and turned the comment into a full post at http://julien.coquet.free.fr/?p=174
Good titles and descriptions are an essential part of effective communication, and I found myself a bit puzzled by the "death and taxes" phrase in the title. I think a better title for the chart would have been "Guns and Butter: A visual look at where your tax dollars go.", an allusion to the guns and butter economic theory which would well describe the trade off between military and non-military spending that the chart's creator is trying to convey. Its just a thought on how this already fascinating visualisation of government spending could be even more effective.
As a non-analytics person, I really appreciate the power of the visualization of data — and how a good visualization is an unsurpassed tool to convey meaning, signifantly more so than a wash of numbers.
I'm surprised that no one has yet posted a comment relating to Edward Tufte, who is the grandfather of data visualization IMO. His book, "The Visual Display of Quantitative Information" is a classic text about Avinash's topic. The seminal example he uses is a map created by Charles Joseph Minard. It portrays the losses suffered by Napoleon's army in the Russian campaign of 1812 — I'm posting a link to a copy of it: Napoleon's Campaign in Russia, 1812
It's pretty hard to see in this link, but this map is really a masterpiece. It conveys 6 pieces of information — the size of the army, its location on a 2-dimensional surface, direction of the army's movement, and temperature on various dates during the retreat from Moscow. And it's all presented in an easy-to-read manner — just about anyone can get this just by looking at the visualization without reading the fine print.
I think Avinash makes a great point — in this context, a picture is certainly worth a thousand words.
I just discovered the blog with this post, and I was very impressed. Most of the analytics blogs out there are obsessed with the data and how to collect it, but you've really hit a nerve with me on how to make it *useful*.
I consult on decision analysis and I ususally spend a heck of a lot more time working with the client on making data useful (thought visualizations and a useful decisionmaking process) rather than on the tools. Thanks for the post.
R,
bigskythinker
shamless plugs:
website: http://www.bigskyassociates.com
blog: http://thinking.bigskyassociates.com
If you want to really learn about data visualization read Edward Tufte's books, he is the master at this. For one thing, using circles and pie charts will deceive your audience.
"The Visual Display of Quantitative Information"
http://www.amazon.com/gp/product/0961392142/sr=8-2/qid=1152996504/ref=pd_bbs_2/104-3023454-8206365?ie=UTF8
"Envisioning Information"
http://www.amazon.com/gp/product/0961392118/sr=8-3/qid=1152996504/ref=pd_bbs_3/104-3023454-8206365?ie=UTF8
And more.
Matt,
Totally agree–I just wrote a post on this very point at my site, http://thinking.bigskyassociates.com . I reference Avinash's post here and Tufte.
Great post. It is, as you say, a lovely visual. The only thing I might add is the thought that people should think about what they are going to do based on the report/graph/analysis and see if they can make their systems do that for them without anyone having to even look at the report. For instance, if you want to detect fraud spend the effort figuring out how to make the system flag a fradulent transaction not on how to show fraud data in a graphic.
Great post (ofcourse I am biased to the topics on the sidelines :) From an agency perspective, I find that as much as we have our act together on analysis and drawing insights as an ongoing process, the impact and action from the insight lies really in how it's presented (and the weightage of the person doing so).
I am curious to know how other companies deal with presenting these large chunks of information to top level management (?) – do visual presentations suffice, in general? (Yes, I am fishing for tips from the experts :)
Additionally, what is the general level of client knowledge? How much do you find yourself educating the decision makers of the deliverable/ exercise itself (as against the actual analysis)?
I am sure the answers would vary by individual client type and experience – but this is more of a state of the industry type question and I wonder if anyone out there has noticed a stark difference in level of knowledge of web analytics by industry type…
Thanks for the post, Avinash.
Sulakshana
Sulakshana: In my experience you will find differences within and across industry. I have had the opportunity to speak with multiple companies in various verticals or horizontals and the level of sophistication varies dramatically. I wish we could say "hey lets work with the interactive agencies or financial services companies because they are really sophisticated."
In terms of education, I don't think it matters if you are inside the company, like me, or outside, like yourself. We have to keep fighting the good fight because the core good analysis people are very few. The advantage we might have, on the inside, is that we might have a closer social awareness and could leverage that.
Finally in terms of presenting large data, I don't think it matters if you are inside or outside. Great analysis and visual representation is just that, great. I might even hypothesize that for agencies this is even more important because some of the complexity is "hidden" from the client and will never be understood, hence your message needs to be even more crisp and easily understood.
IMHO.
Thanks for the response.
I didn't realize that your following post (with Dr. Turner) substantiates some part of my question – especially the answer to 10.
Seems like you'd really like the work by Mark Lombardi.
A fairly new blog on data visualization:
http://www.beyeblogs.com/spotfireceo/
new blog on Data Visualization and it’s uses and impact on business intelligence.
and just cool links
http://simplecomplexity.net
I’d like to see the federal budget information in a treemap.
May I suggest that you also consider
Parallel Coordinates – This book is about visualization, systematically incorporating the fantastic human pattern recognition into the problem-solving …
http://www.springer.com/mathematics/numerical…/978-0-387-21507-5 –
which is now available. It contains an easy to read chapter (10) on Data Mining. Among others, I received a wonderful compliment from Stephen Hawking who also recommended this "valuable book" to his students.
Alfred Inselberg, Professor
School of Mathematical Sciences
Tel Aviv University