Data Analysis 101: Seven Simple Mistakes That Limit Your Salary

inspiration Data analysis is not easy. It takes years to get good at it, and once you get good at it you realize how much more there is to learn. That is part of the joy. You are always learning. You are always growing.

This blogpost is a collection of tips I share with my friends who are just starting out. Each tip is a "simple" mistake that is easily avoided. My hope is that you'll skip them if you are aware of them, and move on to making more important valuable mistakes. :)

My plan is to wrap each tip with additional observations, context that will be of value even to those who have been at this game for a very long time.

Ready for a can of concentrated compressed energy?

Let's do this.

1. Never Compare Apples to Watermelons.

There are some things that are quite promising about this graph.

I love that the analyst is segmenting the data rather than showing the aggregate trend ("all data in aggregate is essentially crap" – me). I also like that the analyst is showing a six month trend.

But there is something fundamentally wrong about this analysis. Before you jump to my reveal below this graphic, can you guess what's wrong with this data? Try it?

Found the problem?

traffic graph

Four different segments are being compared (yea!), but they are calibrated wrong (boo!).

On the surface this is hard to detect.

The part that is clean is that there is very little overlap between Search Traffic and Referral Traffic. If you use Omniture's Site Catalyst or Google Analytics or whatever, they do a good job of collecting clean data into those two segments. But Mobile is a platform. That traffic (or conversions in this case) is most likely in both Referrals and Search. So it is unclear what to make of that orange stacked bar. Is that good? Is that bad? Additionally it is showing conversions already included in Search and Referral (double counting) and because you have no idea what it is, it is impossible to know what action to take. [The analyst recommended a higher investment in Mobile based on this graph!]

Ditto for Social Media. It is likely that the Social Media conversions are already included in Referrals and, of course, in Mobile. Making that green graph useless. [The analyst recommended a massive increase in investment in Social Media as well. An imprecise conclusion.]

Ensure that you always calibrate the "altitude" of your segments. Always.

So if you want to analyze Mobile performance then you want to compare Mobile and Desktop segments. Very easy to create. For bonus points you can analyze Mobile Search traffic performance with Mobile Non-Search traffic performance. You can analyze Mobile Search performance with Mobile Referring traffic information. Then compare those two to Desktop Search and Desktop Referring traffic. So on and so forth.

Nice clean segments that will help you find nice clean answers (as good or as stinky as they might turn out to be :).

For Social Media you can compare it to Search (with no other changes to that segment, use the Default in GA/SC/WT/YWA), and for Referring Traffic make sure you create a new segment where you take out Referrers such as Facebook.com, Twitter.com, plus.Google.com, Stumbleupon.com etc., etc. So you'll be comparing clean buckets of Social Media, Search and Referring Traffic with no social referrals included.

Nice clean segments that provide you nice clean answers.

Always pause and ask yourself: "Are my segments all at the right 'altitude?' Are they individually unpolluted by the other?"

Then go analyze and confidently make recommendations based on what you find.

2. Don't Alarm HiPPOs and Sr. Leaders Unnecessarily.

Creating graphs is easy, and I could fill five blog posts with all the nonsense one can accomplish by playing with the axes. Yes it is a pet peeve of mine.

What do you think is wrong with this commonly available graph?

Look at it carefully? Found it?

sub optimal graphs 1

It artificially inflates the importance of a change in the metric might not be all that important. In this case for my data it is not statistically significant (more on that later in this post), but there is no way you would know that (or not know that) just from the data in front of you. Yet the scale used for the y-axis implies that something huge has happened.

I am going to go out on a limb…. unless you are performing surgery and the above graph is showing the heart rate or blood pressure, try and avoid being so melodramatic in your data presentation. It causes people to read things into the performance that they should most likely not read.

You don't always have to have the y-axis at zero. But over-dramatizing this 1.5 point difference is a waste of everyone's time. And you know what happened to the boy who cried wolf one too many times right?

Another important thing.

Label your x axis. Please.

What time period does this graph cover? The last x hours? The last y weeks? The last z months? Depending on what you choose the data is completely ignorable or deserving of insane additional analytical love. (Assuming of course that you fix the y-axis first.)

As the analyst you hold a lot of power in your hands when it comes to visualizing data. Use that power with caution, and great responsibility.

3. Calibrate Your Time Series Optimally.

I am positive that many of you, including my friends who are just getting started, will have taken this screen shot out of Google Analytics and included it in a dashboard or presentation of some kind.

Don't scroll down yet.

Look at it carefully…. what's wrong with it?

googleanalyticsdailyanalysis

It is a chart that shows nine months of performance… by day! The "trend" is completely useless.

But because this is the default view in Google Analytics everyone uses is. [Arrrrrhhh!] The uselessness comes from the fact that when you look at individual days over such a long time period you are effectively hiding insights / important changes.

It is impossible to find anything of value above.

Let's switch to looking at the exact same time period but by week.

googleanalyticsweeklyanalysis

Much better right? No more puke of squiggly lines that mean nothing, show nothing. You can kind of sort of see some kind of trend above, especially towards the end of the graph (even this simple thing was essentially hidden before).

Here's the amazing thing… when looking at long time periods you can do better!

The best practice I recommend in Web Analytics 2.0 is that if you are looking at four weeks of data then you can look at the daily trend and still find interesting insights.

If you look at three months of data (one quarter) then you should switch from the day view to week view. The macro trends won't get masked/hidden in the daily noise.

If you look at time periods long than that then it is optimal to look at the monthly view of the data.

In our case this is what that would look like….

googleanalyticsmonthlyanalysis

Sweet, right?

You can clearly see the dip from Jan to Feb. You can see the nice consistent dip through July. Then something magical happened (What! What! What!) that has traffic rising to record levels.

All of this was nearly impossible to see in the daily graph, and most of it was hard to see in the weekly graph.

Do remember this really important point: When you look at lots of data, nine months in this case, you are usually not looking for tactical bits, you are trying to find big hairy things… calibrate your time series accordingly.

And if you calibrate your segments optimally you can quickly start doing deep dive analysis looking for some answers. What happened post July? What caused the funk between March and July? Why did x or y or z not happen? All the right good questions that otherwise might have been hidden in plain sight.

Simple best practice. Use it.

4. Always, Always, Always Make Your Point Clearly! (Oh, and Colors Matter.)

Everyone of you will present decks with 95 slides. Or at least 55. : )

When you are doing that data regurgitation it is important to try to make life for the person at the other end (typically your boss, or worse your boss's boss) as easy as possible.

At some point in the data tsunami you unleash eyes glaze over and life becomes boring.

So try to… ok, what do you think the two colors in the below graph represent? Don't look at the legend.

Bonus, what do you think the data is telling you? Don't scroll, think for just five seconds.

graph colors yes no

I my first thought was how come only 29 percent of the organizations have more than one person! That is bad.

Wait. That did not make sense.

I went back to read the question. Then the graph. Then the legend. Then back to the question. Then the legend.

Problem one is that "red" denotes "good" in this case and "green" represents "bad."

Here's something very, very simple you should understand and slavishly follow: Red is bad and Green is good. Always. Don't try to be cute. People will instinctively think that. We have been patterned that way. So show "good" in green and "bad" in red. It will communicate your point faster.

Problem two, much worse, and perhaps only for me, was that it was harder than it should be to understand what this data. First stacked bar above: "Yes 71 percent of the organizations Yes, more than one person." Too many yesses.

And what is the 29 percent? If the question is how many people are directly responsible for improving conversion rates and 71 percent have more than one person, then 29 percent are those that have less than one person or no one? Or just less than one person? Unclear (and frustrating).

[Third bar above] And if 62 percent of the people said "Yes we have no one responsible for improving conversion rates," then what does the 38 percent in green mean? Is it: "No, No we have someone responsible for conversion rate improvement?"

This graph actually comes from a source I deeply respect, an organization with really great analysts. But I'm afraid I completely failed to grasp the point. Do you understand it?

Sometimes you just want to skip the graph.

I don't understand the data above so I'm going to make some numbers up, but would a table like the one below have worked much better to communicate the point?

conversion rate team size

Why do the graph at all?

Okay so sometimes the application of something humorous might not work (I do always try :). But the rest of the table? Effective?

And if you have data for the last two years then perhaps this table is even more valuable…

conversion rate team size trend

Much, much better with context. I love context dearly. Amazingly so does your boss.

Or perhaps if you want to show it to very senior executives then maybe the numbers themselves are less than useful. You could go with something like this…

conversion rate team size delta

Scroll back up.

Look at the graph.

Now look at the table above.

I'm riding a horse! No, not really. What do you think?

I love graphs as much as all of you. But above all, what I crave is simple and effective communication. I want to make the point as fast as I can so that we can begin the politics and hard work of taking action. That is after all what pays our salary right?

5. Statistical Significance is Your BFF.

Okay I gave this one away with the title. We all (novices and experts) make this mistake all the time.

We create a table like the one below. (Mercifully the segments are calibrated right, hurray!) We create a "heat map" in the table highlighting where the conversion rate is good. We declare Organic to be the winner, Direct is close behind. Then the other two. And we recommend doing more SEO.

What's the problem with that?

online marketing conversion rates

None of this data could be significant – as in the fact the numbers seem to be so different might not mean anything. [Looking at July...] It is entirely possible that it is completely immaterial that Direct is 34% and Email is 10%, or that Referral is 7%.

One simple fix (covered in more detail in this post: 4 Not Useful KPI Measurement Techniques ) is to share the raw numbers to see if the percentage is meaningful at all. For example all the data in the Direct row could represent conversions out of 10 visits and all the Referral data could be representing conversions from 1,000,000 visits each month.

The better, much, much better thing to do would be to compute statistical significance to identify which comparison sets we can be confident are different, and in which cases we simply don't have enough confidence.

I have something special for you. I've just uploaded a brand spanking new Statistical Significance Calculator to my old post on that topic. It does 1-tail and 2-tail tests and the even more beloved chi-square test. Download it. Adapt it for your use. Ecstasy will follow.

One of the most common complaints of our Sr. Leaders is that we engage in massive data puking (true!) and never help them identify with any degree of certainty if an action you are recommending will produce results. Well, this is our chance. If you check to see if the results you are seeing are statistically significant, then make recommendations of action knowing that that will produce results you want (all other things held constant).

Remember ecstasy awaits!

Update: Bonus: If you use Google Analytics the always wonderful Michael Whitaker has created something delightful (triggered by our discussion in comments below). A Z-Test calculation that you can embed directly into Google Analytics!

Here is a mini-tutorial on how to use this delightful feature:

1. Visit Michael's blog and drag the bookmarklet into your browser's bookmarks bar. Stats calculator for Google Analytics.

z test calculator google analytics 1

2. Go to any report in Google Analytics and switch to a Goal tab or the Ecommerce tab.

google analytics report tabs

3. Click Z-Test bookmarklet in your bookmarks bar.

z test bookmarket button

4. At the bottom of your GA report table you'll see a new button called Z-test.

z test reports button

5. Check the box next to two dimensions for whom you would like to check statistical significance (apply the Z-test).

compare rows google analytics

6. Press the button at the bottom of the table, Z-test, and boom (!) you have your answer. Green is good, red (lower then 95%) means you need to collect more data before you decide.

The conversion rate between our two main PPC keywords is 1.33% and 1.94%. Is that data statistically significant? Should we go ahead and invest more in Calico Critters (if we are using fixed budgets or there is more inventory)? Let's check…

computing statistical significance google analytics 1

Why yes of course we can!

Twitter sends 5,546 Visits and has (on a non-ecommerce website) a Goal Conversion Rate of 5.27%. Facebook sadly only sends a fraction of traffic and has a lower conversion rate 4.71%. Stop spending money/time in Facebook based on this data? Deprioritize it at least? Let's check….

computing statistical significance google analytics 2

No! See how that saved your goat, you were just about to plunk down a million dollars into Twitter! :)

7. Celebrate your new found awesomeness!

This only, currently works for Ecommerce Conversion Rate and Goal Conversion Rate key performance indicators.

For computing significance ("are the two conversion rates different enough that you can confidently take action") on Ecommerce Conversion Rates you can use this with no thought. (Ok always apply some thought!) But for using it to compute significance for Goal Conversion Rate you should be a little more careful. Unlike Ecommerce Conversion Rate, it is possible for a person to have more than one unique Goal Conversion during a visit in Google Analytics. So when you apply the Z-test you'll be comparing "rotten apples to rotten apples," i.e. measuring the same way for all dimensions. In the most ideal scenario you would apply the Z-test to each goal by itself. I still believe it is of value to use the Z-test for Goal Conversion Rates, but be aware of the nuances.

One more important caveat. Z-test / statistical computations are most optimally applied to results of controlled experiments and not to observational data because in the latter there could be other, uncontrolled, variables at play. So this is not "pure" in some sense. But (as I mention below in comments) we are better off being aware of this purity and still using this test because the insight delivered is better than just "eyeballing" the number to figure out when to take action.

Many thanks to Michael for doing this. No more going to excel (at least for GA), we can be a little smarter quicker directly in our web analytics tools. Makes me wonder why web analytics vendors are so enamored with data puking and can't build all this stuff natively to make more of us Analysis Ninjas!

6. There is Such a Thing as Too Little Data!

A variation on the above "simple" mistake.

I know we all get excited about having data, especially if we are new at this. And we get our tables and charts together and we start reporting data and having a lot of fun.

This, dear reader, is very dangerous. You see there is such a thing as too little data. too little data

You don't want to wait until you've collected millions of rows of data to make any decision, but the table on the left is nearly useless. Recommending doubling down on Facebook (as the Analyst did) this early in your evolution would be a profound mistake.

Things can change so much in just a few days (and they will for you!).

So you can't do anything with data like this?

Pretty close.

But what you can do is look at this report to see if places you've invested time in earning links from are sending you traffic (or not). Look for surprises, places you did not invest money, and see why they linked to you. You can get a tiny bit of understanding of your initial marketing strategy.

Do other useful things.

Look at your search keyword reports. Do you see a few people coming on keywords you SEOed the site for? Better still, go into Webmaster Tools and look to see if your site is well indexed. Look at the keywords for which your site is showing up in Google search results. Are they the ones you were expecting?

Even better… spend time with competitive intelligence tools like Compete / Trends for Websites, Insights for Search, Ad Planner and others to seek clues from your competitors and your industry ecosystem. At this stage you can learn a lot more from their data than your data!

We all tend to read too much into data sometimes. A good analyst knows when there's just not much there and volunteers her/his time on helping run a Task Completion Rate survey or creating new/better Inbound Marketing programs. Go get traffic!

7. Pie Charts Are Evil.

Okay maybe not evil. They are useful on rare occasions. See "Enchanting Analysis: Rule 2: Establish Macro Importance" in this post: Mate Custom Reports With Advanced Segments!

But most of the time they are an active hindrance to communicating anything of value.

Examples of horrible pie charts abound. But let me share this really simple one that I am sure you've seen or perhaps created yourself. :)

Take a moment to breathe it into your brain. What do you think?

pies are evil

The 3D effect does not help. Trust me on that.

This set of charts very cleverly hides any available insights because it makes your executive do these operations for every segment of understanding: Look left, find the interesting slice. Commit the color and number to memory. Go right. Find the color and segment and commit the new number to memory. Now subtract the first number from the second. Decide if the result is good or bad.

Repeat five more times.

Remember to remember only the interesting bits.

When the chart was created did you think you were going to torture your executive today? Would it be surprising then that everyone atom in this universe thinks "omg, numbers are so haaaarrrrd!"?

Why torture people who are so critical to your financial well being?

Just use a table (as we did in #4 above).

pie to table

Much easier, right?

At the very least, you don't have to dart your eyes from left to right all the time and commit numbers to memory to understand what's happening.

And since you the Ninja-in-making are not being paid to just data puke, why even show things that might not be material?

Just go with this…

pie to table smaller

Would the discussion with your management team be much more focused now? And faster?

Oh and… you've already put so much effort into collecting and analyzing the data. Why not use your intelligence (and the statistical significance calculator) to filter data and just show what's most relevant?

It is easy to make things hard to understand. Working hard to make them easy to understand is what brings glory. Sustained glory.

So do that.

Okay it is your turn now.

What are the simple mistakes that you've learned to avoid? Would you recommend a different strategy to follow for one of the mistakes above? Got a better picture to submit? The mistake that most sets you off in the field of web analytics? How did you learn not to make these mistakes?

Please share your feedback, pictures, complaints, mistakes via comments.

Thanks!

Comments

  1. 1

    We (I mean us Nerds) all suck at communication and the essence of your post is about communication.

    I was in a seminar where a presenter said on stage to a bunch of executive types (and I quote cos I wrote this down) "The CPAs and the CPCs in our SEM campaigns weren’t performing well enough.
    Additionally the PageRank in our SERPs meant we need to do better SEO. So we figured we should get more backlinks from the ODLPs."

    Communication like that and pie charts are why we suck. We talk about HiPPOs not getting what we do and we take the high moral ground with them (they must be stoopid not to get what we're saying). "Thou Shalt Listen To Your Analyst" when actually the analyst is talking complete gobbledigook. Those dudes are in the job for a reason and it's because usually they communicate to their superiors (or their customers) better than we do.

    It's the same with information presentation.

    The worst crime you pinpoint here though is the first. I call that "Using statistics like a drunk uses a lamp-post, for support not for illumination". Someone else said that by the way. I just use the analogy.

    Basically mixing up data to support a conclusion that could be completely irrelevant is the fastest way to destroy any confidence anyone ever had in the data.

    Good post.

  2. 2
    Olivia says:

    Great post Avinash!

    We actually have weekly emails sent around the office with 'Grim Graph of the week'. The graphs are taken from presentations that we have encountered and working in the media industry, we are sometimes guilty of style over substance. 3D graphs and infographics feature on a regular basis.

  3. 3

    Hi Avinash,

    Great post as per usual.

    My biggest annoyance is with the shoe horning of pie charts in when they should really be there. Especially in situations where the total doesn't add up to 100% (something SiteCatalyst's interface does frequently) – you should have a segment for 'other' or not use a pie chart in the first place.

    I am also often frustrated with 'top 10s' in situations where there is a long tail. That top one might be the most frequent search term/referrer/visited page, etc, but if the the whole top 10 only adds up to 10% of the total, then you're missing 90% of your data with a top 10. Segment the data in a different way if that is going to be the result. You need fewer data points to be able to extract meaning and provide recommendations, otherwise it all gets a bit cluttered.

    Cheers,
    Alec

  4. 5
    Joe Teixeira says:

    Thank you Avinash!

    Over the years, I've gotten much better at data presentation (due to your blog and teachings) because, no matter how much data / how accurate of data you have, if you don't know how to present it appropriately to your audience, you will most likely #fail.

    I'm a big data / analytics guy and I love acronyms, ratios, jargon, and etc… But if I can't translate it within the walls of my organization, I will ultimately fail.

    P.S. I love "Red is bad, green is good". How many times have I used that in the last 5 years? I lost count. It's amazing.

    Thanks!

  5. 6

    Thanks for the post. We really should turn these points into a checklist (among a few other questions) and have it in front of every analyst as reminders any time they prepare an analysis.

    My biggest pet peeve is #5/#6 and seeing conclusions being made without enough data. I was wondering if you think that the new Real Time analytics options in GA might make this problem worse. It reminds me of people looking at stock tickers all the time when they are supposed to be long-term investors. I am a bit concerned that Real time, while wonderful if used properly, may become a focus of those who like jumping to hasty conclusions and forgetting about significance.

    The second biggest complaint I have is how % changes are communicated (or rather usually not communicated), especially for values that are normally calculated as percentages. It makes a huge difference if you talk about reducing bounce rate on a site by 10% and you mean taking the bounce rate from 50% to 40% (10% in nominal or absolute terms) vs. taking it down from 50% to 45% (10% in relative terms). Case studies are notorious for playing with numbers this way and not being clear whatsoever on the number's context. This is a huge case of "Your Mileage May Vary". So for example, in #7 you should label that the change is an absolute change of -3% for Direct instead of a -3% relative change.

    Regarding point #4: Google Analytics had this color problem with Bounce Rate all the way through version 4 in the Flash-based graphs up top. A reduction in bounce rate when doing comparisons in the graph is red!!!

    • 7

      Michael: There are rare occasions where real time data is useful. Rare. :)

      I'm a fan or real time data is useful only if it meets two criteria: 1. (You mention this) Can you collect enough data for the results to be significant? 2. Do you have the capability to take real time action?

      If both are yes then use real time data, else focus on strategic analysis.

      I *love* your point on communicating percentage changes. Thank you.

      -Avinash.

  6. 8
    Kim White says:

    Thanks, Avinash! I knew I didn't like pie charts, but have not been able to figure out why. This clears it up. Now I know what to do (tables).

    Pies are for eating, not for charting.

    • 9
      Ryan says:

      @Kim White, I was exactly the same way–not knowing why I didn't like pie charts. Looking at Avinash's example above, the clear issue is that pie charts are always only good for momentary snapshots. And snapshots are never the best way to present data for decision making. At best, it seems like snapshots can just inform people of things (and who just needs information for the sake of having information, these days?).

      Not to mention, differences among segments are made much more obvious by bar charts. Mentally superimposing pie slices over one another is an exercise in mental exhaustion.

      Just wanted to expand on your thought as a way of giving you a thumbs up.

  7. 10

    Seriously love this post! Thanks, Avinash.

    Sometimes analysts feel pressured into the #6 "too-little-data" mistake by stakeholders who lack a) statistical acumen, (b) patience, (c) courage, or (d) all of the above, to explain to their C-suite executives why they can't declare the campaign or enhancement pushed to production yesterday a smashing success.

    I've found it helps #5 and #6 to translate statistical significance into plain English, something like "the checkout process redesign has reduced the 65% cart abandonment rate to somewhere between 57% and 61%." Presenting this kind of data on a too-early, too-small data sample yields a squishy confidence interval range that even non-statistical stakeholders recognize isn't actionable.

  8. 11
    Ross Nunamaker says:

    Great post as usual.

    We don't do a lot of email, but when we do, product managers always want to know 'how we did', especially when they use a purchased list.

    It is very easy to set a date range, view KPI's (form view, completion, video) and report numbers.

    This would be easy, quick, and wrong. You have to remember to segment on the campaign, because you aren't reporting on total forms submitted, but on those submitted as a result of the campaign (or whatever action you sought).

    If your campaign wasn't set-up correctly, you can look at a 'like' period of time, get some averages, and compare against those to determine how much of a boost your campaign provided.

    I use this latter approach to estimate press release impact.

  9. 12
    Brian Chiou says:

    Great post! Time to catch up on some reading :)

    Your point regarding the statistical significant data has saved me lots of time when dealing with HiPPo's. I especially like point #3 especially in conjunction with multiple metrics.

    The simple mistakes I've learned to avoid are:

    -Not annotating. There's nothing more annoying then going back on historical data and seeing a positive deviation in conversion rate and not knowing why until I have dug through data points instead of me making a simple note as an annotation :)

    -For #3, I add one more table that compares the (median date – end date) compared to (median date – start date). I think I do that to make myself happier by seeing the green percentage!

    -The mistake that most sets me off in the field of web analytics is using non-segmented metrics as key performance indicators. My way of avoiding this is, if I can't explain in a sentence how a specific metric affects the bottom line of the business I'm helping.. I segment and slice until I can.

    • 13

      Brian: I love all three tips! So simple and so wonderful (and sadly so rarely followed by most Analysts).

      All three have this in common: They make it much, much easier for people to understand what you are saying when they are alone with your data (without your magnetic presence to give context).

      We solve for this common use case infrequently and that is sad.

      Avinash.

  10. 14
    Ned Kumar says:

    Hi Avinash, great tips on data analysis for both novices and ‘experts’.

    A corollary I would add to #4 is that always highlight the changes as others might not perceive the changes as you want them to – in other words, stress the obvious. A couple of years ago, there was a pop-art quiz on change blindness in nytimes that I loved (http://goo.gl/t5bHc) as it highlights our inability sometimes to detect alterations that are in plain view.

    Also, one should avoid the “myopic trap” – the pitfalls of looking only at a section of the data. You might have done great analysis but the conclusions you draw from it can be pretty far from the truth when looked at with a more holistic data availability.

    Enjoyed the post

    Best,
    Ned

  11. 15

    Often we make charts without having our audience in mind. The charts make sense to us (maybe), but do they help convey the point? When in doubt, I show my graphs to friends and ask them what they see. When they stumble, I know I need to re-think and re-do the chart.

    Pie-charts. I read an article recently where a neurological study proved that humans do not have the ability to accurately compare pie pieces to determine the differences. Human brain understands bars and lines much better. My pretty 3D pie charts are no longer featured in my reports.

  12. 16

    Thanks Avinash – these are all helpful. I especially agree with #3 – how useless to show day-by-day trends over long time periods.

    Proper analysis for the win!

  13. 17

    I love the thought process behind your Market Motive course. I am beginning to see information like Neo from Matrix.

    Your 1st example Never Compare Apples to Watermelons tells me I should remember that GA has a few categories for Traffic Sources: Search, Referrals, Direct, Campaign. Those are the 4 sources in my GA and anything else that is being reported as a source, like Twitter or Facebook, are already in one of the 4 sources.

    My mistake: I think I included Mobile Search in one of the reports right next to organic search. oopps!

    I find great value in your Market Motive Certification Course.

  14. 18
    Raffy says:

    I'm a novice and thank you for making it understandable for a "limited web analyst" like me.

    All the best!

  15. 19
    Sridhar says:

    Hi Avinash,

    Thanks for the great post and making to understand how to show the data to HIPPOs and Sr. leaders in the company!

    I love the one, which you said about analyzing the data by weeks, month and days (according to the range you select)!! That's really great!!

    Sridhar K

  16. 20
    Peter says:

    #2 seems off-base. That change does appear to be statistically significant. Post the actual values and we'll see… Whether it is important of course depends on what is being plotted.

  17. 21
    Igor says:

    Thank you, Avinash for statistical significance calculator.

    Meanwhile in point 7 you described well how pie charts can be wrong used. To my mind pie charts are really great but not for comparing periods. It is the best way to show shares in something that makes 100%.

    And for case you showed with table it would be not bad to use horizontal bar chart as Gene Zelyazny recommended in his "Say it with charts".

  18. 22
    Adam says:

    One particular point of note for me: not having the chops to just blow it all up and start over.

    Sometimes a particular piece of information is (a) no longer relevant or (b) recently discovered as false. You need to be able to adapt and create a framework around which to make decisions on 'truth.'

    Recently at my place of employment, it was discovered that our primary sales report was rather horrendously flawed, and the entire company's online marketing budget was being misspent on false assumptions about the relative cost-of-marketing.

    Instead of fixing the report, or even better, devising another one to run in parallel for a year before transitioning, it was decided to take no action (in perpetuity) so that we could accurately comp against last year.

    /facepalm

    You need to be nimble and forward looking, not attached to the 'how its always been done.'

    • 23

      Adam: OMG!

      Let me hasten to add that comp is a very very sensitive topic and we have to think of the psychology of the issue. It is very hard to kill and change things in that context.

      In the long run I hope your management team will accept your wisdom and adopt the right path.

      I'm grateful that you shared this story, I'm sure it was not easy. It highlights a critical issue we have to deal with as Analysts. Thank you.

      Avinash.

      • 24
        Adam says:

        Oh, it is a very sensitive topic – especially when the company in question is a Top 100. The end result though is one must continue to fight the good fight.

        But unless you create value, you'll never get the authority to make those changes. So first and foremost you must bring something to the table, and thankfully in the world of Web Analytics, there is so much untapped potential that there is freedom to shine for someone willing to put in the time and mental energy.

        I guess my feeling is that the beginning analyst can do nothing worse than losing the passion for making real changes and always, always pushing for excellence no matter the obstacles.

  19. 25

    Hi Avinash,

    Just wanted to make a quick follow-up point about the statistical significance section. Direct and Organic conversion rates in your example (as is Web Analytics data in general) is observational data, as opposed to data from a controlled experiment like an AB test. It's OK to make inferences on observational data, but it's not quite as strong as if we had set out to specifically ask the question if Direct converts better than Organic (via an experiment).

    For that reason I actually like to work with confidence levels on observational data. The output would then be something like: I am 82% confident that Organic converts better than Direct, as opposed to rejecting something outright if there isn't a 95% confidence interval.

    • 26

      Michael: I completely agree with you. It would be optimal if we ran a controlled experiment because we could control for all the variables and the data set won't be impacted by hidden biases.

      But in the absence of controlled experimentation (sad scenario in almost all cases) I feel that there is still value to be had from applying a test for significance to know when to have confidence in the data. Incrementally better than jumping off a difference between 1.78% and 1.98% conversion rate. :)

      My overall hope in encouraging the use of statistics (or control limits) is to simply increase the chances that when we make recommendations based on data that there is an increased likelihood that the results will be reproduced (95% or higher significance).

      -Avinash.

  20. 27

    Avinash,

    Longtime reader, first time posting. Thank you for keeping this blog active for so long and with such a wealth ofknowledge, OR is definately the was a huge factor in my getting started as an analyst!

    I work for a marketing and brand management agency and almost daily have to make suggestions/set goals and KPI targets for campaigns and clients with datasets often skirting statistical significance or not, and more often not having any directly relevant data at all.

    Do you have any recommendations on how to approach a situation where the HiPPO needs you to outline what the success case for a proposed client would be, including exact or relative ROI?. The best thing I've thought to do so far, is to use internal case studies as psuedo-past performance metrics for our own efforts.

    Thanks again for maintaining this increadible resource and community!

    • 28

      Matthew: I'm not sure how you/your boss is defining ROI but there are several paths you can take…

      1. Internal benchmarking. Use any data you already have for past performance to forecast impact of your improvements. This is not the greatest thing you can do, but it might be the easiest.

      2. Use your competitor's data. There are many links to CI tools in section 6 of the post. Use them to understand their traffic patterns, industry conversion rates, keyword / referral strategy, social impact etc etc. Almost all this data is available for free. Then check what is it that your boss is going to invest in on behalf of your client. Propose impact.

      3. Test. If your boss is asking for a year success case for a client see if you can take a month, try your highest impact strategies, see what happens. You'll likely reach diminishing margins of return at some point, but actually testing your strategy in a real world environment will be valuable reality check.

      One last tip… whenever people ask you for stuff like this refuse to do anything without getting their help in completing a Digital Marketing & Measurement Model. I can't think of any other way to understand what is important and bring tremendous clarity to your effort.

      Avinash.

      • 29

        Avinash,

        Thank you for the reply – I'm sorry about the type-O's in my first comment, I'm still getting used to the tablet.

        I am very familiar with the CI tools you mentioned, but the majority of problems come from my firm's current clientele – both active and prospective. The CI Tools listed provide great information for an international brand or even a national US brand – but we are currently in regional US brands or are proposing the initial branding and marketing for a market entry or large-scale expansion… Fancy talk for start ups and local chains. Both areas are largely unpopulated with data, it's hard for me to look at the successes Oreo has had with a specific campaign and relate that down to a local or regional brand – especially in the social atmosphere. One of my biggest victories within my firm so far has been my proving that the first step in effective social marketing needs to be effectively developing "Social Mass" i.e. follower-base, and that we can actually calculate the minimum "Social Mass" necessary for a campaign to be effective based on the psychology behind the campaign… I used internal benchmarks and case studies for this and it paid off – but as you said, internal benchmarking only goes so far.

        I am quite familiar with measurement models and that post in particular, as well as both of your books have helped to restructure the entire strategies and management division of my agency – but sometimes our sales department can be rather demanding, "How much $$$ ROI will a X campaign provide Y company?" This post, as well as some of the comments speak towards there being an underlying communication problem and that is absolutely correct – I find it quite hard if not impossible to tell said sales manager "I'm not comfortable making projections based on statistically irrelevant observations of a marginally related case." which is usually met with pure dissatisfaction and threats of my being fired.

        Any advice on a better response to said sales guy?

        Thanks again!
        ~Matt

        • 30

          Matthew: This sounds entirely sub optimal but… if you don't have any data then you don't have any data. You move from using data to using faith (and being clear about it).

          I completely appreciate that it is hard to say that to an executive. But if you can't find competitive data, past client data, any industry projections then I'm afraid you are out of luck.

          Perhaps you can test your recommendations first. Spend 1000 on Facebook / Google ads to see what happens. Send out an email blast to see if anyone cares. Etc., etc.

          -Avinash.

  21. 32
    Dan Peskin says:

    I think one of the most important things I try an avoid is at a broader level, that is making assumptions. One of the worst situations I have been in is giving presentations to VPs and having made assumptions about certain sets of data that they almost always point out and turn actionable items to nothing. I try to make sure there are no assumptions left on the table with the data sets I present.

    If there is something I don't know, I will always research it and find an answer to some extent to the problem. I always ask myself "what am I assuming here? What did I miss?", because almost always the execs will ask the question associated with what's missing. In the worst case scenario where I cannot resolve the assumptions I made, I state them in the presentation above my data to make it clear to my audience that these are potential issues in the analysis.

  22. 33
    Jon says:

    A couple of points: The issue with 3-D pie charts is really that you cannot tell proportionality of the pieces in the front compared to those in back, due to the artificial perspective the charts throw in. If you want to distort the impression of the audience, put the piece you want to emphasize on the front (close section) of the pie. It will always look bigger than if it is on the back piece. The same data can give two different impressions. If you must use pies (almost never, btw) keep them 2-D.

    Second, add the chart for #2 using a y-axis with zero as its base. People will SEE the difference; you won't have to explain it.

    Finally, about 10% of males suffer from red-green color blindness. Don't use those colors on charts. Try Blue-Orange or Yellow-Purple.

  23. 34

    Great post Avinash. Regarding your 4th example, I'll hold my hand up here (as research director at Econsultancy) and agree this chart (from our 2011 Conversion Rate Optimization Report) could be much clearer. We'll have a think about how we can make this type of information easier to digest, perhaps with use of tables or charts with colours which are more intuitive.

    • 35

      Linus: You are a good, brave, man.

      I had deliberately obscured the graph to ensure its origin could not be traced. My point was not to call any particular org out (all other graphs, except Google Analytics :) were also obscured).

      Thanks for the comment and thanks for taking the feedback in the helpful spirit it was meant.

      Avinash.

  24. 36
    Craig Burgess says:

    Avinash, good stuff as always!

    I must admit/confess I have used pie charts in the past. However…. I did show year over year revenue change for each of the products percentage and $$ amounts, in COLOR no less (green for good, red for bad).

    While it did make it difficult with new products added, it did drastically show how the pie had shifted to more successful products. Still, I should go back and see how I can simplify that because, well, they are a pain in the butt to make and take way too long.

    Thanks again, always picking up good things.

    And I got some of them right without looking! Woo-hoo!

  25. 37
    Morten Bruhn says:

    Dear Avinash,

    Thank you for a terrific blogpost :-)

    I agree with the previous comments that we as data analysts has to translate and make the data understandable, and you are an excellent example of this :-)

    Keep up the good work!

    // Morten

  26. 38
    Kim White says:

    @ Ryan – Thanks! You are right; snapshot. Well said.

  27. 39
    Phil Turpin says:

    Great post, good reading for a novice such as myself.

    However I'm hoping someone can clear something up, regarding #4, for me?

    Green is good Red is bad, yep – got that, good idea – nice and self explanatory (& I understand it's only an example but it's important to me that I understand why this is)… however if there's a 20% drop in people believing that nobody is responsible for improving conversion rates ("god help us") then is that not a good thing? And if it's a good thing then surely green is more appropriate?

    Or did I misunderstand? I'd be really grateful for some input :)

  28. 40
    Jeff Certain says:

    Avinash,

    Fantastic post. This one is going to live for a while because we ALL fight these battles.

    I work in an outstanding organization with a serious problem: we're swimming/choking/drowning in our data. We're getting better, and that's due in part to this blog.

    I do have a couple of questions on the post:

    A common chart practice around here is to use the secondary Y-axis to sabotage any and all insight one may glean from what was a simple bar graph illustrating a trend.

    1) When is it appropriate to add an additional Y-axis to the juggling mix? What seems so simple to the creator usually ends up in confused stares and 5 minutes of explanation from the presenter.

    2) Because of my background in display advertising, I try to employ a 1 one or 2 second rule as a benchmark for ad creative. For example, a viewer should understand the ad message/value in 1-2 seconds. When creating presentations, I often wonder what the time benchmark should be for my graphs. 5 seconds? 20 seconds? -I struggle with this and am curious as to how you approach the chart complexity vs. audience recognition issue.

    Best,

    Jeff

    • 41

      Jeff: For #1… if it takes too long to understand what is being shown on the double Y axis then we should not use it. :) Typically if there are two related metrics that provide context to each other then I've used the double Y. An example is to show conversion rate on Y1 and raw number of conversions on Y2. You can immediately see how that is useful. But I have to admit it is exceedingly rare that I use a double Y for the reasons you've mentioned.

      For #2… I got for 15 to 30 seconds. If it takes longer it is just too long. Here's my painful lesson: The most important part of the discussion starts after the data is understood. The politics comes out. The budget battles come out. The blame game comes out. The "how long is it going to take us" comes out. And on and on. Since data that does no drive action is useless, I try to leave as much time for the all the post data understand drama – making my tables and visuals as simple as I possibly can make them.

      Makes sense?

      Avinash.

      • 42
        Jeff says:

        Thanks for the reply Avinash.

        Yep – perfect sense.

        Pretty basic questions, I know- but thanks for adding your perspective. This was helpful.

        Jeff

  29. 43
    Jerry Okorie says:

    Always great to read your posts Avinash. You've just summarized a question which I always get from top execs – what does that graph/chart tell me? Data analysis is just not simply putting together raw data into pie charts etcs but making them as simple but yet, insightful as explianing it.

    I agree with the misrepresentation of some key facts using wrong charts, analyzing a long term goal by using longer time range instead of specfic/shorter ones. Overall – it surely does limit your salary when you get it wrong on too many occassions.

    Cheers,

    Jerry

  30. 44
    Aaron says:

    Avinash,

    Another great post – thank you!

    Quick question – when showing the change between two periods (as you have done in your "kill the pie chart" example) we typically use a % change in a table layout. You are showing a raw change. E.G. Referring traffic dropped 7% (raw change) versus dropped by 35% (percent change).

    Do you have a preference? Is there a good rule of thumb to use one method over the other? Or is it just personal style or the individual requirements of the particular report?

    Thanks!
    Aaron

    • 45

      Aaron: Wonderful point. What I was showing was "Direct went down three percentage points," not three percent. I can totally see how my presentation of it would be confusing.

      I like your idea of showing just the percentage change, the challenge is ensuring that it is not confused with "referring traffic is now 35%."

      My approach would depend on what is the norm in the existing communication system in the company. If there is comfort with seeing -7% and understanding that that is 7 points decrease then use that. If there is comfort in seeing -35% and understanding that that is a 35% reduction in referring traffic then use that.

      Stay consistent and we will all be fine. :)

      Avinash.

  31. 47
    Adam Farrah says:

    Great post, Avinash! Rock on!

    I really love how you're a "Web Analytics guy" and you put out simple posts like this that focus on data presentation and basic analysis skills. Yeah, the earth-shaking, mega-complex Web Analytics stuff is a lot of fun, but posts like this are full of instantly useable insights that yield results RIGHT NOW.

    I'm in 100% agreement about data presentation skills and simplifying tables and charts.

    Great stuff, man!

    Adam

  32. 48
    Josh Braaten says:

    Thanks for the tips, Avinash.

    As you know, I'm a search guy more than I am an analyst. Tips like these help me avoid making bad recommendations based on faulty data, assumptions or approaches.

    I'd be happy to read more of these types of posts in the future!

  33. 49
    eelke says:

    hi Avinash,

    just one detail, shouldnt be -20% in picture below coloured also in green? As i understand it, its positive that it went down :)

    http://www.kaushik.net/avinash/wp-content/uploads/2011/10/conversion_rate_team_size-delta.png

    • 50

      Eelke: It was a way of testing if you were paying attention!

      :)

      -Avinash.
      PS: I've changed the image, thank you for pointing it out. I've also made a note to not write my blog posts in the middle of the night!

  34. 51
    Edith Sánchez says:

    Thanks for sharing your wisdom with analysts of the world.

    As a researcher and teacher I really thank you for calling the attention on the statistical significance issue, countless are the times I had to listen to numbers without solid ground, and we all know how can numbers be managed. :(

  35. 52
    Gregory Cox says:

    Simple mistake (especially for my clients) – not learning Advanced Segments – at all.

    I find that if they don't learn Advanced Segments, they just see the data as so much snow. But as soon as they learn Advanced Segments, my/their mind shifts into high gear – and start really engaging with the process of creating metrics.

  36. 53
    Andy Hubbard says:

    Another great post, Avinash!

    I especially liked how you skewered Klout that anonymous graph provider for their ridiculous axis units.

    A few weeks ago, I noticed in a similar orange and grey graph what looked liked wild swings in a metric, that actually amounted to a range of less than one, because the axis major unit was 0.5!

    No wonder HiPPOs can't figure out what our graphs mean.

  37. 54
    Andy says:

    Good post.

    My point is that sometimes the conclusions need to be adjusted for the audiance. I love in depth analysis but some business owners just want the bottom line. How much, how many, and when.

    For years we produced detailed repors with the conclusions on the final page and then we noticed that our customers would always read the reports from the back pages in. i.e. the conlcusions first. LOL

  38. 55
    Rudy Chou says:

    As always very insightful. Data is meaningless without proper analysis and context.

    Its the analyst job to analyze the data and make it so it is meaningful for stakeholders to make decisions. Of course, how this data is interpreted and presented is key.

    I strongly recommend analyst that prepare data to read through Edward Tufte's books on visually displaying of quantitative data. This ensure that what your reporting on, is truly qualitative.

    -Rudy

  39. 56

    Great post Avinash.

    I appreciate the reminder to make the analysis we provide uber-consumable for our audience. I like the comparison of tables to charts and the beauty of simplicity that can lie in a simple percentage gained or lost table.

    It's something I try to remind myself whenever we're working on another 30-40-50 slide presentation for a client. If I can keep these guiding principles top of mind perhaps we can produce more 30 slide presentations and less 50 slide presentations :)

    Thanks,

    Anthony

  40. 57
    Ryan says:

    "Statistical Significance is Your BFF" lol love it!

    Great post.

  41. 58
    Sarah says:

    Hey Avinash,

    I've been following your blog for a while now, and although sometimes the content goes way over my head (I'm not actually an analyst at all, but I do work with SEO & PPC and need to run reports), I always get something out of your posts. The headline immediately grabbed me for this one, because right now I'm not only running my reports as part of my usual routine…I'm also running reports as part of the salary renegotiating process. Your examples definitely have me second guessing the different slices of data I'd been planning on using as "proof" that I deserve a raise. Here's what I'd planned on using:

    1. Website Traffic for January, compared by years 2009-2011 and split into Adwords, Google Organic and Total

    2. Same as above, for September (aka, the current month for the last time I ran the data)

    3. Bounce rate by month for 2009 vs 2011, 2009 being when I was hired

    4. Return and Unique visitors for 2011, by month and %

    Am I way off base? Is there some other metrics I should be focusing on instead? Anyone, feel free to rip my plan apart…I'm flying by the seat of my pants at the moment. Any advice is greatly appreciated.

    • 59

      Sarah: Everyone loves salary increases! Think of proving your value on three fronts.

      Improvements you made to: Acquisition, Behavior, Outcomes.

      So what can you show in terms of improvements you drove to acquiring traffic for your company? Where were the increases (search in your case)? Did you do so at a lower cost than the past? Was that for specific sources or across the board? Start with the All Traffic Sources report and you'll get almost everything you need there.

      What can you show in terms of improvements to the site itself? Did the changes you drove cause people to stay longer (lower bounce rates!)? Watch more videos? Abandon carts less? Come back more times (repeat visits, which you have in your comments)? Something else you did to improve behavior of people on the site?

      What can you show in terms of bottom-line impact? This outcomes impact is completely missing from your suggestion below, and without it you'll never get as big a raise as you deserve. So compute economic value (see #4 here: http://zqi.me/rocksocial ), or worst case conversion, revenue, lower cpa etc etc. Show that you had a impact on the business (in your case from your SEO, PPC efforts).

      In terms of time period. Mirror your performance review process. So if your salary increases are computed based on Jan 11 – Dec 11 then compare to Jan 10 – Dec 10.

      If your responsibilities increased then make sure you compare the new responsibilities to their past performance and show how you rocked with your new efforts. :)

      All the best!

      Avinash.

  42. 60
    Antonio says:

    Hello, sorry for my English, I read Analitycs Web 2.0 and awoke my interest in the discipline and since then I try to put it into practice in small business where I work, but my hippo is very hard and has a very outdated way of thinking .

    I tell you that takes into account the google page rank and alexa ranking before G. Analytics you can get an idea.

    In this sense, the average time on site for Alexa is 19 minutes while in G. Analytics is 1:30 minutes.

    Obviously Alexa takes our own visits and it seems he does not want to see it.

    Also, think we have many more visitors than they really have and analytics data attributed to missing pages where analytic enter the code.

    Any tip or post where you can see a way to make him see reason?

    Could you give me some tip on analytics basic configuration for the results to be the most realistic as possible?

    Congratulations for the post, and thanks in advance.

    Regards

    • 61

      Antonio: First it is important to realize that the value from Alexa's data set has been eclipsed by other sources that are much better. For example http://www.compete.com for the US and http://trends.google.com/websites for the US and international sites. I do realize that sometimes these sources might not have all the data (but then think of the quality of data you are getting).

      Ok with that out of the way…. Competitive intelligence data will never match your Google Analytics (or Omniture or WebTrends or…) data. In one case measurement is being done directly on your site (and hence is much better, in fact the best you can get) and in the other case (Alexa, Compete, Google Trends) it is being inferred from sources other than your website.

      Use CI data to compare yourself to other competitors or industry sites. You'll be comparing rotten apples to rotten apples, but at least it will be clean.

      Use your website analytics data to measure your own performance by itself.

      This post also lays out competitive intelligence data collection methods and tools in much more detail:

      ~ The Definitive Guide To (8) Competitive Intelligence Data Sources!

      Good luck!

      -Avinash.

  43. 62
    Phil Krnjeu says:

    Hey Avinash,

    I hopped on over to your blog from a link from SEO Moz. I wanted to thank you for writing this! I've been growing immensely in my internet marketing and social media knowledge over the last 2 years, basically. I can talk the talk and understand concepts… or so I thought!

    Coming from a Web developer background, I thought I understood analytics. But this post shows where the chink in my armor is! There is so much more to analytics than I thought.

    Besides reading your book, where do you recommend someone start off? From a "one man's shop" perspective, this just seems like another tough (yet doable) mountain to climb.

    Thanks! I look forward to reading your blog more!

  44. 63

    Lovely post.

    It is too easy to spit out data and think it has value. Taking time to think about what you are communicating with the data is so much harder, but more valuable.

    Haven't kept up with this blog for a while, but forgot how delicious it is.

  45. 64

    Once again, thanks. Pardon me for not 2 cents'ing in sooner.

    1) I'm not sure about dearly but fwiw, I love context as well. It's too often the missing component. Yes, it's always the (second) most important factor when trying to turn data into information. This is why – as you know – I don't like the FB Like. Without disLike or something to complete the picture a running count of people who hovered over a link, clicked it, and incremented a counter really doesn't tell us much.

    2) Both data/info producers and data/info consumers have a responsibility to insist on transparency. Similar to context between data point, it's essential to understand the data, its caveats, its quality (or the lack there of) that went into each any every graph, chart, recommendations, etc. If the presenter doesn't tell then the receiver must ask. Blindly accepting what's on the screen – TV screen included – is a major no no. Yet, this happens more frequently than lack of context.

    As usual, thanks for taking the time to share your thoughts and insights.

  46. 65
    Ankur Rautela says:

    This article is really very helpful.

    Thnx so much for sharing avinash.

  47. 66
    Theodor says:

    Every time I read a post like this it amazes me how many different takes there is to this profession. It is so versitile and it demands so much to master it. Since we are looking into many different fields within online you must master not just the knowledge how to read data but also how to have the most effect of data.

    I actually wrote an article about this but from the inside view. How I think as a web analyst. My daily thoughts and views being a web analyst and conversion specialist. Post is here: http://bit.ly/seIzxM

    Avinash, keep the posts coming, even someone with experience learns something new each day.

    /T

  48. 67
    Sasha Jones says:

    Analysis is only starting point and we should focus on creative aspect based on development of Strategy for Unique Brand Communication trough Social Media platforms

  49. 68
    Sasha Jones says:

    This approach requires to much effort into collecting and analyzing the data instead more focusing of creative aspects

  50. 69
    Nick Lim says:

    I like the attention drawn to misleading charts, inevitable.

    My comment here is on the statistical significance topic, which was interestingly not very much commented on. I really think that any statistical analysis should highlight the effect size, the significance and the power of the test. Here's why:

    Something that is very statistically significant (ie, unlikely to be case if null hypothesis is true) may not matter in real life because it does not move the needle. Example, say organic conversion rate is much higher than email, but organic conversions make up only 1% of total conversions.

    Something that is not statistically significant might still be really matter in real life. Example from Jul to Aug, you changed the email format and found a 3% increase in conversions, that may not be statistically significant. But email makes up 90% of all your conversions, so a 3% increase in conversion rate, while not statistically different contributed to a 30% increase in overall conversions. It moved the needle.

    Anyone interested in this should read "The cult of Statistical Significance" http://www.deirdremccloskey.com/articles/stats/preface_ziliak.php

  51. 70
    alex says:

    I consider using Red/Green together as a big no-no. Colorblindness affects nearly 1 of 20 men.

    It is not necessarily a risk worth taking when there are millions of possible color combination you could effectively choose and only a handful of colors that would be negatively affective

  52. 71

    This was very helpful!

    I've only recently stumbled on being a Data Analyst as a career, finally something that I can put my mathematical and programming skills to good use! These common mistakes will definitely help me avoid problems as I learn the ropes.

    Thank you for taking the time to help us newbs out :)

  53. 72
    Payam says:

    Hi,

    Really nice refresher and reminders. Thank you!

Trackbacks

  1. [...]
    It is a process that is done by talented individuals who understand how the data is captured, suggest improvements for better ways of capturing the data and then use the data for analysis and recommendations. It’s a continuous improvement cycle.
    Nowhere is this more obvious that Avinash’s most recent post on common mistakes that people make in analysis. Web analysis is hard – you have to be technical enough to be able to be able to work out how the data is collected to understand what it means. You have to be good enough with your communication to give your recommendations to management. You also have to be skilled enough in your analysis of data to be able to put together strong pieces.
    [...]

  2. [...]
    实例:很多图形为了夸大效果,将纵坐标的起始值直接调整为有数字的地方开始,而不是从0开始,或者将刻度进行缩减,这样图形就会更陡,令人产生深刻的印象。
    Avinash Kaushik 曾经在文章 Seven Simple Mistakes That Limit Your Salary也举过例子:
    [...]

  3. [...]
    Can you be sure that a conversion rate of 5.56% is better than 4.87% if you only have a few data points? It’s important to think statistically when working with this kind of data.

    I wanted to share a Z-Test calculator that you can use directly in Google Analytics via a simple bookmarklet. It uses the jstat library. You’ll need to use the new version of Google Analytics and have e-commerce tracking enabled as it works on the e-commerce conversion rate.
    [...]

  4. [...]
    Making Data Digestable
    Avinash Kaushik gives agencies and dataheads alike some great advice in this post regarding your technique in presenting data to decision makers and clients. He highlights some common mistakes to avoid that involve time period calibration, use of color, statistical significance and evil pie charts. With each common mistake, Kaushik provides a suggestion on how to improve for greater clarity.
    What this means to agencies:
    [...]

  5. [...] l
    Data Analysis 101: Seven Simple Mistakes That Limit Your Salary (Occam’s Razor): This is written about web site marketing, but the principles apply for any kind of data presentation. Kaushik also has a handy Statistical Significance Calculator.
    [...]

  6. [...]
    Our old friend Avinash Kaushik offers an awesome look at seven simple data analysis mistakes that can hurt your business.
    [...]

  7. [...] 7 Simple Google Analytics Mistakes we make – Kaushik (great point about viewing data weekly) [...]

  8. [...]
    El riesgo aumenta si las decisiones están tomadas por una tercera persona y en base a la información que le damos, porque aparece la amenaza de la mala interpretación de esa información. El analista web más reconocido, Avinash Kaushik, en su post “Data Analysis 101: Seven Simple Mistakes That Limit Your Salary” ha congregado 7 errores comunes en la preparación de los datos a presentar. Algunos de ellos parecen muy obvios, otros quizás no tanto, pero todos están basados en casos reales, cometidos por humanos, como todos nosotros: trucos de visualización, matices de estadística, particularidades de la percepción son algunos de los temas que Avinash toca en ese artículo.
    [...]

  9. [...]
    Data Analysis 101: Seven Simple Mistakes That Limit Your Salary
    [...]

  10. [...]
    This post over at Occam’s Razor by Avinash Kaushik provides some good insight into making that picture clear and meaningful. If you have to present data to others, and that is basically everyone in business, jump over and have a look, despite the heading this is not about salaries.
    Data Analysis 101: Seven Simple Mistakes That Limit Your Salary
    [...]

  11. [...]
    This post over at Occam’s Razor by Avinash Kaushik provides some good insight into making that picture clear and meaningful. If you have to present data to others, and that is basically everyone in business, jump over and have a look, despite the heading this is not about salaries.
    Data Analysis 101: Seven Simple Mistakes That Limit Your Salary
    [...]

  12. [...]
    Certainly, it seems to me that the argumentative gurus focus mostly on charts. But I share what I think is Avinash Kaushik’s way of seeing “data visualization”. Avinash, btw, doesn’t fit in that argumentative guru category. He’s pretty mellow. See View of Lake Tahoe from An Airplane. He’s even given the pie chart an OK on occasion, although, in general, he agrees with the guru community that Pie Charts are Evil for relative size presentation. I too have agreed that pie charts aren’t the best choice, Google Analytics Pie Chart Dashboards, but I also note that pie charts often win on the pretty side of presentation and for really BIG relative differences, can be really effective. When A Pie Chart Is The Best Chart
    [...]

Add your Perspective

*