eMetrics DC '07 Reflections: Accuracy, Precision & Predictive Analytics

yum yum yumThere is always something delightful to report back from each eMetrics summit, even if it late in getting to you!

Typically though it tends to be stuff from new folks who present and bring new perspectives.

This time it was different. There were a ton of presenters and many new faces and voices. But both presentations that I found delightful, with real solid actionable nuggets, were from old people.

Ok before Jim and Neil crucify me let me hasten to add that by old people I simply mean my colleagues who have presented at many a eMetrics summit in the past (this was my seventh consecutive presentation at eMetrics, so I am "old" as well!!).

Both veterans presented something new and interesting that, IMHO, overshadowed other insights, for me. Three cheers for non-recycled content!!

For people who speak at many conferences, this is a gift. It takes an extra effort to come up with good new content, but it is appreciated as a gift (from the knowledge Gods! :)).

It was nice to see friends, blog / book readers, vendors, analysts and everyone else at the summit. Thanks to everyone for making the time. Let's get on with my favorite insights……

Who: Jim Novo, The Drilling Down Project
What: Web Analytics Meets Business Intelligence
Why: Accuracy, Precision and the Actionable Data Pyramid

Others have talked about this but I have never seen someone explain it so clearly, the difference between accuracy and precision……

accuracy vs precision

Get it?

Both of the above in a perfect world are not desirable. But each brings a interesting set of challenges, and there is one of the above that is preferable.

Which one is it?

Precision.

Jim's recommend precision because it is predictable and insights that can be gained can be actioned with significantly more confidence. Think about it….

If you don't know where the shot will land every time you fire, what can you predict about the next shot you fire?

But if you know where the shot will land every time you fire, even if you don't score a bulls eye, can you predict what will happen with the next shot you fire?

Of course the choice now is stark.

Most people (Marketers, Analysts, Decision Makers, Report Writers) focus on Accuracy.

I think it is driven by "business world 1.0" where things were far less complex, the world moved at a glacial pace, the price for perfection was bearable because there were three variables on which decisions were made and even if it took five months to get the last 4% accuracy then it was worth it.

spur gear 1

Because decisions were big, change was slow, mistakes were expensive, tolerance for risk was low.

Unfortunately "business world 1.0" is dead. Atleast Online. Has been for some time, it will take the Fortune 500 a little while to realize that. Decisions are made on a lot more than just three variables (to get a sense for it just see Web Analytics 2.0). They need to made much faster (if you don't then your agile competitors will). Risk can be managed (even with your most outrageous ideas, say a test, you can control for risk – just split: control 95% and test 5%).

Change is all around you and happening faster every single day. For all us that don't want to get run over, let's determine to go for precision and not accuracy. Please.

Do the best you can with Tags, Cookies, Instrumentation, then jump into the arms of the sexy Ms. Precision.
[I don't know why but Accuracy seems so much more a male thing! Yes I get the irony.]

End of what you all surely agree is a rant.

There was one more delightful slide from Mr. Novo. This was particularly powerful for me.

What data yields insights that can be actioned the most? Here's Jim's answer…..

actionable data pyramid

Did I already say I adored this slide? It is adorable.

Jim's points are extremely obvious (actionability, relevance of insights that can be actioned decreased as you go down the slide). Let me share my learnings.

There is this myth that if only I know who you are that I'll be able to find earth shattering insights that are relevant and actionable. You age, your income, your marital status, your education level. That is worth a lot less than people realize.

I think partly people don't trust web analytics data because it is anonymous and cookie based. Demographic data seems to be the magic answer. While it is useful in some cases, increasingly for many businesses it is of less relevance and scores lower on the actionability index.

demographic 1Jim has worked in the online, offline and nonline worlds. His experience is that if you have actual behavior of your customers then that is most insightful in finding insights that driving action (what they do on your site, what have they done in the past, what made them fork over money to you, what creative / messaging got them to submit the lead…).

Then comes inferences based on implied behavior. You are doing this so far and everyone else who did that ultimately ended up sending us a truck full of money. People who have compared cars on Yahoo Auto and are on our site probably want to do this/that. This takes an even balance of art and science.

Then come Psychographics and finally demographic data.

I have a lot less experience than Jim but my humble experience, especially online, has reflected Jim's recommendation above.

Don't get red herrings lead you down paths where the output of your effort leads to a red face.

Jim's blog is: Marketing Productivity.

__________________________________________________

Who: Neil Mason, Applied Insights Foviance
What: Cutting through the NOISE!!
Why: Application of Predictive Analytics

I have had the distinct pleasure of hearing Neil Mason speak atleast three times and each time I am impressed with his insights and passion.

In our space Neil is the most "I have it all together and you will listen to me and you will be wow'ed" speaker. If you get a chance don't miss his talk.

Neil's presentation had this wonderful slide that I was quite smitten with.

It outlines something simple (yet non obvious): all of the variables that will determine the size of Neil's audience at a conference……

application of predictive analytics

Notice how incredibly well thought out it is! Neil's thought of all the elements, and now he has a magic formula that spits out a number. 250. Packed into a room that holds 150! :)

Neil's slide, for some odd reason, made me think of how hard it is to understand all the reasons why there is a outcome on your website.

Just look at the variables for a "simple" problem that Neil tries to solve above.

Now imagine trying to understand why your website is doing better than last month or worse. I think people desperately underestimate the complexity of mastering this talk.

improvementTake for example conversion rate.

Your boss comes into your office and says improve conversion rate by 10%. Not to ten points, that would be huge! By ten percent.

What do you do?

How easy or hard is your task?

Should you run out and spend a ton of money on Affiliates / Email Campaigns / Paid Search Ads? Should you run to identify the demographic profiles of people who visit your website? [That was a trick question, the answer is no! :]

Instead I recommend that you do the "Neil Mason Exercise".

Here is what my humble attempt looks like…….

application of predictive analytics conversion

Before I figure out how to improve conversion rate I am going to sit down and identify all my "levers". That's what you see above.

Conversion rate depends on my acquisition strategy (where am I spending money to acquire traffic), my organic ranking of the "head" keywords, how crappy my checkout process is, distribution of why people come to my site (Primary Purpose), my website "scent" (Tip of the hat to the Eisenbergs) and so on and so forth.

In fact as I was writing these I ended up with way more variables than the seven slots available from Neil's slide. They are all the other green arrows you see above. :)

Lesson #1: This exercise is of tremendous value.

Lesson #2: This is hard.

Lesson #3: You can't improve what you don't understand.

Next time you get a challenge to improve a metric, any metric, go throug the exercise above. Then……

Go get the data for each of the variables you have identified and try to identify where the true opportunities are for improvement (classic: here are three areas out of 15 we stink at and now let's do a cost benefit analysis of where we can get the maximum bang for our bucks).

If you have done a good job of identifying all the variables then I promise at the end of this exercise you'll be surprised at what you need to improve to win. It won't be the obvious areas.

Neil's Blog is: Applied Insights. [Neil: More content please!!! :)]

__________________________________________________

Ok I am done! One summit, three excellent ideas, what more could anyone ask for!

Its your turn now…..

Please share your perspectives, critique, additions, subtractions, bouquets and brickbats via comments. Thank you.

[Like this post? For more posts like this please click here, if it might be of interest please check out my book: Web Analytics: An Hour A Day.]

Comments

  1. 1
    Ankur Mody says:

    Hi Avinash,

    According to me, Acquisition strategy optimization and cart/checkout complexity are the two stand-out points for predictive analysis of Conversion.

    Your humble attempt to guide us in this respect is praiseworthy.

    Keep doing it
    Ankur P Mody

  2. 2
    Jim Novo says:

    Thanks for the callout Avinash!

    Re: Accuracy vs. Precision, if you are having trouble getting "traction" with management, please consider this: Management is in the business of Forecasting, making Precision more important to them than Accuracy.

    A measurement approach that yields Consistent, Repeatable results rather than "Absolute Accuracy" may be in your best interest.

    Typically, this means not drilling down so deeply. Same data, just not drilled down to the level where noise becomes very loud and signal weak!

  3. 3
    Manju Muthukumaresan says:

    Thanks for sharing the ideas… I have to agree with the checkout complexity as one of the key factors.

    The feeder from Applied Insights blog is broken. See link below. Avinash, could you help address this?

    http://www.applied-insights.co.uk/news/category/blog/feed/

  4. 4
    Ned Kumar says:

    Avinash, thanks for a great post – I really enjoyed reading it.

    Hats off to Jim – I am in complete agreement with his thoughts on accuracy vs precision. As you point out, it is a changing world and going for the "deterministic" measurement approach can be fatal in the marketplace (IMHO the go-to-market speed is more critical than accuracy). I was sitting and chewing on this and remembered something from my high-school physics – Heisenberg's Uncertainty Principle. To have some fun, I thought I will have twist on it to reflect Jim's theory :-)) — The more time you spend on increasing the "ACCURACY" (zero variance) of a metric, the less time you will have to make it "PRECISE" (consistent and repeatable) and vice-versa.

    On the predictive analytics – well,coming from the world of traditional analytics and modeling I am for sure biased and will vote for Neil all the time:-). The only comment I want to add is this — as you and Neil point out, identifying the key levers or variables that impact the outcome is the important thing. I have seen folks not spend enough time on this task and then use fancy tool/tech like neural networks or genetic algorithms to get "good" results". This is a waste of time — the technology can deliver an output quality only as good as the input quality.

    Ciao,
    Ned

  5. 5
    Tim Wilson says:

    Dead on! As always (it's a familiar refrain)! Thanks for the great summary and added color.

    A combination of nitpicking and an extension of the concept with regards to application of predictive analytics. I FULLY realize that I'm about to make a point that is out of the context of the full presentation, so, Neil, if you read this, realize that I *suspect* you might have covered this in the delivery of the presentation. BUT…

    It strikes me as odd that none of the factors driving audience size was directly related to the content of the presentation: Content Relevance or Content Freshness or Topic Hotness. If someone presented "Pictures of My Hawaiian Vacation" at eMetrics…the lack of relevance would dwarf the other factors.

    Now, that's saying that there can be factors that are subjectively measured that need to be taken into account. And, it may be that, given that the topic wouldn't be accepted if it wasn't reasonably current/pertinent, it may be that that's a non-significant variable when it comes to a practical application of the model. Still, it's very much a lever that can be controlled.

    So…the extension of the idea that comes from this nitpicking is that, when sitting down to list out the levers, sometimes it makes sense to list levers that you *cannot* easily quantify or that you cannot necessarily influence (the strength of the economy, for instance). That allows you to consider whether you're building a predictive model that is largely irrelevant or not — either through logic (all eMetrics presentations are relevant) or testing (is there a correlation between the economy and my dependent variable that is so strong and highly weighted that I'm not likely to be able to build a strong/meaningful model without it).

  6. 6
    Florian Pihs says:

    Thought provoking! Now when I think a bit more about the predictive analysis slides, they start to look very much like a simplified fishbone diagram to me. A very useful tool indeed.

  7. 7
    Tim Peter says:

    Avinash,

    Great. As always.

    A good friend of mine got me hooked on precision a few years back with the following folksy quote:

    "Apples are apples. It doesn't matter if your apples are rotten as long as you're comparing 'em to other rotten apples."

  8. 8
    deric Loh says:

    Avinash,

    thanks for the Awesome stuff you shared with us,

    Very true….

    Accuracy vs. Precision…..

    Ready than taking your dollars and cents and just spreading your fire and doesn't really hit the precise point.

    More please…haha

  9. 9
    Billy Shih says:

    Haha, go ahead and steal it! It took me a good 10 seconds to make it in PowerPoint :)

  10. 10
    Rick Galan says:

    Great advice! I have been asked by my boss to do basically the exact same thing as your conversion rate example above. I will give Neil's & your method a shot, and let you know how it works out.

  11. 11
    Konstantin Drapkin says:

    Thanks so much for the insights gleaned from the summit. Goes to show that there's always room to grow-

  12. 12
    Juan Damia says:

    Hi Avinash, I really love this kind of post you normally write every time you come back from a conference / event.

    There is just one thing I would like to point out. Behavioral information (from your web analytics tools), at least until today, allows you to understand what users are doing at your site but not understanding what are they doing through different period of times. Why? Because you wont be able to create clusters of users and following up those specific users. What you are able to do is to follow up the whole group of users that can, and probably will, change through the time, so if the behavior change from one year to the next one, saying that your users have changed their visiting behavior won't be correct, or at least you will never know.

    Regarding the demographic information, as you say is not as important as people normally think, but in my opinion this non-so-important information gain lot of value when added to the rest of your information and becomes very useful.

    Finally, in my opinion, using behavioral information without attitudinal information is very risky since you wont be able to know why is people behaving in one way or another. You will be just able to say, people is behaving like this or like that, but not why. This normally ends with managers inferring, for example, that people is not visiting some content from their site because it is not attractive to them, when what could happens is that the content is not well promoted through it.

    Thanks for the post, I have really enjoyed it!!!

  13. 13

    Juan : As always YMMV! :)

    I can imagine that a Woman's lingerie website could benefit from knowing that a website visitor is Female or Male (the latter have no idea what they are doing on a Woman's lingerie site so they can be charged high prices and sold anything!).

    But for most websites my humble experience reflects Jim's slide, behavioral data is a ton more actionable than demographic. And of course in context.

    That last part is important. I am a much better fan of the "Google philosophy": What I know about you (behavior) from the immediate past (say the last few days) is a better predictor of what you want now. I might have your history from last year or from when you were born, but that is less of a predictor of what you want now. Maybe some, but the most recent history is most delightful.

    Thanks for your thought provoking comment, please keep 'em coming.

    -Avinash.
    PS: Tim : Your quote brought a smile, it is fantastic. Thank you for sharing, I am going to use it in my presentations and am off to look for pictures of rotten apples! :)

  14. 14
    Tim Peter says:

    Thanks, Avinash. I'd say I hope you find a good picture, but I shudder to think what you'll see along the way. :-)

    On a separate note, I have to agree with your point about the value of behavioral data. The key is that demographic and attitudinal data has limited utility for purposes of predicting behavior. People do what they do, not what they say they'll do, or what people who look like them do.

    I used to run e-commerce for a series of mass market brands, attracting many tens of millions of unique visitors each year who spent well over $1 billion during my tenure there (a pretty significant sample ;-). We segmented attitudinal data and demographic data across a number of behaviors and found:

    1. The demographic attributes of the folks who purchased looked just like the demographics of the folks who didn't.

    2. The attitudes of those that purchased varied slightly from those that didn't. The main difference was that those who purchased more frequently tended to have a more favorable impression of the brand, though which was cause and which was effect, we were never able to say with confidence.

    Juan, I do think that attitudinal data and demographic data may help you determine which types of sales and marketing channels provide you the greatest benefit (i.e., you might have more success with TV advertising – for example – if your core demographic consists of older folks who like to watch a lot of TV). Otherwise, nothing in my experience tells you what you can expect customers to do better than what those customers have done in the past. Good thing we're starting to get tools that allow us to know what that is and do something with it.

    Keep up the great work, Avinash and good luck on your rotten apple picking!

  15. 15
    biswarup says:

    Nice topic…good work by Jim…'accuracy' & 'precision' reminded me of the same 'reliability & validity' issue in multivariate data analysis.

  16. 16
    Hans says:

    I get quite impressed with your blog posts. A lot of knowledge. You are also a good writer.

    One thought though is that I get the feeling you general talk about looking at something very specific. In some posts maybe it isnt visible that you do that?

    Do also feel that you talk about data for very specific entities such as one brand? Or do you feel the same method can be used on bigger issues and still be the best solution?

Trackbacks

  1. [...] Online testing is a bit different from other marketing data. It uses live traffic to find out what works. Analytics is the same, measuring what’s occurring at the moment. So why is that important? Well you can infer all you want from surveys, usability studies and demographics, but in the end you can’t argue against what real users are doing. Avinash Kaushik, a popular analytics blogger, summed up the juiciest bits of a presentation by Jim Novo at eMetrics. In it, Jim asked, “What data yields insights that can be actioned the most?” The answer: [...]

  2. [...] E-commerce allows for precise measurement. Or mostly precise. Or some kind of precise, dammit. Just don’t tell me it can’t be measured. And don’t confuse precision with accuracy. Yes, unique phone numbers and coupons have any number of flaws. They’re not completely accurate. Bummer. They’re more measurable than not using unique phone numbers or coupons. And for those of you in my past life (you know who you are): my objection to coupons was specific to that implementation, not to the concept in general. If you can’t measure it, you can’t manage it. So don’t bother doing something you’re not bothering to measure. [...]

  3. [...] Actionable web metrics are precise. Don’t confuse precision with accuracy. While Avinash Kaushik explains the difference between precision and accuracy better than I can, I’ll summarize with my favorite quote about this topic: Apples are apples. It doesn’t matter if your apples are rotten as long as you’re comparing ‘em to other rotten apples.” [...]

  4. [...] 1. La obsesisión por la precisión sobre el medio. Este medio/canal es uno de los más medibles, otros como la televisión o la prensa son tremendamente menos medibles, y sin embargo parece que aquí nos estemos cargando las posibilidades del canal simplemente porque no podemos llevarlo al grado máximo de precisión… lo importante en este caso es la tendencia, no la precisión. Recomiendo esta lectura de Avinash. [...]

  5. [...]
    This Weblink, again from the highly respected and insightful Avinash Kaushik, highlights a major eMetrics Web analytics conference from two years ago, which was linked to a related update from a just completed similar conference in San Jose this year — and specifically that — I add especially in 2009, in these tough times for so many — data accuracy rather than exact precision is most valued, and specifically in what customers have done in the near past that predicts immediate future behavior (and spending which puts money in your pocket) rather than demographic profiles of so-called like types, or even what customers say they are going to do or spend (which often turns out not to be the case).
    [...]

  6. [...]
    As you see while striving for accuracy can reduce the absolute error, it increases the randomness of the error value as well. While it is just the opposite when striving for precision. The decreased randomness also renders itself more favorably towards predictive ability of the system.

    Hence, analytics systems should really be chasing precision and not accuracy. Knowing that a system always has some definite error is surely better than knowing that a system has a small but relatively indefinite error.

    All said, the pursuit for an accurate AND precise system continues.

    Some more on the topic by Avinash Kaushik.
    [...]

  7. [...]
    The challenge is that all analytics programs have some set of accuracy issues and, due to different methodologies in use, it’s rare that you’ll get two different analytics tools to line up. Avinash Kaushik has looked at the difference between accuracy and precision in measurement before and as some brilliant commenter noted, “Apples are apples. It doesn’t matter if your apples are rotten as long as you’re comparing ‘em to other rotten apples.”
    [...]

  8. […]
    Autres sources pour rédaction:
    Web Data Quality: A 6 Step Process To Evolve Your Mental Model
    eMetrics DC ’07 Reflections: Accuracy, Precision & Predictive Analytics
    […]

Add your Perspective

*