08 Mar 2010 02:18 am

ThreeData, data everywhere yet nary an insight in sight.

Is that your web analytics existence?

Don't feel too bad, you share that plight with most citizens of the Web Analytics universe.

The problem? The absolutely astonishing ease with which you can get access to data!

Not to mention the near limitless potential of that data to be added, subtracted, multiplied, and divided to satiate every weird need in the world.

You see just because you can do something does not mean you should do it.

And yet we do.

Like good little Reporting Squirrels we collect and stack metrics as if preparing for an imminent ice age. Rather than being a blessing that stack becomes a burden because we live in times of bright lovely spring and nothing succeeds like being agile and nimble about what we collect, what we give up, and what we deliberately choose to ignore.

The key to true glory is making the right choices.

In this case its making right choices about the web metrics we knight and sent to the battle to come back with insights for our beloved corporation to monetize.

A very simple test can allow you to figure out if the metric you are dutifully reporting (or absolutely in love with) is gold or mud.

It is called the Three Layers of So What test. It was a part of my first book, Web Analytics: An Hour A Day.

What's this lovely test?

Simple really (occam's razor!):

Ask every web metric you report the question "so what" three times.

Each question provides an answer that in turn raises another question (a "so what" again). If at the third "so what" you don't get a recommendation for an action you should take, you have the wrong metric. Kill it.

the three test

This brutal recommendation is to force you to confront this reality: If you can't take action, some action (any action!), based on your analysis, why are you reporting data?

The purpose of the "so what" test is to undo the clutter in your life and allow you to focus on only the metrics that will help you take action. All other metrics, those that fall into the nice to know or the highly recommended or the I don't know why I am reporting this but it sounds important camp need to be sent to the farm to live our the rest of their lives!

Ready to rock it?

Let's check out how you would conduct the "so what" test with a couple of examples.

Key Performance Indicator: Percent of Repeat Visitors.

You run a report and notice a trend for this metric.

Here is how the "so what" test will work:

"The trend of repeat visitors for our website is up month to month."

So what?

"This is fantastic because it shows that we are a more sticky website now."

(At this point a true Analysis Ninjas would inquire how that conclusion was arrived at and ask for a definition of sticky, but I digress.)

So what?

"We should do more of xyz to leverage this trend." (Or yxz or zxy – a specific action based on analysis of what caused the trend to go up.)

So what?

If your answer to that last "so what" is: "I don't know… isn't that a good thing… the trend is going up… hmm… I am not sure there is anything we can do… but it is going up right?"

At this point you should cue the sound of money walking out the door.

Bottom-line: This might not be the best KPI for you.

Let me hasten to point out that there are no universal truths in the world (though some religions continue to insist!).

Perhaps when you put your % of Repeat Visitors KPI to the "so what" test you have a glorious action you can take that improves profitability. Rock on! More power to you!

Many Exit Signs

Key Performance Indicator: Top Exit Pages on the Website.

[Before we go on please know that top exit pages is a different measurement than top pages that bounce.]

You have been reporting the top exit pages of your website each month, and to glean more insights you show trends for the last six months.

"These are the top exit pages on our website for the last month."

So what? They don't seem to have changed in six months.

"We should focus on these pages because they are major leakage points in our website."

So what? We have looked at this report for six months and tried to make fixes, and even after that the pages listed here have not dropped off the report.

"If we can stop visitors from leaving the website, we can keep them on our web site."

So what? Doesn't everyone have to exit on some page?

The "so what" test in this case highlights that although this metric seems to be a really good one on paper, in reality it provides no insight that you can use to drive action.

Because of the macro dynamics of this website, the content consumption pattern of visitors does not seem to change over time (this happens when a website does not have a high content turnover – like say a rapidly updating news site), and we should move on to other actionable metrics.

Here the "so what" test not only helps you focus your precious energy on the right metric, it also helps you logically walk through measurement to action.

Conversion Rate Efficiency

Key Performance Indicator: Conversion Rate for Top Search Keywords.

In working closely with your search agency, or in-house team, you have produced a spreadsheet that shows the conversion rate for the top search keywords for your website.

"The conversion rate for our top 20 keywords has increased in the last three months by a statistically significant amount."

So what?

"Our pay-per-click (PPC) campaign is having a positive outcome, and we should reallocate funds to these nine keywords that show the most promise."

Okay.

That's it.

No more "so what?"

With just one question, we have a recommendation for action. This indicates that this is a great KPI and we should continue to use it for tracking.

Notice the characteristics of this good KPI:

#1: Although it uses one of the most standard metrics in the universe, conversion rate, it is applied in a very focused way – just the top search keywords. (You can do the top 10 or top 20 or as many "head keywords" as it makes sense in your case, just be aware this does not scale to the "mid" or "tail".)

#2: It is pretty clear from the first answer to "so what?" that for this KPI the analyst has segmented the data between organic and PPC. This is the other little secret: no KPI works at an aggregated level to by itself give us insights. Segmentation does that.

task completion rate

Key Performance Indicator: Task Completion Rate.

You are using a on-exit website survey tool like 4Q to measure my most beloved metric in whole wide world and the universe: task completion rate. (You'll see in a moment why. :)

Here's the conversation…

"Our task completion rate is down five points this month to 58%."

So what?

"Having indexed our performance against that of last quarter, each one percent drop causes a loss of $80,000 in revenue."

So what? I mean in the name of thor, what do we do!

"I have drilled down to the Primary Purpose report and most of the fall is from Visitors who were there to purchase on our website, the most likely cause is the call to action on our landing pages and a reported slowness in response when people add to cart."

Good man. Here's a bonus and let's go fix this problem.

Nice right?

Notice in this case you have a inkling to the top super absolutely unknown secret of the web analytics world: If you tie important metrics to revenue that tends get you action and a god like status.

Keep that in mind.

So that's the story of the "so what" test. A simple yet effective way of identifying the metrics that matter.

This strategy is effective with all that we do, but it is particularly effective when it comes to the normal data puke we call the "management dashboard". Apply the "so what" test and you'll make it into a Management Dashboard.

Closing Summary:

Remember, we don't want to have metrics because they are nice to have, and there are tons of those.

We want to have metrics that answer business questions and allow us to take action—do more of something or less of something or at least funnel ideas that we can test and then take action.

The "so what" test is one mechanism for identifying metrics that you should focus on or metrics that you should ditch because although they might work for others, for you they don't pass the "so what" test.

And killing metrics is not such a bad thing. After all this is the process that has been proven to work time and time again:

web analytics metrics lifecycle process

More here: Web Metrics Demystified.

Ok now it's your turn.

Do you have a test you apply to your web metrics? What are your strategies that have rescued you during times of duress? What do you like about the "so what" test? What don't you like about it? Do you have a metric that magnificently aced the "so what" test?

Please share your comments, feedback and life lessons via comments.

Thanks.

PS:
Couple other related posts you might find interesting:

Social Bookmarks:

  • Twitter
  • del.icio.us
  • Reddit
  • Google Bookmarks
  • StumbleUpon
  • Sphinn
  • Digg
  • Facebook
  • FriendFeed
  • LinkedIn
  • PDF
22 Feb 2010 02:19 am

Many SplendorCompetitive intelligence, the "what else", is one of the core tenets of Web Analytics 2.0.

The reason is simple: The ecosystem within which you function on the web contains mind blowing data you can use to become better.

Your traffic grew by 6% last year, what was your competitor's growth rate? 15%. Feel better? : ) When should you start doing paid search advertising for tours to Italy for 2011? In May 2010 (!). What is your "share of search" in the netbook segment compared to your biggest competitor? 9 points higher, now you deserve a bonus! How many visitors to your site go visit your competitor's site right after coming to yours? 39%, good god! Where to do display advertisements to ensure you get in front of men considering proposing to their girlfriends (or boyfriends)? Go beyond targeting men between the age of 28 and 34, use search behavior and be really smart.

I am just scratching the surface of what's possible.

It is simply magnificent what you can do with freely available data on the web about your direct competitors, your industry segment and indeed how people behave on search engines and other websites.

The secret to making optimal use of CI data lies in one single realization: You must ensure you understand how the data you are analyzing is collected.

Not all sources of CI data are created equal. It is key that before you use the data that comScore or Nielsen or Google or HitWise or Compete or your brother-in-law shove into your face that you understand where the data comes from.

Once you understand that you choose: 1. The best source possible that is 2. The right answer for the question you are asking (which implies you have to be flexible!).

Here are the sources of competitive intelligence data.

#1: Toolbar Data.

Toolbars are add-on's that provide additional functionality to web browsers, such as easier access to news, search features, and security protections. They are available from all the major search engines such as Google, MSN, Yahoo! as well as from thousands of other sources.

yahoo toolbar

These toolbars also collect limited information about the browsing behavior of the customers who use them, including the pages visited, the search terms used, perhaps even time spent on each page, and so forth. Typically, data collected is anonymous and not personally identifiable information (PII).

After the toolbars collect the data, your CI tool then scrubs and massages the data before presenting it to you for analysis. For example, with Alexa, you can report on traffic statistics (such as rank and page views), upstream (where your traffic comes from) and downstream (where people go after visiting your site) statistics, and key-words driving traffic to a site.

Millions of people use widely deployed toolbars, mostly from the search engines, which makes these toolbars one of the largest sources of CI data available. That very large sample size makes toolbar data a very effective source of CI data, especially for macro website traffic analysis such as number of visits, average duration, and referrers.

Search engine toolbars are a ton more popular which is the key reason that other toolbars data sources, such as Alexa, are not useful any more (the data is simply not good enough).

Toolbar Data Bottom-line: Toolbar data is typically not available by itself. It is usually a key component in tools that use a mix of sources to provide insights.

#2: Panel Data.

Panel data is another well-established method of collecting data. to gather panel data, a company may recruit participants to be in a panel, and each panel member installs a piece of monitoring software. The software collects all the panel’s browsing behavior and reports it to the company running the panel. Additionally the person is also required to self report demographic, salary, household members, hobbies, education level and other such detailed information.

panel data

Varying degrees of data are collected from a panel. At one end of the spectrum, the data collected is simply the websites visited, and at the other end, the monitoring software records the credit cards, names, addresses, and any other personal information typed into the browser.

Panel data is also collected when people unknowingly opt into sending their
data. Common examples are a small utility you install on your computer to get the weather or an add-on for your browser to help you auto complete forms. in the unreadable terms of service you accept, you agree to allow your browsing behavior to be recorded and reported.

Panels can have a few thousand members or several hundred thousand.

You need to be cautious about three areas when you use data or analysis based on panel data:

Sample bias: Almost all businesses, universities, and other institutions ban monitoring software because of security and privacy concerns. Therefore, most monitored behavior tends to come from home users. Since usage during business hours forms a huge amount of web consumption, it is important to know that panel data is blind to this information.

Sampling bias: People are enticed to install monitoring software in exchange for sweep-stakes entries, downloadable screensavers and games, or a very small sum of money (such as $3 per month). This inclination causes a bias in the data because of the type of people who participate in the panel. This is not itself a deal breaker, but consider whose behavior you want to analyze vs. who might be in the sample.

Web 2.0 challenge: Monitoring software (overt or covert) was built when the Web was static and page-based. The advent of rich experiences such as video, Ajax, and Flash means no page views, which makes it difficult for monitoring software to capture data accurately. Some monitoring software companies have tried to adapt by asking companies to embed special beacons in their website experiences, but as you can imagine, this is easier said than done (select few want to do it, then do they do it well and how do you compare companies that did or did not beaconify?).

The panel methodology is based on the traditional television data capture model. In a world that is massively fragmented, panels face a huge challenge in collecting accurate and complete (or even representative) data. A rule of thumb I have developed is if a site gets more than 5 million unique visitors a month, then there is a sufficient signal from panel-based data.

Panel Data Bottom-line: For companies such as comScore and Nielsen panel data has been a primary source for competitive reporting they provide their clients. But because of the methodology's inherent limitations, recently panel data is augmented by other sources of data before it is provided for analysis (including for a subset of data you'll get from comScore and Nielsen – please check & clarify before you use the data).

#3: ISP (Network) Data.

We all get our internet access from Internet Service Providers (ISP's), and as we surf the Web, our requests go through the servers of these ISP's to be stored in server log files.

network cables

The data collected by the ISP consists of elements that get passed around in URLs, such as sites, page names, keywords searched, and so on. The ISP servers can also capture information such as browser types and operating systems.

The size of these isps translates into a huge sample size.

For example, Hitwise which chiefly relies on isp data, has a sample size of 10 million people in the United states and 25 million worldwide. Such a large sample size reduces sample bias (surprise!). There are also geographically focused competitive intelligence solutions, like Netsuus in Spain, that provide analysis from excellent locally sourced data.

The other benefit of ISP data is that the sampling bias is also reduced; since you and I don’t have to agree to be monitored, our ISP simply collects this anonymous data and then sells it to third-party sources for analysis.

ISP's typically don’t publicize that they sell the data, and companies that purchase that data don’t share this information either. So, there is a chance of some bias. Ask for the sample size when you choose your ISP-based CI tool, and go for the biggest you can find.

ISP Data Bottom-line: The largest samples of CI data currently available comes from ISP data (in tools like Hitwise and Compete). Though both tools (and other smart ones like them) increasingly use a small sample of panel data and even some small amount of purchased toolbar data.

#4: Search Engine Data.

Our queries to search engines, such as Bing, Google, Yahoo!, and Baidu, are logged by those search engines, along with basic connectivity information such as IP address and browser version. In the past, analysts had to rely on external companies to provide search behavior data, but increasingly search engines are providing tools to directly mine their data.

search engine

You can use search engine data with a greater degree of confidence, because it comes directly from the search engine (doh!). Remember, though, that the data is specific to that search engine—and because each search engine has distinct user base, it is not wise to apply lessons from one to another.

With that mild warning here are some amazing tools. . . .

In Google AdWords, you can use Keyword Tool, the Search-based Keyword Tool and Insights for Search.

Similar tools are available from Microsoft: Entity Association, Keyword Group Detection, Keyword Forecast, and Search Funnels (all at Microsoft adCenter Labs).

Search Engine Data Bottom-line:Search engine data tends to be the primary, and typically the best source you can find, for search data analysis. If you are analyzing data for your SEO or PPC campaigns and you find the search engine providing the data then you should instantly embrace it and immediately propose marriage!

#5: Benchmarks from Web Analytics Vendors.

Web analytics vendors have lots of customers, which means they have lots of data. Many vendors now aggregate this real customer data and present it in the form of benchmarks that you can use to index your own performance.

Benchmarking data is currently available from Fireclick, Coremetrics, and Google Analytics. Often, as is the case with Google Analytics, customers have to explicitly opt in their data into this benchmarking service. But this is not always true for all vendors, please check with yours.

web analytics benchmarking report

Both Fireclick and Coremetrics provide benchmarks related to conversion rates, cart abandonment, time on site, and so forth. Google Analytics provides benchmarks for visits, bounce rates, page views, time on site, and % new visits.

In all three cases, you can compare your performance to specific vertical markets (for example, retail, apparel, software, and so on), which is much more meaningful.

The cool benefit of this method is that websites directly report very accurate data, even if the web analytics vendor makes that data anonymous. The downside is that your competitors might not all use the same tool as you; therefore, you are comparing your actual performance against the actual performance of a subset of your competitors.

With data from vendors, you must be careful about sample size, that is, how
many customers the web analytics vendor has. If your web analytics vendor has just 1,000 customers and it is producing benchmarks in 15 industry categories, it might be a hit or a miss in terms of how valuable / representative the benchmarks are.

WA Vendor Data Bottom-line: Data from web analytics vendors comes from their clients, so it is real data. The client data is anonymous, so you can’t do a direct comparison between you and your arch enemy; rather, you’ll compare yourself to your industry segment (which is perfectly ok).

#6: Self-reported Data.

It is common knowledge that some methods of data collection, such as panel-
based, do not collect data with the necessary degree of accuracy. A site’s own analytics tool may report 10 million visits, and the panel data may report 6 million. To overcome this issue, some vendors, such as Quantcast and Google’s Ad Planner, allow websites to report their own data through their tools.

self reported competitive data

For sites that rely on advertising, the data used by advertisers must be as
accurate as possible; hence, the sites have an incentive to share data directly. If your competitors publish their own data through vendors such as Google’s Ad planner or Quantcast, then that is probably the cleanest and best source of data for you.

One thing to be cautious about when you work with self-reported data. Check the definitions of various metrics. For example, if you see a metric called Cookies, find out exactly what that metric means before you use the data.

Self-Reported Data Bottom-line: Because of its inherent nature, self-reported data tends to augment other sources of data provided by tools such as Ad Planner or Quantcast. It also tends to be the cleanest source of data available.

#7: Hybrid Data (/All Your Base Are Belong To Us).

Competitive intelligence vendors are observing you from the outside. Any single source, toolbar or panels or isp or tags or spyware etc, will have its own bias / limitation.

Some, smart, vendors now use multiple sources of data to augment the data set they started their life with.

all your base are belong to us

The first method is to append the data. This is what's happening in the case of Quantcast and Google Ad Planner, in both cases they have their own source of data to which your self reported data is added. The resulting reports are "awesomely good".

The second method is to put many different sources (say toolbar, panel, isp) into a blender, churn at high speed, throw in a pinch of math and a dash or correction algorithms, and – boom! – you have one "awesomely good" number. A good example of this is Compete or the DoubleClick Ad Planner.

Google's Trends for Websites is another example of a tool that uses hybrid data for its reporting (see answer #2 here).

The benefit of using hybrid methodology is that the vendor can plug in any gaps
that might exist between different sources.

The challenge is that it is much harder to peel back the onion and understand some of the nuances and biases in the data (sometimes mildly frustrating to analysis ninja's such as myself!).

Hence, the best-practice recommendation is to forget about the absolute numbers and focus on comparing trends; the longer the time period, the better.

Hybrid Data Bottom-line: As the name implies, hybrid data contains data from many different sources and is increasingly the most commonly used methodology. It will probably be the category that will grow the most because frankly in context others look rather sub optimal.

[Update:] #8: External Voice of Customer Data.

This is one often overlooked source of competitive intelligence (/benchmarking) data. There are several ways to use Voice of Customer data.

For starters various companies such as iPerceptions (like CoreMetrics, Google Analytics, Fireclick above for clickstream) publish Customer Satisfaction & Task Completion Rate (my most beloved metric!) numbers for various industries.

If I am in the internet retail game I can use these benchmarks to compare my performance:

iperceptions retail ecommerce task completion report

Or I can dig deeper and compare my performance by the Primary Purpose segments:

iperceptions customer satisfaction by purpose of visit

Both of the above are from the iPerceptions Q4 2009 Ecommerce Benchmark report. You'll find other reports in the Resource Center.

With other sources like the ACSI (American Customer Satisfaction Index) you can get a big deeper. Just choose an industry from their site, I choose Internet Travel, and bam!

american customer satisfaction index internet travel

If I am Travelocity I am wondering what in the name of Jebus did the other guys do last year to have all those gains in Satisfaction where I got a zero.

And really what are those guys at Priceline eating! 5.6 points improvement just last year, 15 points over the last 9! Sure they started with a smaller number but still.

What can I, sad Travelocity, learn from them? From others?

In both cases above the intelligence came from a third party doing the research and giving you data for free.

But you can also commission studies, from the two companies above or one of thousands like them on the web.

Or you can do it yourself.

loop11 usability process

I just met a Top Web Company the other day and I spent $300 on 20 remote usability participants to go to the website of Top Web Company and their main competitor. I gave the usability participants the exact same tasks to do on both sites.

The scores were most illuminating (and embarrassing for Top Web Company). It allowed me to (without working at either company) collect competitive intelligence about how each were delivering against: 1. Task Completion Rate and 2. Customer Satisfaction.

You could collect your own competitive intel using UserTesting.Com, User Zoom, Loop11, or one of many other tools.

Or for even "cheaper" (and a bit less impactful) insights you can use something delightful like www.fivesecondtest.com. Upload how your pages, upload those of your competitors (or complete strangers not in your industry) and learn from real users which designs they prefer and what works best.

Hybrid Data Bottom-line: I love customers, I love Task Completion Rate as a powerful metric, I love VOC. Above are three simple ways in which you can collect competitive intelligence using Voice of Customer data and drive action perhaps even faster than the first seven methods!

[PS: Nothing, absolutely nothing, works better to win against HiPPOs than using competitors and customers!]

The Optimal Competitive Analysis Process.

A lot of data is available about your industry or your competitors that you can
use to your benefit.

Here is the process I recommend for CI data analysis:

1. Ensure that you understand exactly how the data is collected.

2. Understand both the sample size and sampling bias of the data reported to you. Really spend time on this.

3. If steps 1 and 2 pass the sniff test, use the data.

Don’t skip the steps, and glory will surely be yours.

See the additional posts linked to below for types of analysis you can do once you choose the right tool [OR you can also start on page 221 of my new book Web Analytics 2.0!].

Ok now your turn.

Do you use a source of competitive intelligence data not covered in this blog post? Which of the above 7 is your favorite? Was there something surprising you learned in this post? What is one thing you would add to the critical analysis above?

Please share your tips, best practices, critique in comments.

Thanks.

PS:
Couple other related posts you might find interesting:

Social Bookmarks:

  • Twitter
  • del.icio.us
  • Reddit
  • Google Bookmarks
  • StumbleUpon
  • Sphinn
  • Digg
  • Facebook
  • FriendFeed
  • LinkedIn
  • PDF

Next Page »