End of Dumb Tables in Web Analytics Tools! Hello: Weighted Sort

Magical Arthur C. Clarke said:

"Any sufficiently advanced technology is indistinguishable from magic."

That quote comes to mind when I think of a new feature in Google Analytics that carries the unassuming name of Weighted Sort. It is an advanced implementation of technology (mathematical algorithms in this case) and when used it very much feels like magic!

In this blog post I want to share with you why I am so incredibly excited about this feature, how it works and how going forward you will reject every tool that does not come built in with this feature (ok so maybe that's a stretch, but I promise you this is so cool that at least for a few minutes you'll think other tools are lame by comparison!).

Let's take a couple of steps back, get some context before we dive in.

The Problem.

We have a very long tail of data in web analytics. Tens of thousands of rows of keywords in the Search Report (even for this small blog!). Hundreds and hundreds of referring urls and campaigns and page names and so on and so forth.

Yet because we are humans we tend to look at just the top ten or twenty rows to try and find insights. The problem? The top ten of anything rarely changes (except in rare circumstances like a sale or on a pure content – think news – site).

Hence I have persistently evangelized the need for true Analysis Ninjas to move beyond the top ten rows of data to find insights.

How? Advanced table filters, tag clouds and keyword trees are a good start.

But we need more.

One more problem though.

As if massive data we have is not enough of a problem, we also rely on Averages, Percentages, Ratios and Compound/Calculated Metrics in a profoundly sub optimal way, as a drunken man uses lamp-posts – for support rather than for illumination.

Take a percentage, for example Bounce Rates. The top ten won't change.

bounce_rate_normal_table_view

Hmmm. what to do. what to do?

You know what I'll try to  find the keywords with the highest bounce rates and fix them! After all I don't want to have all those visitors say: "I came. I puked. I left!"

Ok analytics tool: Sort descending!

bounce_rates_descending

Arrrrrh! Useless!

See all those single visits? Would improving these bounce rates have a huge impact?

Ok maybe I should learn from keywords with low bounce rates so I can perhaps take the lessons from my awesomness and apply it to others. Tool: Sort ascending!

bounce_rates_ascending

Arrrrrh! Again! Useless.

What could I possibly improve by focusing on these keywords with so few visits? Nothing.

So to recap:

  1. We tend to only understand the top ten rows of data, because that's what is easily visible.
  2. Gold exists beyond the top tend rows.
  3. Using percentages, averages sub optimally makes it impossible to find the Gold!

Yet gold I must find if I want to improve the outcomes for my web business (for profit or, as in the above example, non-profit).

The Solution!

The Google Analytics team has built a innovative and mathematically intelligent new feature called Weighted Sort to precisely solve this problem.

Now when you sort the data off a percentage or a ratio, like in the above case, you'll see this on top of the table.

weighted_sort_option_google_analytics

When you press this unassuming checkbox something magical happens. Google Analytics brings back for me the rows of data I should analyze further to have the highest possible impact on my business.

It looks like this. . .

search_keywords_weighted_sort_google_analytics

Sweetness!!

Notice that the Visits for these keywords are sorted in an "odd" manner, as are the bounce rates.

That is the magic.

Now you don't have to go through wild gyrations (or worse guesses) to figure out the best places to focus your attention on. You can skip combing through the, in this case, 5,777 rows of data. The algorithm will do that for you!

The "magic button" will sort your data from:

"focus here because something very important is going on here and if you focus here chances your improvements leading to reducing bounce rates will have a very high ROI for your business"

to

"rows/keywords where your efforts might not quite yield big ROI improvements"

Translation: Sort by "interestingness".  What are the most interesting keywords with high bounce rates? [Where things are going "wrong".]

You can reverse sort the table, keeping the Weighted Sort checkbox on, and you'll find the most interesting keywords with the lowest bounce rates [where things are going right].

No more using silly ascending and descending sorting. No more worrying about if you are focused on the right places. Less worrying if you are prioritizing things right.

Save time. Do less data puking. Be happier!

Awesome right? Go try it on your own Google Analytics data!

How Does Weighted Sort, aka The Magic, Actually Work?

Good question. It is also the reason for the Arthur C. Clarke quote.

Of course there is no magic, it is all the beauty of some wonderful math and ingenuity.

But it is complicated.

Let me try to explain it as best I can using some visualizations and formulas.

What powers weighted sort?

This simple hypothesis:

The true value of a metric (bounce rate, conversion rate, time on site etc) for dimensions with small participants will be imprecise.

English: If the dimension you are looking at is referring urls and if only five visits this month originated from Bing then a conversion rate of 80% (or a conversion rate of 20%) is not reflective of the "true" conversion rate.

There are too many unknown variables, or irreplicable events, that could have contributed to that number (80 or 20) making it incredibly difficult to make any decisions based on just 5 visits.

You saw this problem when I sorted descending or ascending for the metric bounce rate above.

So how do you address this problem?

The fearless developers were given this amazing goal:

Compute the "expected true value" for each row on the table.

It is a difficult problem to solve. But since the actual values are not very useful, applying some logic and mathematical intelligence to figure out what the true value is can brilliantly help identify "interesting" data (aka where to focus).

Google Analytics computes the expected true value (in our case above "expected true bounce rate") and then sorts the data using the expected true value (ETV) giving you the most interesting data to look at.

The expected true value (ETV) is not shown in the UI (as it would simply be distracting).

How exactly do you compute the "expected true value"?

That is a good question.

Think of a scale. On one end there are is a dimensional value (keywords, countries, referring urls, product names etc) with zero visits and "a lot" of visits at the other end.

scale

Let's assume we are analyzing the dimension countries and the metric bounce rate.

Remember out hypothesis above? True value of a metric is not reflective when it comes to small samples (visits in our case).

So if there was one visit from South Africa its actual bounce rate reported in the tool is not a precise reflection of what the true value might be. But if there were A Lot of visits from South Africa then the actual value is reflective of the true value.

Put another way. . . I request you to pay attention. . . .

For values to the very far left of the scale we equate the expected true value (ETV) to be equal to site average. A very safe bet.

For value at the very far right of the scale (i.e. "a lot" of visits) it is quite likely that the ETV will be equal to the actual value. Makes sense right?

All other points between the left and the right will have ETV's that will be a blend of the site average and actual values.

Hence when computing ETV. . .

Those closer to the left (fewer visits) will have a higher blend of site average compared to actual values.

Those closer to the right (many many visits) will have a higher blend of actual value compared to site average.

Here's a image that explains this very critical concept clearly. . .

computing_estimated_true_value

Crystal clear on how ETV's are computed?

The quest is to figure out the estimated true value (ETV) for any metric for a given dimensional value (keyword, referrer, campaign, display ad, social media strategy).

NOTE: Numbered values (0.01, 0.99, 0.5 etc) are for illustrative purposes, just to explain how weighted sort works. Actual values used in your report are intelligently and automatically computed in context of your data.

Can you give me a specific example of ETV computation?

Sure.

Let's say you are a multi billion dollar multi country multi people corporation with multiple products and services.

The next step in your world domination plan is to figure out how best to move beyond your current list of country domination (United States, Brazil, UK, India, Spain).

What do you do?

You'll look at where your traffic comes from and look at bounce rates, to figure out how you can retain even more people who land on your website. You are confident that if you just retain them beyond one page, engage them beyond your 200 mb flash intro, then you know you'll suck them into your business. Then world domination is but 15 minutes away!

So you log into your web analytics tool and you'll probably see a report like this in Google Analytics, or Omniture / Adobe or CoreMetrics / Unica / IBM or WebTrends or. . .

conversions bounces by country

And you let out a little sarcastic: Just great.

The report has confirmed what you already knew from starting at the same top ten row. You very quickly went nowhere.

But you are in luck, you are using Google Analytics! (At least in my imagination. :)

You click on the Bounce Rate column to sort and then check on the Weighted Sort column and. . . bam!

Something useful. . . .

bounce-rates-weighted-for-country-visits Sorted by "interestingness"!

You are now looking at a intelligently sorted list of countries where if you focus on improving your bounce rates (i.e. lower them) you'll have the best bang for you recession hit buck!

Segment the traffic from Argentina, Peru, Spain, Colombia, Chile and Denmark and you are on your way to the aforementioned world domination.

But how did Argentina rank #1 (4k visits), Peru #2 (1.5k visit), Spain #3 (8.8k visits)?

Analytics used the, again, aforementioned formula to compute the estimated true value (ETV), by leveraging Average Bounce Rate (64%) and Actual Bounce Rate for each country (last column above) and assigning contextual weights based on Visits from each country.

Let us see how the ranking worked by reverse engineering it. Here is what happened:

    Argentina Bounce ETV = (0.01*avg BR) + (0.99*actual BR)

    Argentina Bounce ETV = (0.01 * 63.49) + (0.99 * 79.53) = ETV = 79.37

    Peru Bounce Rate ETV = (0.1 * 63.49) + (0.9 * 80.24) = ETV = 78.57

    Spain Bounce Rate ETV = (0.001 * 63.49) + (0.999 * 77.76) = ETV = 77.75

Now you can see how each country, even though visits are very different, were sorted #1, #2, #3. By interestingness, by computing ETV for each.

Where did the number in red come from? You were not paying attention!!

Ok.

Remember the scale? (If not see picture with scale above.)

The numbers in red are:

    1. just for illustrative purposes in this blog post

    2. a function of where Visits by a country would fit, closer to the Zero (Peru) or closer to A LOT (Spain), hence the name weighted sort

    3. always computed uniquely for your website data based on a intelligent mathematical formulation (which is patent pending and I can't reveal to you!)

You now understand how weighted sort works! Yea!

What if you wanted to discover which are the most interesting countries to focus on, where bounce rates are already low, and deepen your world domination?

Reverse sort the table. . .

reverse_sort_best_countries_bounce_rates

Happy birthday.

Examples Of Weighted Sort Analysis You Should Try.

I wanted to close this post by highlighting other places you can use weighted sort and some other types of analysis you could do.

Focus your efforts for attracting New Visitors to you site.

Weighted Sort also works with % of New Visits. So let's say you are a newspaper and up against the "newspapers" of Fox. To survive you must find new countries (or Cities or Regions) from which to attract lots more new visitors from.

Well just sort by % of New Visits and you'll have the answer. . . .

percentage_new_visits

Now you know where to focus.

[Remember that for a newspaper Repeat Visits are also great! :)]

How about looking at the most interesting countries from where the % of New Visits is already high? Just reverse sort the above column.

You might then want to segment that data to go see if over time Visitor Loyalty for those countries is also increasing, or these are just fly-by-night visitors.

Valuable analysis right?

Understand audience preferences, improve $$, for a non-ecommerce site!

I don't have ads or promotions on this website. But like any good Analysis Ninja I have identified my goals (I have six) and then identified values for each goal. The values define revenue that does not come to me directly, on this site, but rather comes to me in other ways as a result of the work I do on the blog (multi channel impact baby!).

The benefit of Goals and Goal Values is that it helps me do "financial analysis" for all the traffic I get (you all!). That means I can focus on what works for you and what works for me.

The metric I use is $ Index. It is the average value a given page or a set of pages add to the overall pie.

The analysis I want to do is to understand what pages / content I should focus on to create the highest possible impact.

I am not going to look at the normal table found in Google Analytics or Site Catalyst or Yahoo! Web Analytics.

I am going to look at the table with Weighted Sort turned on to identify the rows with "interestingness". . .

customer-interest-content_$index-value

Who would have thunk that my public speaking engagements page was of so much interest and creating so much value for me, with just 469 page views! Certainly not me.

Some of the other rows of data were also unexpected (I need to do more videos, podcasts!) and others were just plain gratifying (I love killing useless metrics, and so do you!).

But there was also heartbreak.

When I reverse sort the data, to find which blog posts / topics are not generating enough $ Index (value), I was sad to see this was the #1 post. . .

heartbreaking-low-value-blogpost

Seven Skills to Look for in a Web Analytics Manager

I was really sad because I was a manager and a senior manager and a director of a web research and analytics teams. The above post distills my little wisdom.

More people should read this post (and similar by others) because day in and day out I see wrong people leading analytics teams causing problems for the company and sucking the life out of the Analysis Ninjas. And I hate that.

See why my heart is broken with that low $0.11 value?

But at least I know!

Money, Money, Honey Bunny!

Can't close a blog post without an example of conversion rates right?

Traffic comes to your website from many sources. We typically tend to look at silos and rare compare across acquisition channels.

Hence I recommend that you look at one of my favorite reports: All Traffic Sources.

Let's suppose you are an Analysis Ninja called Nico Weber. Now at a glance you can compare direct traffic with referral traffic with paid search with organic search with campaigns with. . . everything! Make it your new favorite.

When you report to your Sr. Leader now you can look across ALL traffic channels and tell her/him which ones are most interesting for the company. . .

google-analytics-referrals-conversion-rate

Did anything in web analytics look more delightful? [Maybe the Intelligence Reports. :)]

The above table helps you prioritize where your most interesting sources of traffic are, not by conversion rate only but rather by using a intelligent mathematical algorithm that weights against Visits while computing estimate true value of the conversion rate.

Oh and don't forget to reverse sort and find the "loser" traffic source prioritized by interestingness!

That's weighted sort.

It's a simple feature, a great addition to the portfolio of techniques that Analysis Ninja's will use to find insights faster and focus on what's important.

It is my fondest hope that web analytics vendors like Adobe, I B M, Yahoo! will take a step back from this constant quest to collect every more data and just puke it out. I hope they'll take mercy on the Reporting Squirrels and Analysis Ninjas of the world and spend 10% of their vendor resources on making tools smarter, a bit more intelligent. We deserve at least that much.

I hope the Google Analytics team also continues to do so.

Ok your turn now.

What do you think of this small feature in Analytics? Do you understand how it works? Do you use it in your job already? What do you think the team at Google did right with this feature? What could they have done better? Are there other techniques you use to move from Data to Insights faster?

Please share your feedback, tips, critique, words of praise, and all else via comments.

Thanks.

Comments

  1. 1

    I like the new sorting feature for getting those keywords/sources that are converting well. But in my daily work it's also about volume. That's why I like the advanced filters a bit more: "give me all sources with a higher bouncerate than 80% and at least 200 visits". That way you can find the high potentials regarding traffic. Within that selection I could use the weighted sort to give me the best performing sources.

  2. 2

    Thank you Avinash for explaining well how Weighted Sort works.

    I also agree with Scholten, that while this feature is helpful, Advanced Filters are much more helpful in general.

    Separately, I find that keyword reports are not very useful in our case because it does very little to distinguish real long tail terms. It seems to do a good job of identifying decent medium tail terms, but not cleaning the mess that is long tail (over 95+% of our sites' keywords).

    However, I really like using Weighted Sort on the Content Drilldown reports so I can get better category level visibility in my content and prioritize better.

    Hopefully, the'll add weighted sort to e-commerce metrics as well. I wanted to use some weighted sort in Custom Reports sorting on Avg Value and I got no love from GA.

  3. 3

    Weighted sort is a welcome feature in Google Analytics for all of us who have been using column sorting and desiring a way to have actual, useful meaning behind it.

    But like anyone with experience knows in web analytics, it's not just about one feature or one filter. Combining Weighted Sort with a secondary dimension and an advanced filter can really help narrow your focus (in a good way) to seek out those golden nuggets of valuable information that we all strive for!

    And – thank you for highlighting $index in your second to last example!

  4. 4

    I've found the weighted sort feature especially helpful when trying to figure out the holes in adword (keyword) spend.

    It's amazing what Google is trying to do across the board to help reporting squirrels around the world. It makes me wonder if Google read your post about statistical significance and implemented their new experiment settings functionality in Google Adwords.

  5. 5

    Thanks for explaining this great new feature, Avinash. I've been pimping out my GA with extensions lately and it's been a little tricky to figure out what's new and what's a part of a new add-on.

    Weighted sort is awesome! I would typically find myself weeding out bounce rates associated with keyword terms (or what-have-you) with only several visits. Now I can use weighted sort instead!

    It seems like one should be cautious, as always, with how one interprets weighted sort. Should the grain of salt taken with the data grow larger as the weighted metric moves closer to the site average?

  6. 6

    I'm very glad you decided to do a blog post on this feature because I was very excited when it was released and was very interested to get your take on it.

    This feature makes the job of an Analysis Ninja so much easier. The logic behind the algorithm just makes sense, so much so that one might wonder what took so long, but beggars can't be choosers :)

    Now if only other Web Analytics solutions would follow suit and implement similar intuitive features that make analyzing web analytics data a whole lot easier, Omniture :)

    Keep the great innovation coming!

  7. 7

    Oh, oh. I almost forgot to mention the $ Index reference, but reading Joe Texiera's comments reminded me. I've been using $ Index for some time now to really focus in on those pages that are contributing to conversions on client sites. I am very glad you mentioned it and it got me wondering if the $ Index could have been used to drive the Weighted Sort algorithm instead of the Estimated True Value. Just a thought.

    Great post!

  8. 8

    Avinash,

    Awesome post, great addition to GA. Any chance of using it with the advanced segments in the future?

    I love how you use it with the posts, to know wich post have the best perfomance and which needs more love.

    Kudos.

  9. 9

    Have I told you lately that I love you? ;)

    I was so excited to hear about the new feature, and I'm even more excited to see it put into such plain English. Kudos to the GA team not only for the work but for the willingness to be so transparent about it. I've been wanting to use a similar method for PPC analysis, and I think you've given me just enough math to make that possible.

  10. 10

    Great feature and great article!
    IMHO, probably the most important step forward since advanced segmentation within GA

    I can't wait analyzing my stats…

  11. 11
    Abdullah Yousoff says

    I am typically a freshie at Web Analytics, only 2 years of work experience in this field. I must admit that I have been a lost analysis soul before I came across your blog.

    Truly inspiring to see how you use multiple data points with this new feature to make sense into valuable insights. Thanks again for highlighting the new feature as your explanations in plain English makes web analytics so much easier to understand.

    Hats Off!!!! :)

  12. 12

    Thanks for writing this blog post, Avinash!

    This is certainly a new favorite feature. It was one of these things that you wish it existed without knowing what it should look like. Great job, Google team!

  13. 13

    We've been having to do this offline for quite some time. Your description of their approach sounds familiar to what we use, involving Bayesian shrinkage estimators. Glad to see Google Analytics has automated this – it will definitely cut down on the amount of time we spend processing data for analysis. Another outstanding addition to this tool!

  14. 14

    Being honest Avinash, although it's a review of feature of Google Analytics, I will have to come here again and again to understand the post.

    You simply rock! I feel like a kid whenever I reach here.

    :)

  15. 15

    Will be interesting to see what kind of case studies arise out of this new feature.

  16. 16

    Andre (/Michael): I suspect in everyone's life it is about volume. :) But how do you focus on: 1. Where are we getting less volume where we should get a lot and 2. Where is it that we are getting lots of volume but we can improve a lot more.

    You can filter using Advanced Filters, but that still leaves you with thousands of rows of data where you have to apply judgment and pick and choose and figure out how to separate the wheat from the chaff.

    Weighted Sort was built exactly with that goal in mind: To make it even more efficient for you to find "interestingness" in thousands of rows of data – and without you making guesses.

    Then you apply advanced filters.

    If you first do advanced filters and then weighted sort you might filter out some low volume sites that present the most interesting opportunities.

    If you do weighted sort and then advanced filter you won't have that problem.

    I want to stress that I am not saying advanced filters are not great or that you don't need them. Weighted sort is just one more tool in your arsenal.

    Josh: Remember it is not the largeness of the data set that will cause one thing or the other to happen in Weighted Sort. It is the individual metric values for the particular dimension you are looking at (keyword, or urls, or campaigns or whatever). Overall size of data (large or small) is context for "position on the scale" but it is the individual values and Estimated True Value computation that is more important.

    The purpose of this blog post is absolutely the sentiment you mention: Make sure you understand how ETV's and Weighted Sort works so that, as you mention, everyone will use it judiciously.

    Anthony: All web analytics vendors, Google Analytics included, need to pause every once in a while from increasingly puking ever more data out to thinking: "What have we done to make our tool more intelligent?"

    The answer is not: "We've collected even more facebook twitter flash ajax underwear offline data warehouse fingerprint sprops custom variables things".

    The answer is: "We've built five smart features that use math, statistics and artificial intelligence".

    Not enough vendors do the latter. It is my fervent hope that that changes over time.

    -Avinash.

  17. 17

    Hey! I would love some flash ajax underwears…

    Any idea where i can get some of these? ;)))

  18. 18
    Chris Erickson says

    Avinash, I agree 100% that Weighted Sort is a wonderful, fantastic tool! I've already been using it for clients and showing it off to colleagues. (It works wonders for SEO and SEM people, for obvious reasons.)

    However, they still have some rough edges to work out.

    For example, looking at the little arms around weighted sort, it would seem like you can apply weighted sort to the "Pages / Visit" metric or "Average Time on Site" metric, but you can't.

    Worse yet, I can do weighted sort on "overall" conversion rate for a conversion group, but not for an individual conversion/goal (!)

    Similarly, while being able to weighted sort bounce rates on keywords (and referrers, etc.) is awesome, I'd love to be able to do so for the "Top Landing Pages" report as well!

    In short, I'd sum up my requests as follows:

    1) Make it clear which metrics and reports Weighted Sort can and can't be used for, and

    2) Make Weighted Sort work for all the metrics and reports it seems like it should work for, but doesn't!

    I've already submitted this feedback directly to Google, but I think all of us ninjas & squirrels out here would really appreciate if you'd help lend your voice to suggesting they add these improvements to this awesome new feature :)

  19. 19

    Hi Avinash,

    As always thanks for the wonderful post. I just ordered both your books. I cannot wait to get started.

    Cheers

  20. 20

    Is the weighted data a beta feature? I do not see it showing up in my Google Analytics reports. Does it need to be turned on?

  21. 21

    Chris: Consider the feature to be in Beta, all feedback from the lovely GA users will help influence what happens next. Please keep the feedback coming, I know the GA team members read this blog. : )

    Steve: You have to click on the column title (like Bounce Rate, Conversion Rate, % New Visits) and then you'll see the check mark pop up on top of the table (like in the image in the post titled OMG) and then you click on that check mark and you are in business.

    Everyone has access to the weighted sort feature, though it is not yet on every single metric in every single report.

    -Avinash.

  22. 22

    Hey Avinash,

    Great description of the inner workings – I was very curious to see how it was done. And some fine examples that you have given for people to look at (of course each person will find some insight in something slightly different to the next one).

    Things like this are why I love Analytics. A few years ago I played around with Microsoft's Gatineau. They had a great visualisation tool that would help give you impressions of your data based on size of squares, with colour depths depicting another scale (eg bounce rate), allowing you to focus on areas. It's a shame that they didn't invest and then discontinued it.

    This, the weighted search, motion charts, etc – all great ways of turning numbers into visualisations that you can make actionable.

    Alec

  23. 23

    Avinash…thank you for explaining this – and I can think of one job where this will be a handy feature. The client is thinking of taking their online courses overseas – this will help determine the direction of languages to target.

    BTW I don't know how you keep up with all of these cool things, and so glad to have found your webinars, blogs and sound advise.

    Thanks!

  24. 24

    Avinash,

    Related to keywords and bounce rates – It all sounds nice in theory – find a bunch of keywords with a (relatively) high bounce rate, "fix" the website and the bounce rate goes down, but do you have any before/after examples of this?

    A lot of people talk about this, but I rarely see it put into place in "the real world" – especially in the SMB world.

    Sorry – love this stuff and I'm sure I sound like a negative nancy.

  25. 25

    Alec: I concur with you that the demise of Gatineau was sad. Especially I had even written a blog post with my wish list!

    Blast from the past:

    ~ Microsoft Gatineau: My Wish-list for the Web Analytics Application

    Where is Ian Thomas when you need him! :)

    I think the visualization you were referring to was the Treemap Visualization. Very cool.

    If you use Google Analytics you can do that visualization using the GA API And the App Engine:

    ~ Google Analytics API on App Engine Treemap Visualization

    Nancy: I read a lot.

    Oh and it is a lot of fun to do this, never feels like work!

    Ben: There are a ton of case studies of people focusing on the metrics outlined in the examples and improved their sites. Bounce Rates, Conversion Rates, % New Visits etc etc.

    If you / your company / your client have failed to improve your websites / campaigns by using those metrics then Weighted Sort won't help.

    If you have succeeded in the past (by fixing landing pages, discovering new valuable referrers, loving pages that create more revenue for you) then Weighted Sort will help you find dimensions of value faster.

    To me that's it. No more, no less. :)

    -Avinash.

  26. 26

    Avinash,

    I completely agree that:

    1) this new feature is magnificent and

    2) there should be more fireworks going off in celebration!

    Slicing, dicing and sorting data into something actionable can be like prospecting for gold. Sometimes you hit the mother load, a lot of times you come up empty.

    This feature is super helpful in guiding us to better insights and payoffs. Great work GA team.

  27. 27

    Weighted sort certainly makes analyzing large volumes of data easier, but what I wish Google would add to Analytics is the ability to sort data by percentage change or change in absolute value when comparing two time periods.

    Can you suggest any third-party tools (other than excel ;) that make this kind of data sorting easier? Does GA already do this and I'm just missing it?

    Thanks!

  28. 28

    Noah: I am personally, just speaking for myself and no one else, not a fan of percentage change. It rarely highlights really interesting stuff (hence something much smarter like weighted sort – and there are many other smarter methods as well).

    I do understand the desire to use percentage change. At the moment that does not exist in GA.

    But there are a ton of third party tools that do this very effectively with Google Analytics, there is a very long list here:

    http://www.google.com/analytics/apps/results?category=Reporting%20Tools

    Checkout the other categories on that page as well.

    Finally also please checkout the recently awesomized GA Custom Alerts feature. It allows you to set percent change alerts for pretty much every dimension and metric that you want to. You can even choose the comparative time periods with ease (ex: compare to yesterday or compare to same day last week etc etc).

    Avinash.

  29. 29

    Hi Avinash,

    Swung by to catch up on the newest stuff, and I have been impressed by the extent that Weighted Sort will take out the grunt of constantly setting up inline filters to eliminate very small volumes or mad bounce rates, etc.

    I am interested to know whether this functionality will be combined with the Intelligence features. Ideally, you would want to be alerted to changes in variables which are likely to feature in the most common / popular Weighted Sorts.

    Just thinking out loud here, suggesting combination of features that may assist the community in focusing on results, not reports. Anything Google can do to speed up the 'diagnosis' phase of our work, means the quicker we can get on to taking action.

    Keep on keeping on

    Dan

  30. 30

    Dan: Google Analytics Intelligence and Weighted Sort both sit under the same umbrella: "What can we do to make our users life easier by finding them valuable things to focus on algorithmically"

    But under that umbrella each feature solves a different problem.

    In case of Intelligence: How to proactively look at your data to highlight precious anomalies in your data and surface them.

    In case of Weighted Sort: Based on what metric and dimension is *important to you* how to enable the discovery of optimal dimensions to focus on.

    In as much they will remain "separate".

    But you'll be glad to know that there are more awesome things on the horizon for Intelligence that literally have me giddy with excitement, I am having a hard time not talking about it. Things that will, as you put it, speed up the diagnosis phase of an Analysis Ninja's work.

    Stay tuned.

    Avinash.

  31. 31

    Great post (should I expect any less?)…

    Definitely a great feature, and one that Adobe ought to replicate (hint, hint)

    Question for you, Avinash:

    What are your thoughts on using two analytics platforms at a major eCommerce organization (say Adobe and GA). I certainly see benefits of going beyond the Adobe world, but also headaches and internal confusion (why do these two sources have different numbers…)

  32. 32

    Ben: I believe in monogamy. The complexity of polygamy almost never delivers on the on-paper promise (and I can't even tell you how much less love exists in that relationship!).

    I am of course talking about web analytics tools. :)

    I have covered this in a lot more detail in this post:

    ~ 10 Fundamental Web Analytics Truths

    Please check out item #1.

    I know that it is very tempting to believe two can exist and succeed where one might not have, especially if you planning to be so disciplined that the "big" tool will be for "extra special people who get it" and the "small" tool for those who don't. It never works out.

    Implementation, culture, paralysis are primary culprits.

    I encourage you to stay with Adobe / Omniture and identify what the real reasons are for your organization not being massively data driven. I don't think the problem is Omniture because it is a good suite (even if it is a bit complicated – but this is the reason your company has you! :)).

    All the best.

    Avinash.

  33. 33

    Avinash,
    Great stuff and again a post with value to your readers.

    The 'weighted sort' also counteracts what I call the "uncertainty principle" in analytics –too few data points, the value & impact are in question; too many data points, the question arises if the results are masking hidden insights because of skewness and the average effect.

    I would also second your comment about using filters second after you figure out what is 'interesting'. This is similar to when folks work with random samples. Before one does any sampling, they should first study their data to ensure their sample contains most if not all the important features of their original dataset. Just taking a sample or filtering first can as you said eliminate valuable data points from your view.

    (The only 'nit picky' :-) feedback I have is on the term ETV. I wonder if some folks would equate this with "the most likely value an attribute might take for that observation". For the high incidence attributes, this would be the case but for the terms with low incidence this is definitely not so. The "true" value (for low-incidence terms of today)given enough occurences might be something quite different from the ETV calculated today. Of course, as we get more and more data points the ETV will morph itself into the true value).

    Best,
    Ned

  34. 34

    Do you have something against the word "an"

  35. 35
    Alex (Handy Backup team) says

    Thank you for a great post, Avinash!

    I've spent almost half an hour trying to explain my colleagues what Weighted Sort meant… to no avail. But then I sent them a link to your explanation (the countries example, and other), and it worked just great:)

    However, I can't understand why not all reports allow applying the new feature to Bounce Rate. For example, I always wanted to study keyword performance of my website's homepage, but when I select "/" in Content, and then "Entrance Keywords" to analyze, the "Weighted Sort" option doesn't show up (still I can see it for Index and Exit rates). What do I get wrong?

    Regards,
    Alex.

    • 36

      Hey, i can't use that tool neither. I see it but this is gray, i can't select it.

      Anyone knows why? (14/09/12).

      • 37

        Omar: The Google Analytics team has removed the options from some of the reports, but Weighted Sort is still available in many of the reports.

        Here is how you use the feature.

          1. Go to the report you are interested in, let's say All Traffic Sources.

          2. Click on the column that has a % distribution, let's say % New Visits.

          3. Go to Sort Type and from the drop down choose Weighted.

          4. Review your results.

        If you choose a column for which Weighted Sort is not available then it will be grayed out.

        Avinash.

  36. 38

    @ B Weiss

    It's been possible to do weighted sort in Omniture for years – just add a calculated metric. For weighted bounce rate as described above, you'd use this:

    ([Single Access]/[Entry Pages])*([Page Views]/[Total Page Views])

    Then just apply that metric to the appropriate report in SiteCat.

    Andy

  37. 39

    Andy: I encourage you to read the post again about how Weighted Sort works.

    It is not the formula you describe in your comment, and it is not possible to do what GA is doing with Weighted Sort in Omniture's Site Catalyst.

    Note what I am not saying: I am not saying Weighted Sort is so unique that it can't be done in any other tool.

    A cursory review of advanced statistical formulas will show there are many ways to compute ETV and then use that intelligence to intelligently sort the data.

    The team at Adobe (or WebTrends or IBM / CoreMetrics or IBM / Unica) can implement those formulas in a jiffy to accomplish the same intelligent goal that ETV and Weighted Sort is doing in Google Analytics.

    Only the slightest amount of will is required.

    Alternatively you can also study those formulas, extract data out using Omniture's (paid) api or WebTrends's (free) api and apply them yourself.

    But if you want to accomplish that then it is not the formula you describe in your comment.

    Avinash.

  38. 40

    Awesome post!

    Helps push a different point that we have been trying to champion internally for some time. We have been pushing the concept for evaluating trade-off between traffic and conversion. Would be nice to have both but that is not always the case.

  39. 41
    Andrew Blank says

    My friend, how can I reject every tool that does not come built in with weighted sort if it is patent pending?

    Seriously cool post.

    Google is on to something with the intelligence aspect. I've wanted something like this for years. It really moves web analytics tools away from being just web metrics.

  40. 42

    Aseem: I think you have hit the nail in the head of one of the key goals behind doing things like Intelligence and Weighted Sort.

    You are right to push the concept of evaluating trade-offs and I am sure your organization is better for it! :)

    Andrew: While Google's unique computation of ETV is patent pending, I have described everything that is happening except a very small (ok but important) last part.

    There are many ways in statistics to accomplish the same process. I encourage you to explore them and use them. There are too many methods out there and too much value to gain from them!

    Thank you so much for adding your comment, and support.

    -Avinash.

  41. 43

    I really like the weighted sort. thanks for this great explanation of the feature.

  42. 44

    @Andy

    Thanks for the tip, but Avinash has already taken the words from my mouth… weighted calc metrics in Omniture are something entirely different; but I agree with you that they can be useful. It's something I've already come to know and use, but the GA implementation of weighted sort, as Avinash points out, is a more intelligent system and much more effective (imo, at least).

  43. 45

    Great post and cool feature.

    In your experience do you think the feature performs better or worse on data with different types of distribution?

    For instance is it better with nice normally distributed data or better for the long-tail (pareto) distribution (amongst many other distribution types).

    Are the results more or less "interesting" in either one and if so what's the most appropriate use of the feature?

  44. 46

    Donald: If you look through the explanation (above) of how Weighted Sort works you'll note that the distribution of data (normal or not normal) does not impact its performance.

    It is a function of overall Visits and Bounces (or Conversion or % New Visits) and the values of the Dimension (keywords, urls, campaigns, whatever).

    In as much what it does it is able to do it quite well for different types of distributions.

    Avinash.

  45. 47

    Avinash –

    Very interesting concept! Sounds like you made another breakthrough that will help our fellow Web Analytics ninjas.

    I wonder what kind of exact calculation is done inside the black box :) Anyway I think it would be really great if there's a functionality that can fine-tune weight parameters. Am I hoping too much? :)

    BTW I recently got a new job at JPMorgan Chase seeking a new challenge. Still living in the Buckeye city!

    Jonghee

  46. 48

    la la LaaaaAAAAA! Love it! It's hard enough having a website with I think are a lot of pages, which one should I work on today? Gosh…is it worth my time this page? Thank you weighted sort for literally saving my day! GLEE!
    Thank you for the explanation!

  47. 49

    WOW- this is incredibly insightful. and a brilliant way to gauge where you should focus your efforts!

    Kind of stumbled across this when logging into GA and saw the analytics blog post and followed through to here.

    Will track back to you on my blog :)

  48. 50
    Not yet a ninja says

    Avinash, this is a great post. I was told about this new functionality earlier this week and struggled with it. Your post really helped me to understand it and I am now using it – thanks.

    Only feedback for the Google folks would be that it doesn't seem I can use this on all reports (e.g. top content) – why not? (or am I doing something wrong!)

    Anyway, great advice and guidance as always. Thanks

  49. 51

    You are awesome. Thank you so much for such a detailed (yet fun and easy to understand) post!

  50. 52

    … the traffic from Argentina, Peru, Spain, Columbia, Chile and Denmark ….

    Is Colombia not Columbia!!.

    Anyway great post Avinash… i have learn a lot today in your Blog.

  51. 53

    Hi Avinash,

    Great feature and nice explanation of the functionality.

    Is there possibility to add other magical sort options for Pages/visit, Time on Page and custom variables?

    I plan to use custom variables to track survey responses and would like to:

    – Prioritize engagement rates
    – Identify user tasks which are most difficult for visitors to complete

  52. 54

    Brilliant as usual, Avinash. I just like how this rewires the synapses and allows me to look at and think about data differently – it can create an Aha! moment.

    I used Weighted Sort to pull up a report based on $ Index, and found a relatively high-traffic landing page on a client site that needed some work. A few changes, and voila, a 411% increase in click-through rate to the product Buy page. Good stuff.

  53. 55

    This really helps, thank you.

    Basically, GA is cleverly exploiting the principle of regression toward the mean. I like it.

  54. 56

    Hi Avinash,

    Thanks for a great explanation about this new feature provided by GA. I am guessing we can use this feature to define threshold or benchmark within the metrics. It will save a lot of work that we do in excel and this way we could do better analysis.

    There are lots of changes happening in Web Analytics Vendors and they are offering lot more new features as per today's business needs.

    Regards,
    Prashant

  55. 57

    Prashant: Weighted Sort works automatically, with no work or input from your end.

    If you want to define thresholds then consider using the inline filters that are already in Google Analytics (look for the box with a magnifying glass with the word Advanced next to it) or use Custom Alerts to set up notifications when things fall inside or outside areas you are interested in.

    Here's a blog post with ideas: Identify The Known Unknowns: Leverage Analytics Custom Alerts

    -Avinash.

  56. 58

    I'm working on optimizing a PPC campaign for a client, not a lot of data, but still, the weighted sort really comes in handy, so I googled it to find out what really goes on in the background and that's how I ended up here.

    My favorite part of your post: the math of using average bounce rate and country bounce rates to figure out how the weighted sort works. Who'd ever say I'd enjoy math after all the math I did back in my engineering days?

    What's your thoughts on limiting data, by eliminating the lowest and highest figures in a chart? Do you think eliminating the obviously extreme values out of an analysis gives a clearer and easier to figure out conclusion, or we'd be losing valuable insights from what dramatically works (or doesnt work)?

  57. 59

    I Googled for an answer with no luck/answer for this question: what happened to this great weighted-sort feature in the new analytics version? I'm tired of always switching to the old interface just for using this feature.

    • 60
      Felipe Veiga says

      This option is on the first row of any table in "sort by" button. Nevertheless, it is always grayed out for me and I can't switch to the old anymore.

Add your Perspective

*