Smarter Data Analysis of Google's https (not provided) change: 5 Steps

complex beautiful1 It is astonishingly common that we are asked to analyze the impossible. In perhaps a career-limiting move I'm going to try to do that today (and for a controversial topic to boot!).

In this post about an important Google change, I want you to focus less on the data and focus more on the methodology. And – so important – I want you to help me with your ideas of how we can do this impossible analysis better, in the complete absence of data :). So please share your ideas via comments and let's together make a smarter ecosystem.

[Update: As of late 2013 secure search now results in almost all of our keywords not being provided. For the latest, please see this post: Search: Not Provided: What Remains, Keyword Data Options, the Future.]

On board? Let's go….

In an effort to make search more secure, on Oct. 18th Google announced that users logged into their Google accounts using www.google.com would be redirected to https://www.google.com. The search queries by these users would hence be encrypted and not available to website owners via web analytics tools such as Omniture, WebTrends, Open Stats, Google Analytics etc.

Switching from have all the search queries in the keywords reports was our normal state, not having them feels different. As the change ramped up and more user queries came to be represented, in at least Google Analytics, under the moniker "(not provided)" we all got worried. From our perspective it would be immensely preferable to be able to analyze all the keywords individually. Sadly we don't have that now.

The wonderful thing is that in addition to passionate commentary on Twittersphere / industry blogs / gurus, we also have access to data for our own websites. We can, and should, look beyond simplistic "it is this high or that low" to see if we can understand something (anything!) deeper.

Most analytics vendors, including Google Analytics, reacted immediately to the change in order help us quantify the impact of this change in multiple ways. As you can imagine my reaction was to unleash a flurry of custom reports and apply smart advanced segments and compare data pre and post change and go down a bunch of holes.

From that experience here are five steps I recommend you follow to gain a smarter understanding of this change…

1. Establish macro context.

On Oct 20th on my Google+ page I'd shared a custom report for Google Analytics that makes it extremely simple for you to look at this data. Visits, Unique Visitors, Bounce Rates, Goal Completions for (not provided).

You can download that report into your GA account by clicking on this link after you are logged into GA: Google httpS Change Impact.

Here's what the data for this blog looks like for one month:

not provided custom report 11

Like me first you should compute the high level impact of the change. From Oct. 31 (when the trend started to spike and subsequently stabilized) to Nov 15…

Total site visits: 57,672
Search engine visits: 27,534
Google visits: 26,548
(not provided) – i.e. keyword unknown – visits: 4,651

User search queries not available: 4651 / 26548 = 18%

Please note that this number will vary dramatically depending on the type of website you have, audience attributes, geographic location and a number of other factors.

Now you know what the number is for your site, and you can keep the custom report handy to continue to watch what happens over time. Remember to divide the number by total Google traffic. I see people using total search traffic or total site traffic or… other imprecise metrics.

All numbers in aggregate are at best marginally useful, and that rule applies to this one too.

We want to know more. Who are these people? Are they people I should care about? Not care about? And what kind of search queries are these? Brand? Non-brand? What else?

Sadly we can't answer all of those questions, but we can make a small clump of informed judgments based on data we do have. It just needs a pinch of passion, some smarts and a lot of effort.

Let's deep drive into some very cold and choppy waters…

#2: Understand the performance profile of the (not provided) traffic.

One of the things I hate about standard reports in all web analytics tools is that they scatter necessary data across tabs, multiple reports, or outright hide it. #aaarrrrrh

So I always use custom reports . In most web analytics tools it takes as little as 20 seconds to create one. I did one for this particular purpose. It provides me the end-to-end view of search keyword performance in one place.

Here is what it looks like:

keyword analysis custom report 11

You can download it into your Google Analytics account by clicking here: Keyword Performance Analysis Report

Two quick things to note.

1. Never ever never never never create a custom report without three critical elements: Acquisition, Behavior, Outcomes. Without the end-to-end view you'll make bad decisions.

2. It is a bit odd that my first dimension is Source (essentially All Traffic) for a keyword report. Before I dive into search data, I always like to set context in my mind for how important this (or any other) traffic is. It is rare that we see the big picture before we go for the weeds, I personally find that sub optimal.

Though in this case if you drill down into any other report except a search engine, that second drill down won't make sense, but that is okay. Small sacrifice to be smart, right? :)

So how does (not provided) look? Here's my end to end view:

keyword performance data 31

The numbers in red were added to the report by me. I wanted to know what percentage of the total Visits and Goal Completions (not provided) was. [On that last point, if you have an ecommerce website you can use Orders or an appropriate proxy instead of Goal Completions.]

Bottom-line: 18% of the Visits and 22% of the Conversions.

Big numbers! But with a quick scan of the report, I think I already see that there is something delightful going on here. Stick with me. I think we have a surprise coming.

The custom report has eight metrics (two more than I normally use) simply to try to tease out some nuance of the performance as we look across keywords.

One hypothesis I had was that (not provided) might be mostly returning visitors. The overall search avg % New Visits is 67.96%, for (not provided) it is 65.06%. Very similar to the "average site visitor." But notice that all Brand Terms above (avinash, kaushik, occam's razor) have very low % New Visits. So it is possible that (not provided), contrary to my hypothesis, are mostly new people.

Overall bounce rate is 70.2% (not unusual for a blog/pure content site), and (not provided) is 66%. Again, scanning across the top ten terms you can see higher rates for non-brand searchers (people looking for specific, perhaps quick, answers) when compared to brand terms.

Content consumption, Pages/Visit, seems to be a bit on the higher side compared to the average (1.76). But like the other metrics above, there is a pattern between brand and non-brand (with brand higher on this metric).

I really, really care about Goal 2, hence that conversion rate is in the report. The average is 2.21%, (not provided) is around 2.37%. There's not much conversion going on with the broad non-brand terms (you can't get lower than 0% :).

Goal Completions is very interesting. (not provided) is a huge bucket of goal completions (and it is easy to understand why so many SEOs and Marketers and Lovers are in a tizzy!). The thing to note here are the numbers in red (% of each bucket compared to total Goal Completions, 4,816). See how quickly thing fall off the cliff. Note the difference between brand and non-brand.

Finally, my absolute favorite: Per Visit Goal Value. There is no obvious monetization on this blog, but I have 8 distinct goals and I have goal values assigned to each for the long term impact each adds. (How's that for focusing on customer lifetime value? :)). $1.27 for (not provided), compared to overall of $1.01, and the number does not come close to the other brand terms.

We still don't know what keywords are contained in the (not provided) bucket.

But what we do know is that for this site (not provided) visitors fits this bill: They seem to be new people with behavior that is quite distinct from the "head" brand terms and closer to the non-brand terms.

In the past I've lovingly termed non-brand long tail visitors as "impression virgins." The hint at the end of this step is that I've got myself a lot of impression virgins in (not provided)!

Let's go and see if we can validate that theory.

#3: Deep dive: Match up performance profile to Brand & Non-brand visits.

Based on the clues above, I'm going to try to understand whether the performance profile for (not provided) is indeed closer to brand searchers.

I create this simple segment in GA… should take you five seconds to do it for your own business…

brand keywords segment1

Apply it to my custom report and boom!

brand traffic performance1

[sidebar] A quick thing to note is the ratio of Unique Visitors to Visits. In context of % New Visits that makes sense. But just make a note of it. [/sidebar]

How does this compare, purely from a performance of the key performance indicators perspective, with (not provided) for the same period?

not provided keyword performance1

Quite a stark difference as you look across metrics like % New Visits, Bounce Rate, Pages/Visit, Conversion Rate and Per Visit Goal Value.

So how does the performance of (not provided) compare to that of non-branded keywords? Not a difficult question to answer.

Back into GA to create a segment like the one above, expect change "Include" to "Exclude" and I have my non-branded traffic segment.

Here's how those numbers look like in the aforementioned custom report:

non brand keyword performance1

When you do this with your data you'll have a similar image and you'll compare it to your (not provided) segment performance, and your brand segment perfromance. In the comparison above it is clear that these three buckets are distinct, but that the performance of (not provided) is not as close to brand as it is to non-brand. Even though the (not provided) segment is small (4.6k) compared to non-brand (21.9k) – thinking about impact on averaging these metrics.

There are two likely scenarios in terms of what you'll find…

In your case (not provided) segment might match overall Google traffic or one of the above segments. In which case you continue business as usual with the assumption of an even distribution.

It is possible that (not provided) segment does not match overall Google traffic, or one of the above segments, in your case. In this chase you understand a bit better how to treat it in your thinking (more keywords connected to your brand or non-brand segments). At the moment you can't take action based on this information (how to you react to visitors whose keyword you don't know at all). But when presenting to your senior executives you can give them a bit more context.

It does not eliminate all the questions, but it does help me go from "I have no idea who all these people/keywords are" to "Okay looks like it might be my non-brand possibly long tail traffic."

Something of value, right?

All of the above is still kind of at an aggregate level. But we all have a lot of keyword level historical data. At some point we should have enough post change data that we can throw it all into a delightful regression model to fine tune our understanding at a keyword level.

At the moment we just know a little bit more than "here's my total (not provided)."

#4: Tentative conclusions. Why this seems so scary, but might not be (at least for now).

Most, but not all, of my branded traffic is my "head" traffic, i.e. traffic that results from a few keywords used by lots of visitors. After all your brand is unique to you and, for any type of website, drives loads of search traffic to you because you rank high in SERPs for those brand queries.

Most of my non-brand traffic is my "tail" traffic, i.e. traffic that results from a lot of keywords used by a few people each. For example you'll notice at the very start of this post that during this time period I had 27k visits. Of this my "tail" traffic comprised of 21,921 visits. These delightful folks used 10,498 distinct non-branded key phrases to find my website.

10,498 distinct search queries drove 21,921 visits!

Remember the two scenarios I'd mentioned above? Let's look at one of them (performance closer to non-brand traffic) and understand what is happening a little more visually. What is happening when (not provided) shows up as your #1 metric in your search keyword reports?

In my case above, closer to scenario #2 for me, the performance of (not provided) as shown by the metrics above looks more like that of the visitors who came via those 10,498 non-branded search key phrases.

Here's what's happening when (not provided) shows up #1 for me (clear in the screen shot in part #2 above), as explained by my head – tail illustration :

long tail slivers1

Prior to this change by Google, the gray slivers above represent traffic that became (not provided) after the change.

In the past only a small part, if any, of this traffic, for me, would ever show up in the top ten or twenty keywords in the report (head traffic). Because much of it was in the long tail I never noticed it (it is hard to look at all 10,498 key words individually! :).

But after the change by Google, these tiny, in the past invisible, slivers combined look like one scary beast. I've painfully combined every pixel of gray sliver above:

long tail not provided combined1

OMG! I've lost a huge chunk of something that was a very important part of my traffic!!

Not really. It just looks scarier than it really is because tiny shavings of your other keywords (now used by logged in users who are opted into https sessions on google.com) appear in one big piece. Individual cells don't look that scary. But combined they look like Darth Vader himself. :)

Let me hasten to add that this does not mean that these "slivers" from user search queries are not important. Or that just because they are mostly non-branded traffic we should ignore them (I argue 100% contrary to that here: Monetize The Long Tail of Search ). Or that you should not worry and that the sun is shining, there is no US debt problem, we have universal health care and Ashton and Demi are still together.

No. Not at all.

But the sky is not falling either.

We can use the actual data we have to keep a very close eye on this traffic and its performance. We can use advanced segmentation and custom reports to understand where this big scary block of traffic used to be. Is it (to repeat the scenarios we outlined at the end of part 3 above) closer to the average performance and hence possibly evenly distributed or closer to non-brand and less evenly distributed.

We sadly still won't know what actual long tail or non-brand keywords or overall keywords they represent or how much of a particular keyword/phrase they used to be. But my POV is that we'll be in a better place.

You can be, if the data in your case justifies this, just a little less worried.

#5: Additional awesomeness: Landing page keyword referral analysis.

One final idea I had was to wonder if the (not provided) traffic enters the website at a disproportionate rate on some landing pages when compared to all other traffic from Google. If that is the case we could do pre post analysis on referring keywords to those landing pages and get additional clues.

It is not very hard to go checkout that theory.

First, create an advanced segment for the (not provided) traffic:

not provided traffic segment1

Then go and apply it to your standard Landing Pages report in Google Analytics (or SiteCatalyst or WebTrends or Yahoo! Web Analytics):

top landing pages report search1

The analysis from here on is not very difficult (though in the new version of GA it is harder as the UI designers got rid of the % delta for comparative segments – what a shame). Just use our bff MS Excel.

For example 14% of the (not provided) traffic enters on the home page.

I was able to find a small clump of pages where the (not provided) traffic, at least currently, entered the site at a higher rate than overall Google traffic. I can see the referring keywords to those pages prior to the change and after the https change and attempt to identify which keywords might be contributing traffic to (not provided).

For me this analysis provided a better idea about some long tail non-brand keywords. But it was not as much as I would have liked to learn. Partly that is a function of the fact that those keywords are used by a handful of people and, this makes it worse, they are quite transient – they are not used too many times again.

But since everyone's site and visitor behavior would be different I did want to share this idea with you. It is not a hard bit of analysis to do, and you can let the data tell you something (or not).

That's it.

A simple five step process to go from reacting based on an aggregate number in your keyword reports to a much more nuanced (if imperfect) understanding based on your own data.

Caveats:

Before we go, a few important reminders that are spread throughout the post above but bear repeating….

* Perhaps the most important one is that your business might be nothing like my business. For example, you could have a lot more volatility in your search behavior (e.g.: your top ten search keywords look dramatically different every week/day), which would make my comparative analysis in part two moot.

Use the steps above, but your own data to arrive at unique conclusions.

* I'm comparing two weeks of data here, because that is all we have so far. I plan to revisit this analysis again in two more weeks, and then periodically to reaffirm my conclusions above or to burn them and start anew.

* We actually don't have any idea what keywords / key phrases comprise (not provided). We just have a better understanding of how that traffic performs.

* It is important to point out that Webmaster Tools and the AdWords Keyword Tool still have a lot of keyword-specific data related to your website. They don't have any (not provided) – mostly because their view is from Google and not from your website. Please use those two tools – both free – to understand keywords that cause your website to show up in Google SERPs, and queries that subsequently get clicks. Not exactly reveling 100% what (not provided) search queries might be, but something.

Anything else I should have here that I've forgotten?

I would love to know how you would go about doing this impossible analysis? What other path would you take in your web analytics tool? What segment, report, metric, walk on water effort would you undertake? Regarding my five step effort above… what flawed assumptions am I making? What would you change in terms of the approach/conclusions in any of the steps?

Was this nuanced understanding of what might be happening better than where you started?

Please share your alternative ideas (please!), critique of the above analysis, ideas for world peace via comments.

Thank you.

P.S: A request. This blog focuses on digital marketing and web analytics, it is not a policy blog. If you are up for it I would love for your comments to focus on the former and not the latter. If for no other reason than that my skills don't extend to the policy part and I would not be able to share anything of value with you.

I appreciate your consideration.

Comments

  1. 1

    That is an extremely interesting analysis Avinash, it sheds a lot of light into the issue.

    The only thing that I don't quite share with you is the "Why this seems so scary, but might not be" section. What really scares me is that 18% (or any other %) is just the beginning. Clearly, Google's objective is to grow that number to ~100%, and that would be very problematic for us. And, although 100% is not reasonable, the number could easily get very high with the huge adoption to Google+ ;-)

    • 2

      Daniel: Website owners (including you and I) have found keyword referral data to be almost mission critical. I do hope that it won't go away.

      I'm glad you found the analysis to be of value, it really is an attempt to try and dredge up some understanding in a place where we have little data to go on.

      -Avinash.

  2. 6

    I like your approach to this matter, but what I do miss is the action. Conclusions are good, actions are better. I want to see bouncerates for keywords to see if the landingpages contain what my visitors where expecting. I want to be able to spot new keywords for upcoming markets, etc. And my ability to do this kind of analyses to take action will decrease over time.

    Analyzing the big blob of unknown data will hardly earn me any money, and therefore it's a waste of time, in my humble opinion ;)

    • 7

      Andre: On the contrary if you are able to find some insights from, as you rightly say, "big blog" then you might actually be able to charge more rather than less. Most people will see the "big blob", throw up their hands and move on.

      Not you though. You are going to find something of value even if you have to compare individual keyword by individual keyword per and post changes to ensure you at least quantify for your clients how worried they should be, what action, even if limited, they can take.

      Right?

      :)

      -Avinash.

  3. 8

    Avinash

    Great post explaining what you are after and making it easy for us to follow – illustrations, screenshots.

    Nevertheless, maybe me, André and Daniel have to remember, measuring is great but you have to use the right metrics for your business context. And while impressions/pageviews can matter, we have to see how much referral analysis relate to our key drivers (i.e. we need to define and agree on the latter before we measure).

    Thanks so much for sharing.

  4. 9
    Nelson Yuen says:

    Number #1 & 3# was it for me. It's been pretty easy to weed down which KWs were what by isolating traffic volumes and trending data historically.

    For many analysts and marketers, the initial "holy crap batman" reaction was because the "Unknown" was one of the top performing KWs or terms that generated the most amount of traffic.

    So in summary, I don't really share the same concern previous commenting colleagues have expressed. Correlation isn't necessarily causation. I suspect your top performing "Unknowns" were most likely brand terms or terms end users were familiar with searching on Google to specifically reach your site. I'm sure many people still type "Occam's Razor" or "Kaushik" to get to this blog, despite being avid fans of the AK brand.

    I reserve the right to be wrong. But from what I've seen in the ecomm space… IDK

  5. 11
    Ander Jáuregui says:

    Interesting and very ilustrative analysis Avinash, and also you can compare the results of your demostration with one specific keyword research and, boom! will not be precise, but will get us to the light… we dont have te think, only, with numbers.

    You are proof that the Web Analytics are something else, and in that somthing else we have to go deeper.

    • 12

      Ander: Excellent suggestion.

      For now I just don't have enough data to do keyword level analysis. But at least for my brand terms I have a lot of history (like everyone else) and at some point I should have enough pre post data to run some kind of massive regression to see if I can fine tune my understanding at a keyword level.

      For the long tail I think we might be in a stickier analysis, simply because the numbers are small and the search queries unique and, to make matters worse, transient.

      Either way more analysis to do in the future once we have more data.

      Avinash.

  6. 14
    Benoit Arson says:

    Hello Avinash,

    I think the analysis you suggest may mislead your readers:

    The performance profile of "Not provided" keywords looks like the non-brand terms one so "Not provided" keywords are non-brand terms.

    You make an assumption: the behavior of people is sufficient to know which kind of terms they have searched.

    Take an other example:
    Black people are good basketball players.
    A group of persons plays basketball very well.
    All persons of this group are black people.

    The fact that you are a good basketball player is not sufficient to conclude that you are black.

    Before concluding of the affiliation of "Not provided" keywords only watching its performance profile, you need to agree with this assumption.

    Then I suggest to not create a segment for the (not provided) traffic as you do in 5#.
    A segment needs to be distinct from other segments and homogeneous within the segment. And "Not provided" keywords are a melting of pot of keywords: Non Branded and Branded keywords, "Apples" and "Oranges" keywords. The segment is not totally distinct from any other Search Traffic segment and it isn't homogeneous at all. As you say, we haven't any idea of the composition of this segment. I think we can not compare this segment with the Brand Search Traffic segment as there are branded keywords within “not provided” keywords segment.

    My opinion is that we have to not analyze the "Not provided" keywords and accept that we have lost precision and accuracy in the analysis of keywords coming from Google.

    • 15

      I totally agree with Benoit on this – with all due respect Avinash, an analyst can not come to any kind of valid business conclusion with unavailable data – other than the fact of knowing how much data is not there… To make another analogy, a survey with a margin of error of 4% can't possibly lead an analyst to guess how this 4% would have answered – he/she only know this is the "unknown"…

      • 16

        Benoit: Let me emphasize that I agree with you 100% that my analysis is based on certain hypotheses. I started with some. Then drilled into data. Evolved those hypotheses if data did not support them and formed new ones. But it is a risky exercise. I appreciate your stress on caution and not jumping to earth shattering conclusions. Thank you.

        If I were to stick with our analogy then perhaps what I'm trying to do is not compare the performance of good African American basketball players with the general population of African Americans in the population. Rather I'm attempting to look at all the people who are playing basket ball already (in my stadium!) and then using segmentation. Slightly less selection bias than might be implied.

        I concur that it is less than ideal. But my initial feeling was that, based on the 110,000 visits to this site, it was worth doing and learning something from.

        I'm afraid I personally refuse to accept that this group of visits is not worth analyzing because we've lost precision. Perhaps not yet. Perhaps at some point the "pieces of search queries that combine to make up (not provided)" will be, as you point out, comprised of a truly unrecognizable mutant. At that point we give up completely! : )

        You are right to advice caution. I'll be extra cautious. Thank you again Benoit!

        Avinash.

        • 17
          Andrew Blank says:

          I understand what you are trying to do and applaud it. All good analysts scrape for any pieces of data to get to the truth. Also you seem to be interested in calming the angry mobs with facts. Again a noble aim. I think if this didn't affect people's analysis in such a fundamental way it wouldn't be such a big deal.

          I even can see the pragmatic aspect of your saying in an earlier post that we will be going more toward sampled data and we had better get used to it (unless the uproar changes Google's mind, I'm surprised there isn't a formal petition).

          The part that I cannot get past is that looking at (not provided) is essentially examining an aggregate group (in the search keyword sense). Did you not say "All numbers in aggregate are at best marginally useful"? It's not just precision we've lost. Some of the basics of segmentation are simply not there. They might look/act like branded or non-branded now, but over time this group is meant to be more and more aggregate and include less and less defined action.

          Strangely it implements the opposite of what Google wants in making ads less relevant.

          • 18

            Andrew: I think I phrased it even more strongly: "All data in aggregate is crap." A bit outrageous but true.

            In this case we have lost visibility to some data. It is less than optimal. We can keep an eye on the special custom report to see how that evolves over time. The core purpose of this post is to suggest some ideas (and others have added more) about analyses that can add some productive value.

            To your last line… it won't impact ads (paid search ads) as your bidding process, the quality score type ad ranking structures, resulting visibility in reporting (ad groups, matched query type, keywords etc) will stay the same as before.

            -Avinash.
            PS: My expertise does not lie on the policy side of this issue, I'm a Marketer / Analyst / Businessperson. So I won't opine on the policy issues.

  7. 19

    I concur with Benoit – this is lost data that can't be compared against other segments with any hope of truly insightful analysis. I'm focusing on the keywords I do see.

    And I'm going to slightly take you to task for this, Avinash:

    "This blog focuses on digital marketing and web analytics, it is not a policy blog. If you are up for it I would love for your comments to focus on the former and not the latter. If for no other reason than that my skills don't extend to the policy part and I would not be able to share anything of value with you."

    Call me a cynic, but your blog post strikes me as a PR piece that basically says "look, losing keyword data isn't so bad, let me show you why and how."

    Yes, it is bad, and no amount of reporting will get around it, especially as the "not provided" segment only grows due to Google actively pushing users to log in and stay logged in.

    This growing blob of unknown visitors has me pondering recommending to all of my clients a move to an analytics platform other that Google Analytics. If Google is taking data away from web analysts, perhaps we should limit what we share with Google.

  8. 22

    Avinash,

    I am involved in creating strategies for companies to monetize their websites better. In my analysis I use analytics data to complete reports on SEO, website optimization and usability, competitive analysis, etc. Mainly, my goal is to figure out what is worth the effort and what to focus on that promises the best return. Your 5 step process will be helpful to make some sense of (not provided) data. Thank you.

    Limited keyword data "broke" many of my systems and processes because I plug in specific keyword data into my calculations. Even though keywords are just a part of the analysis, they are extremely important part because Google is basing their search results on the search query. So, I will need to figure out different ways to gather the data. My thinking now is that I will have to use more PPC tests for keywords than I have before.

    • 23

      Lyena: Thanks for sharing the impact on your processes. That is quite sub optimal.

      Love the idea of using PPC to test for keywords. I was doing that in the past for especially investment heavy optimization efforts. You are right that we can glean something in this case as well.

      If my conclusions are kind of sort of directionally correct than I'm afraid we might lose visibility into the unknown unknowns (not knowing what we don't know), and that we can't test using PPC or anything else. We might have to collectively come up with new optimization strategies.

      Given how smart the SEO world is I'm confident we will.

      Avinash.

    • 24

      The cynical side of me thinks the folks at Google will be doing cartwheels knowing more people will turn to PPC to get some higher quality keyword data.

      Unfortunately, at present, my non-cynical doesn't have much to say for itself.

      I guess I'll have to just try to make the best of a sub-optimal situation, and will try applying some of these points.

  9. 25

    It might create some insights trying to find a segment that matches (not-provided)-traffic… at least for now.

    But after all it still feels like digging in the dirt. Maybe on a higher level ;) The share will continue to grow and we better get used to look at google (organic) exactly the same way we analyze other "referring websites".

  10. 27
    Sean Riordan says:

    Great article with very useful Custom Segments and recommendations on how to analyze this new pocket of traffic (maybe not new, but newly labeled).

    While I see the benefit of finding better/more effective ways of analyzing the (not provided) visits, I worry about the future. In my mind, these percentages of visits, hidden keywords, conversions, and returning visitors will only increase with time. More than what Google initially predicted when saying, "Keep in mind that the change will affect only a minority of your traffic."

    Google is building up a large base of users with Google+. I am guessing a majority already had Google accounts, but new registrants is a guarantee with a brand new social network. It also means previous Google Docs, Gmail, Reader, etc. users will stay signed in so they can see updates to those they follow. More people will be signed in and secure when completing search queries through Google. Add in the fact that Chrome is also generating a huge user base with people tying in their Google services through multiple computers.

    These all add up to a growing number of (not provided) visitors and a gap in keywords being collected. While keyword metrics aren't the only thing to look at for a website, they are certainly important enough to be worried.

  11. 28
    Elixa says:

    Absolutely love your approach. Web data analysis is problem solving. If we don't know the value of x, we do the math and figure it out. User intent in analyzing queries is the data that is most valuable. Increased personalization in search results (such as within the customized results being delivered via Google Plus) changes the way we figure out user intent.

    I believe the data delivered by the +1 is going to become the means of quantifying the impact of "not provided" traffic. That data (found in Webmaster Tools) and the information from the "view ripples" on shared Google Plus posts in the short term should be focused on. Common interests of users within the ripples would be great data I'd love to be able to gather and analyze for keyword research.

    I think there is a tendency to over-complicate the math when the variables change. I favor the approach of analyzing segments according to their nature. How can one begin to analyze Google Plus traffic without understanding the nature of the platform? Social interaction and personalized search results means I cannot analyze this traffic by the same methods I use to analyze "public" search engine traffic. The nature of Google Plus strikes me as a combination of Stumbleupon-type of viral traffic activity and personalized ranking of results based on user preference.

    I want to see my "not provided" traffic grow, because it means those users prefer my websites and find them relevant. I have little use for website statistics that do not deliver a means to quantify user intent. So far "not provided" is about a new perspective and new opportunities, as Avinash suggests, in my humble opinion. And some new math problems to solve ;).

  12. 29
    Jake Winter says:

    I have to say this entire analysis is quite flawed. The 18% isn't more like the non-branded segment. It is representative of an audience that is mostly non-branded or long tail. This is supported by the fact that nearly every measure of "not provided" is close to the average. If you want to characterize this audience, look at the differences between the average and the sample.

    If you want to be honest with us, just say we lost a portion of the sample and our margin of error has increased on these particular reports.

    • 30

      Perhaps we should start lumping "not provided" in with direct.

      • 31

        But it's not "direct." It's "not provided." We still need to consider it Organic traffic.

        • 32

          True, but it's a new type of "unknown"

          • 33
            Jake Winter says:

            I wouldn't go that far. We do know they came from Google. That's something. I think we just want to monitor that segment and make sure that it stays in line with the other 80-85% of organic search. If it's behavior starts looking different, then I'd be alarmed. As long as the (not provided) look representative, then we just accept the higher (but in the world of statistics, still desirable) margin of error that an 80-85% sample represents.

  13. 34
    Vanessa Fox says:

    One of the bigger issues I'm dealing with is trends by topic over time. I use regular expressions to create categories of queries and then chart that progress over time. So, for instance, if i'm looking at a cooking site, I might segment all of the cake-related queries so that I can see if I'm getting more or less traffic from that category (which could include a thousand distinct queries) and what the behavior of those searchers is vs. say, the ones doing pie-related searches.

    With this change, of course, there's often an increasing traffic decline for these segments (as the not provided segment traffic goes up). I always tell people to look at traffic and not rankings, but I understandably, as people see these traffic drops, they want to look at rankings to see if they've lost rankings or if it's simply that the lost traffic is now part of not provided.

    Surfacing actual issues that need to be fixed (lowered rankings, negative change in SERP display, impacting CTR, etc.) become obscured because it's easy to assume that all lost traffic is due to that traffic now being part of not provided.

    I suppose one could calculate the percentage of not provided over time and compare that to the traffic loss for other query segments, and then only investigate for a potential problem if the percentage is vastly different, but as you point out, it may not be the case that the distribution is even across all queries, so I'm not sure about this method.

    My understanding is that Google is slowly rolling this out over time, so I think this may be an issue for some time as the traffic continues to shift.

    Any ideas on how to think through this?

    • 35

      I think that using a more or less even distribution makes the most sense even if that may not indeed be the case. If you can determine that indeed a particular long tail keyword segment's traffic (cake-related, with the 1000+ unique queries) decreased in a way that correlates relatively strongly with the increase in 'not provided', I suppose adjust your attribution of cake KWs to the 'not provided' segment a bit. And I do agree with Avinash that this analysis should be done (I like the landing page segmentation, personally). But as the quickly growing comment section is stressing, use caution. Bottom line, I think that in most situations we're going to find that (not provided) is exactly what it says it is – obfuscated, 'not provided' data and that both SEOs and web analysts are going to find their hands quite tied as Google continues to roll this out.

      • 36

        Vanessa: For now if there is enough history then we could probably construct some regression model model to understand a little bit better if there is an egregious impact on "cakes" or "brownies" or "pastries." But even then it might just be at an educated guess level.

        I believe, as I mention in other comment replies, for "head terms" where we have a lot of data we could get a good educated guess, but for the long tail of search queries (in businesses like cooking sites and web analytics blogs there is a very very very long tail :) we might just be out of luck. Simply a function of how much data we have for individual search queries.

        Sadly in this case we've just lost the data and some of the kind of wonderful analysis you are doing would be harder to do.

        As Elixa says above, time for new ideas / processes. :)

        -Avinash.

        • 37

          @Vanessa,

          I think that you'll have to rely on the landing page analysis rather than focusing on the keywords in your case that you mentioned. Regardless of the not provided or other keywords changing you'll see the fluctuations in actual visits to those pages in the category in question. From there you'll need to look at your pages and hypothesize if you are driving traffic to those pages from the KW focused terms of each page or not (cross check with non-logged in data or data you have for the same pages from bing)

          It is a complete shame that data quality is heading in the wrong direction (once they switch Android traffic over to https "not provided" will jump again even if Google+ growth stagnates).

  14. 38
    Adam Audette says:

    Interesting piece, Avinash, and especially insightful comments here. I just wanted to share this idea for capturing a bit of data from paid search that could be 'backfilled' to the organic segment:

    If you're running SEO and PPC campaigns in tandem for your site/clients, it may be instructive to look at the landing pages for the (not provided) segment. Then, pull paid search data on these same URLs, and run a search query report to find the actual queries that fired the paid ads. What you'll end up with is the raw search term visitors used to reach the same URL via paid.

    Of course, we'll need to keep in mind that this isn't apples to apples and probably quite different behavior, but at least it's *something*. And something is better than nothing.

    A couple caveats: not only do you need to be running paid ads on the same URLs that you're focusing for SEO, but you also need to have a fairly high organic rank to make this meaningful. Caveat: cannibalization and other factors that occur between organic and paid listings also apply.

    • 39

      Adam: A great suggestion, thank you.

      I believe this will be especially helpful on our brand or head terms (brand or non brand). It is more likely that we will have both ppc and seo going at the same time, and we have a good bunch of traffic to do good enough analysis.

      I'm going to try this option with some clients!

      -Avinash.

  15. 41

    While I definitely appreciate the process of using advanced segments and custom reports you outlined to try and describe the (not provided) keyword segment, like Benoit and Stephane I too am wary of drawing conclusions when the data is simply not available. Maybe the (not provided) blindfold is not 100% opaque, but it is a quite far from being even translucent even after slicing and dicing the heck out of it. In other words, while I would advocate doing this type of analysis on the (not provided) segment, I would have also stressed the "take with a grain (kg) of salt" aspect of things.

    In light of the lack of data, I tend to use an even distribution model when it comes to (not provided), namely, consider each of my Google Organic keyword sources as really receiving an additional X percent of visits. This works ok for head keywords, though breaks down a bit when analyzing the long tail.

    @Dave Culbertson, I don't see how moving away from Google Analytics will help anything…

    • 42

      Yehoshua,

      I think my point is that the Google empire is built on consumers and businesses freely it giving it data. If they're going to stop sharing extremely valuable keyword data with us, what is our incentive for continuing to give all of our web analytics data freely to them. Quid pro quo?

      • 43

        I hear. Though I doubt that they would really care even if there was a revolution in that regards. Also, theoretically we can change the data sharing settings in Google Analytics. That is, of course, if you trust that Google actually uses those settings. :-)

  16. 45
    Kevin says:

    This is an interesting way to noodle this, Avinash.

    I think I'm going to start overlaying two parts of Google Analytics to study this. Taking a look at the new SEO reports (since I connected Webmaster Tools to GA) and then overlaying that data on Keyword data.

    I think if we filter out the Image property, and look at only the Web property, we'll see if we've lost impression share from the 31st on. If we haven't but have lost a significant amount of occurrences of specific keywords, we should be able to make a decent guess as to what (not provided) is being replaced by.

    I'm sure we could get even more complex, and look at the landing pages as you have as well.

  17. 52
    AJ Kohn says:

    I also don't see (not provided) as a problem and wrote about it today as well:

    http://www.blindfiveyearold.com/not-provided-keyword-in-google-analytics

    I approached it from a slightly different perspective since I believe that the keywords and query intent for those in (not provided) are NOT materially different from the rest of the population.

    With that in mind I sought to apply the distribution of keywords outside of (not provided) to the (not provided) pool. This becomes much easier when you cluster keywords – not by brand and non-brand but by the keyword phrase itself. (Thank you Google Refine.)

    Unfortunately I'm unable to share much of that in my post due to client privacy, but I find that by doing so the results are fairly accurate and consistent across client installations.

    If the mix of traffic you see is primarily 'new' and 'branded' then seeing that modeled in the aggregate for (not provided) makes sense but doesn't, to me, indicate that all of that traffic meets that profile. It's probably just weighted as it is within your normal keyword universe.

    This could change and might differ by site but I'm not seeing it right now.

    • 53

      AJ: I concur with you that the situation, at least for now, is less dire than it might seem. Hopefully people see that in both our posts.

      I also like your suggestion of perhaps an enhancements were the words stay anonymous but a count is provided of how many, and contribution of traffic by each. Don't know if that will happen, but it is an interesting suggestion.

      I wanted to share two thoughts.

      1. Across five or so websites I analyze actively (including my blog) the assumption that the keyword distribution and intent are NOT materially different does not seem to be true. The distribution is unclear but the performance is very different (the basis of my post).

      But that is ok. Each site/client will be different and we will have to continue to watch this over time and refine conclusions.

      2. I shared this in my comment to Vanessa and others… I am as yet unsure if we'll be able to do a decent (/accurate enough) distribution analysis. For head terms it might be possible (once we have enough data and can set up some simple models) but for long tail terms this is going to be very hard to do (almost impossible).

      This is just the start of the journey. Your post was delightful, thank you for sharing it. I'm really glad that we have a small bunch of posts moving beyond the "OMG the number is so high" to "Let's see how we can evolve."

      Thank you.

      Avinash.

      • 54
        AJ Kohn says:

        Thanks Avinash. I do hope suggestions around number of keywords within (not provided) or even an anonymized keyword drill-down are, at a minimum, considered by Google.

        The latter would provide a far better way to determine and 'match' keywords from the (not provided) to our normal keyword universe. So, if I see that 40% of my (not provided) traffic comes from one keyword with a certain engagement profile I could look for a similar keyword with that volume and profile outside of the (not provided) traffic pool.

        Interesting that you've seen that the traffic is materially different in your installations. Once I clustered the keywords appropriately I see some variance but not what I'd consider significant at this stage.

        I too am glad that we're talking about solutions and work-arounds. Your steps definitely make me want to dig deeper into the profile of the traffic so I have more due diligence ahead of me. That's what makes this industry so interesting, it changes from day to day and from site to site.

        One last question I have is what exactly we do with this information? If the (not provided) traffic IS materially different, how do we translate that into action? Can I dynamically serve different information based on identifying a visit as (not provided)? Could I retarget this group specifically to gain more insight?

        I'm thinking out loud here, but I'm interested in making sure these insights can be leveraged.

        It's sort of like the dilemma around many new medical tests, they provide a bit more information but often don't allow you to take any additional action.

        • 55

          AJ: To your question about what to do with this information… I have to admit I went into this dark tunnel just wanting to know what was in there. Everyone seems to indicate that there were monsters inside. :)

          On a serious note, the hope was to primarily eliminate something that has been distracting SEOs and Marketers for the last month. Now that we have a way to figure out just a little more (and not as much as we want) I would foresee two actions.

          If the performance profile for this segment looks just like your average search traffic then perhaps the distribution is more event and we can just go about our business knowing we don't have to worry much – no matter if (not provided) is #1 in our reports.

          If the performance profile looks like brand or non-brand or head or tail we can have a bigger worry. We understand what it is, but it is unclear what we can do about it. Adam's suggested a SEO PPC strategy that is interesting. We could dive deeper into landing page segmented analysis to make educated guesses at where the impact is. Many people have suggested keyword level regression analysis. This would allow us to ensure optimal effort on SEO efforts and landing pages.

          Those are some actions that spring to mind.

          -Avinash.

          • 56
            AJ Kohn says:

            You're absolutely right about banishing monsters and limiting distractions. That's a positive outcome in and of itself.

            I too like Adam's idea of running PPC along side SEO. And regression analysis could also be interesting though I'm not sure that's an action everyone could take.

            Don't get me wrong, I love this post and the exploration. I'm just mulling over how to integrate it, particularly given all the other demands on our time.

  18. 57
    Courtney says:

    Does anyone know if this new https for Google affects Insights for Search? We've been tracking our most popular (by far) branded keyword as a gauge of public sentiment and as of midOctober, things kind of fell off compared to last year. I did all of Avinash's recommended reports, and find the opposite, that our 'not provided' traffic is branded (meaning it could be a lot of this key term).

  19. 59
    Josh Braaten says:

    Thank you for your insights, Avinash. Most of my comments are on policy, so I'll honor your request and take those thoughts elsewhere.

    I appreciate your efforts to help us make lemonade out of this unfortunate decision with suggestions for analyses such as the one in this post.

  20. 60
    Damon says:

    One idea for improving the quality of the segments that would work with a lot of sites is to divide the (not provided) by landing page.

    Branded queries mostly land on the home page along with a few short, highly-competitive terms while long-tail queries typically land on deeper pages.

    A (not provided) segment that also excludes visits landing on the home page would give you a much cleaner segment covering long-tail (not provided) queries. Or you could take it a step further and look at organic search traffic segmented by landing page with a list of referring keywords.

    The more I think about it, the more I like the idea of switching from looking at referral data primarily segmented by keyword to referral data segmented by landing page and then by keyword.

    -It makes spotting any unusual patterns in missing keyword data easier

    -It lets me easily see when search engines are sending visitors to the wrong page for a particular query, and

    -It lets me look at buckets of related keywords giving me a better sample size on small sites.

  21. 61
    Joshua Kotowski says:

    Thanks for the post Avinash.

    In my personal opinion I agree with your statement that it is not as scary by adding this type of context….Also agree that it's not that scary 'for now'. Yes, not perfect, but definitely something worthwhile to do without a shadow of a doubt.

    Imagine that if we dive deeper and deeper into this and view entrance pages, geography (as you include as a dimension), etc. etc. we get more to 'profile compare'. The point is that we have to do something and not be content with dealing with (not provided) as is… right?

    As this continues, and more and more (not provided) acquisitions, behavior, and outcomes end up in our history that changes the tune a bit. Occupy Google?? Just kidding. Let's not do that. For now an educated guess is better than no guess at all.

    Good comments and insights here, passion

  22. 62
    Brett says:

    This is an excellent article, Avinash! I love how you stick to the facts without getting all emotional.

    I did the above analysis on one of our sites, but unfortunately couldn't come to any conclusions. Our (not provided) keywords seem to be a mixture of branded and non-branded keywords.

    We originally thought this change by Google would make a huge impact on our ability to make decisions with regards to our site. However, I really think the major impact is on long-tail keywords. Our head terms can be measured through keyword tracking tools, and thus only our CTR is hard to measure after the change. But the long-tail phrases we're losing because of the change undermines our ability to create content around subjects our users are searching for.

    Thanks for the analysis here.

  23. 63

    For those who have wondered, there is a petition requesting that Google reverse this decision:

    keywordtransparency.com/

    It's 100% legit, started by Danny Sullivan of Search Engine Land.

  24. 66

    Thanks for this enlightening post and the intelligent, creative discussion in the comments. Certainly analysis with PPC and WMT is going to get more important so thanks to the contributors who shared their ideas on this.

    One other thing I am going to do is compare the profile of (not provided) traffic with that of traffic from social media sources- after all, if people are signed in to Google+ they are in more of a social mindset and sharing their recommendations. This could be why people are coming to the home page so often and are frequently new to the site- they have been introduced by the people they are networking with.

    If there was any correlation though, and we made improvements to the site to get better results from social traffic, wouldn't the end result be more social sharing and a growing proportion of (not provided) results? I have signed the petition above too….

    • 67

      Liz: I'm glad you found the post to be of value.

      It is a very interesting idea to think of doing some kind of correlation with social traffic, I've not done that so far. Will look into it.

      Please know that people can be using Google search when they are signed into Gmail or YouTube or other Google properties. They might not even have a Google+ account. So in our effort we'll keep that in mind.

      -Avinash.

  25. 69
    Andy Dutson says:

    Thanks for another great post Avinash. As a person who mainly deals with UK based websites I am still awaiting the full impact of this. On another note and quite frankly even more frightening is the impact from the EU wide cookie directive that is already in force now but the grace period ends early next year. This will dramatically reduce the possibility of tracking visitor actions at all never mind what keyword they used to find your site.

    The ICO analytics reported a drop of 90% in traffic due to cookies not being accepted by visitors to the site http://econsultancy.com/uk/blog/7692-ico-follows-ico-rules-cookie-usage-drops-by-90-percent. The data was captured through a Freedom of Information Act and is documented http://chinwag.com/blogs/sam-michel/cookiepocalypse-implementing-new-law-drops-use-90

    As someone who is looking to develop my skills in web analytics, this and the topic of your post is very disheartening. Sorry to be negative but the facts are there for all to see and the industry seems to be burying its head in the sand over the issue.

    • 70

      Andy: Indeed we live in a complex world with numerous challenges. For Marketers to do hyper-relevant marketing, Site Owners to create hyper-relevant experiences, and Analysts whose job it is to make organizations data driven.

      But I remain an optimist.

      In Digital Analytics (site, mobile, apps, etc) we still have exponentially more data than any other marketing channel on the planet. For example TV advertising is something like 200 billion dollars (Billion!) and effectiveness of that is measured based on observing behavior of just a few thousand people (less than the number that visit my blog every day). We, Digital Analysts, will continue to have a good solid source of data to do good solid inventive beyond "here are the page views" analysis.

      If you are thinking of a career in digital analytics then let me assure you that if you are good at it, your career will be recession proof (even if 50% of our data sources go kaput!).

      -Avinash.

  26. 71
    Erica says:

    Despite I agree with the general ""aaah we are all going to die" feeling, I appreciate your effort to try and get something out of the "big blob". It doesn't change the fact this is a big hindrance to analytics-driven SEO in my opinion, but it is good that you pointed out a few things, such as the possibility to filter by landing page and brand/non brand – not to mention GWT data.

    Of course the bigger the amount of data is, the easier is to get to wrong conclusions :(

    I still don't understand if this (not provided) analysis is a useful tool or a loss of time; I did, however, very much enjoyed the custom segments =)

  27. 72
    John W. Furst says:

    Thank you for putting this into a different perspective and context. This is heavy weight information. Will look at my situation through your glasses.

    Yours
    John

  28. 73
    Alan Perkins says:

    Hi Avinash

    Your slivers in #4 are interesting.

    If many of the (not provided) keywords are long tail terms represented by small slivers, it makes me think that we should use a filter to convert all (not provided) keywords back to unique keywords that give us some more data and reports like the good old days of pre-October 18. e.g. instead of just having "(not provided)" in our reports, we have something like kw[insert unique keyword id here]".

    Taking this a step further, perhaps instead of a simple unique id the filter could create some combination of:

    * the landing page (so we could see which pages were driving the most not provided data)
    * the date/time
    * other data, e.g. __utma, new/returning, number of visits, etc.

    i.e. the who, where and when of that visitor. Even if we don't have the keyword, this data could allow us to deduce most if not all of what we really needed to know.

    This would give us a whole bunch of single-visit keywords, rather than an amorphous mass of (not provided). Analysing those single visit keywords, we could perhaps better figure out the kinds of keywords being used and group them together into sensible segments which could be much more refined than brand/non-brand. In your case Avinash, rather than 4651 "(not provided)" head keywords, you'd have a long tail of 4651 different pieces of info that you could analyse alongside genuine keyword data in order to detect trends and patterns between the two.

    • 74
      Kevin says:

      Ugh. One thing I didn't really think about when I was responding a few days ago, or when I was reading this post is….. new content. I posted a quick blog today which was a little off topic compared to the rest of my content, and started seeing encrypted Google traffic right away. And I'm left wondering why/how.

      For new posts, I have nothing to back and dissect/compare to/analyze. It's really just a bit of a vacuum. :)

      • 75

        Alan: This is a great suggestion. I've not implemented anything like this yet but if you do implement it would you please share your filters / results with me? If you find the analysis to be of value I'll be happy to add it to the blog post.

        Kevin: Sadly that is true.

        Hopefully in just a few days you'll start to see a better balance of logged in and not logged in visitors and you'll get a feel for what queries might be in not provided.

        With a lack of history (new sites / pages) the challenge is compounded in terms of our ability to do analysis.

        Avinash.

  29. 76
    gaby says:

    Very informative post Avinash, and there seems to be a creative solution in nearly every reply. That's what I like most about this kind of work — merging art and science to come up with creative solutions to business problems.

    I don't have quite the experience as some of the others that have replied, but I can't help notice that the conversation is focused on off-site search metrics. Wouldn't on-site search metrics for the (not provided) segment help fill that gap a bit?

    I suppose I may have simplified the situation too much, but if the question is "Google wants me to provide quality content to rank, but how can I do that if I don't know what search terms are landing the (not provided) segment of users on my site?" than it seems to me that looking at that segment's on-site search behavior would give you the closest answer.

    Unfortunately, I don't have access to any sites that have on-site search implemented, so I'm not able to test this idea. I would however be interested in any thoughts on including on-site search metrics to some of the other recommendations made here.

    • 77

      Gaby: Interesting suggestion!

      The challenge with internal site search data is that on most sites around 10% of the Visitors will use site search and hence the percent of Visitors coming via organic search and then doing internal search will be smaller and of those ones who originally came on (not provided) will be smaller still.

      But with that small caveat out of the way this is a very easy analysis to do, takes all of five seconds. In Google Analytics (or your favourite web analytics tool) go to the standard built in internal site search report. Click on the Search Terms report. Then on top of the table choose the Secondary Dimension called Keyword and…. boom!

      Internal Site Search Data for Not Provided

      On the left is the term searched on your/my website. Next to it is the search query that was used to come to your/my website from Google. Interesting!

      The above screenshot shows all external Google search queries, you can add a Advanced Filter to the report (click on the magnifying glass icon) to only Include Keyword Containing not provided.

      Thanks for the idea.

      Avinash.

  30. 78
    Michael says:

    Very Helpful!!!!! You are the best Avinash!

    Thanks for all your helpful insight over the years. You are truly a master.

  31. 79
    Alex Thomas says:

    Avinash, if it wasn't blocked off by security settings – a great way to get the referrer information would be to open the previous URL via the Javascript window.history object in an iFrame. Then you could simply pull in the referrer data from the iFrame's src.

    Naturally this opens up huge security/ privacy issues for users which is why we can't do it. – And also why iFrame history is an extension of the main window's history these days!

    Of course send me a message if you ever do find a way around it ;)

  32. 80
    Jeff Baker says:

    Another problem: Not Set is now our number one Mobile Device in our metrics. In a one month view, Not Set Mobile device is higher by a thousand over the next device (the iphone). How can use this metric with such a high number of Not Set showing up?

    • 81

      Jeff: (not set) takes the place of values that might not exist in a particular dimension. In the report you mention it means that those many people have phones for whom device info is unrecognized by Google Analytics.

      I you would like to try a third party service to get mobile data, or check an alternative set of device reports, then you can try PercentMobile. They have a free version as well.

      (If you do a View Source you'll see that I use Percent Mobile. But I'm not affiliated with the tool in any way. I'm just a fan of their reports.)

      -Avinash.

  33. 82
    Hugh Gage says:

    Would it be worth analysing the (not provided) segment by landing page?…. This would surely give some indication of what these terms were. Apologies if somebody has already mentioned this above.

  34. 83
    Edith Sánchez says:

    We all agree this change may be affecting our data but may I suggest…..

    Tiny things for those who freak out:

    1-If the change doesnt seem to affect numbers at a significant level -obviously or statisticly-, don´t freak out

    2-Remember seasonality if it exists in the web you analyse

    3-Which leads to right sampling

    4-Remember there are statistical tests you can use for the same object of analysis in time (the numbers that look different)

    5-Also, are you sure, totaly sure, there aren´t other decisions in what you analyse that may be changing the numbers?

    When uncertainty comes, begin with the obvious
    or
    Don´t start looking in places you never go

  35. 84

    Estou lendo o livro Web Analytics Uma Hora por Dia, em Português e é ótimo, conceituando tudo antes da abordagem prática.

    Parabéns!

  36. 85
    Kevin Hill says:

    Our logged in percentage and thus (not provided) percent is around 35%.

    I see this number only getting bigger. This is really a big, big problem. What is interesting is that we still get keyword metrics and data for adwords ads, but not for organic. Hmm….

  37. 86
    Michael T says:

    One other thing that I do is compare the unknowns to the other search engines. The difference is probably the unknowns.

    • 87

      Michael: That is another interesting strategy. The thing to be careful there is that typically people form different search engines behave very differently.

      For example except for a few brand terms the traffic patterns from different search engines for me have nothing in common. Including for simple things. I get a lot of traffic from Bing for the word web analysis, but I only got two visits from Google for the same term!

      -Avinash.

  38. 88
    Vinod says:

    Excellent article Avinash!

  39. 89
    Scott Shemtov says:

    Avinash,

    Interesting analysis. I am using a different approach, I'd like to hear what you think. It's either flawed, or super simple.

    What about comparing 3 Advanced Segments for time periods before and after the (not provided) bomb dropped on 10/18/2011.

    Segment 1: Organic Keywords without branded terms
    Segment 2: Organic Keywords but only brand terms
    Segment 3: All Visits

    Now view these segments under the Traffic Sources > Search Engines report, then click Non-Paid to filter out any PPC.

    Measure the ecomm revenues you see for 3 time periods, for each segment:
    Date Range 1: 10/18/2012 – 12/12/2012
    Date Range 2: 8/23/2011 – 10/17/2011
    Date Range 3: 6/28/2011 – 8/22/2011.

    For one of my client websites, in Date Ranges 2 and 3, I'm seeing that ~73% of revenue came from Segment 2 and ~26% of revenue from Segment 1 . In Date range 1 (after not provided rolled out), I'm seeing 66% and 32%, respectively.

    So, going forward, I will be assuming that about ~73% of my (not provided) revenue is related to branded terms. 6% of total organic revenue shifted from Segment 2 to Segment 1, since (not provided) shows up in Segment 1.

    Is my assumption reasonable?

    Many thanks

    • 90
      Scott Shemtov says:

      Date Range 1 should be 10/18/2011 – 12/12/2011, wrong year.

    • 91

      Scott: The approach seems logical, segmentation is such a powerful tool.

      Two quick thoughts…

      I don't know who your client is but in these types of analyses I'm careful about correlation and causation. If you also have that concern at the back of the head that would be optimal.

      The second part to consider is that businesses evolve very quickly and efficiently. This means that any insights we glean from such analysis might have a limited staying power. As time passes and busienss strategy and marketing changes, the insights will be of limited value.

      Both points above of course also apply to all the methods I've used in this post. :)

      -Avinash.

      • 92
        Scott Shemtov says:

        Thank you for the quick reply. Great caveats you suggested. If the business begins doing much more "branding" campaigns, that would certainly increase the % of branded search visits, such as a Super Bowl commercial or other non-web advertising.

  40. 93
    saumil says:

    Hey Avinash,

    Thanks for your insightful analytics, this new encrypted version have make it difficult for webmasters specially new one with website started after 18th october as they have never known what their keywords are in "not provided". Other thing i have heard is if you switch to Google premium account they will show you those keywords. Is this true? Kindly share your thoughts on Google Premium.

    Thanks again for sharing this tips with us.

    • 94

      Saumil: This is 100% untrue.

      Regardless of the version of Google Analytics you use, and regardless of which web analytics tool in the market you use, this change impacts every tool exactly the same way. The change is not in web analytics, the change is in the Google search engine.

      Avinash.

  41. 95
    Jeff Bronson says:

    Thanks for this tip Avinash.

    I'll need to apply to a couple test Analytics accounts and ensure everything is reporting smoothly. It is a shame Google has made this decision though.

  42. 96
    Matt says:

    Google is changing its aproach daily, small merchants like us are in trouble because of such strange google policies.

  43. 97
    S.J. says:

    Hi Avinash,

    I must say I do have an appreciation for your methodology in ferreting out patterns in the data.

    However, I also think that you are spending twice the time to achieve less than half the answer. Unfortunately, in the end all we are left with is a series of clever assumptions based on trends that may or may not be indicative of the answers we are seeking.

    • 98

      SJ: I'll take it!

      In this situation unfortunately we have nothing. Literally "not provided." To be able to ferret out anything, even if a little bit, to me is a minor victory for us. If we can, per your comment, get half the answer I'll probably go into the street and start dancing. :)

      On a serious note… there is a point where we'll reach diminishing marginal returns. That point is probably earlier rather than later for small businesses, maybe even medium sized businesses. But for large businesses I continue to believe there is value in such analysis.

      Of course, like you, I wish we did not have this problem.

      -Avinash.

  44. 99
    Mohammed says:

    I Avinash, I started using the 'custom reports' for the analytics.. really liking it

  45. 100
    Laura Book says:

    I agree, keyword information is mission critical for website analysis and it is just a matter of time before we become blinded to all keywords coming in.

    My not provided traffic is now over 10% of my total visits and it just keeps climbing. UGH!

  46. 101

    That's great and i must say we don't think about (not provided) keywords because after this announcement my Goals increased around 120% as people searched with login accounts and that really could happen with my social media efforts BTW nice articles and great insights..

  47. 102
    Keith Hauser says:

    I've been watching the percentage of (not provided) creeping upwards every month since Oct. 2011 at my site. It has now reached 25%.

    Organic search is getting harder and harder to analyze. I wonder where we will be in a year from now?

  48. 103
    Alan says:

    Hi Avinash, great post and well worth the read.

    I have one question about not provided. One of my clients monthly reports shows more than one not provided and for a full month shows 1 keyword phrase and then the rest are not set and then not provided(1), not provided(2), not provided(3) and all the way to not provided(7).

    Please help me out here,

    Thanks

    Alan

    • 104

      Alan: These are two different things.

      (not set) will show up in a tool when no value for that dimension is captured. For example you sent out a email campaign and forget to put anything in the UTM parameters, your instead of having all three you only have two in the link. This will show as (not set). This problem is solveable, you just make sure the code is implemented properly, adwords and analytics are linked properly and all that good stuff.

      (not provided) will show up for organic searches that are done on google.com using secure search.

      I'm not sure why you will see not provided(1), not provided(2) etc. That is not expected behavior. Please work with a GACP who can help you figure out what might be up. Here's a list: http://www.bit.ly/gaac

      -Avinash.

      • 105
        Alan Perkins says:

        Hi Alan (different Alan)

        I take it you have inherited this account, or at least somebody else has admin access to it. Try checking the filters. There's a good chance a predecessor has set up filters to augment (not provided) with the 1-7 numbers you are seeing.

        Alan Perkins

  49. 106

    What about doing the following calculation:

    You have the Overall Google organic search visits number let it be OGSV.

    Then you have the visits number per keyword let it be VPK

    Usually now you will find the top keyword (or one of the top) to be “not provided” Let it be NPKV

    If you subtract the not provided visits from all the Google organic visits (OGSV – NPKV) you will get the number of provided visits, let it be PKV.

    Now assuming the ratio of visits for a given keyword (one that has more than a few visits) will be the same within the not provided data to that of the provided data, why would it not be? Then you can calculate the ratio of visits per keyword from the dividing its provided visits number by the overall provided visits number (PKV). The result is a fracture representing the % of visits a keyword had.

    Now to receive a close estimate to the real number of visits a keyword had you multiply the above fracture by the total number of not provided visits (NPKV) and you get the number of visits the keyword had among the “not provided” keywords (it usually won’t be a whole number as your using ratios here) . This number then is added to the provided recorded visits for the keyword to give a very close estimate of the real number visits a keyword has.

    This calculation is to be done per dataset for a given chosen period of time as number are relative to each other only not to the site itself.

    So for example a site has 3000 over all Google organic visits per a given month out of them 1000 are “not provided” meaning that 2000 are provided. The keyword “test” shows 50 visits in the analytics data from Google organic searches. This means that 50/2000= 0.025 of the provided visits were from this keyword. Now in order to calculate how many of the not provided visits were from this keyword we multiply 0.025 by the number of not provided visits (1000) which will be 0.025*1000= 25. We add this number to the number of provided visits for the keyword (50) and we get the estimated number of visits after controlling for the not provided figure in that data set, in this case 50 +25 = 75.

    Does this method make sense to anybody? It does to me.

    • 107

      @Amir

      An evenly weighted distribution is "OK," but it doesn't really model keyword volume from (not provided) so well. For example, on my own website (not provided) accounts for 65% of my Google Organic traffic. However, when I look at the not provided distribution by landing page, my top 10 landing pages all have not provided percentages higher 75%. Why the difference? Those landing pages are all blog posts of mine and it seems that more of my blog readers are logged into Google than the other visitors to my site.

      I recommend looking at your own distribution of not provided traffic by landing page and see if the weighted model works for your particular site or not.

      Yehoshua

      • 108

        Reply to myself (after re-reading comment section):

        @Yehoshua. HA! It appears that you straight up advocated for an even distribution approach a year ago in this very comment section, but now aren't quite so gung-ho about it. I guess you may have learned something over time as (not provided) has gotten a bit out of control.

  50. 109
    Jirawat Worasubhakorn says:

    This is the greatest data analysis method I have ever read.

    Thank you very much!

  51. 110
    BSAPK says:

    My site has 80% searches marked as (not provided).

    Thanks for this article, it helps a lot

  52. 111
    David says:

    Thank you so much for this.

    The ability to see the landing pages of the "not provided" is extremely valuable.

    Is there anyway to see the "not set"?

    • 112

      David: If you see (not set) that is typically a configuration problem somewhere (like the link between your adwords account and ga), or some data is getting dropped off due to incorrectly implemented analytics code.

      (not set) is not because anyone is withholding anything. If you don't know why it is, please consider working with a consultant to help you sort it out. Here's a list: http://www.bit.ly/gaac

      -Avinash.

  53. 113

    Is it possible to get Secure Date from (not provided) Keyword ? If it is possible then could you please guide me or redirect me to the respective source ?

    Thank you!

  54. 115
    Nick says:

    Avinash, truly impressed by the quality of this post. So much information I may have to read it twice!

    I'll be forwarding this article to my friends who work with analytics !

Trackbacks

  1. [...]
    Avinash Kaushik discusses five steps to smarter data analysis in light of Google’s search encryption change at his site, Occam’s Razor.
    [...]

  2. [...] Применение данного фильтра, конечно, не вернет старых-добрых подробных отчетов, но позволит получить больше информации, чем нам доступно сейчас. Также советую прочитать статью Авинаша Кошика об анализе «(not provided)» запросов тут. [...]

  3. [...]
    Google's decision to hide keyword data from searches done by logged-in (HTTPS) searches certainly has created a lot of controversy. Avinash Kaushik offers up some ways to analyze this data on his Occam's Razor blog: Smarter Data Analysis of Google's https (not provided) change: 5 Steps
    [...]

  4. [...]
    Derek Edmond of KoMarketing Associates wrote a guide on the SSL changes for business-to-business marketers, reminding webmasters that there are still ways to get the keyword data – even if you can't get the same granular results. Aviniash Kaushik also has five tips for (not provided) data analysis.
    [...]

  5. [...]
    Many site owners have begun running additional or custom reports to try and combat the problem. To learn how you too can combine or customize your Analytics reports to gain as much intel on your web visitors as possible, despite Google’s attempt to keep us in the dark, check out these articles. How to Steal Some Not Provided Data Back from Google and Smarter Data Analysis of Google’s https (Not Provided) Change: 5 Steps.
    [...]

  6. [...]
    A full discussion on it, and some ways to analyze its impact can be found on Avinash Kaushik’s blog. So read that first, and report back here when you’re done. The post below is the results of a thought process I came up with after reading his post, so it’s important to get his background. He’s also a lot smarter than me, so that’s a second good reason to read his first.
    [...]

  7. [...] Avinash Kaushik/Occam's Razor: Smarter Data Analysis of Google's https (not provided) change: 5 Steps [...]

  8. [...]
    In other Googlenews, the digital marketing world is still crying in its Starbucks cup over the big G’s move to shroud referring keywords for logged in Google account users. But gracious gurus share their tips for working around this likely permanent change through advanced segmentation and filters. Thank you!
    [...]

  9. [...]
    Smarter Data analysis of google's https (not provided) changes
    [...]

  10. [...]
    Avinash Kaushik, the analytics wizard and Digital Marketing Evangelist for Google, took a few hits when Google announced the “not provided” policy in Google Analytics. He responded with generous grace by writing a thorough guide to Smarter Data Analysis of Google’s https Change. Using clear, step-by-step instructions he shows how to:
    [...]

  11. [...]
    An example here is when Google announced the Secure Socket Layer protection on their search results. By adding that little s to the http, Google affectively took away one of SEOs important campaign tracking metrics. More and more (not provided) instances are showing up in the analytics programs and, unless you know a work around, are essentially useless. This change on its own had a ton of people up in arms. To be sure, more changes are coming.

    See also: Smarter Data Analysis of Google's https (not provided) change: 5 Steps
    [...]

  12. [...]
    In other words, this is filtered site data that is not available for you to view. If your site is in the 50% range, what you can view has become so restricted as to almost become meaningless. (update: check out this post from Avinash for an interesting work around) Is it fair that Google allows paying Adwords customers to see that data? No, but that is a choice Google has made.
    [...]

  13. [...]
    Lets take a specific example: the (not provided) keyword issue. If I look for (not provided) keywords in any interaction, I get 19 different results. By the way, I set up some custom groupings to illustrate the point more easily and not reveal the actual keywords.
    [...]

  14. [...]
    There have been some interesting discussions of late trying to answer that question. Avinash threw out a few ideas for capturing (not provided) data in an excellent piece. One of the key takeaways from his approach, is to make assumptions based on known branded and non-branded query sets, and apply this distribution to the (not provided) segment.
    [...]

  15. [...]
    El guru del análisis web, Avinash Kaushik, como siempre, tiene su propia visión de la solución del problema. En su artículo “Smarter Data Analysis of Google’s https (not provided) change: 5 Steps”, propone una metodología para analizar el tráfico con la etiqueta “(not provided)”. Los métodos incluyen: analizar el comportamiento de este tráfico y clasificarlo como “más de Branded” o “más de non-Branded”; analizar las páginas de destino de este tráfico y compararlas con datos históricos etc. Una vez leído este interesante post, entenderéis, que, aunque este método no resuelva todas nuestras dudas, es mejor, que no tener ni idea de cómo analizar las palabras clave de alrededor de 20% de vuestro tráfico.
    [...]

  16. [...]
    No, por supuesto que no. Todavía podemos aprender mucho de este segmento censurado, tal y como lo demuestra Avinash Kaushik en su más reciente artículo. Para quienes no lo conocen, se trata de uno de los gurúes de la analítica web más importantes y una autoridad en el nicho. En su artículo, gracias a los reportes personalizados de Google Analítics logra demostrar que el tráfico del término “not provided” en su sitio web proviene principalmente de búsquedas que no están relacionadas a su marca personal, sino que son enteramente originadas por posicionamiento orgánico.
    [...]

  17. [...]
    Smarter Data Analysis of Google's https (not provided) change: 5 Steps. Occam's Razor
    [...]

  18. [...]
    Patrząc na witryny zagraniczne np. Avinash Kaushik na swoim blogu odnotował aż 18% udział słowa kluczowego w całości ruchu pochodzącego z wyników organicznych Google (klikając na link zostaniesz przeniesiony do jego artykułu na ten temat). Weźmy jednak pod uwagę, że na zachodzie (zwłaszcza w USA) użytkownicy dużo częściej korzystają z produktów Google, a więc są do nich zalogowaniu surfując w sieci.
    [...]

  19. [...] \此外,我们可以一步推测某访问次数(或目标转化率)较大的not povided具体是什么关键字。通过查询其目标登录页(content->page),并交叉透析其已知关键字(含排名)报告,最后通过Google排名、页面停留时间、跳出率、目标转化率的匹配程度,几乎能推测到not provided是什么关键字。

    此外not provided流量还可以从not provide页面停留时间、新老访客等方面去细分,Avinash的博客有详细介绍。本文就写到这儿了,你看看有啥要补充、不清楚或觉得以上方法不靠谱的地方,就在下面方框内用力的敲敲键盘…
    [...]

  20. [...]
    Avinash Kaushik es uno de los mayores expertos en análisis web, y parece ser que ha encontrado un camino para solventar este problema. En su artículo “Smarter Data Analysis of Google’s https (not provided) change: 5 Steps” comenta los diferentes pasos a seguir para analizar el tráfico de datos “vetado” por Google (se estima que están ocultas desde un 10% a un 25% de las palabras clave)
    [...]

  21. [...]
    If you want a more general, and accurate, look at the effects of (not provided) on your data, Avinash wrote a great how-to article. It’s not keyword data, but it’s great insight.
    [...]

  22. [...]
    However, we send major props to Avinash Kaushik for his post about five steps to Smarter Data Analysis of Google’s https (not provided) change. I have not seen anything close to adapting to Google’s decision and improvising a solution.

    What Avinash proposes is not easy to do. Quite frankly, I don’t know any clients that would be willing to pay for this analysis today to improve SEO. However, “not provided” is only at 10 – 18 percent, Google wants that to be as close to 100 percent as possible.
    [...]

  23. [...]
    In each case above, with three very different target audiences, [not provided] made up a substantial percentage of the overall search traffic to these sites — and the numbers would be higher if I only compared it to overall Google traffic.
    There are ways to use analytics data to help get a general idea of who these [not provided] visitors are. Google’s own Avinash Kaushik has some ideas and examples in this excellent article.
    [...]

  24. [...]
    L’excellent et inspirant Avinash Kaushik, gourou et prophète de l’analyse web, nous confie quelques perspectives ici [en].
    En adaptant l’esprit de cet article à notre environnement et aux sites pour lesquels nous nous occupons du suivi Analytics et de l’optimisation, je vous livre ci-dessous une démarche possible à suivre et une première idée de l’impact réel pour des sites de contenus monétisés par Google Adsense.
    [...]

  25. [...]
    Avinash Kaushik does some very smart data analysis by first quantifying "not provided" data, then segmenting and attempting to understand user behavior to gain insight. Using landing page keyword referral analysis, he was able to take a before/after snapshot at the URL level to determine what keywords are sending an abnormally small amount of traffic. I can see this being especially useful when looking at year-over-year comparisons.
    [...]

  26. [...]
    By default, they will now be directed to https://www.google.com. This means that your analytics will not capture any referral data from logged in searches. Obviously, marketers are upset as we will get even less data to work with going forward. While Google estimated this would affect approximately 10% of all searches, the reality for most websites is that approximately 20% of our organic traffic is now affected. Fortunately there is a workaround.
    [...]

  27. [...]
    I recommend reading Avinash Kaushik’s blog post on the topic. He helps you get beyond your missing data and find other ways to come up with actionable insights in the presence of (not provided).
    [...]

  28. [...]
    What are your percentages and what metrics or segments are you using to get your (not provided) information?
    A few colleagues in our field have come up with some interesting points: that it’s not totally the end of the world. So let’s move forward and discover what we can do with this information to make the best of it. Check out Avinash’s post
    [...]

  29. [...]
    Whilst all others were losing their heads at the news the peerless analytics Guru Avinash Kaushik provided his followers with a step by step guide to “analyse the impossible”, proving that a loss of data quality doesn’t always equal a loss of analytical insight.
    [...]

  30. [...]
    In response to all of the concern about the "(not provided)" data set, Google's resident Analytics expert Avinash Kaushik provided a few custom reports that you can import into your analytics account to analyze it—sort of—and concludes that it's reasonable to assume that "(not provided)" is a cross-section of your website's overall traffic. Kaushik believes that this just stands to reason, given how secure search has been implemented across all Google Accounts.
    [...]

  31. [...]
    Smarter Data Analysis of Google's https (not provided) change: 5 Steps
    [...]

  32. [...]
    B2B marketers need to set appropriate benchmarks now and ongoing, specifically calculating trends in branded, non-branded, and not provided keyword information. Here are some resources and recommendations for establishing these benchmarks. Smarter Data Analysis of Google’s https (not provided) change: 5 Steps
    [...]

  33. [...]
    It makes things much more manageable if we lose one or two keywords here and there instead of losing a whole keyword or two that is driving the most traffic to your site. This isn’t ideal, but you can still make grounded, logical marketing decisions that will pay off for your future. A great article to further reference can be found here by Avinash Kaushik
    [...]

  34. [...]
    De todos modos voy a dejaros algunos enlaces que me han gustado mucho y que os pueden ayudar a evaluar qué está pasando con ya las famosas visitas (not provided). Están en inglés, pero valen la pena.
    http://www.kaushik.net/avinash/google-secure-search-keyword-data-analysis/
    [...]

  35. [...]
    Gdy sporządzisz już „protokół strat”, warto przyjrzeć się jakości ruchu (not provided). Avinash Kaushik, guru analityki internetowej, proponuje na początek sprawdzić kilka najważniejszych statystyk i porównać je z wynikami dla ważnych dla nas segmentów. Jeśli chcesz szybko sporządzić podobny raport, skorzystaj z naszego szablonu – zajmie to dosłownie kilka sekund. W ten sposób możemy wyciągnąć (mocno przybliżone) wnioski co do natury odwiedzin oznaczonych jako (not provided), np.:
    [...]

  36. [...]
    Pamiętacie jedną z propozycji Avinasha Kaushika? Znany analityk polecał sprawdzenie, na jakie strony docelowe trafia ruch z (not provided), co pozwala dość łatwo oszacować tematykę „zaginionych” słów kluczowych.
    [...]

  37. [...]
    In other words, this is filtered site data that is not available for you to view. If your site is in the 50% range, what you can view has become so restricted as to almost become meaningless. (update: check out this post from Avinash for an interesting work around) Is it fair that Google allows paying Adwords customers to see that data? No, but that is a choice Google has made. Another highlight to consider is that “Google can’t tell you where the algo will be in 6 months.” There are so many thousands of tests/evaluations being conducted that drive the almost 600 changes made this year alone.
    [...]

  38. [...]
    A couple of folks have come up with some really clever solutions to making the data more meaningful. Analytics guru Avinash Kaushik has suggested a systematic approach to gaining some understanding into the profile of the users who are being caught in the (not provided) net, while smarty-pants Dan Barker has a method for understanding landing pages and thereby making some assumptions on the driving keywords based upon the targeting of those pages. This does also require some deeper manual analysis but you remain in a far richer data position than without it and is highly recommended (remember you can’t apply these sort of Analytics data filters retrospectively).
    [...]

  39. [...] Smarter Data Analysis of (not provided) post by Avinash [...]

  40. [...]
    Smarter Data Analysis of Google's https (not provided) change: 5 Steps by Avinash Kaushik
    [...]

  41. [...]
    그런데! 꽤 괜찮아 보이는 트릭을 찾았다. How to steal some ‘not provided’ data back from Google 이라는 글에서 찾은 트릭이다. 이 글은 아비나우 카우쉭이 제안한 (not provided) 분석 방법에서 아이디어를 얻어서, (not provided)를 랜딩 페이지(외부에서 내 사이트로 온 사람이 처음 도착한 페이지)의 주소로 바꿔치기하는 트릭을 제공한다.
    [...]

  42. [...]
    Cyrus Shepard of Above the Fold also blogged about the (not provided) issue. His 7 tips are generally more digestible, however in #3 he links to some insanely complex analysis and custom reports from Google’s own Avinash Kaushik, if you’ve got the stomach for it!
    [...]

  43. [...]
    Mettre en place des outils d’analyse de trafic et monitoring des médias sociaux : Google a décidé de nous priver des mots clés cachés quand il s’agit de recherches via connexion sécurisée et cela concerne presque 30% du trafic naturel (même si Avinash Kaushik nous a donné quelques tuyaux). Je considère alors les rapports sur les requêtes de recherche disponibles via Google Webmaster Tools ou dans Google analytics (quand couplé avec GWT) sont très pertinents pour juger de l’avancement du référencement SEO. Du côté social, si en 2010 je me posais des questions concernant l’utilisation des outils de mesure d’e-réputation (engagement, analyse de sentiment, Net Promoter Score NPS), je me simplifie la vie en utilisant les segments avancés pour les réseaux sociaux sur Google Analytics et aussi en suivant les partages effectués via la plateforme Sharethis après avoir installé le bouton, et aussi des fois pour des besoins ponctuels, je peux utiliser Bit.ly ou Goo.gl qui tous les deux offrent une visibilité suffisante de la vie de l’URL partagée, à condition de la garder intacte. Notez qu’aujourd’hui Google vient d’annoncer le lancement de sa section valeur des médias sociaux dans Google Analytics.
    [...]

  44. [...]
    Avinash Kaushik, the analytics wizard and Digital Marketing Evangelist for Google, took a few hits when Google announced the “not provided” policy in Google Analytics. He responded with generous grace by writing a thorough guide to Smarter Data Analysis of Google’s https Change. Using clear, step-by-step instructions he shows how to:
    [...]

  45. [...]
    Indeed, Avinash Kaushik has perhaps made the strongest argument in favour of utilising historical data to guide our on-going interpretations of “not provided” traffic. If in the past the proportion of brand keywords driving search traffic to a given website was consistently around 30%, why should that suddenly cease to be the case? Finger in the air measurement yes, but a sensible assumption nonetheless.
    [...]

  46. [...]
    Leia também: Análise Inteligente dos Dados do Google HTTPs (not provided), por Avinash Kaushik
    [...]

  47. [...]
    Pamiętacie jedną z propozycji Avinasha Kaushika? Znany analityk polecał sprawdzenie, na jakie strony docelowe trafia ruch z (not provided), co pozwala dość łatwo oszacować tematykę „zaginionych” słów kluczowych.
    [...]

  48. [...]
    Gdy sporządzisz już „protokół strat”, warto przyjrzeć się jakości ruchu (not provided). Avinash Kaushik, guru analityki internetowej, proponuje na początek sprawdzić kilka najważniejszych statystyk i porównać je z wynikami dla ważnych dla nas segmentów. Jeśli chcesz szybko sporządzić podobny raport, skorzystaj z naszego szablonu – zajmie to dosłownie kilka sekund.
    [...]

  49. [...]
    Smarter Data Analysis of Google’s Https (not provided) Change – 5 Steps – Avinash Kaushik Note: If you think of data and Analytics as a hobby, and you’re not reading Avinash Kaushik’s blog, you’re missing out. I read his posts 3 or 4 times just to try and glean as many nuggets as I can.
    [...]

  50. [...]
    These keywords are blocked not only in Google Analytics, but for all analytics software – so you can’t get away from it. If you were super keen, you might consider reading Avinash Kaushik’s post about potential ways to get this information out – although note he says it is ‘trying to do the impossible’.
    [...]

  51. [...]
    Step 1 – Monitor Impact. Before you begin trying to analyse what the (not provided) data actually means, set yourself up to monitor how your website is being affected and what on-going impact the change has on your reporting. Avinash Kaushik has a great way of working this out.
    [...]

  52. [...] Smarter Data Analysis of Google's https (not provided) change: 5 Steps – [...]

  53. [...]
    Hacer un análisis más complejos y por el comportamiento que tienen estos usuarios asumir quienes pueden ser: son visitantes nuevos, rebotan, cuántas páginas ven, si convierten o no… Y compararla con las keywords de marca y las no marca; así como las landings. Muy completo el análisis que propone Avinash
    [...]

  54. [...]
    Note that this filter can’t be applied to historical data; you’ll only see it in action for new visits. There’s a similar technique on Avinash Kaushik’s blog that involves creating an advanced segment to examine which landing pages your ‘not provided’ results are leading to.
    [...]

  55. [...]
    This issue is on fire. Industry thought leaders have spoken out against this act. Google has been chided for providing privacy as the reason for withholding the terms, but then continuing to provide the terms to AdWords customers. Brilliant/clever analysts are spending valuable time trying to find a way to deal with this loss of data. For now, I am at peace. I understand and appreciate the fact that Google is keeping this data from us to protect our collective privacy. Big data can get real small when it is segmented with granularity. In a world of pocket dialing and accidental pasting, pocket searching /accidental searching seems worth being protected from… nevermind intentional searching.
    [...]

  56. [...]
    You can isolate these visits and mine your analytics to determine on which landing page the “not provided” visitors entered the website—which gives some insight into which keyword groups may have been used in the search. To use this method, first you need to create a custom advanced segment for “not provided” traffic. Then, in standard reports, pull up the landing page report and apply this advanced segment. Now you can see that X number entered through the homepage, Y number entered through service page, and so on.
    [...]

  57. [...]
    Este post del gurú de la analítica web Avinash Kaushik
    http://www.kaushik.net/avinash/google-secure-search-keyword-data-analysis/
    En consonancia con lo visto anteriormente, el mismo autor nos expone
    [...]

  58. [...]
    Smarter Data Analysis of Google’s https (not provided) change: 5 Steps
    Google’s secure search does not provide the search query used to visit a site. Learn five steps to gain deeper insights into this traffic.
    [...]

  59. [...]
    Avinash Kaushik is the author of analytics books “Web Analytics an hour a day” & “Web Analytics 2.0”. He also writes for his leading analytics blog Occam’s Razors. http://www.kaushik.net/avinash/google-secure-search-keyword-data-analysis/
    [...]

  60. [...]
    Monitor Impact – Before you begin trying to analyse what the (not provided) data actually means, set yourself up to monitor how your website is being affected and what on-going impact the change has on your reporting. Avinash Kaushik has a great way of working this out.
    [...]

  61. [...]
    Many blog posts have been written about how to better analyze the (not provided) data. My personal favorite approach involves breaking down the (not provided) referrals by landing page. Say I've got a blog post discussing enterprise app stores. When I go into Google Analytics, I see data like this:
    [...]

  62. [...]
    Avinash Kaushik also proposed an alternative approach on his own blog soon after SSL search went live in the USA. He put forward the idea that it is possible to use behavioural metrics to identify the provenance of the unknown search traffic. As per usual, his post is worth reading and fairly in-depth. It does not offer much if you wish to conduct any kind of detailed analysis of organic traffic.
    [...]

  63. [...]
    Pentru o luna/doua, va fi usor de identificat, pe baza istoricului, ce cuvinte ar putea fi in spatele not provided, insa pe viitor e nevoie de mai multa analiza pentru a extrage posibilele cuvinte. Ajutorul vine de la aceasta integrare, chiar daca datele nu sunt in proportie de 100%, pentru cine doreste sa faca analize avansate, e un punct bun de plecare. *** In loc de incheiere, recomand un post de la Avinash pe aceeasi tema si unul de pe Search News Central.
    [...]

  64. [...]
    Примите к сведению, что этот фильтр не влияет на исторические данные. Изменения вы заметите только для новых посещений. Схожая техника описана в блоге Авинаша Кошика. Она подразумевает создание расширенных сегментов, чтобы определить, на какие Целевые Страницы ведут результаты not provided.
    [...]

  65. [...]
    The good news is there are several ways to work around this like linking your Adwords and Webmaster tools accounts to Google Analytics. A good way to start is kissmetrics' blog where you will find Google's take on this subject as well as tips from Dan Baker's e-consultancy blog and a technique from Avinash Kaushik.
    [...]

  66. [...]
    David Harry built on insights from Google’s resident analytics expert Avinash Kaushik to create this analysis. We refer you to the full post for the brain-spray awesomeness that is his analytics wizardry. The basic idea is to use broader query classification and advanced segments in Google Analytics to bucket users with similar intent, and then use the relative proportion of those visitors to determine the keywords that make up your (not provided) set.
    [...]

  67. […] Smarter Data Analysis of Google's https (not provided) change: 5 Steps (Occam's Razor) […]

  68. […]
    Some of the better posts on this subject include: Data analysis of Google secure search by Avinash Kaushik
    […]

  69. […]
    I came across a Google analytics hack and I cannot wait to see the results in the next 24 hours. Yes, I am writing about a hack that has yet to provide me with results but I am just too excited to keep this to myself. Avinash Kaushik has a fantastic step by step blogpost (even with pictures) to help you understand the hack.
    […]

  70. […]
    Smarter Data Analysis of Google’s https (not provided) change: 5 Steps – Avinash Kaushik
    […]

  71. […]
    In an effort to make search more secure, on Oct. 18th Google announced that users logged into their Google accounts using www.google.com would be redirected to https://www.google.com. The search queries by these users would hence be encrypted and not available to website owners via web analytics tools such as Omniture, WebTrends, Open Stats, Google Analytics etc. http://www.kaushik.net/avinash/google-secure-search-keyword-data-analysis/
    […]

  72. […]
    5 Steps to Smarter Data Analysis of Google's https (not provided) change
    […]

  73. […]
    There’s been lots of chatter on the Twitter-sphere that this marks another nail in the coffin for SEOs. This is becoming a defacto emotional response when Google ups the ante a bit, and needs to be taken with a pinch of salt. The death of SEO? Hardly, it just means that as SEOs we need to dig a little deeper and explore alternative avenues. There were many creative workarounds that were proposed initially such as Avinash Kaushik’s Smarter Data Analysis of Google’s httpps (not provided) change: 5 steps
    […]

  74. […]
    Smarter Data Analysis of Google’s (not provided) (Occam’s Razor)
    […]

  75. […]
    Smarter Data Analysis of Google's https (not provided) change: 5 Steps – Avinash Kaushik, Occam's Razor
    […]

  76. […]
    Smarter Data Analysis of Google's https (not provided) change – via Occam's Razor – Avinash Kaushik's 2011 post dealing with the (not provided) change and establishing a better understanding of the data at hand and in transition.
    […]

  77. […]
    Smarter Data Analysis of Google's https (not provided) change: 5 Steps (Occam's Razor)
    […]

  78. […]
    Avinash Kaushik tips on dealing with Not Provided and gives five point clues on overcoming this situation and here’s one among them- “One final idea I had was to wonder if the (not provided) traffic enters the website at a disproportionate rate on some landing pages when compared to all other traffic from Google. If that is the case we could do pre post analysis on referring keywords to those landing pages and get additional clues”.
    […]

  79. […]
    Smarter Analysis of Google's https (not provided) change, November 21, 2011: Analytics Guru and Digital Marketing Evangelist (Google), Avinash Kaushik discussed 5 ways to dig into GA (not provided) data. Hardcore data stuff.
    […]

  80. […]
    In 2011, Avinash Kaushik, one of the world's top analytics experts and an evangelist for Google, wrote a blog post "Smarter Data Analysis of Google's https (not provided) change: 5 steps" detailing the things marketers can do within their Google Analytics account when Google first turned on the search encryption for logged in users. It is for high level Google Analytics users, but he goes over it step by step, and is well worth the time setting it up.
    […]

  81. […]
    Get better at leveraging analytics and the data at hand – the need for employing data analysts and analytics experts as part of your team just got increasingly important. There are probably few analytics experts better than Avinash Kaushik, whom a few years back when Google originally started with the whole secure search thing wrote an excellent post on smarter data analysis. While some of the customized reports that Mr. Kaushik describes may no longer work, he does illustrate the idea of looking at other metrics (think page level analysis). The bottom line is although Google has taken some data away from us, that’s ok as we still have other data sets that can provide insight as to the decisions we need to make at a keyword level, content level, or other.
    […]

  82. […]
    Smarter Data Analysis of Google's https (not provided) change: 5 Steps
    […]

  83. […]
    Dans Google Analytics, il est possible de créer des segments qui permettent d’analyser le volume des (not provided). Un article à vous mettre sous la dent si ce sujet vous intéresse chez Avinash Kaushik. Ces analyses peuvent vous apprendre à mieux comprendre comment vos visiteurs cherchent vos produits et services. Elles peuvent vous aider à mieux choisir les “nouvelles” pistes évoquées plus haut. Pour explorer ces méthodes d’analyses web analytiques, je vous propose un article qui lie 11 autres publications d’experts parmi lesquels quelques pointures.
    […]

  84. […]
    Smarter Data Analysis of Google’s https (not provided) change: 5 Steps
    Back in 2011, Avinash Kaushik — general analytics and digital marketing demi-god — saw this day written in the tea leaves, and wrote this article to help marketers and businesses analyze their data in a much deeper, more thorough way than simply scanning a list of high and low keywords. While this is definitely not for novices, it puts forward some great Analytics strategies that almost everyone should see.
    […]

  85. […] Smarter Data Analysis of Google's https (not provided) change: 5 Steps […]

  86. […]
    In 2011, Avinash Kaushik, one of the world’s top analytics experts and an evangelist for Google, wrote a blog post “Smarter Data Analysis of Google’s https (not provided) change: 5 steps” detailing the things marketers can do within their Google Analytics account when Google first turned on the search encryption for logged in users. It is for high level Google Analytics users, but he goes over it step by step, and is well worth the time setting it up.
    […]

  87. […]
    The brilliant analyst Avinash Kaushik had developed several workaround in Analytics.
    […]

  88. […] Avinash kaushik บอกว่าภาพรวมของคีย์เวิร์ดที่มาจาก not-provided กับ suggest keywords ที่ได้จาก AdWords นั้นไกล้เคียงกันในเชิงพฤติกรรม เขาจึงแนะนำให้ webmaster ใช้ Google Analytics โดยใช้ Custom report วิเคราะห์หาสัดส่วน Brand กับ Non-Brand เพื่อทำ Organic search map และเมื่อคุณได้รายงานออกมาแล้วคุณก็จะเห็นภาพรวมของ Non-Brand keywords หลังจากนั้นคุณก็เพียงนำ Suggest keyword ของ AdWords มาแทนที่ Not-Provided แน่นอนว่าคงไม่ได้ทั้งหมด แต่ก็ดีกว่าคุณไม่รู้อะไรเลย…จริงหรือไม่? […]

  89. […]
    Listen to the Smart Guys: These are only a few of the ways we’ve discovered to deal with “not provided” data; however, there is a wealth of information and truly intelligent SEO types out there that can help you move in an even deeper direction. For a much more technical look at this Analytics change, check out Avinash Kaushik’s article about dealing with the https “not provided” change. It’s a bit heavy, but it’s definitely one of the best resources out there.
    […]

  90. […] Smarter Data Analysis of Google's https (not provided) change: 5 Steps (Occam's Razor) […]

  91. […]
    Mientras, si quieres leer más sobre el tema, el post del gurú Avinash es muy interesante: http://www.kaushik.net/avinash/google-secure-search-keyword-data-analysis/
    […]

  92. […]
    If you are an advanced Google Analytics user, you can read the article of Avinash Kaushik, the top Analytics expert.
    […]

  93. […]
    Avinash Kaushik of Google wrote an article on his own site that provided thorough tips on how to best navigate through the upcoming murky waters. The article got an overwhelmingly response and directly engaged the SEO community.
    […]

  94. […]
    Lecturas relacionadas Si quieres ampliar los conocimientos sobre el not provided, te recomiendo que leas estos artículos que me inspiraron a la hora de escribir este post, y que son la fuente de las imágenes que he utilizado: - When Keyword (not provided) is 100 Percent of Organic Referrals, What Should Marketers Do?, Rand Fishkin. - Smarter Data Analysis of Google's https (not provided) change: 5 Steps, Avinash Kaushik.
    […]

Add your Perspective

*