If you are using a modern web analytics tool (tag based or log based) it is quite likely that it is using cookies for tracking purposes.
In my conversations it is embarrassingly common to find a lot of FUD and confusion and lack of understanding (or appreciation of!) cookies and the role that they play in any analytics done on the web.
Hence my attempt at this simple easy to understand primer. If you are an Analyst or a Marketer or a Website Owner or a Website User it is critical that you read this short blog post – your data will make so much more sense after are done.
Why are cookies important?
Cookies, usually anonymously, allow the website owner to measure the number of Visits and the Unique Visitors to the website and hence understand the Customer's website experience and segment visitors that are New to the site from those that are Returning.
That's it.
No more and no less.
Lots of other tracking is possible without the use of cookies, they are not the be all and end all of visitor behavior tracking. Wipe that sweat off your forehead. Go get a cold glass of water to drink.
Let's attack the rest of this complex issue in a few bite sized understandable chunks.
Transient vs. Persistent.
There are two types of cookies that the web analytics software will set when you visit a website. They are commonly called "Transient" and "Persistent" cookies. Some folks refer to them as "session" and "user" cookies respectively.
The job of the transient cookies is to help "sessionize" your experience on a website. Put simply, you are going to make a series of clicks and leave. That's a session. The transient cookie helps group those clicks efficiently.
The transient cookie is "set" when you visit the site, it disappears when you leave.
The persistent cookie is set the first time when you visit the website, and it will remain there for the duration that the website determines. For example, Analytics cookies are typically 18 months but many other tools will use anything from 18 months to 18 years. Persistent cookies are there to help identify a unique browser to your website, in as much they are the closest thing to tracking a "person" / "unique visitor".
The persistent cookie is on your browser until you either delete it, reinstall your browser or do other such things.
[
It is important to note that almost always persistent cookies don't contain any PII – Personally Identifiable Information – data. They just have a random string of numbers or alphabets that only the company who set the cookie can read. For example here is a cookie that Webtrends.com just set on my browser as I visited www.webtrends.com: C8ctADY1LjU3LjI0NS4xMS00MTU3MTQwMTc2LjI5OTQ0NzE5AAAAAAAAAAACAAAAo
M0AAINghUgWYIVI.
I should see if I can mess with them by changing that cookie to COREMETRICSWILLALWAYSBEWATCHINGYOURSITE4LOVE+OMNITUREISGREATAND
INDEXTOOLSWINS!! :)
]
First Party vs. Third Party.
A "third party" cookie is set by, well, a third party when someone visits your site. So if www.omniture.com is using WebTrends as the web analytics tool of choice then when I visit omniture.com a cookie will be set on my machine under the www.Webtrends.com domain. On omniture.com a Webtrends.com cookie is considered a third party cookie.
[Omniture.com is actually setting cookies using .2o7.net which would make them third party cookies on that domain.]
In the good old days it was easier for the web analytics vendors to use third party cookies and they were rampant. But it was discovered that there were other players using these cookies in sub optimal ways. This lead to default internet browser settings that would reject third party cookies, and many other anti spyware and malware programs auto deleting them etc etc. Suffice it to say they have fallen out of favor, and are considered quite sub optimal for tracking "unique visitors".
A "first party" cookie, hence, is set by the web analytics tool using the domain of the website itself. As an example when you visit www.coremetrics.com you'll notice (if you have WASP!) that they are setting cookies using the domain data.coremetrics.com – which makes those cookies first party.
First party cookies are the preferred tool of choice for tracking "unique visitors" because they are deleted / rejected a lot less by any objective measure. This means, for example, they are a far superior at tracking repeat visits or new and returning visitor segments etc.
Another reason first party cookies are rejected a lot less is that much of the internet does not work if you don't accept first party cookies. Email providers like hotmail (! :) or gmail.com, ecommerce websites like amazon.com or crutchfield.com, banks, even blogging platforms! They all require you to accept first party cookies.
Almost every single decent web analytics vendor now provides an easy ability for you to use first party cookies. Some like Google Analytics only offer the option of having first party cookies.
If you notice some initial push back from your vendor to use the easier-for-them third party option, do a little push back of your own. Insist on first party. Its good for your health.
Exception for Third Party Cookies.
There are some relevant uses of third party cookies. One of the most common is by ad serving platforms because that is the only way they can track a "unique visitor" across multiple websites. So even if that third party cookie gets blown away and rejected a lot more, they (you) really don't have much of a choice. That's just how the internet protocols work.
Here's a example of how that works.
We saw that omniture.com is using .2o7.net third party cookies. After going to omniture.com I could go to ebay.com and then to nytimes.com. .2o7.net knows that I was at the Omniture site a little while back and then I went to eBay and then NYTimes.
Now as I am reading the latest Maureen Dowd column .2o7.net (if it was a ad serving platform) could serve me a ad for Omniture next to the Maureen Dowd column. Knowing I also went to eBay they could even give me a deal on Omniture in that ad! : )
This is of course just one example to illustrate the use of a third party cookie and why Atlas and DoubleClick and Yahoo and all the others use them (and provide value to their customers).
First party cookies can't be "read" and "carried over" like the above scenario.
Does my choice (1st or 3rd) influence where my data is stored?
No.
The type of web analytics software you use determines that.
If you are using a ASP based solution (say NetInsight or Microsoft AdCenter Analytics or VisiStat) then both your first party or third party cookie data is stored in the data center of your application service provider (vendor).
If you are using a in-house solution (like ClickTracks or Urchin) then your data is stored in your own data center (regardless of what kind of cookie you use).
Cookie Deletion Rates.
It is important to remember cookie rejection is not the same as deletion. With rejection you don't even accept (worsens tracking). With deletion you collect data for the session (visit) but tracking after that visit worsens.
Everyone wants to know cookie deletion rates ("help my web analytics data is crap!"). There is no "global standard". Sadly I have never seen a study that was objective and not pushing the vested interests of the publisher (be it a company or a "analyst").
It is also extremely extremely difficult for a "third party" to have the kind of access required to actual data that would help them develop anything close to a objective "standard".
The biggest determining factors are your customers and their browser settings and software on their computer. And that can vary greatly from site to site.
My own personal experience across a number of ecommerce, support, and other corporate sites (excepting extremely "tech heavy audience" sites) has helped me come up with a "benchmark" of cookie deletion rates of 3% to 5% for first party cookies and 20% to 25% for third party cookies. They all tend to fall in that range.
FWIW.
If you want to know what the number is for you, I recommend putting in the sweat, blood and tears to measure it on your actual site. If it is important to you, it is important that you don't just take someone's word for it and proceed to evaluate your own web analytics data and get your own benchmark. I assure you that you are unique.
Do I have to use cookies?
The current generation of web analytics tools all use cookies to perform the core function of "accurately" compute Visits and Unique Visitors.
If you use cookies those numbers will be better (not perfect, see this post: Data Quality Sucks, Let’s Just Get Over It).
You will get a better understanding of metrics like Visits to Purchase or New and Returning Visitors or even Conversion Rates.
But if your company executives or, more likely, website customers have a preference for you not to use cookies then you don't have to.
You won't be able to measure some of the above Key Performance Indicators, but you can still get good value from the cookie-less data that you do collect. Top Visited Pages, Revenue, Referring Websites (URL's), Search Engine Keywords and on and on and on.
Don't let the fact that you don't use cookies get in the way of being able to use the web analytics data in meaningful ways.
The data won't be perfect but then again perfection is greatly overrated! (Chapter 13, Page 341 of my book.)
[
There are analytics tools that allow you to use alternatives to cookies to compute Visits and Visitors. You can use user_agent_id's, combination of browser_id and operating system etc. See if your Management or Customers are ok with that. If yes, use those. If not, to stress again, the data you collect, anonymously, can still reveal insights of value.
]
Is privacy important?
I know that sounds like the most obvious question in the world, with the most obvious answer in the world.
Yes. It is.
The primary function of your website is to be responsive to your customers. It is important to have a clear privacy policy, it is important to be transparent about what you are collecting (especially if you are collecting PII – personally identifiable information), and to educate your users.
Here's my humble privacy policy (you'll always find it in the footer).
Be transparent, there are few things more important than the trust of your customers.
Besides as I have stressed several times, even with what data you can collect (say you just have your raw server web logs and nothing else) it is possible to find insights. Nothing's impossible for a Analysis Ninja!
That's it.
You are now a graduate of Cookies 301. May the force be with you!
I would love to hear your feedback on this delightful and often beguiling topic. What do you think of cookies? What has worked for you? What did not? How have you overcome obstacles? Any tips for the rest of us?
I am sure you have stories you can't wait to share. Please do.
Thanks.
PS:
Couple other related posts you might find interesting:
Well that depends on your definition of "does not work" :-)
Excluding work's websites, I have only about 20 or so sites whose cookies are accepted within my browser (be honoured Avinash, you're one of 'em. :-P ).
Yet I spend all day browsing around the 'Net reading and interacting with websites. Working.
Internet works fine for me! Come on in!
Sure there are sites that *need* cookies. Logins, Shopping carts etc. No argument. But you might be amazed at just how much of the 'Net as a whole works very fine without them. Or not. :-)
One extra I'd add for your readers (as I know you know this!). You can configure web servers to log cookies into your web server logs. Is *highly* recommended to do so if you use logs.
Cheers!
– Steve
Avinash,
It's a very interesting post this one, especially because i find that many topics that you talk about are the same one that i've found myself covering in many meetings in the past 8 years working in the web analytics space.
A couple of comments that i think are mandatory after reading this.
The first one is about the 1st Vs. 3rd party cookie debate. One very important thing that everybody should always consider is P3P compliance of the cookie. P3P (Platform for Privacy Preferences) was introduced years ago (if i'm not wrong IE 5.0 was just released). It basically defines standard information that should be included in the cookie itself in order to "explain" to the internet users:
– Who is setting the cookie
– the reason why the cookie has been set
– the type of information that the cookie contains
– a link to the privacy page of the vendor setting the cookie
Any cookie can become P3P compliant. If a cookie is P3P compliant the information listed above will appear in the privacy report of the user browser each time the cookie is recieved / read by the website.
This is very important because by default since IE 5.0 (and other browsers later) by default a 3rd party cookie won't be accepted UNLESS it is a P3P compliant cookie. If it's P3P compliant it will be accepted unless the user specifies not to accept ANY cookie (maximum protection level).
How does this matter to me?
This means that if the Web Analytic Vendor uses 3rd party cookie, it should also be P3P compliant. The acceptance of that cookie will be much higher and very very close to the 1st party cookie acceptance (please also note that in some few cases the 1st party cookie non-P3P compliant could be blocked anyway!).
Regarding cookie deletion rates:
it's not all about cookie acceptance/rejection or cookie deletion. We should also consider what i usually define as "cookie generation". To understand what i mean here just think about your monthly surfing habits.
Let's first pretend you do not delete the cookies :-)
How many diferent browsers/devices do you use on a monthly basis? in my own case i usually use 2 PCs (sometimes my girlfriend's Mac too) both with IE and Firefox. Then i use the ipod touch (Safari). That's it. So over a monthly period i'm generating (using, surfing with) at least 7 different cookies. And this if i never delete them. How about you?
A unique visitor should be a real person, but even if we are 100% sure that our users NEVER delete cookies, they will for sure generate more cookies than 1 each.
Is this a problem?
I don't think this IS a problem. This is a scenario that needs to be taken in account, of course, but i don't consider it a problem for the simplest reason: don't Ad-Servers use the same technology (cookies)? Don't we plan an advertising campaign on a site based on Page Impressions and / or unique visitors? This last metric should, in fact, be called Unique Browsers, because that's what cookie measure. Browsers. If I use 3 browsers i'll generate 3 cookies. As simple as that.
I hope this explanation contribute to complete the scenario.
Ciao,
Manuel
Hi Avinash,
I am an ardent fan of your blog.I am getting unusual statistics from google analytics. Surprisingly "Referring sites" section showing that my website is getting visitors through google.com and even from our website.
How can google.com and my own website is considered as reffering site?
Plz crack this million dollar question!
Very relevant!!
Rather than dwell on cookie deletion as something that messes my numbers I use trending when talking about UVs. There is nothing to say cookie deletion will be greater one month over the other. So if the overall traffic is trending the way I hope it is (up or down depending on the purpose of the page) then whatever is happening is OK. For authentication sites (sites requiring a sign-in) you have an additional check – not fool -poof.
For me, "setting expectations" (with clients / co-workers / bosses) has worked the best. It's important to understand that Analytics packages aren't replacement for Quick Books or some other accounting software. They use first or third party cookies to collect visitor data, so your data isn't going to be 100% accurate, not even close (well close, but not too close). So as long as everyone understands this going in, things usually go a lot smoother than if people find out later that their web analytics programs use cookies and that people can block cookies :).
In addition some web analytics providers (such as coremetrics) use the user/persistent cookie to link multiple sessions together. In other words it is used to see that a visitor came in from a banner ad on Monday and then returned via natural search on Tuesday to actually purchase.
Sujith,
The reason you are picking up your own site as a referrer is likely because you have links on your site that are absolute instead of being relevant.
If the URL in your hyperlinks read http://www.yoursite.com/movies/movies.aspx …they should instead read /movies/movies.aspx.
Use relative…not absolute links. Google is a referrer because of search traffic being sent your way or maybe links in google email from the mail servers leading traffic to your site.
Joe,
One important thing would be to measure what is the standard error percentage.
Each site audience has his own cookie deletion ratio and cookie generation ratio. This won't change much over time. So you'll know how "wrong" those cookie based figures are.
But what's wrong for someone might be right for someone else. There is no absolute unique right way to measure everything. It's all about using the right measurement / metric for the right purpose.
:-)
Manuel
This is a amazing post Avinash. I have learned a lot from your blog.
I am a follower of your blog in Korea. I am eagerly waiting the publishing of your book in Korean this weekend. Here is one of our Korean Blogger who write about it:
http://www.youzin.com/blog/?p=686
Good luck to you.
Hi Avinash,
You mention that, in general, first-party cookies have a deletion rate of 3% to 5%. For what time frame is your estimate?
Cole
Steve: I admit that line was "comment bait" and specifically directed at you. I had missed you not commenting on my recent posts! It is so much fun to argue with you. :)
You are absolutely right on logging the cookies into server logs, that is the reason I opened the post by saying you can use cookies with either tag based or log based solutions.
Manuel: Your comment is brilliant! Thank you, thank you.
You are absolutely right that even if no one deleted their cookies there are still issues with getting "accurate" counts of "persons" on the website. But the data is still very good (not perfect) and quite actionable.
And you are right about "unique browsers", I am sure you notice my cute use of that exact term, browsers, in my post several times. :)
I am deeply appreciative of your comment, thanks for adding to the conversation.
Joe: Here is one of my opening lines in these types of conversations:
"Consider what you can measure if you take out an ad in a magazine. Subscribers to the magazine at best. Maybe newsstand sales. How does that measure effectiveness of the ad? Its a total faith based initiative! Now consider that same ad but on the website of that magazine. Imagine what you can measure: Impressions, clicks, visits, revenue, leads….. Now it's not 100% accurate, but its not a faith based initiative!"
:)
Cole: In most cases it was going back three months (usually that yields a number closer to the first one). In a bunch of cases we went back six months, and that usually yielded a number closer to the higher in the range.
Hope this helps.
-Avinash.
Dear Avinash,
This post has deleted most of *FUD* about cookie issues since Manuel also added his additional points too.
I have question in my mind. Can we capture the data of visits such as First Referrer & Last Referrer?
I would like to know that if new visitors come by organic source first time then next time is he visited the same site through PPC source.
You might be think what is this funny! :)
But I got one GA code from research which will provide you first referrer & last referrer too.Is this possible in GA?
I have not used that code till.
But if you want to look at this I will post it to you.
Thanks,
Bhagawat Jadhav.
Avinash,
Thank you for the post. One question that I was hoping you could answer is the approximate % of users who block 1st party cookies. I have not found any reliable statistics covering this topic. The closest thing I found was from a post on Diablo Media but who knows from where they got there data.
What are your thoughts? I suppose this could vary greatly across segments and countries.
Michael,
Global average of cookie acceptance rate is higer than 96% (not users, but page mpressions).
Source: Nielsen Online.
This means basically that given 100% of page impressions more than 96% will be generated from a user browser accepting cookie (being either 3rd or 1st party cookie). Please also note that this is the global average, therefore some sites might have an higher rate (e.g. technical users sites), but from my own experience i've never seen a site where the number of non-cookied page impressions was higher than 9-10%.
I hope this helps.
Ciao,
Manuel
This post is very helpful, but what about if your company has several websites, all on different domains, and you want to understand whether visitors are going to just one, or several of your sites? Are there impacts to using 1st vs 3rd party cookies here? I assume if domainA.com has its own 1st party cookie, we need to set up domainB.com with its own 1st party cookie- but am not sure how to track a user who travels across those domains. Or, should I use some sort of custom variable/metric in my web analytics tool to track this separate from the cookies?
Thoughts?
Great Post, Avinash!
And Alice Cooper, thanks for that. Its amazing what you can learn in the span of a cup of coffee!
Just wanted to correct something in the comment thread:
1) The #1 reason that a site in GA is its own referrer is that the configuration is wrong. Either one has subdomains or cross domains and the webmaster has (talk about topical!) not set up the configuration to manage the cookies correctly. The #2 reason is that it used to be set up wrong, it is set up correctly now, and you still have people with the wrong (yup) cookies. There is #3 reason, and it is escaping me….
2) When Google shows up as a referrer, it is not because of search. For search, it will show up under search engines. As a referrer, I see it the most often in customer accounts because visitors are using Google Reader or Google Home or Google Something Else where they have a link to you (usually a feed). It is also possible that you, the website owner, have answered a question on a Google Group and left a link there, and someone clicked it. I believe that gmail always shows up as gmail and never as Google.
… continuation of comment above
OK, I remembered. The #3 reason that your site is your own referrer is untagged pages.
Nancy: Ahh…. a tough question. :)
Let's take it in parts.
Each domain for optimal collection of data should set a first party cookie. That is the cleanest way to get numbers for each domain.
It also means that you won't be able to truly calculate the total de-duped number of Unique Visitors across all the domains.
If you want to track multiple domains as one then you can do something like what this article advices:
How do I install the tracking code if my site spans multiple domains?
You can also also try a couple of other techniques that are listed in this post (look for the question and answer right under the image of the earthen pots!):
Google Analytics Help: Questions, Answers, Tips, Ideas, Suggestions
Both these apply to GA but of course your vendor might have a solution as well, please check with them.
Net net this is a bit of work, but if it is important enough for you then you should put in the effort.
Robbin: You are a star! Thanks for adding the excellent answers for self referrers and for "the" google showing up as a referrer.
-Avinash.
Robin – On your point # 2, you are right. Gmail, ymail would show as mail.google.com or mail.yahoo.com. The other referral for Google can also be the blending of top links that Google displays. A search for the word "Dictionary" on Google would provide you the links to various properties on dictionary.com 's website
(http://www.google.com/search?source=ig&hl=en&rlz=&=&q=dictionary&btnG=Google+Search)
Avinash – You might have also heard vendors talking about dedicated third party cookies. It is nothing different than your example above for Coremetrics.
As always, another great post.
Thanks,
Rahul
Avinash this is a excellent post. Having been in the industry for ten plus years I can honestly say that I have yet to read such a cogent explanation of cookies.
Most discussion around cookies is either too harsh or too complementary. Your post is balanced and pragmatic and provides clear guidance.
Simon.
Avinash,
It is interesting the topics of Cookies and FUD (Fear, Uncertainty, and Doubt) have come up on your blog.
I am a privacy advocate.
I realize my traffic data (usage, statistics, analytics, etc.) is out there and able to be collected.
I consider my traffic data to be a private exchange between me and and my intended destination (in this case, you).
First-Party and Third-Party both have legitimate uses for web optimization.
They help us build better website and make the internet stronger.
However, cookies do not exist in a vacuum.
Since the internet was not build with security in mind, there is room for trusted parties (especially Internet Service Providers) to manipulate and analyze traffic that is not intended for their use.
This topic gets sticky when you begin to consider the potential security implications of Third-Party and First-Party cookies.
Some recent news, you and your readers may not have heard of, has emerged in the security community that may be contributing to the "FUD" you references.
Fear of Third-Party cookies is not entirely unfounded because real-world examples of the use of First and Third-Party cookies being used for secret data gathering (and money making) have been found in the wild.
If you or your readers are interested in learning about potential origins of the FUD, or why some of it may not be FUD, I recommend you all review some the June and July 2008 episodes of Steve Gibson's Security Now podcast.
He reviews the "Phorm" technology and how certain ISPs have allowed a third party vendor to implement it, secretly, to gather data on the ISPs users in exchange for money.
One particular episode #153 covers this but it has been discussed in several: podcasts regarding Phorm
http://www.grc.com/sn/sn-153.txt
So yes, third party cookies themselves are fine, but there are real privacy and security issues that exploit vulnerabilities in the web. Unfortunately they will contribute to a certain level of "cookie rejections".
In my case, I browse the web with FireFox 3 which features 3rd Party cookies off by default. I run No-Script so I may control the execution of Java Script. I run flash block too.
I would recommend, as a best practice or at least a consideration, to avoid the use of 3rd party cookies whenever possible.
James: Let me unequivocally say that the "FUD" that I was, coyly, referencing was not from the sources that you have very kindly identified. While I am quite aware of the sad sad story of "Phorm", I was unaware of Mr. Gibson's podcasts. They certainly make an interesting listen.
I am a Analytics Evangelist and while data is important to me, I am a big believer in both privacy and transparency. I am also a staunch advocate for collecting non-PII data becuase online you don't need PII to understand and optimize the customer experience.
Every single company that sets cookies or collects any kind of data online owes it to its customers to be 100% transparent (in easy to understand language).
Every individual then should have a option to set their own levels of privacy.
Knowledge + Choice = Trust.
Thanks so much for the thoughtful comment and the links.
-Avinash.
I have been asked the question about how your own site can appear as the referrer a few times before and while I have made guesses (untagged pages generally), I have never felt that I had the real reason. But I was asked the question again today and had an idea which appears more logical and which the data subsequently proved to be accurate.
If a visitor accesses your site but clicks on a navigational link before the first page has fully loaded, the first measurement that is received will be for the second page, with the referrer being the URL for the first page. This is assuming that the page measurement code for your web analytics package is at the bottom of the page as it is for Nedstat.
I can't prove that this is the primary reason for having your own site as a referrer but it makes more sense to me than anything else. It is more likely to be an issue with content heavy pages or for visitors with lower download speeds. Can anyone let me know if they can prove or disprove this theory?
This is a really interesting post, it explains more about cookies than I knew before.
But I think it underestimates the number of people who delete cookies. Except for cookies from sites I need to log on to (bank, discussion forums, Amazon, etc.), all of my cookies are deleted every time I close the browsers I use. So, if I understand you rightly, I'm a unique visitor every time I visit a site even if I visit it every day or several times a day (if I close my browser between visits). This puts a new spin on the stats websites brag about.
Good reference post – always looking for well-written, objective discussions of this subject for people I'm working with.
Thank you for the NetInsight mention. One tiny amendment – NetInsight is available as ASP or in-house using 1st, 3rd or no cookies at all (and there are still plenty of folk that want/have to avoid them)
-bob
Another quality post Avinash.
I am a bit puzzled about the cookie deletion rate. If I recall correctly ComScore did a reaserch about cookie deletion a year ago and they found a deletion rate of about 30% on first party and third party cookies. Correct me if the numbers are wrong but according to the study a few users inflated the unique cookie by 2,5 times by deleting very often.
So how do you calculate your 3-5%?
Looking at our site we have looked at daily versus weekly versus monthly cookies to try and see the deletion rate and the 30% seems to be supported.
Also do you have any comment on ComScore study?
Thanks in advance.
Theodor: You might have noticed my own point of view about comScore in my blog post where I mentioned the dearth of objective and sound points of view on cookies. There are a whole bunch of problems with that data, I'll simply refer you to a pretty nice write up from my good friend Ian Thomas from Microsoft:
Cookies are evil! Burn them!
http://www.liesdamnedlies.com/2007/04/cookies_are_evi.html
My own experience is based on analysis on multiple websites over the course of the last few years of two primary sources of site data: javascript tag based and the raw server log data. In each case data gets collected differently, there are common keys (if nothing else then IP and User Agent strings) and both use cookies (common one set by the website platform). This allows detection of uniqueness using couple different algorithms, and not just cookies. At the end, after a bunch of sweat, you have insights into cookie deletion rates.
The data is not based on observing using "monitoring software" what a small panel of users do (even though I realize that the data has good intentions, is adjusted multiple multiple ways to adjust for sample bias and user type bias). It is based on actual data from the website, and all the data.
Let me rush to add that you should not take anyone's number at face value.
I stress repeatedly in the post that if this is important to you then put in the work required to measure it rather than jumping off based on someone else's work (and that includes me! :). Each website and web business is unique, it is possible you are 30% and it is likely that I am 85% and just as likely that pricegrabber is 1%.
-Avinash.
Few comments to add to this debate …
1 – Cookie deletion / blocking
Until last Nov I worked for a UK analytics vendor called RedEye for many years. Back in 2003 RedEye produced a comprehensive report on cookie deletion; go here for more info http://www.redeye.com/bestpractice/white_papers.php. This paper is still highly relevant today and goes in to more detail than pretty much any report written since and is worth a read.
The key to RedEye’s report was being able to look at the behaviour of a set of logged-in users (of transactional sites) and to analyse how many different cookies and IP / user agent strings each visitor had used over varying periods of time.
In summary, cookie deletion and blocking can be rife or negligible, but to what extent largely depends on the nature of the web site and the make-up of its visitors.
For example, consider a site whose best customers access it many times per day, such as a gambling site, verses a site whose best customers only visit once or twice a month, such as a supermarket. The number of cookies per visitor for the gambling site is likely to be much higher for reasons already discussed; such as multiple PCs (I can place a bet from any PC in approx 1 minute), multiple browsers (phone and PC), deletion (do I want people to know that I visited a gambling site?) and so on.
And yes, it’s also possible for each logged in visitor to have multiple logins which really does make ‘100% accurate’ measurement a pipe-dream, but as Avanash says the data is still fantastic for planning and analysis. Tip – steer clear of any vendor who says that their data is 100% accurate – you’d be surprised at how many claim they are.
2 – 3rd party verses 1st party
It’s not just a question of which type of cookie you set as many companies will use both. For example, a brand can set a 1st party cookie on the web pages which it manages and use the same tracking domain to set a cookie on any white labelled content which is hosted by a third party. In this case, cookies set from the white labelled pages will be 3rd party cookies during the same session.
3 – Cookies are no longer just browser cookies
You can also set cookies through Flash. In principle a Flash Cookie has the same basic objective as a browser cookie but as it is set via an embedded object in Flash it’s browser independent.
They’re also tricky to block and delete.
This is good news because Flash Cookies can very easily and significantly improve the ‘accuracy’ of reported visitor figures due to very low deletion and blocking rates. However, it can also be bad news because their use is not flagged up in IE’s Web Page Privacy Policy or through the usual Firefox plug-ins (WASP etc), so any brand using them needs to be even more conscious of best practice data protection and privacy wording. Go to http://www.e-consultancy.com/news-blog/author_25880/paul-cook.html for a view on this and also http://www.adobe.com/products/flashplayer/security/privacy_policy/faq.html
Avinash –
Thank you for this article. It is amazingly helpful indeed.
We are currently faced with a similar situation as Nancy described above (comment # 20) We have a collection of branded websites where a visitor can easily move from one to the other. Until recently all websites were using the same domain. We used 1st party delegated solution offered by Coremetrics to track.
Currently, we are redesigning our sites and moving them to different domains starting with the 2 of them. In principle, i would like to use the approach you are suggesting and use 1st party cookies for each domain. The internal concern however is losing paid search mmc parameters information when a user jumps across domains. For example if a visitor comes via Google paid search to domain1 and then to domain2 and purchases, the purchase will be attributed to a domain2 and categorized as a refered visit from domain1. However, internal marketing would want to get full credit for the purchase based on mmc.
Can you please offer an advise on what approach we might take to not lose paid search relevant information and to track our metrics accurately.
Thank you,
Olga
@Olga – I see two options, but it's early and I may miss a trick:
1. Use 3rd-party cookies for a single, other, domain. (urgh!)
2. If you can manage the links that a user may use to move from one site/domain to another, then you could pass the cookie value over in the URL – it's not bulletproof, but can work well.
(there was a third thing as I started writing this response, but it's escaped me for now)
Avinash, I have a very naive question to ask. You say that cookie deletion rates lie in the range of 3% to 5% for first party cookies and 20% to 25% for third party cookies. My understanding is that cookie deletions are in most cases manually performed by the user. The user just deletes all cookies stored on the local drive. This act doesn't distinguish between 1st and 3rd party cookies. This being the case, how could their deletion percentages be so vastly different?
I am researching Google Analytics for possible use on web site for survivors of domestic-abuse. The site administrator is understandably concerned that cookies set by Analytics in visitors' caches could become potential safety risks for those visitors, whose abusers might detect and be provoked by evidence of a visit to the site. From what I understand in the discussion above, persistent tracking cookies usually show up as a string of numbers, but do they necessarily do so? How can this masking be ensured? Can the Analytics user *choose* what appears in the tracking code? How does one do that?
Ame: This applies regardless of which web analytics tool you use (Google Analytics or Yahoo! Analytics or Omniture or….)….
Cookies are strings of characters and numbers and for the web analytics tools they are anonymous. For example here is the cookie set by Omniture on my computer to track me:
Key: s_vi_qpasgvx7Bwx7Cqwx7Cwfex7Dx60y
Value: [CS]v4|4A301DEC00004139-A0208310000005B|4A301DEC[CE]
Created: 6/10/2009 9:05:15 PM
Expires: 6/9/2014 9:05:15 PM
You can use this free program to check all the cookies set on your computer:
http://www.karenware.com/powertools/ptcookie.asp
Just clear your cookies, then go to any site, open the Karenware viewer, see what cookies that site is setting.
Some websites do collect PII (personally identifiable data). That is normally not stored in cookies, unless they really don't know what they are doing.
Tools like Google Analytics explicitly prohibit collection of any PII data by its users using Google Analytics.
To answer your other questions:
You cannot choose what string shows up in the cookie set by the Analytics tool (you'll quickly realize allowing you to set your own strings defeats the purpose of creating and setting random string cookies).
You don't have to use a hosted solution like Google Analytics or Omniture or WebTrends. You can use a log based solution. You implement it in-house (at your company / non profit). You own the data. It never goes any where outside your company.
You can find logs based solutions from WebTrends (http://www.webtrends.com), Urchin (http://www.google.com/urchin/index.html) or ClickTracks (http://www.clicktracks.com). They all cost some money, both Urchin and ClickTracks are pretty cheap.
You can find free logs based solutions here:
http://www.analog.cx/
http://awstats.sourceforge.net/
http://www.mrunix.net/webalizer/
All logs based solutions, free or paid, require you or someone at your company to be slightly technical to implement them.
Hope this helps.
Avinash.
I am not entirely sure if you answered my question–I am less concerned (here) about whether PII is available to outsiders (including but not limited to the website doing the tracking). I am really asking about whether others in the same household as a user would be able to see where those with whom they share a computer have been going online via cookies. For example, in clearing out my TEMP files, I once found cookies for pornography sites on my computer, which tipped me off about what my ex-husband was doing in his free time. Most people do this cache clearing automatically, but a suspicious (me) or controlling (an abuser) person might not. Our question: is there any chance that an abusive spouse would run across a tracking cookie that includes all or part of the abuse shelter's name? We want the info provided by Analytics; we don't want to endanger the people we're trying to help. Your Omniture-cookie example is relevant and reassuring; however, I cannot tell from what you are saying whether it is *necessarily* the case that the code would be anonymous. Are these cookies *always* randomly generated? Sorry for being so thick-headed and long-winded.
Ame: Let's try this again, this time with additional context from you.
When you open your browser (Firefox, Internet Explorer, Chrome) and visit a website two types of 'tracking" happen.
1. Tracking by the Server (website you went to).
2. Tracking by the Client (the browser on your machine).
In scenario one the website collects typically anonymous data about your visit (pages seen etc) and also sets a anonymous cookie on your computer. This anonymous cookie can't identify you as a person, it can identify the site which set the cookie. When I visit the NY Times a cookie gets set and on my computer it will say nytimes.com (or whatever domain used to set the cookie).
In scenario two (regardless of anything the website is tracking or not) the websites you visit, the pages you see, the url's you type, the forms you filled etc are all logged by your browser. This information is only stored on your computer.
So:
If you are a website owner and want to keep the data tracked by your website private, see my comment above.
If you are a person using a browser to surf the web and wants to keep your behavior private then your best option is to use a new feature in browsers called Private Browsing.
The way it works is you open, say, FireFox (version 3.5 or later), in the top menu you'll see Tools, click on that, then choose Start Private Browsing. This will start a complete private browsing session on your computer, when you close the browser all the history, cookies, cache, your download history (not the download itself) all will be cleared. There will be no trace left on your computer.
I use Google's Chrome browser and in it I click the Wrench icon (top left) and choose New Incognito Window. Same thing as above happens, no trace of my browsing is left on my computer after I close the browser.
I hope this helps.
-Avinash.
Thanks, Avinash!
Hi Avanish,
I've been thoroughly enjoying both your blog and web analytics 2.0, thank you for taking the time to write so well on the subject.
There is one subject around cookies that seems to be rarely touched on but which becomes very important when you have a long user-cycle as we do for our online website creator.
Our users are always logged in but have life cycles of many months. With customers using different machines and taking long periods to build their site our data becomes more and more dirty as their cookies are duplicated or deleted.
We really want the cookie to be set based on our user_id rather than simply the machine.
Looking at the GA cookie it seems possible to store the user's original GA cookie then overwrite the userID portion of it with the original should they log in from a different machine. That way we can be sure that GA is correlating events with visitors consistently rather than creating two different visitors for the user who logs on both at home and at work.
Is this a reasonable approach? Can you recommend any articles that talk about how to fix this problem in either GA or Omniture (which we're also using).
Thank you in advance for your help
– Peter
Peter: The optimal solution for you might be to use the new Custom Variables functionality in Google Analytics:
http://code.google.com/apis/analytics/docs/tracking/gaTrackingCustomVariables.html
I think setting the "scope" at a Visitor level might give you want to you (assuming your users will log in or have some other way of telling you they are the same person). Then you can use the standard reports or, like I do, cheat and just get what you want from the GA API and create your own custom reports.
If you need help then I am sure a GAAC could be happy to do so, a list is at http://bit.ly/gaac
Good luck!
Avinash.
Avanish, thank you for the prompt response. The new custom variables are great and get us 90% of the way there. We're able to do most of what we want to with them but unless we also store the data on our side we don't get the cross-machine tracking.
Let's say a visitor comes to us in month one and signs up through a landing page heavily promoting a particular feature. We store that they were shown the promotion in a visitor-scoped custom variable
In month three they then purchase the feature but do so from their work machine rather than the one they signed up on.
Because of the machine switch we wouldn't be able to correlate the two events and the efficacy of the early promotion.
That said, we haven't tested how much of an edge case this is. I think we need to do that first and If it does prove to be we will look up a GAAC.
Thank you again for the input.
– Peter
Peter: Ahh…. slight misunderstanding on my part. My apologies.
You can't do that. Not because it is not doable via the solution I had shared but because it would be a violation of Google's terms and conditions.
But if you really want to do this then now you know how you could do this with a different web analytics tool.
-Avinash.
Thank you for your post. I was wondering if you had heard or experienced the following:
Majority of our traffic is based on campaigns, which we use tracking codes for links within emails/newsletter emails, SEM campaigns, banner ads, etc. Prior to implementing the Omniture friendly cookie, the tracking codes were generating an average of 11.5 page views per visit. Post implementation, the PV/V dropped to 4.5 or less and the number of visits, clicks, or visitors did not drop, but the page views dropped per tracking code. After two months of this, we removed the cookie implementation and the tracking code metric numbers returned to 11.5-12.0 PV/V.
It should be noted that as a whole, the site's page views, visits, visitors did not change drastically from the implementation to the removal. It seems, whatever page view loss there was in the tracking codes, it rolled over into the 'None' line item in the Tracking Code report with the cookie implementation.
Omniture cannot explain this page view issue with the cookie implementation. I was hoping you had some insight and/or heard of a similar situation.
Thank you for your time.
Jan: I have to admit I don't know what "omniture friendly cookies" are. Cookies are cookies. Perhaps you had switched from Omniture's normal third party cookie to a first party cookie, that is a very good thing.
Looking at your comment…. If the cookies caused the problems then something else should have changed as well. For example if you moved from 1st to 3rd cookies (a bad thing) your Visits might have gone up, Unique Visitors would surely have gone up. If you went form 3rd to 1st then Visits might have gone down, UV's would surely have gone up (as you more accurately track "people").
In neither scenario would total page views have been effected. One thing analytics systems are good at is recording a hit for every page request, regardless of cookies. So if total page views changed as well during the time period (in your case they went down) then this might be a simple problem of the entire site not being tagged properly with the Omniture code (whatever friendly stuff you did).
If total page views did not change but the number of individual sessions (visits) changed then something with the friendly stuff might be terminating the session prior to when it should be. Some javascript implementation specialist digging might help identify the issue. Please work with your Omniture rep.
Avinash.
Dear Avinash,
I work for an educational organization that does not collect cookies (per our privacy policy). We use our website for informational purposes and online applications to events. We use Google Analytics to track usage. If we don't collect cookies, then why does GA report results for visitor frequency and recency? I thought that data were collected by tracking cookies.
I am really confused and would like help so that we can report our website usage accurately.
Cat: Almost all web analytics tools in the market today use cookies in a default implementation to track visitor behavior on your websites. If you are using Google Analytics or SiteCatalyst or WebTrends or Baidu Analytics or anything else.
With regards to Google Analytics here's a article that lays out how it uses cookies, first party, and what they do:
http://code.google.com/apis/analytics/docs/concepts/gaConceptsCookies.html
These cookies don't collect any Personally Identifiable Information. There is a lot of detail in the above link, including how to use _setVisitorCookieTimeout(cookieTimeoutMillis) _setCampaignCookieTimeout(cookieTimeoutMillis) settings to not set a persistent cookie.
If you do not want to use cookies there are other options. But one simple option might be to simply use the log files that are sitting your web server. You don't have to do any implementation to get them. You can use something like the free WebLog Expert to process them to get data.
Log files don't provide as much data of as many types, but depending on what you need they might be sufficient.
-Avinash.
Tracking cookies are like the salesmen on 3rd world country touristic markets.
If they catch you glancing at their merchandise they will chase you down the street trying to sell you that particular item, telling you "You good price pay! Wife like!"
So, I wrote an article about that :) mennobieringa.nl/general/a-word-on-tracking-cookies-and-privacy/
Menno: Third party cookies work the way you mention, but not first party cookies.
So you are right that advertising tools use third party cookies and will "follow you around." But cookies used by Google Analytics and other Web Analytics tools are first party. They don't follow you anywhere.
For more on this please checkout this post:
~ EU Cookie / Privacy Laws: Implications On Data Collection And Analysis
-Avinash.
Suggestion -put a date when you insert an update – for future reference (I'm now in Jan 2013 and trying to figure out when the update regarding GA was inserted)
Question: I heard somebody claim today that they could imbed "some pixels", i presume they meant cookies, to a visitor's visit, gather the person's information, a do follow up advertising customized to what the visitor did while on the site.
Is this possible?
Russell: There are many different ways to collect data online. They are often referred to using words like code, javascript, pixel, etc.
Many ways to collecting data for web analytics tools use a invisible to a naked eye 1×1 image pixel that the code will call from the server. This often delivers more code and information that can be used for analysis.
So when they say they can "imbed pixels," just think of it as a way to collect data.
One of the uses of that code is to use cookies (typically third-party) to do remarketing using display advertising platform like Google Display Network. So they are right, they can use that data to make advertising more relevant to your site's visitor when that visitor is on other sites.
Avinash.
Dear Avinash,
I really like reading your blog, all the topics are so insightful.
I have a question on which I am struck for so long, hope you could help me with it.
I am trying to implement an eCommerce tracking, I know the basic way in which you have to pass values in the order completion page, this method I have already implemented and working fine. There is one site though, which is not generating any unique page so I am not able to see what data/values are actually passing with each complete order and eCommerce value in analytics is showing me (not set) no product information,etc but the transaction is recorded. This CMS doesn't allow to create a unique page and eCommerce tracking is inbuilt in it, I mean it is recording every transaction but not values. My questions is, is there any other way we can track these values may be through unique cookies which are generated with each transaction.
I am really struck and would like some help. Is there any article which you recommend. Is this type of tracking even possible?
Thank you for your time!
Regards,
Isha