EU Cookie / Privacy Laws: Implications On Data Collection And Analysis

vortex 1 Privacy is a very important issue when it comes to digital. The way data is collected online and what happens to it is a much-scrutinized issue (and rightly so).

Digital data collection is also exceedingly complex, perhaps a reflection of the organic nature, and subsequent explosion, of the internet. Hence even sophisticated users find it difficult to know everything, one can hardly expect normal digital users to know what's really happening.

For example, people are really shocked when they hear that even with no web analytics or advertising analytics tool on a site their behavior on the site gets automatically logged into server web logs. Information like IP address, the page requested, time stamps, browser ids and more are stored. These server logs can then be used to do basic reporting using off the shelf software.

Another example, people don't realize that, depending on the browser you use, being in Private Browsing or Incognito or InPrivate mode does not mean that no data about you is collected by the sites you visit. Being in InPrivate or Incognito mode simply means that no data (cookies, history, etc) is stored on your computer when you close the browser. If your employer or ISP monitors your web usage, then they can still track you when you are using Private Browsing/InPrivate/Incognito mode. [Be careful! :)]

So, it is complicated. You can understand why there's a lot of confusion and scrutiny.

Two recent flare-ups highlight this scrutiny. The first was around Facebook and Twitter tracking your behavior across the web as you visited sites that use Facebook and Twitter social buttons, or integrate FB's commenting system. The second flare-up is the evolving regulation in the European Union around the use of cookies.

In this post I want to cover the second issue, implications of some of the still-evolving EU cookie regulations. Though Europe is our primary focus, regardless of where you are located you'll learn about web privacy, data collection, optimal tool decisions and how best to plan your data strategy.

Since this is not a blog about legal issues (and I'm not a lawyer!) I will not focus on whether these regulations are good or great or how you can comply with them or what tracking you can and can't do. Please reach out to a local technology lawyer who can help you navigate those issues. Please work hard to ensure you're in compliance with local laws.

What I want to do is cover the implications of whether the use of cookies is permitted or not. My hope is to simply help you internalize the impact of these decisions on reports, which metrics might be impacted and which will be fine, as well as what types of decisions you can still make with confidence and which decisions you might make with a grain of salt.

chocolate chip cookies yum

Web Data Collection Context: Cookies and Tools

Before we go forward let's set some context and try to get on the same page with some terms that we'll use for the rest of this article.

Cookie:

A cookie is a small text file placed on your computer. Cookies contain a small amount of anonymous information that allows a website to know that you have visited in the past, responded to a campaign, or had x items in your cart (which allows the site to retain those items in your cart the next time you come). And other such users.

Cookies are always set on behalf of the website owner. For example, they'll explicitly implement a tracking solution (like SiteCatalyst, comScore), typically via JavaScript tags, or an advertising solution (like DoubleClick, Kenshoo) or social buttons/commenting systems.

First-party Cookies:

These are cookies set on your behalf on your own domain, under your own domain. For example, when you visit this site, Google Analytics (which only uses first-party cookies) will set a small text file on your computer. This small text file (the cookie) can only be read on the browser it was set on, and only on this website. In other words, it only tracks what you do here and nowhere else.

Cookies are never permanent; they can disappear for any number of reasons. But first party cookies, because of above behavior, are the most persistent in the sense that they are rejected the least amount of times, they are preserved the longest, and they are least deleted by "cookie cleaners."

Third-party Cookies:

These cookies are set on your behalf via a different domain. For example, if this website used Facebook's commenting system, then every time you visit this site your behavior would be tracked. But because this cookie (and your anonymous data in the cookie) is not set on this domain (kaushik.net), your behavior across other sites can also be tracked by this cookie. So if after you visit this website you visit Gawker and then CatsThatMakeAnalystsLaugh.com, then that behavior will also be tracked by Facebook.

Data in the cookie is anonymous, but when you go back and visit facebook.com privacy policies on facebook.com dictate how that data, along with your behavior on facebook.com, is used.

Third-party cookies are used by tons of providers. Perhaps the most common users are advertising platforms (Yahoo!, DoubleClick, Microsoft and more). Third-party cookies are critical when it comes to behavior targeting / ad re-targeting tools which rely on the ability to tie one person's behavior across multiple websites.

Because of the behavior described in this section, third-party cookies are the least persistent. They are more often rejected due to default browser settings, user choices, cookie cleaners, etc.

Some web analytics tools providers still user third-party cookies, and often their customers are unaware. This is quite inadvisable. Check what type of cookies your web analytics tool uses. With third-party cookies your data is so sub-optimal that no matter how much the pain, please immediately take steps to shift from third-party to first-party cookies.

Web Analytics Tools:

Tools like Google Analytics, CoreMetrics, Baidu TongJi, StatCounter etc., whose primary purpose is to measure user behavior on one website: Yours!

Advertising Analytics Tools:

Tools that you implement along with, say, display advertising or social features on your site. The primary purpose is to provide you a service (ads, comments plugins, social buttons) and track user behavior *across sites.*

This is a lot of perhaps complex information. But it is important that if we get into the privacy/data collection discussions that we understand these basic terms. Without them it is impossible to understand what regulation should take place and what the implications of these regulations (or our privacy controls) are.

Remember: Cookies are an important part of understanding user behavior on a site or multiple sites. But even if you are not using cookies, or digital analytics tools, behavioral data of your users is still being logged by your website's servers in web log files.

measuring euro 1

European Privacy Regulations: Implications

I'm not a lawyer, nor do I play one on TV :), so I'm not going to opine on the why's and the how's of the law. Please consult with a lawyer in your local legal jurisdiction.

At the moment there is a central EU e-Privacy directive that is in various stages of interpretation and implementation by individual EU member countries. The directive requires obtaining consent prior to tracking. One specific portion of the directive hence applies to the usage of cookies.

Since the interpretation a bit unique across EU members, a site located in the UK might ask for a different permission, using a different method, and permit tracking of different things than a site in Germany or Holland or Spain.

Broadly speaking, there seem to be four buckets of implementation currently underway in Europe when it comes to the cookie part of the law. Let's look at just one thing: Implication of each implementation (Government requirement) on the data you collect and the web data analysis that you'll be able to do.

1. No change in the law related to cookies.

La vita è bella.

Life continues as normal. Focus on picking the best web metrics, actually do analysis of that data, toil day and night to deliver superior web experiences and improve digital profitability of your company.

Please assign someone in the company to stay in very close touch with government regulations and recommendations. Please ensure that your privacy policy is transparent, up-to-date about the data you collect and the choices your website visitors have for not being tracked.

PS: Oh and if your web analytics tool uses third-party cookies, switch to first-party, and if it can't use first-party then it is time to say sayonara to the tool.

2. No change to first-party cookies. Third-party cookies require opt-in.

A number of EU countries are in this bucket. Implications?

For first party cookies: The data you are collecting with your web analytics tool about user behavior on your site is just fine. Use this data to make life better for everyone. [If your web analytics tools is using third-party cookies (boo!) then everything below applies to you.]

For third party cookies: When users visit your website, you'll present them with a notice to opt-in to being tracked using third-party cookies. They can choose yes or no. You'll ensure your web serving platform remembers that setting and acts accordingly.

If the user accepts the cookie, there is no impact on your data.

If the user says no to being opted-in: In most cases ads, social commenting systems etc, that use third-party cookies, will continue to work. Ads will show up, comments will be accepted. But there will be an impact on your advertising analytics solutions.

The data you'll be able to collect will be worse than it was before (and remember, it was fragile before). The number of impressions, click-through rates, conversions, view-thrus and other metrics reported by your ad platforms will be less precise than they were before.

If you use behavior targeting/ad re-targeting/remarketing solutions then your effectiveness with these solutions will be reduced. That's simply because retargeting/re-marketing relies on leveraging third-party cookies to observe a person's behavior across websites and then deliver optimally targeted content/ads.

If you use social commenting platforms provided by third parties, the user will see that you no longer remember them.

If the users don't accept third-party cookies, then on some platforms you won't be able to track their behavior away from your site. On one social platform they very cleverly track this behavior: Bonita sees your post on the social platform. Bonita then visits your website, which was in your social post. Bonita then comments on your post. Bonita then also clicks on the social share button and shares your post back on the social platform. This tracking primarily works today because of third party cookies. This might not work going forward so you won't be able to analyze this behavior.

The full impact will depend on the digital advertising analytics tool you are using. Please call your account representative at your tool's vendor. Use the information above to ask specific questions. (For a minority of vendors it is very difficult to get straight answers.) Adjust your data analysis strategy accordingly.

3. Both first-party and third-party cookies require asking users for permission/opt-in.

A couple of EU countries are going to be in this bucket. It is not completely clear if no to cookies means that zero tracking can be done (or if only cookies can't be stored). Assuming the opt-in is just for cookies (i.e data stored on a customer's browser)…

If the person opts-in, nothing changes. Both your web analytics and your advertising analytics tools are going to report the data fine.

If the person rejects the request and does not opt into setting cookies in their browser then one of these two scenarios will occur:

1. Some web analytics tools will still collect the data ("hits" really). They will then try to "intelligently stitch" the session together. Since each "hit" is reported to the tool (via the JavaScript tag) there is a bunch of data comes through even if the cookie data does not. So these tools will use the anonymous browser id and the IP Address and other such elements to do something like:

"All these hits look like they are coming from the same browser, their time stamps are really close to each other and while we don't have cookies these 'hits' start at time x and finish at time x+5 and so they must be one visit. We'll stitch the hits together and report that as one visit."

All the data is anonymous (as it would have been in case of a normal web analytics tool as well, almost all of whom don't collect personally identifiable information).

Implication on the Visits metric: You get the best case scenario of a Visit. Not perfect, but close enough.

Implication on Unique Visitors: Since the cookies are not stored, every time the user of that browser comes back to the site he will be identified as a New Visitor. So Unique Visitor counts will be imprecise. By how much will depend on how many people exhibit that behavior.

Implication on New and Returning Visitors: See above, these numbers will be imprecise (with New Visitors being overstated). For this reason you can see why metrics like Recency and Loyalty will also be wrong.

If your web analytics tool is exhibiting the above behavior then there will be little to no impact on data for dimensions like referring websites, keywords etc and metrics like time on page or total page views. There will be some impact on metrics like time on site or total page views (because remember the "visits stitching" is a very informed guess).

Please check with your vendor if they are using this method. To the best of my knowledge, only a couple do.

2. Most web analytics tool will detect an inability to use cookies (after the user says no when presented with a choice in the opt-in) and they will not collect any data for that browser/visit. The motivation is to present you data that is clean (or as clean as it would normally have been) and you can confidently analyze.

Google Analytics falls into this bucket.

So behavior of people who opt-out of cookies won't be measured and not represented in the data. Hence it will represent less than the total.

If first-party cookies are not accepted then the number of people not accepting will remain a unknown unknown number – remember they are opting out from being tracked (including that they don't want to be tracked).

Also remember, as I'd mentioned in the opening, your web server is still likely collecting all the hits (requests). Web servers don't, by default, set cookies and hence don't have that information in the web logs. But information like IP address, browser user agent id, time stamps, page urls and much more are recorded in web logs. These logs can be parsed using freely available web log parsing solutions.

While reports from your web log parser won't give the type of robust reporting you can get from a SiteCatalyst or Yahoo! Web Analytics, you can still report a lot of user behavior using these web logs. Make sure that your privacy policy clearly states this to your users, and please consult with a local law expert for guidance.

4. [Nuance on the cookie issue:] Cookies are fine, IP addresses can't be collected or only partly collected.

One European country, and perhaps more in the future, is in this bucket. The government has said that IP addresses should be considered as personally identifiable information (PII) and not collected by web analytics tools.

If this is you, then the report that will be most impacted is the Geography/Location report. It will show imprecise data. This will be regardless of the type of cookies you are using.

In Google Analytics this report is located in: Audience > Demographics > Location.

If your web analytics tool allows you to report out IP Addresses (Google Analytics does not) and match it back to companies, etc., then you won't be able to do that precisely as well.

All other data should be fine.

Those are the four key scenarios that we are dealing with at the moment. I hope you understand better the implication on your web analytics and digital advertising analytics tools.

Quick Repeat Summary of Cookies.

The current crop of web analytics tools rely on cookies to more accurately identify a unique browser.

They typically don't track a person. If you use three browsers on your computer then you appear as three unique visitors to a web analytics tool. (Solutions like Google Analytics explicitly prohibit you from collecting personally identifiable information.)

The rare exception might occur if, say, you log into all three browsers using a unique login id you'd created with the company. In that case the company has a choice to use that login to match back all three anonymous cookies to one person. But this happens in the company backend (say a CRM system or a Data Warehouse) and not in the web analytics tool.

There was a story recently in the New York Times about Target doing that to create unique people profiles. Again, that would happen in the company's backend systems and not the web analytics tools.

happy baby oranges

Closing Context: Don't "freak out" about the missing data.

If you think back to how we've measured the effectiveness of marketing in the past (or even today for TV, Radio, Magazines, Newspapers, Billboards, etc.), you'll realize that what we call measurement is essentially a glorified faith-based initiative.

If you think back to how we've measured user experience (using observational studies, lab tests, follow-me-homes, etc.), you'll realize that we used observation of 10 or 100 or 1000 people to extrapolate to what millions of our actual users do.

Now if you consider the data you are collecting with your digital analytics solutions, you'll most definitely marvel at how much data we have. A lot. 80%? 90%? 60%? It is a ton more than from any other channel.

So you can cry about all the data you won't have or don't have. Or you can be happy that you still have 5,000 times more than you have on any other channel on the planet and analyze that data and use the insights. If in the past when you had 100% of the data you might have made a big massive huge data-driven decision, now you might just make a big massive data-driven decision.

Still better than the faith that powers offline advertising, right?

Don't let the quest for perfection stop you from making a decision today based on good enough.

Ok, it's your turn now.

If you live in Europe, how are you adapting to the implementation of the cookie directive in your country? Have you noticed an impact on your advertising analytics or web analytics solution? Is your company paralyzed, or still using the "good enough" data (I know it can be hard for Sr. Managers)? If you live in rest of the world, do you understand cookies better now? If you are a technology expert, what's missing from the article above? Anything you would change?

Please share your feedback, critique, kudos and delightful perspectives via comments.

Thanks.

PS: A couple of bonus items for you…

#1: This article outlines everything, in simple English, you ever wanted to know about how GA uses cookies. The cookie names, exactly what they do, when do they expire and more: Cookies & Google Analytics

#2: My privacy policy states in simple English what I track and what you can track out of every analytics solution used on this website, and more: Occam's Razor Privacy Policy

#3: Two of my favorite articles about cookies: Why The Guardian uses cookies. The New York Times' cookie page.

Enjoy!

Comments

  1. 1
    Adam Wilsher says:

    Hi Avinash.

    Here in the UK there has been an awful lot of hand-wringing about the analytics implications and impact of this Law, but the ICO (governing body in the UK) has at last come up with a workable solution that recognises 'Implied Consent'.

    So now, websites can have a small pop-up that points visitors to a Privacy Policy that includes a cookie section, but no action has to be taken by the visitor if they do not want to, and merely clicking onto the next page implies that you accept the cookies (see http://www.bbc.co.uk for a good example. Clear your cookies first if you are a regular visitor, and I do not know how the box displays if you are from out-with the EU)

    This will negate the previous interpretation, whereby visitors actually had to tick a box to say "Yay – Give me the Cookies!" (or similar) and which was resulting in a 90 – 95% reduction in GA reporting stats – a nightmare for any web-traffic monitoring service.

    So, the UK has come up with a pragmatic and workable solution that treads a fine line between Privacy and Operations. A lot better than the other options available!

    Adam

  2. 2

    If there is one thing I want after this article it is that someone makes http://www.catsthatmakeanalystslaugh.com/ a thing :)

    Seriously though, in the UK we had an odd situation where the Government organisation implemented the ruling in a hurry, stated there would be a one year window in order to get it right and then spent the next year issuing conflicting advice on how to do it.

    In the end we have a situation where Analytics cookies can be given on the basis of what I've been calling 'implied, informed consent' whereby the user has the opt out option much more readily available to them and the information about how the site uses cookies much clearer and in plain English, not lawyer speak. This I think is what the law was designed to be about – getting the users more aware of what is going on rather than just blindly opting in or out.

    The trouble is that it will probably be abused and the Government body (the ICO) will probably be fairly powerless to do anything about it. Plus they have far more pressing matters to deal with anyway – the companies who deliberately sell, 'lose' or accidentally lose personal data including credit card information. Really they should be given more power and personnel to deal with these companies rather than a broader remit on something that historically only a minority of users care about.

    Cheers,
    Alec

    • 3

      Alec: Thanks for sharing your perspective with the situation in the UK, it does sound like it has been a bit of a pudding.

      Hopefully the ICO will read your comment on this blog. :) Providing a easily accessible, and simple, opt-out option supported with a plain English privacy policy sounds like a sensible thing to do (for all websites around the world).

      On a spectrum of threats to privacy to people on the web, cookies are perhaps lower on the level of threat. You point out some of the higher level threats. But cookies have managed to become the bogeyman.

      -Avinash.

  3. 4
    Matthias Bettag says:

    Hi Avinash,

    You're brave to tackle this complicated topic in a blog post ;-) I totally agree with your "don't freak out" recommendations.
    Just my two cents, probably three:

    – Some countries do not have set the EU directive into a national legislation (e.g. Germany). As an EU directive is something like a "make this a law"-call to all member countries, it has to be interpreted by them, which may result into up to 27 different national laws. As we see already now.

    However, the German Data Protection Officers say that the directive is already a law. Guess what? (not only) the industry is now waiting for a lawsuit to see how this is actually handled. Confusing? At least not boring..

    See an EU wide overview here: http://www.ffw.com/pdf/Cookies%20tracking%20table%20-%20April%202012%20-%2019672675_1.pdf (PDF)

    IMPORTANT: This is not a recommendation for doing nothing, just one of several reasons why things are so complicated.

    – The EU commission has decided three months ago to reform the existing EU directive. This will not replace existing national laws. However, I expect an EU regulation which is (other than an EU directive) automatically a law in all member countries without individual interpretations by each country. To be expected in 2014. (PS: high time to discuss this now http://www.digitalanalyticsassociation.org/?page=sig_european)

    – Be careful when focusing only on cookies! The privacy directive (or regulation) is about user recognition, regardless by which technology. And it includes any further data processing.

    – PII is not necessarily only about having a full name and postal address. Also pseudonymous data must be treated in a way that decoding it to personal data is not directly possible (e.g. data-storing in a different place than the PII data). Connecting different data sources to each other is only allowed when both data sources are pseudonymous (pseudonimized? forgive my wording..) AND the data-owner is the same legal entity (it is insufficient to be in the same holding). Also, this data correlation requires user opt-in upfront.

    – As far as I experience most companies are not ready for this, even if they try their very best to be compliant. Global players and international audiences on the same digital channel are not easy to align towards one national law, worse for several. Tools (mostly) do not offer out-of-the-box functionalities to have a waterproof privacy compliant implementation. Data processing "behind" webanalytics is a different thing and not covered when a users only agrees on placing a cookie.

    Generally speaking, companies are requested to:

    – Explain to its users what they do (and don't) with user data. And why. (not easy)
    – Doing this in a transparent way. (very hard, systems are complex and the user audience includes my mum..)
    – Let a user control his/her data (incl the "right to be forgotten" -however this can be managed technical wise- and allowing changes of former opt-in/opt-out decisions).

    It is necessary to get opt-in / user consent before collecting and working with user data. Sometimes this data will be anonymous and/or pseudonymous and then implicit consent might be given without a dedicated opt-in checkbox (or whatever).

    The industry will probably have to offer the user something in return in order to use its data. Getting user opt-in might be a new way of doing website optimization (or whatever digital channel it is).

    • 5

      Matthias: I agree with you, this is far from boring! :)

      I also agree that any final cohesive central European privacy directive will cover more than cookies (let's hope so!). Let's us also hope, and pray, that they'll clearly outline what PII really is, and what can be done in terms of connecting that data up to create "user profiles." And should that be limited to behavior on one site, or across the internet.

      Narrowing down the scope to just website analytics (collecting of data on one site but the site's owner), I hope that the law will require: 1. Transparent and simple presentation of text that explains what's collected and 2. Simple strong choice (browser plugin?) to respect the user's choice.

      This is doable today.

      When we stretch the scope to broad beyond one website data analysis it get complicated, and you've wonderfully captured it in your "generally speaking" section. Thank you!

      Avinash.

  4. 6

    It's been an interesting few weeks in the UK watching how the interpretation of the cookie laws has been rolled out. As to be expected, major public companies have conformed very quickly, a good example is http://www.bt.com which presents you with an obvious pop up box on first visit.

    In our own directory market, we've watched with interest how the market leaders have reacted, and again, the PLC big boys have done it to the letter of the law. Then you follow the scale down across the 2nd and 3rd tier directories, and it becomes more obscure, from the less obvious down to the blatant "we're doing nothing".

    So as a small business, which course do you steer? We decided on the information route and let the end user decide, with an obvious Cookie Policy in the header, and explaining how to be rid of our cookies, if a visitor felt they could no longer continue.

    But in the current climate, is this really the priority we should be dealing with? With continual Google updates and the ongoing pressure to ensure a squeaky clean marketing strategy, and the day to day pressure on running a business in these challenging times, I think any effort to implement the cookie laws will be well received. But with government departments under ever greater budget squeezes, I can't see how they will be enforced, and as with the things like the TPS (Tel Privacy) under the Privacy and Electronic Regulations, I suspect it will simply become another bit of toothless legislation.

    • 7

      Steve: I concur with you that policing the implementation (and as Michael points out what happens after the implementation) is going to be hard. Of course we first need clarity on what we are supposed to be doing in the first place.

      If I was running a small business I would take my cues from the big boys (let's say BT in this case) and immediately follow them. They have very expensive lawyers looking at this, let's take advantage of that. :)

      Again as a small business owner I would spend less time worrying about cookies (such a minor problem in the grand scheme of things) and more of it analyzing the data and taking fast decisions about the business and winning! That is one thing big boys are very slow at!!

      Avinash.

  5. 8

    Great article Avinash!

    Two concerns of mine:

    1: In today's world the non-trackable audience is mixed within the trackable audience. For instance, unique visitors per article (or product) is a mix of both trackable and non trackable people, causing the total number to be skewed (and useless).

    I agree that it is better to have 'some' analytics, but unless we can differentiate between them, it's not really useful. This is my biggest grief about the cookie law. It warps the data, but we don't know by how much. If 20% aren't trackable but each person visit three times, then out of every 100 people our analytics will report 140 unique visitors — but I don't know that it is 100 people in the first place, so I have an unknown unknown. As an analyst that scares me.

    Also, if we could differentiate between the trackable and non-trackable audience, what if we find that the trackable audience has a conversion rate of 3%, but the non-trackable audience has a conversion rate of 22%? Now we have disconnect between the result in the data 'we know' and what happens by the people we have no idea about. So we have a known unknown (we know that our unknown audience has a 22% conversion rate) that causes an unknown unknown (we don't know who or what those 22% really are) ;)

    The problem is that we only get a sample size in which we do not know if it representative of the whole.

    This is one of the reasons why I don't track my subscribers using cookie. As you know, I use a unique URL.

    2: There is funny thing about the actual implementation of the cookie law on different sites. It doesn't connect with the browser itself – that is, if people choose to not accept cookies, they set it on the site instead telling the browser to block them.

    That means the cookie is technically still possible to set… and GA (and others) have no idea that people have chosen not to be tracked.

    On the BBC, for instance, they show you a page where you can define what cookies you want to allow (or disallow). They then 'record' your choices in… a cookie :)
    http://www.bbc.co.uk/privacy/cookies/managing/cookie-settings.html

    Then if you delete the cookies in your browser, BBC suddenly no longer knows that you previously defined not to be tracked, and promptly reset your choice to 'allow tracking'.

    It's just a mess :)

    • 9

      Thomas: You are smart to pick up on the unknown unknown problem. As an Analyst I would *love* to at least know who many are in that bucket, I can at least program my models accordingly. But I can see how users might say: "hey what part of don't track me don't you understand!" :)

      Over the long term the right path for this whole thing is to provide website visitors an incentive to log-in. For example you can track me insanely on Baekdal Plus because I'm always logged in (because I trust you). And if a decent number of people are logged in then their data is more than sufficient for you to make smart business decisions.

      On the second point, the technical implementation is different but there is a call that the site owner can send to Google Analytics that the person has said no, then GA will not track them. Not even that they said no. That that simply goes away. So you get nothing for those people for GA.

      Perhaps the web analytics tools the BBC is using don't provide them options beyond using cookies to know that you said "don't track me." But it is possible to do it without using cookies. For example the Google Analytics Opt-out is a browser plugin. So even if you blow your browser cookies away Google Analytics will remember that you said "don't track me." It won't track you.

      I don't like this whole issue, in that we are not actually solving the real problem for users. But when the users are supposed to be given a choice, let us as an industry, including the BBC, resolve to give them a real non-fragile choice.

      -Avinash.

      • 10

        I did know about the GA opt out plugin (and also that Adblock filter out GA as well, much to my annoyance ;)) But I didn't know GA had a variable to 'ignore if told'. Interesting.

        Of course, this whole thing is quickly becoming a non-issue. We won't be able to track people with cookies anyway once mobile takes over.

        In the iPad+iPhone for instance, clicking on a link in the Twitter, Facebook or G+ app, opens up the site in a embedded web browser within that app. And that browser does not have access to any previously stored cookies (it is essentially a new private browsing session).

        Meaning that if you click on a link in Twitter to this site, you are identified as a new previously unknown visitor. Then if you 10 seconds later go to the FB app, click on another link also to your site, you show up again as a *new* previously unknown visitor.

        With the mobile trends looming, this will be a much bigger problem than the EU cookie law.

        We need a new solution, and as you say, creating a stronger relationship with the reader (through logins or other returning mechanisms) is probably the way forward :)

  6. 11

    Hi Avinash,

    Some weeks ago I read an article in a Dutch marketing magazine regarding "cookie-less" analytics. The essential was that based on the IP-adress you can guess where the visitor is located. If you know that than you know a lot of the visitor (in the Netherlands there's a lot of information avaiblable at post-code level). However, as you indicated IP-adress might not be collected in more countries in the future.

    My question, will those kind of methods (via IP-adresses) be the roadmap for the future? Or more in general, what will be the direction of the developments. Because people are creative and will find new solutions for tracking web visitors.

    Especially for Dutch and Flemish readers, here's a guide how to deal with cookies (http://bit.ly/Ag5KTi).

    -Sander

    • 12

      Sander: It is always dangerous to predict the future, but in this case it is not that much of a danger.

      Relying on IP addresses to track a person holds little value today, it will become even worse over time. That's because even today IP addresses are dynamic for such a large percentage of the population. In many companies it is not unusual for the public IP of the entire company to appear as one and the same.

      So IP addresses are decent at guessing a person's location. But they are an extremely poor choice for tracking individual user behavior (i.e. a replacement for cookies).

      -Avinash.

  7. 13
    Mary Kay Lofurno says:

    Hi Avinash,

    Very useful. I appreciate it.

    You mention the data on server logs. As you know and stated in your books, the industry has swung to a java script focused solutions, not that packet sniffing or logged based programs are not used in tandem, just depends on the solution.

    So what about in regards to applications that are server side call based [just an example - SiteCore's Digital Marketing System it sits on its own server like the old grand-daddy web trends]? What do you say then in regards to implications of the scenarios you discuss above?

    • 14

      Mary Kay: I'm unfortunately not aware of how SiteCore works. But if it works like the old/current WebTrends web log parsing solution…

      Weblogs don't collect enough data (certainly not enough about a person because of their reliance on old technologies), provide flexibility in terms of types of data that needs to get collected (so less rich tracking), and don't have the kind of up-to-date segmentation and analysis tools.

      So they are not an optimal choice for robust web data analysis. Using javascript driven solutions (including the suite from WebTrends) is still a better choice.

      -Avinash.

  8. 15
    Megan says:

    Avinash,

    You work for Google (Analytics), which is a third party web analytics tool, and yet you say

    "With third-party cookies your data is so sub-optimal that no matter how much the pain, please immediately take steps to shift from third-party to first-party cookies."

    Should everyone on Google Analytics delete their account? Why doesn't Google offer a hosted solution like Urchin anymore?

    -Megan

    • 16

      Megan, (correct me if I'm wrong Avinash) as far as I understand it, Google Analytics is not using third-party cookies.

      It is essentially an analytics platform that is hosted for you in the cloud – just like Urchin used to be.

    • 17

      Megan: This post is complex and perhaps that's why a couple of wires got crossed, per your comment.

      There are two types of web analytics tools broadly speaking (for a definition of web analytics tools please see the post). In-house hosted and cloud based. Urchin, as you mention, was a in-house hosted solution. WebTrends is another famous one. Adobe Site Catalyst, IBM CoreMetrics, Google Analytics are cloud based solutions.

      Where you host your web analytics tool has nothing to do with cookies.

      Unless you have built your own, and some people do, all web analytics solutions you use are third party solutions. I.E. they are sold to you by someone else.

      Now let's bring these two threads together…. You can decide if you want to use first-party or third-party cookies with your hosted or cloud based web analytics tool.

      Google Analytics only supports first-party cookies. Hence the implications outlined in this article for first-party cookies apply to it.

      There are other solutions, hosted or cloud based, that use third-party cookies. I'm encouraging you to switch those to using first-party cookies.

      I hope this clarification is helpful.

      Avinash.

  9. 18
    Paul Carysforth says:

    Good Article and agreed brave to tackle this!

    We've been working with a lot of our clients in the UK primarily to make sense of this (as much as its possible!)

    In terms of approaches then Dave Chaffey did an interesting audit http://goo.gl/VNyQO which categorised the types of approaches UK companies have taken since May 26th.

    Based on this and our own research it basically falls into two camps.

    1) A detailed cookie policy page detailing cookies used but no ability to opt-out easily – the site doesn't give you the capability quickly and easily

    2) The site offers the ability to opt out of different categories of cookies – strictly necessary, functionality, performance and advertising. Barclays, BBC and BT are good examples. These sites have usually displayed some kind of initial message when they hit the site for the first time.

    These categorisations in point 2 now appear to be the standard which is great as it at least provides some consistency. The question is then what you argue analytics falls under! Is it strictly necessary? You could argue that analytics is 'strictly necessary' to make your business competitive online! I know we would agree with that!

    There a smattering of companies that fall into a third type and that is those that have gone with the all or nothing route like ICO did. Network Rail is one and Natwest within their cookie policy area is the other (very brave!). Not sure on how you run a site with a 0 or 1 route to cookie acceptance as it was acknowledged quite early that this was just not a practical approach.

    The implied consent route has really covered off the biggest worry of the no cookie setting until the user has agreed. We can pre-set and certainly for strictly necessary providing the user is informed.

    There are though some big question about how mobile fits in (most companies forgot about that!) and how you operate your cookie policy for a site that has visitors from multiple countries. Not easily at all.

    From a consumer point of view though the question is does anyone care?! From what we've seen literally only a handful are clicking on cookie policy pages. They're just not interested or not interested in the current approach. I do worry that the solutions created are very inward looking. Has anyone asked whether they know or care about the difference between 'functionality' or 'performance' cookies honestly!?

    Its quite interesting how Guardian have approach the informed route – click on the 'discover how guardian uses cookies' on this page http://www.guardian.co.uk/info/cookies. Not sure on whether this is the answer but interesting none the less.

    On a final point regarding data collection my personal view is that persistent third party cookies are actually the real issue here and analytics has just fallen under the microscope. There just isn't enough regulation in my view. The TED talk from Gary Kovacs on this point was a real eye opener and one that's worth 6 minutes of anyone's time! http://goo.gl/Y6Va0.

    Cheers
    Paul

  10. 19
    Josh Braaten says:

    Thanks for explaining all the implications, Avinash. This privacy legislation always seems to go too far, and it makes me sad. When I think of the bounds of privacy laws, I try to think of the brick-and-mortar equivalent of a particular law to see if it makes sense or not.

    As far as the cookies go, if I were a shop keeper and you were my customer, I would recognize you when you entered my store, I would be aware of your preferences, and I would even be apt to recommend things to you because I know you as my customer. These behaviors would win me the adoration and continued business of the majority of consumers because it would be a better experience.

    With the EU cookie legislation, having to ask permission get to you know your customers seems like it extends beyond the norms of business customs. I hope this level of extreme legislation doesn't make it to the U.S. I know you didn't want to turn your post into an op/ed, but I'd love to hear your take on whether this was a good idea or if it went too far. I'll take my comments off the air :)

  11. 20

    @Josh Braaten

    That's my favourite sound bite on the topic to date.

    • 21
      Josh Braaten says:

      @Jono why thank you! I'm glad someone appreciates what I hope comes across as common sense and matter-of-fact. If only legislation followed suit…

  12. 22
    Rachit says:

    Hi Avinash,

    :) This is a great topic. The web analytics and the cookies have a strong relationship (at least it has been till foreseeable future). I'm surprised not to see two words in the post or in comments so far. P3P and cross domains.

    If one moves to first party solution (which I think probably inevitable in future), how would one measures users across the domains reliably? GA has what I call somewhat solution described here

    https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSite.

    The problem I see with this approach, is, if the user doesn't click on a link of a page on siteA.com and go to siteB.com, it'll be very difficult to link the connection of the user between these two domains (as no _ut* values on siteB).

    Thoughts?

    Thanks!

    • 23
      Colby says:

      Rachit: I am glad you asked the question. I was seeing the same issue as I read through it. I would really like to see your thoughts on this one Avinash. As always, excellent post…even though it may cause more questions than answers hehe.

    • 24

      Rachit: If you want to track behavior across multiple websites that you own (so in a web analytics context and not an advertising analytics context) then you pick the best of the sub optimal options.

      Google Analytics, SiteCatalyst, CoreMetrics everyone works to expand your tracking across multiple domains you own. It comes with some cons, as you mention, you get to decide of how bad that is for you.

      If you switch to using third party cookies for this purpose, consider two things:

      1. How much drop in quality will you suffer on your primary website tracking (this is what this blog post warns a lot about) and

      2. How much additional data you'll get by having the con (people have to click from one site to another) in order to track them with first party cookie tracking

      Often the value of switching to third party cookies in this case is simply not there, even after accounting for the minor loss of only being able to track people who click from one of your site to another. But in your case that might be different, and now you know how to decide. :)

      Avinash.

  13. 26

    Hi Avinash, one question – one comment:

    Q: What if you are international member organization – based in EU and have a site targeting an audience from US, Africa, Asia and EU? Which version of the cookie directive interpretation should be followed ? The strictest, the median or the most liberal approach?

    C: Are you sure GA does not capture IP-addresses? For some clients we had to add this line: _gaq.push (['_gat._anonymizeIp']); in order to make the tracking compliant with German law (even for sites without .de so also for .com sites with content in German).

    Failing to do so can lead to the request to shut down the existing GA profile and open a new one, thus losing all your valuable historic data.

    • 27

      Jeroen: With regards to your first question… you can most likely never get into trouble with the most conservative path in this type of a scenario. But as always please check with your local lawyer for optimal advice.

      Google Analytics complies with all local legal jurisdictions, including those that place restrictions on collecting the complete IP address or just the first five or just… as the case may be. If you want to learn more about what applies to your jurisdiction please work with a GACP (GA consulting partner).

      Avinash.

      • 28
        Jeroen says:

        Thanks for your reply.

        Nobody want to go for the strictest scenario as demoed here cookiedemosite.eu – only Lithuania or Latvia (not sure) have chosen this scenario.

  14. 29
    Charlie Wang says:

    Avinash, great explanation on the impacts of the new regs!

    I am from the China market, which is on the opposite end of the spectrum. Privacy and consumer protection is no where as mature as the EU or the U.S. As a result, it allows some local companies with the right relationships to get some off the wall data straight from the ISP's, which in the states would probably probably cause public outrage :)

  15. 30
    Martin Tallett says:

    Hi Avinash,

    Thanks for this perspective on the UK's new cookies laws. I am a contractor working in and around Birmingham UK and I would really like to be able to explain to users of my website what the cookies are all about (and comply with the law ;)) I work in IT and I have no way of knowing exactly what information GA is collecting on my behalf or where it is being stored.

    I have a very simple website for my business which uses Google Analytics – martintallett.com

    What I am looking for is a cookies statement that is a bit like this one from the UK Government
    http://www.direct.gov.uk/en/SiteInformation/Cookies/DG_WP201870
    or this one from Barclays, who are a well known bank in the UK
    group.barclays.com/cookies_policy.html

    They both have a list of cookie names to start with and to comply with UK and EU Data Protection legislation I am looking for …

    – a list of cookies that are used by Google Analytics when one of my customers uses my public website
    – what information is collected by Google Analytics in each cookie
    – why it is collected
    – where it is stored ie in what country, especially if this is outside the UK or even the EU

    Can anyone help here ?

  16. 31
    amitaimk says:

    Hey Avinash,

    Thanks indeed for sharing. It was pretty informative.

    Regards
    Amitaimk

Trackbacks

  1. [...]
    EU Cookie / Privacy Laws: Implications On Data Collection And Analysis 1 Upvotes Discuss Flag Submitted 1 min ago About The Inn Analytics kaushik.net Comments
    [...]

  2. [...]
    EU Cookie / Privacy Laws: Implications On Data Collection And Analysis, http://www.kaushik.net
    [...]

  3. [...]
    Avinash Kaushik discusses Europe’s recent “cookie /privacy” laws and the implications for digital data collection and analysis at Occam’s Razor.
    [...]

  4. [...]
    The EU Cookie law was brought up right at the end and we didn’t have enough time to go into it in any depth. It’s quite a complex and fuzzy issue at the moment and lets face it largely unresolved. Here are some links to material that might help businesses with their thinking around cookie law compliance. These are the guidelines. Some good examples of how companies are approaching the EU Cookie Law. Implications for data collection and analysis
    [...]

  5. [...]
    Finally, no discussion of Google Analytics would be complete with a few words from our old friend Avinash Kaushik. And his look at the recent EU cookie privacy laws and their impact on Google Analytics is well worth read if you do any business in Europe. Great overview with practical insights (not that we expect any less from the Divine Mr. K).
    [...]

  6. [...]
    Whether you are already doing remarketing or are looking to start, you need to ensure that you are compliant with the cookie law. Avinash Kaushik wrote a brilliant post on the EU Cookie Law / Privacy Laws and the implications on data collection and analysis.
    [...]

  7. [...]
    Upon finishing this course, students could develop an understanding of the theory and principles of information retrieval and gain a deeper understanding of how search engines work. They could also explore the variety of Web search services and analyze the publicly available Web data for academic or business intelligence. The students will also gain an appreciation for the Web’s many complex social and legal.
    [...]

  8. [...]
    Remarketingo veikimo principas yra pagrįstas sausainėliais („cookies“).  Jums užtenka apsilankyti interneto svetainėje, kuri naudoja remakretingo metodą ir Jūsų kompiuteryje iškart bus išsaugotas sausainėlis. Visgi pastarųjų naudojimas yra glaudžiai susijęs su asmeninės informacijos kaupimu, taigi, prieš pradedant remarketingo kampaniją siūlyčiau šiek tiek panagrinėti „Cookie law“. Ji puikiai aprašyta ir pakomentuota A. Kaushik‘o blog‘e, straipsnyje EU Cookie / Privacy Laws: Implications On Data Collection And Analysis.
    [...]

  9. [...] EU Cookie / Privacy Laws: Implications On Data Collection And Analysis: http://www.kaushik.net/avinash/eu-cookie-privacy-law-data-analysis-collection/  [...]

Add your Perspective

*