It seems that every good web analyst / practitioner / director / vp’s wish list of a perfect web analytics tool starts with a desire to get “real-time” data.
The thought being that with the fast pace of the web and everything changing all the time getting real-time data is mandatory to being able to take advantage of all that the web has to offer from its ability to cough up so much data.
This customer desire seems to be so pervasive that every little and big web analytics vendor prominently advertises how real-time their data is. Someone says I can do every five hours, the next guy says I see that and I raise you three hours and the next gal says you guys are sissies because I can give you real real-time and give you live traffic streams.
But is getting real-time data really relevant? Do you are really need it? What’s the cost of getting real-time data?
This spoils the surprise but the answer to the first and second question is mostly no. Since most of us think it is yes it makes the answer to the third question a little bit scary (and rather sub optimal for most businesses in terms of impact).
So what is the typical impact of getting data near real-time (roughly defined as faster than every couple hours)? Here are the five that we have generally observed:
1. Much more reporting, much less analysis: We already live and swim in a world of too much data, so much that we really have a hard time finding any actionable insights from what we have even after we hack at it for hours and days. Real-time data usually worsens that by giving you even more data faster and you are left to find the proverbial needle in the haystack (a non trivial task as you can imagine).
2. Detrimental impact on resource allocation: It is a common theme in the industry that we don’t apply the 10/90 rule. One reason is that it is hard to find the right people with the right skills for the job. But a secondary reason is that in the world of Web Analytics we have a lot of very complex data that can never reconcile to anything else. Now imagine what happens with real-time data.
Our finite resources now have to make sense of all this, pardon me, mess but with less time on hand and provide insights. Almost always because of real-time data there is a negative impact on the resources and bandwidth allocation because there is organization and management pressure to justify return on investment (remember real-time data is not really free, you pay to have access to data that fast).
3. Choice of sub optimal web analytics solutions: This one is really common. Web Analytics tools are chosen based on complicated all encompassing RFP (request for proposal) processes (here’s a alternative suggestion for selecting a tool). Everyone wants everything so these things are usually a joy to behold. : ) Usually top of that list is “need data in real-time” (remember the mindset, who would not want data real-time; its like asking someone “when did you stop hitting your spouse” its a lose lose).
The impact is that the committee that is narrowing from 200 tools to 2 will reject any vendor that is not “real-time” because that is a deal breaker. Most often this means that lots of tools that might have met other important criteria (say advanced segmentation or integration with other sources) get kicked out. In the end you might end up choosing a tool that is real-time (and expensive) yet in a few months when we are smart enough to dig deeper we’ll find limitations.
Let me relate a personal story. Everyone wanted real-time data and that is what the big three vendors were selling as well (including data almost real time streamed over to pagers and smart phones). Yet we choose the tool we have because of all the features it brings to us, and it can’t do real-time and we don’t care. It is much cheaper to boot (software, hardware). Another story, this time from a friend’s company, is that their team wanted real time PPC / SEM (pay per click / search engine marketing) data and they simply decided to take it all outside the company and created a data / decision making silo that did not have a end to end view and optimized for that silo (usually this is a sub optimal scenario).
4. Increased complexity in systems and processes: Most practitioners don’t realize that real-time is not just buying a powerful web analytics tool. There are other collateral requirements.
1) If you have a in-house solution then real-time means having to buy increasingly powerful machines (usually multi-cpu and loads of memory) that can capture data and process it fast enough to make it available in real-time for you to use it. [Remember that you can’t actually use raw data logs (web logs or javascript tag based).]
2) In order to pull real-time off we will also have to implement increasingly complex processes inside and outside the company.
In your company for example you’ll have to have to have faster processing schedules implemented and allocate some resource (maybe 0.25 person) to watch and make sure everything happens as expected and finally implement reports to run to process all the data to humans.
From a outside perspective you’ll have to put processes in place that will pull data from outside sources (say adwords or affiliates). This adds more steps and complexity into your systems / processes, complexity that is often ignored and not considered by marketing folks but it is complexity that inserts a non-trivial cost into the ecosystem.
5. False sense of confidence: There is not much to say here except that sometimes you’ll observe a false sense of confidence that all is well with the world because we have real-time data streaming into our blackberries. Of course this is not every organization. But it exists more than we might prefer. This false sense of confidence means that we are less likely to look at what the real cost is of getting the data and what is the downside.
In summary the impact of real-time data is that you will pay more for your web analytics tools than might be optimal, you’ll fuel a culture that will do more reporting than analysis and you will end up adding complexity to your systems and supporting processes which in turn will add lots of hidden costs.
Did you realize this? What is the true “cost” of your real time data? Do you disagree with the five impacts outlined above?
[Update: For more context on real-time, in light of all the recent developments with big data (!), please see this post: A Big Data Imperative: Driving Big Action. Valuable video, plus rule #4.]
This obviously does not mean that you should never want real-time data. Here is a simple check-list to use to judge if your organization is ready for real-time data and increase the odds that you will get enough bang for the, end-to-end, increased bucks you’ll spend:
1. “Statistical significance”: You get enough visitors to your website that you can make statistically significant decisions using real-time data. You not only have to get enough overall traffic to the site but you also have to get enough data in segments you want to make real-time decisions.
For example if you want to make real-time decisions about marketing promotions or adwords campaigns then do you get enough traffic and outcomes (orders / leads) to make a statistically significant decision? If you get 13 visitors and 2 outcomes from two different campaigns every four hours then you probably can’t make a confident decision comparing that to anything else.
Statistical significance is not just about raw numbers, you don’t need a million visitors a day to get significance. But you do need enough visitors exhibiting the right behavior you are looking for and for them to do it often enough every hour for you to separate signal from all the noise.
2. Good analytical capabilities: You can not only capture data real time but you have dedicated analysts who can analyze the data very quickly to find nuggets of valuable insights by looking not just at one piece of data but end to end. For example they would not only notice that we got lots of clicks on this new creative from Google / Yahoo PPC campaigns but this traffic is also placing more orders for the right products than other sources of traffic.
Along with analytical capabilities you also need people who have optimal business acumen (maybe super optimal). Numbers no matter how fast they come at you and in how much quantity by themselves won’t help you make good decision. For that you need people will good business acumen (as defined by people who understand your business really well, have a great grasp of your web ecosystem, have lots of common sense).
As a wise person :) said reporting is not analysis!
3. Diversified & Empowered decision making structure: Does your company have a decision making structure where a “front line” analyst can make decisions and authorize / execute changes based on data? Do you require VP approval before web pages go on or off? Do you need a HiPPO to sign off on promotions / campaigns changes?
For action to be taken from real time data decisions need to be made fast. Usually it will be your Analyst or Marketing Manager observing these statistically significant differences. Often these kind folks don’t actually have the authority to stop or green signal anything based on data. That happens via a company labyrinth that needs to be navigated.
If the answer to all of the questions above is No then you are all set, you are empowered and ready.
4. Awesome website / structural operational execution capabilities: Your company has a web operations team that can execute on a dime. They are able to push out the right creative, remove non performing promotions, change the adwords strategy, update landing pages, change email blasts that are already in the queue, send different instructions to your ad / search / affiliate marketing agencies who can also make changes very quickly.
Essentially if it takes you two days to execute changes to your website / campaigns / agencies then value of real time might be really questionable.
Four extremely simple rules / requirements. If your organization capabilities meet all of the above requirements then you are well set to to gain a advantage from getting your data real-time. But if even one of the above requirements are not met then it is perhaps more the case that you want to know (real-time) because you want to know and not to take action. That knowing can be extremely expensive (people, process, $$$) and distracting.
Real-time is perfect in one scenario, if there are micro decisions that a automated system can make based on rules that humans can input. In this scenario some, but not all, of the above issues become less critical. Data helps technology to react real-time to create unique customer experiences. More on this in a future post.
What do you think? Are there other requirements for a organization ready to leverage data real time? Is this post off base? Something missing from the analysis outlined above? I welcome your feedback and critique.
[Like this post? For more posts like this please click here.]
I think much of this goes back to your original post about choosing your provider. A company ought to know *why* they are asking for the things they say they want in an RFP. They need to use the right tool for the right job. As you implied, there are uses for real-time data, but it is not the be-all/end-all that some companies seem to think it is. For better or worse, it is my understanding that real-time still has a huge wow factor in the sales cycle, and that steering the decision makers to see the stuff that "matters" is sometimes tough.
I have to tell you I started reading this with a feeling of dread, but I think you laid out a very good method for deciding whether real-time data is useful for one's business.
I can only think of a few instances where real time data is worth it…
1. You had a big release the night before and need to make sure that the site is performing okay.
2. Sales, new accounts, etc… are way down on the day and you cannot find a compelling external reason why.
3. You need to make an immediate evaluation of a marketing program that day so that you can decide to keep, kill, or overhaul that same buy tomorrow.
If you have built your reporting infrastructure correctly, you have some secondary way of checking all three of those reasons. Even if you don't, those situations don't happen often enough for most companies that it would justify real time reporting as an expensive add on.
Well said (written?) Avinash. I think this was a great topic for a post. I've often thought that there is far too much emphasis on real-time data when I don't have the time to analyze and act on it, and my website development team has a backlog of work and usually takes a week or so to implement changes requested. I've yet to find myself in an organization that can take advantage of real time data even though items 1 and 3 (statistically significant amount of data and empowered decision-making structure) have not usually been barriers.
Avinash, right as usual … there are few companies that have there act together enough to utilize real time data (I have never seen one in real life, but in theory they are out there). I have dismissed real time data since the late 90's in favor of a scalable architecture. Thanks for a cogent writeup that I will use as an education for the analytics-disconnected people who usually write the RFPs.
Avinash, great post and I agree with the concept; however, I think you left off one final reason for real-time: if you've developed systems that react to data effectively. Amazon, for instance, is likely using near real-time data to determine what others viewed/purchased for its recommendation displays (they don't have to, I guess, but useful for new products). When I've written about meme marketing, I only considered CGM as the primary methodology. However, AGM (analytics generated media, or better, APM, analytics preferenced media) should work best of all. Admittedly, it's just a different POV on items 3 and 4 above, but definitely one worth considering. In either case, the key consideration is how actionable the data is. If you can't action it immediately, why do you need real-time?
Tim: You are absolutely right. I had obliquely referred to a similar point that you made in your comment. Here's the excerpt from the post:
We have some simple ways in which we react to the customer as they browse our website or as they come back on a returning visit or show them this if their OS is a or b and on and on.
There are some fairly sophisticated systems in the market now what will do AGM/APM. But none of them are Web Analytics vendors. So if that is a space we want to explore, and we should, then we should not rely on anything real-time from our current Web Analytics solutions because they key piece, allowing near instantaneous reaction, is missing from all of them.
Maybe our vendors are working to fix this as I write this, would make a great extension of their platform!! :)
Thanks so much for your insightful comment.
-Avinash.
Avinash,
I completely agree that real-time is an overrated attribute in a lot of situations, and your points on statistical significance and capacity to react are well made.
However, I think there's also an awkward human factor that can't be ignored here.
We've seen a lot of situations where people execute changes to their websites/online marketing and are biting at the teeth to see the impact. The fact that they can see now the impact of what they did generates the enthusiasm and is the social factor that moves them from passive reporting to actually caring about the numbers.
Sure, if they sat down the next day and looked at the right information they'd have a more objective, statistically significant and structured review of the impact. But by then the enthusiasm is lost, there's something more interesting going on elsewhere, and the feedback is at best diluted.
Equally, when key performance indicators do suddenly move out of trend, I'm constantly amazed by the short term memory loss some companies exhibit in terms of all the things they have done at the same time that might have had an impact.
So whilst real-time (or "fast-turnaround-time") is of limited value in a mature analysis feedback cycle, it can still be a key motivator for getting teams engaged with results and building a desire to connect development/marketing actions with analysis of their impact.
Maybe that's the real reason why it appears in so many RFPs…
One of the challenges as a vendor/consultancy we experience is persuading marketing teams that they need to look at less data, less often, but think about it more. Not as sexy as a graph that moves every second for some reason!
Andrew: You do make a great point. Enthusiasm is often under-rated, here is Mr. Godin saying that better (in a slightly different context) than I ever will:
http://tinyurl.com/yep2eo
We have had a lot of success getting people to bet small amounts on our a/b – MVT tests just to get everyone in the game. And they really get into it and they ask for data constantly and want to see if the "won"!! :)
In context of Web Analytics the difference is getting data right away this very instant for some change we launch or waiting around 15 hours. Every single vendor provides us "yesterday's" data. So at any given time your web analytics data is at most 8 to 16 hours "old" (data gets processed at midnight and let's say you come to work at 8 am and leave after eight long hours of work).
My point of view, in the post is: It be ok to wait for eight hours to get the "latest" data and do so cheaper with less impact on your systems and people and processes and increase the chances that it will be a bit more meaningful.
I want to stress that real-time data is not completely unwanted and unloved. See Tim Peter's comment above for one excellent application.
Thanks so much for adding your valuable perspective to the discussion.
-Avinash.
I've read through this thread a few times and have been hesitant to respond. My hesitancy stems from a definition of the "real-time" paradigm. I completely agree with you that, if the goal is to simply analyze (no intention to minimize "analysis" with that phrasing), then there is no point and is possibly harm in using a real-time system.
However, if the point is to respond to visitors in real-time (the requirement being to engage in an active "conversation", a true interaction) then a real-time system is a necessity. I also recognize that my use of "real-time" may not be relevant to what is posted here, so I'll go back to my lurking at this point. – Joseph
Tim has it right. Close the feedback loop automatically and you realize the value of real-time. Your argument isn't necessarily wrong (yes, you always need a cost/benefit analysis to see whether real-time is needed/worthwhile for a given app), but besides the point.
This is not a web-analytics example but illustrative: every modern control system (autopilots, cruise control etc.) is a real-time feedback system. The more complex ones are real-time analytics-feedback loops (for example, guidance systems that use Kalman filters).
The medium is the message. Ask not what real-time can do to your current analytics-based business model. Ask what you can do to your business model to take advantage of the possibilities of real-time. Real-time is primarily a creative/synthetic thinking challenge, not a problem-solving challenge. I am surprised you didn't discuss Twitter and real-time search. Or maybe not surprised :).
Seriously though, real-time+automatic feedback creates a whole new tech paradigm. We just need to get our creative juices flowing around real-time thinking as opposed to batch thinking. Gladwell has an interesting feature in the New Yorker about this…
Beyond basic feedback control, you can think in terms of real-time learning algorithms (eg. for Netflix recommendations or Amazon recommendations). Once you get rich sensor-web capabilities online via smartphones etc., and even actuators, the whole web will need a definite real-time overlay. I am a control engineer who started working on web systems, so perhaps I am overstating the case (a good read is Feedback Control of Computing Systems by Joe Hellerstein of IBM).
Venkat
Real time data from an analysis point of view is hardly relevant except for several situations which other commenter’s here have pointed out. However, real time data is critical if you have the capacity to act on in real time and the ability to offer a unique experience to the user that enables him/her to more quickly and easily do what they came to the site to do (and what you want them to do).
In short…
– real time for analysis, not relevant.
– real time for personalization and optimization, relevant.
My work is more in advocacy and blogging with an occasional foray into e-commerce system. There's no real-time actionable decisions going to get made in real-time: I'm often lucky if my clients will sit down with the reports I put together every few months (I've been getting a lot of one-line "looks great! thanks! Sent from an iPhone" responses lately!)
But it seems like this is not unlike the live-streaming switch we're seeing with Facebook and Twitter. I have found that the focus on immediacy has the tendency to draw my attention from the more important long-term work.
Typical scenario: I'll make a video or will write something. It will go up to Youtube, a blog, out to Twitter, up to my Facebook personal account and Fan Page. Within a few days my followers have left comments and retweeted. I'm in that warm glow of instantaneous feedback from people I know. But the audience I care about is actually not the people who follow me, but the people who I wish would follow me. They're not as likely to see the post immediately. They'll be coming in one at a time over the next few months or even years. My goal is to convert them into regular followers. You only see the mass of these stumblers in the aggregate when looking at data over time.
I find when finance is in charge of analytics 'real time' data always seems to be a requirement. Reality is using that data may not give you enough of a sample size and it may mean you're not letting your page/design live long enough.
Venkat: At the end of my post I do mention briefly (it was more flushed out in the book :)) that if you can eliminate humans from the decision making process than real time data can be useful. As in doing behavior targeting or like Google does (and I am sure Yahoo and MSFT do) when it uses your past behavior or your current searches to optimize your results.
With regards to twitter and real time search (i.e not a consumer looking for real time stuff but rather you as a company using real time search data to do something) I think you are falling in the trap again. What would you do if you have that data? Is it just that you want to know what is happening or you have the capacity to do something (either using automated systems or manual systems that can react very fast on your interpretation of the data)?
In every case actionability makes the decision for me. In 99% of the cases there is a yearning for real time data but absolutely no capacity to react to it. Then I would much rather the finite resources be used to do strategic non-real time analysis.
Tony: You are right, and it is because they are used to getting a streaming ticker of information to make decisions, often in real time (think currency arbitrage).
Web analytics is ten million miles away from that kind of a need or that kind of an action.
Knowing is not enough. The priority should be on knowing what you can to action in a timely manner.
-Avinash.
I was lead to this post from Web Analytics 2.0. So who says books aren't interactive?
Apart from the obvious advantages of providing real-time data to other technologies, there's one "group" of users which have not been mentioned.
And that's web sites that are focused on dissemination of news, such as newspaper front pages (online) and portals.
Disclosure: I work with a product that provides exactly that, real-time click data to editors.
Workflow involves pushing out a large number of stories in an optimal way and responding to real-time data to decide the optimal positioning and life-time of each story.
Of course, some of the requirements outlined by Avinash still hold, like you need a certain amount of traffic for the data to be useful, and that data must be available in a readable form to the person actually executing the changes needed.
I am considering writing a blog post on real-time usage in this segment, if there's any interest in it.
Agree. Real time data is the decider. Its similar to what you spoke about the 'data quality trap'. We already have enough to act on as it is and its good to ask what incremental value such a paid tool adds.
I was talking to a IT consulting firm's marketing manager the other day and he was talking about 'going for something solid such as omniture instead of free tools like GA'. I asked him to elaborate and it was clear his conclusion wasn't thought out based on specific needs or comparison of features – its the common 'paid is better than free' heuristic.
I was more curious about another comment he made.
He said he has hired temp interns for the analysis and reporting. In your experience is that common? Under what conditions would such a choice be a wise one? B2B companies rely heavily on their site content to help prospects decide to sign up for their services and Web analytics typically will not be a one time project. Thats why I was surprised.
Krishnan: You are right, Google Analytics is right for some people and Omniture is right for others and Yahoo! Web Analytics is better for some other folks. It is important to understand what the needs for an organization are and then pick the vendor that meets those needs.
Here are two posts that might help in that process:
~ How to Choose a Web Analytics Tool: A Radical Alternative
~ Web Analytics Tool Selection: 10 Questions to ask Vendors
With regards to resources… it is highly unusual for any company that is serious about web analytics to leave it to Interns or Temps (not that there is anything wrong with the Interns or Temps). I have a simple rule for investment in web analytics:
~ The 10 / 90 Rule for Magnificent Web Analytics Success
To the extent possible any company (for profit, non profit, b2b, b2c b2anythingelse) should over invest in smart brains because at the end of the day the smart brains will make a difference and not the tools.
Avinash.
Google now has Real time analytics now! wohoo :)
But yeah, it helps you monitor but decision taking based on real time data is still questionable.
It can surely have a big impact, when you have some new campaigns, mostly in B2C environment!