It seems that every good web analyst / practitioner / director / vp’s wish list of a perfect web analytics tool starts with a desire to get “real-time” data.
The thought being that with the fast pace of the web and everything changing all the time getting real-time data is mandatory to being able to take advantage of all that the web has to offer from its ability to cough up so much data.
This customer desire seems to be so pervasive that every little and big web analytics vendor prominently advertises how real-time their data is. Someone says I can do every five hours, the next guy says I see that and I raise you three hours and the next gal says you guys are sissies because I can give you real real-time and give you live traffic streams.
But is getting real-time data really relevant? Do you are really need it? What’s the cost of getting real-time data?
This spoils the surprise but the answer to the first and second question is mostly no. Since most of us think it is yes it makes the answer to the third question a little bit scary (and rather sub optimal for most businesses in terms of impact).
So what is the typical impact of getting data near real-time (roughly defined as faster than every couple hours)? Here are the five that we have generally observed:
1. Much more reporting, much less analysis: We already live and swim in a world of too much data, so much that we really have a hard time finding any actionable insights from what we have even after we hack at it for hours and days. Real-time data usually worsens that by giving you even more data faster and you are left to find the proverbial needle in the haystack (a non trivial task as you can imagine).
2. Detrimental impact on resource allocation: It is a common theme in the industry that we don’t apply the 10/90 rule. One reason is that it is hard to find the right people with the right skills for the job. But a secondary reason is that in the world of Web Analytics we have a lot of very complex data that can never reconcile to anything else. Now imagine what happens with real-time data.
Our finite resources now have to make sense of all this, pardon me, mess but with less time on hand and provide insights. Almost always because of real-time data there is a negative impact on the resources and bandwidth allocation because there is organization and management pressure to justify return on investment (remember real-time data is not really free, you pay to have access to data that fast).
3. Choice of sub optimal web analytics solutions: This one is really common. Web Analytics tools are chosen based on complicated all encompassing RFP (request for proposal) processes (here’s a alternative suggestion for selecting a tool). Everyone wants everything so these things are usually a joy to behold. : ) Usually top of that list is “need data in real-time” (remember the mindset, who would not want data real-time; its like asking someone “when did you stop hitting your spouse” its a lose lose).
The impact is that the committee that is narrowing from 200 tools to 2 will reject any vendor that is not “real-time” because that is a deal breaker. Most often this means that lots of tools that might have met other important criteria (say advanced segmentation or integration with other sources) get kicked out. In the end you might end up choosing a tool that is real-time (and expensive) yet in a few months when we are smart enough to dig deeper we’ll find limitations.
Let me relate a personal story. Everyone wanted real-time data and that is what the big three vendors were selling as well (including data almost real time streamed over to pagers and smart phones). Yet we choose the tool we have because of all the features it brings to us, and it can’t do real-time and we don’t care. It is much cheaper to boot (software, hardware). Another story, this time from a friend’s company, is that their team wanted real time PPC / SEM (pay per click / search engine marketing) data and they simply decided to take it all outside the company and created a data / decision making silo that did not have a end to end view and optimized for that silo (usually this is a sub optimal scenario).
4. Increased complexity in systems and processes: Most practitioners don’t realize that real-time is not just buying a powerful web analytics tool. There are other collateral requirements.
2) In order to pull real-time off we will also have to implement increasingly complex processes inside and outside the company.
In your company for example you’ll have to have to have faster processing schedules implemented and allocate some resource (maybe 0.25 person) to watch and make sure everything happens as expected and finally implement reports to run to process all the data to humans.
From a outside perspective you’ll have to put processes in place that will pull data from outside sources (say adwords or affiliates). This adds more steps and complexity into your systems / processes, complexity that is often ignored and not considered by marketing folks but it is complexity that inserts a non-trivial cost into the ecosystem.
5. False sense of confidence: There is not much to say here except that sometimes you’ll observe a false sense of confidence that all is well with the world because we have real-time data streaming into our blackberries. Of course this is not every organization. But it exists more than we might prefer. This false sense of confidence means that we are less likely to look at what the real cost is of getting the data and what is the downside.
In summary the impact of real-time data is that you will pay more for your web analytics tools than might be optimal, you’ll fuel a culture that will do more reporting than analysis and you will end up adding complexity to your systems and supporting processes which in turn will add lots of hidden costs.
Did you realize this? What is the true “cost” of your real time data? Do you disagree with the five impacts outlined above?
[Update: For more context on real-time, in light of all the recent developments with big data (!), please see this post: A Big Data Imperative: Driving Big Action. Valuable video, plus rule #4.]
This obviously does not mean that you should never want real-time data. Here is a simple check-list to use to judge if your organization is ready for real-time data and increase the odds that you will get enough bang for the, end-to-end, increased bucks you’ll spend:
1. “Statistical significance”: You get enough visitors to your website that you can make statistically significant decisions using real-time data. You not only have to get enough overall traffic to the site but you also have to get enough data in segments you want to make real-time decisions.
For example if you want to make real-time decisions about marketing promotions or adwords campaigns then do you get enough traffic and outcomes (orders / leads) to make a statistically significant decision? If you get 13 visitors and 2 outcomes from two different campaigns every four hours then you probably can’t make a confident decision comparing that to anything else.
Statistical significance is not just about raw numbers, you don’t need a million visitors a day to get significance. But you do need enough visitors exhibiting the right behavior you are looking for and for them to do it often enough every hour for you to separate signal from all the noise.
2. Good analytical capabilities: You can not only capture data real time but you have dedicated analysts who can analyze the data very quickly to find nuggets of valuable insights by looking not just at one piece of data but end to end. For example they would not only notice that we got lots of clicks on this new creative from Google / Yahoo PPC campaigns but this traffic is also placing more orders for the right products than other sources of traffic.
Along with analytical capabilities you also need people who have optimal business acumen (maybe super optimal). Numbers no matter how fast they come at you and in how much quantity by themselves won’t help you make good decision. For that you need people will good business acumen (as defined by people who understand your business really well, have a great grasp of your web ecosystem, have lots of common sense).
As a wise person :) said reporting is not analysis!
3. Diversified & Empowered decision making structure: Does your company have a decision making structure where a “front line” analyst can make decisions and authorize / execute changes based on data? Do you require VP approval before web pages go on or off? Do you need a HiPPO to sign off on promotions / campaigns changes?
For action to be taken from real time data decisions need to be made fast. Usually it will be your Analyst or Marketing Manager observing these statistically significant differences. Often these kind folks don’t actually have the authority to stop or green signal anything based on data. That happens via a company labyrinth that needs to be navigated.
If the answer to all of the questions above is No then you are all set, you are empowered and ready.
4. Awesome website / structural operational execution capabilities: Your company has a web operations team that can execute on a dime. They are able to push out the right creative, remove non performing promotions, change the adwords strategy, update landing pages, change email blasts that are already in the queue, send different instructions to your ad / search / affiliate marketing agencies who can also make changes very quickly.
Essentially if it takes you two days to execute changes to your website / campaigns / agencies then value of real time might be really questionable.
Four extremely simple rules / requirements. If your organization capabilities meet all of the above requirements then you are well set to to gain a advantage from getting your data real-time. But if even one of the above requirements are not met then it is perhaps more the case that you want to know (real-time) because you want to know and not to take action. That knowing can be extremely expensive (people, process, $$$) and distracting.
Real-time is perfect in one scenario, if there are micro decisions that a automated system can make based on rules that humans can input. In this scenario some, but not all, of the above issues become less critical. Data helps technology to react real-time to create unique customer experiences. More on this in a future post.
What do you think? Are there other requirements for a organization ready to leverage data real time? Is this post off base? Something missing from the analysis outlined above? I welcome your feedback and critique.
[Like this post? For more posts like this please click here.]