Experimentation and Testing: A Primer

This post is a primer on the delightful world of testing and experimentation (A/B, Multivariate, and a new term from me: Experience Testing). There is a lot on the web about A/B or Multivariate testing but my hope in this post is to give you some rationale around importance and then a point of view on each methodology along with some tips.

I covered Experimentation and Testing during my recent speech at the Emetrics Summit and here is the text from that slide:

Experiment or Go Home:

Customers yell our problems (when they call or see us), they bitch, they rarely provide solutions
Our bosses always think they represent site users and they want to do site design (which all of us promptly implement!!)
The most phenomenal site experience today is stale tomorrow
80% of the time you/we are wrong about what a customer wants / expects from our site experience

That last one is hard to swallow because after all we are quite full of ourselves. But the reality is that we are not our site’s customer, we are too close to the company, its products and websites. Experimentation and testing help us figure out we are wrong, quickly and repeatedly and if you think about it that is a great thing for our customers, and for our employers.

Experimentation and testing in the long run will replace most traditional ways of collecting qualitative data on our site experiences such as Lab Usability. Usability (in a lab or in a home or remotely) is great but if our customers like to surf our websites in their underwear then would it not be great if we could do usability on them when they are in their underwear?

It is important to realize that experimentation and testing might sound big and complex but it is not. We are lucky to live at at time when there are options available that allow us to get as deep and as broad as we want to be, and the cost is not prohibitive. There are three types of testing that are most prevalent (the first two are common).

A/B Testing: Usually this is a all encompassing category that seems to represent all kinds of testing. But A/B testing essentially represents testing of more than one version of a web page. Each version of the web page usually uniquely created and stand alone. The goal is to try, for example, three versions of the home page or product page or support FAQ page and see which version of the page works better. Almost always in A/B testing you are measuring one outcome (click thrus to next page or conversion etc). If you do nothing else you should do A/B testing.

How to do A/B Testing: You can simply have your designers/developers create versions of the page and depending on the complexity of your web platform you can put the pages up and measure. If you can’t test them at the same time put them up one week after the other and try to control for external factors if you can.

Pro’s of doing A/B Testing

This is perhaps the cheapest way of doing testing since you will use your existing resources and tools
If you don’t do any testing this is a great way to just get going and energize your organization and really have some fun
My tip is first few times you do this have people place bets (where legal) and pick winners, you’ll be surprised

Con’s of doing A/B Testing:

It is difficult to control all the external factors (campaigns, search traffic, press releases, seasonality) and so you won’t be 100% confident of the results (put 70% confidence in the results and make decisions)
It is limiting in the kinds of things you can test, just simple stuff and usually it is hard to discern correlations between elements you are testing

Multivariate Testing: Currently the cool kid on the block, lots of hype, lots of buzz. In A/B above you had to create three pages. Now imagine “modularizing”? your page (break it up into chunks) and being able to just have one page but change dynamically what modules show up on the page, where they show up and to which traffic. Then being able to put that into a complex mathematical engine that will tell you not only which version of the page worked but correlations as well.

For example for my blog I can create “modules” / “containers” of the core page content, the top header, and each element of the right navigation (pages, categories, links, search etc). In a multivariate test I could move each piece around and see which one worked best.

Pro’s of doing Multivariate Testing

Doing Multivariate turbocharges your ability to do a lot very quickly for a couple of reasons
- There are free tools like the Google Website Optimizer or paid tools like Offermatica, Optimost, and SiteSpect who can help you get going very quickly by hosting all the functionality remotely (think asp model) such as content, test attributes, analytics, statistics.
- You don’t have to rely on your IT/Development team. All they have to do is put a few lines of javascript on the page and they are done. This is a awesome benefit because most of the times that is a huge hurdle.
It can be a continuous learning methodology

Con’s of doing Multivariate Testing

The old computer adage applies, be careful of GIGO (garbage in, garbage out). You still need a clean pool of ideas that are sourced from known customer pain points or strategic business objectives. It is easy to optimize crap quickly.
Website experiences for most sites are complex multi page affairs. For a e-commerce website it is typical for a entry to a successful purchase to be around 12 to 18 pages, for a support site even more pages (as we thrash around to find a answer!). With Multivariate you are only optimizing one page and no matter how optimized it cannot play a outsized role in final outcome, just the first step or two.

Most definitely do Multivariate but be aware of its limitations (and yes the vendors will tell you that they can change all kinds of things throughout the site experience, take it with a grain of salt and take time to understand what exactly that means).

Experience Testing: New term that I have coined to represent the kind of testing where you have the ability to change the entire site experience of the visitor using capability of your site platform (say ATG, Blue Martini etc). You can not only change things on one page or say the left navigation or a piece of text on each page, but rather you can change all things about the entire experience on your website.

For example lets say you sell computer hardware on your website. Then with this methodology you can create one experience of your website where your site is segmented by Windows and Macintosh versions of products, another experience where the site is segmented by Current customers and New customers and another where the site is purple with white font with no left navigation and smiling babies instead of product box shots.

With experience testing you don’t actually have to create three or four websites, but rather using your site platform you can easily create two or three persistent experiences on your websites and see which one your customers react to best. Since any analytics tools you use collect data for all three the analysis is the same you do currently.

Pro’s of doing Experience Testing

This is Nirvana. You have an ability to test on your customers in their native environment (think underwear) and collect data that is most closely reflective of their true thoughts.
If your qualitative methods are integrated you can literally read their thoughts about each experience.
You will get five to ten times more powerful results than any other methodology

Con’s of doing Experience Testing

You need to have a website platform that supports experience testing, (for example ATG supports this)
It takes longer than the other two methodology
It most definitely takes more brain power

Experience testing is very aspirational but companies are getting into it and sooner rather than later the current crop of vendors will start to expand into that space as well.

Agree? Disagree? Counter claims? Please share your feedback via comments.

58 thoughts on “Experimentation and Testing: A Primer”

Pingback: WebMetricsGuru
Webmetricsguru
May 22, 2006 at 14:01
Pretty good post; I commented on it at Webmetricsguru.com. Wondering how Analytics will need to adapt to Personalization.
We’re coming to the long tail of search where eventually everyone will see different search results for the same query (based on who they are and where they’re located).
In Web Analytics, if a site shows a different page (one created for the customer on the fly) based on who they are – how will the Analytics represent those variations?
Reply
Avinash Kaushik
May 22, 2006 at 22:53
Marshall: Thanks for your comments….
In Web Analytics, if a site shows a different page (one created for the customer on the fly) based on who they are – how will the Analytics represent those variations?
Many different analytics tool can now handle this quite easily. The challenge is that we should have the foresight to know what we want to track. Typically you can set either cookie values or url parameters that a analytics tool will automatically pick up and then you can analyze.
For example you can come to http://www.kaushik.net/avinash and based on the keyword you came on (or campaign) you could see:
http://www.kaushik.net/avinash?key=marshall
http://www.kaushik.net/avinash?key=marshall-is-great
Now you might see different content on the fly depending on what “key” you came on and my analytics tool can measure every kpi imaginable now that it has captured the key value.
Reply
Dave Morgan @ SiteSpect
May 23, 2006 at 19:37
Very fine post, Avinash. Thank you for the thorough explanations.
As you’ve asked for some feedback, here goes:
Experimentation and testing in the long run will replace most traditional ways of collecting qualitative data on our site experiences such as Lab Usability. Usability (in a lab or in a home or remotely) is great but if our customers like to surf our websites in their underwear then would it not be great if we could do usability on them when they are in their underwear?
Both lab testing (i.e. for usability) and transparent web testing (A/B, multivariate, etc.) have their places. Lab testing has many benefits, among which is the ability to reveal UI/usability issues that are difficult to quantity with web analytics. Likewise, A/B and multivariate testing are very strong quantitative tools that reveal user preferences that simply cannot be measured in a testing lab.
For example, asking a user in a lab test to decide which promotion is more appealing is unreliable for many reasons (Hawthorne Effect, for one). But ask users “in the wild” by presenting offers in an A/B testing where they don’t know they’re being tested, and let them vote with their wallet. This is the crux what marketing experiments reveal that qualitative usability research cannot.
It is difficult to control all the external factors (campaigns, search traffic, press releases, seasonality) and so you won’t be 100% confident of the results (put 70% confidence in the results and make decisions).
There’s nothing inherently more or less difficult to control with an A/B test vs. multivariate test. The key thing is to randomly assign visitors to the A group vs. the B group.
In A/B above you had to create three pages. Now imagine “modularizing” your page (break it up into chunks) and being able to just have one page but change dynamically what modules show up on the page, where they show up and to which traffic.
Another way to look at this: if A/B testing focuses on one site element (i.e. a product image), then multivariate testing focuses on multiple elements (product image plus headline). And a quick terminology note… in the realm of experimental design, these are commonly called multifactor tests. So when you see “multivariate” or “multivariable”, or “multifactor” in the context of web testing vendors, it’s useful to note that they essentially all describe the same process.
You don’t have to rely on your IT/Development team. All they have to do is put a few lines of javascript on the page and they are done. This is a awesome benefit because most of the times that is a huge hurdle.
This has nothing to do with a test being multivariate, and everything to do with the innovations provided by the vendors you’ve mentioned. More to the point: the hardest thing about A/B or multivariate testing is switching content – showing version A1B1 of a page to one user, while simultaneously showing version A1B2 of that same page to another user. The more factors you’re simultaneously testing (varying), the more crucial the content switching becomes.
You don’t have to rely on your IT/Development team. All they have to do is put a few lines of javascript on the page and they are done.
Well, yes and no. In some organizations, instrumenting pages with javascript each time you want to test a new area requires the IT/development issue. [Note that I’ll freely admit my bias here: my company’s product is the only one that truly takes IT out of the equation because it doesn’t require javascript tagging.]
Finally, regarding Experience Testing, what you are articulating sounds alot like personalization with an experimental component thrown in. Besides ATG Dynamo, Microsoft Commerce Server and IBM WebLogic also support simplistic forms of this.
Thanks again Avinash, and welcome to blog-o-sphere! :)
Dave @ SiteSpect
Reply
Avinash Kaushik
May 23, 2006 at 22:12
Dave,
Thanks you for vastly enriching the value of my original post by adding your comments. I have learned more as I am sure have other readers.
I have personally and actively observed the Hawthorne Effect and hence I am biased towards testing and specifically Experience Testing (which is not so much personalization as putting randomly assigned people into two or more “controlled and different” experiences on the website and seeing which experience performs better against preset goals).
It is not perfect but we can not only measure conversion and revenue type stuff but also qualatitive like Task Completion and Satisfaction for these experiences. It is not perfect but in my mind a great tradeoff between bringing someone in and giving them $250 to run through a site vs doing it without the participant knowing.
By no means is Lab Usability over, not by any stretch of the imagination, but as you can see I am giddy at the possibilities of Experience Testing and the learnings that can come from that.
Reply
Dave Morgan @ SiteSpect
May 24, 2006 at 16:24
Sorry, just noticed my mistake… in my very last paragaph I should have said “IBM WebSphere”, not WebLogic (sorry BEA).
Dave
Reply
Mike Samec
June 20, 2006 at 06:29
Don’t forget with A/B testing that you can test specific elements of a page rather than an entire page. For example, test two variations of a headline, send a % of people to version A and divert a % of traffic to version B. The rest of your traffic being your control group.
Reply
Pingback: How to test the effectiveness of your Adsense ads at Analytic Insight
Pingback: Confluence: Marketing (Corporate)
Pingback: The Software Abstractions Blog
Pingback: Analytics Talk » Get Ready for Testing with Website Optimizer
Pingback: Turn Up The Silence - iPerceptions Blog
Pingback: commadot.com » Blog Archive » A/B Testing and HiPO
Pingback: Confluence: Edmunds Central
David Bullock
June 22, 2007 at 03:44
Avinash,
Wonderful post. Thank you for your insights. The tools for analysis and observation have improved greatly over the last several years. What is available now does not compare to the rotators, Javascript trackers and numerous spreadsheets that I was using years ago.
In doing this conversion optimization work, we found that the segmentation that you speak of is a subset of “noise reduction” in terms of test design.
As the tester observes and refines the “conversion conversation” the natural extension is to create a user defined channel that serves the visitors concerns and needs. When that match is closely correlated the users actions become more predictable.
Your term – “Experience Testing” is a perfect articulation of the what I termed the “conversion conversation”
I am enjoying your Blog and thinking.
David
Reply
Pingback: Paul Rouke on User Experience » Top 8 Business Benefits of User Testing
Emerson Hartley
September 4, 2007 at 10:32
Avinash,
You should talk about the limitations of Javascript and how only an installed solution like the one that Memetrics offers can test more than a few attributes on a simple page. Why not test Paid Search, Direct Mail or anything with a dependent variable.
Reply
Pingback: ZoomInfo’s Bizographic Ad Platform - Care to Help? « good to know
Pingback: Success in Work » Blog Archive » Split the Difference
Shashank
November 27, 2007 at 10:12
:( I am facing problem in setting up experiments on Google Website Optimizer.
For creating experiments for Conversion Rate Optimization I am facing problems even for a very small test to execute.
Previously I have created around 5-6 experiments for our website using Website Optimizer, but from last one week I am not getting required combinations on setting up experiments.
I do add the scripts to the test and conversion page and also making variations, but at last step when I preview it I don’t find desired combinations.
Please help me as I feel that before 1 week I was successfully setting up experiments but from these last 1 week I am doing same actions that I used to but not getting desired combinations.
Thanks in advance.
Regards,
Shashank
Reply
Matt Gershoff
February 5, 2008 at 12:01
Good Blog,
What is the difference between A/B testing and what you are calling Experience testing? It sounds like for the experience test you are creating a mega-variable, lets call it ‘site’, which is a bundle of attributes (pages, images, etc.). I don’t see how the actual testing is different than A/B.
For the multivariate testing we are looking at attributes that sit in several dimensions (think of the corner points in a hypercube). What we are concerned about is that there will be interaction effects over the variables we want to test – so that we need to know all of the values of each variable when determining the results.
One unmentioned ‘con’ of this approach is that it is very costly from a data perspective – we need lots of observations to ‘fill up’ that hypercube to make robust estimates. One way around this is to work with fractional factorial design testing – where we make assumptions about how many variables we need to include at any one time – this has the effect of collapsing or aggregating some of the corners of our cube so that we need fewer estimates.
Thanks again
Matt Gershoff
Reply
Pingback: How to Use Bounce Rate as a Metric to Improve Your Website Performance : SEO Consultant India: Kichus
vivek chandran
November 24, 2008 at 15:05
Hi Avinash.
Was looking for a quick and dirty description on multivariate testing. This was exactly what I was looking for :) Thank You.
Vivek
Reply
apageor2
January 4, 2009 at 15:29
Avinash,
I came across this web site through another link and have been reading for the past 20 minutes on what you have to offer for information and advice regarding the building of web sites and pages themselves. I have been working solo the past 10 years while gaining my degrees; not an easy thing to do but possible.
As a Unix programmer, I had to follow a certain protocol and test code before it was permitted to go live. At the time I was writing code in Unix or MUMPS however it still was a necessary matter. I still follow the same habits with my web pages before sending them to my clients.
The company which I was working for used multivariate I believe. Now that I am working for myself, I am following experience testing. Great post! Best wishes for 2009!
Sue
Reply
Edu Barredo
June 3, 2009 at 08:01
Hello Avinash, I consider myself a reader of your blog but I’m allways discovering your “old” amazing posts.
Can you give more info about the “Expercience testing” concept or include some related links?
Thanks Avinash, and keep up with your great work!
Reply
Pingback: A/B Testing at PBworks - The official blog of PBworks
Pingback: Landing Page Testing: Choosing Between A/B Or Multivariate Approaches
Jay L.
March 23, 2010 at 09:46
At our agency we’re really doing some interesting A/B/N and multivariate testing of landing pages, ecommerce flows, banner creative among other digital media.
One of the ongoing debates when we execute a split test is whether or not to keep content elements the same. Meaning, vary up the design concepts dramatically but use the exact same headlines, products, imagery, etc. to isolate the design variable.
Is this a necessary practice or can a winning design be deemed the winner regardless if the elements are consistent? My belief is that the greatest change promotes the greatest opportunity for results.
Reply
Avinash Kaushik
March 23, 2010 at 20:48
Jay: I think you might want to use Multivariate tests for the goal you have set for yourself, rather than A/B tests.
In multi variate tests you are able to change anything about a page (headlines, product images, layout etc).
As the test runs the math and regressions that the tool will do (say Google’s Website Optimizer or Test & Target or Sitespect) will compute the impact on the outcome of each element you are trying.
In the end not only will you know which page works the best, it will tell you the contribution of each element (and, this is cool, it does happen that sometimes a great headline that by itself scored well might not even be on the winning combination!).
You can of course do this with A/B/N tests, it will just take you far too long.
Avinash.
Reply
1. Nikos
  February 9, 2012 at 11:23
  Avinash would you by any chance know of any free-of-charge multivariate test calculators for sample size, duration and testing significance of results?
  Nikos
  Reply
  1. Avinash Kaushik
    February 9, 2012 at 17:20
    Nikos: I think this is what you are looking for:
    ~ Website Optimizer Duration Calculator
    You were looking for MVT, but here’s a nice A/B test one from the nice folks at Visual Website Optimizer:
    ~ A/B test duration calculator (Excel spreadsheet)
    -Avinash.
    Reply
muirskate
March 30, 2010 at 13:34
Great article!
I’m a UI designer for a website and A/B split testing has given me such great insight into how our customers think online. You may be right saying that 80% of the time what I think when designing the layout is based on my own experience and what I want people to think or what I expect people to think.
For example, here’s a simple situation where it proved my thinking faulty. When someone is viewing a longboard deck they have the option of buying the deck or buying a complete board with all the components (trucks, wheels). When designing the button I thought “the button should have a call to action like Build a Complete Skateboard”. We tested this thinking with a simple button that says “Complete Longboard”. In the end the button without the call to action got more conversions.
I was surprised and thought that I can learn a formula from this, but in reality I can’t pretend to figure people out. So for sure A/B testing is going to take websites to a new level for a lot of people, and it’s free (Google’s product).
Maybe one day Experience Testing will be an option, but as you said that required a lot more.
Reply
Pingback: When to trust the numbers, and when to trust your gut
Francisco Meza
August 31, 2011 at 22:28
It’s been a long time since I even paid attention to A/B split testing. I am glad your course has reminded me of something so valuable which I have overlooked for a couple years now. This will let the customer make the decision. I use this method in Google Adwords by making 2 ads and let them compete against each other. Once I have a winner, I delete the loser and make another ad to compete with the winner. In the end A/B split testing will let the customer decide which ad or webpage is better.
Reply
Pingback: A/B Testing Basics | A Better User Experience
Pingback: AARRR – Métricas para Piratas | Uma idéia por dia
Pingback: Experiment or Go Home | _BossyTuna/Home
Savitha A
May 16, 2013 at 23:20
Avinash,
I read through the blog and comments. I have one question, hope you can clarify it.
I’m running a campaign for my product to incentive people to renew their license. I have two flows for the same campaign/offer. However, flow A has one design theme (with Black background) and flow B has another design theme (with white background). Am also keeping content, layout different in both the flows.
Ultimately, I want to know which of the two design themes and flows are working better in terms of renewal rates and why?
Should I be doing experience testing or multi-variance testing in this case.
Please help based on your past experience.
Thanks,
Savitha
Reply
1. Avinash Kaushik
  May 17, 2013 at 16:26
  Savita: The scenario you are describing is a standard multi-variate experiment.
  If it is too complicated to tease out the causality factors, you can start with a A/B test which will just tell you which one works. Then over time you can work with other tests to tease out causality for individual changes.
  -Avinash.
  Reply
Pingback: Competitive Research and Social Media Monitoring | Electronic Commerce and Internet Computing
Pingback: Lean Website Optimization: We don’t know anything!
Pingback: AARRR - Entenda as métricas para piratas | Uma idéia por dia
Soumya
May 20, 2014 at 14:04
Thank you for your post, Avinash. I appreciate your effort and passion. Your posts are very clear and crisp.
I had a query , where can a learner like me dirty their hands with such testing tools (as mentioned above). Do you have any recommendations?
Thank you again for your posts.
Soumya
Reply
1. Avinash Kaushik
  May 20, 2014 at 17:54
  Soumya: You need two things.
  1. You need a website (the bigger the better, but any size will do).
  2. You need time to learn.
  If you have those two things, just go to https://visualwebsiteoptimizer.com/ or https://www.optimizely.com/ and implement your first test. Both tools have free trials for a month. Use that. :)
  Best way to learn is to do.
  For broader learning in the digital analytics space, please see this post with tools, books, helpful links etc: Web Analytics Career Guide: From Zero To Hero In Five Steps! https://www.kaushik.net/avinash/web-analytics-career-guide-job-strategy/
  -Avinash.
  PS: You can also try the free Google Content Experiments: https://goo.gl/QX1FC0
  Reply
Pingback: A unified product management framework | Front to back
Pingback: A unified product management framework | User experience design, UX design and usability in South Africa - Flow Interactive
Pingback: Tre steg för att testa smartare | Adtraction |affiliateprogram affiliatemarknadsföring affiliate
Pingback: Trzy kroki do rozsądniejszego testowania | Adtraction sieci afiliacyjnej
Pingback: Multivariates Testing zur Website-Optimierung - WebSpotting
Pingback: From A to Z: 200 Essential Resources for Entrepreneurs Building a Business
Pingback: 35 Essential tools, resources for internet marketer | DigitalMarketingDesk.org
Pingback: From A to Z: 200 Essential Resources for Entrepreneurs Building a Business | Mogul Meals
Pingback: 营销人进阶必看！成为增长黑客，这 7 项技能你具备了吗？ | Richie's Blog
Pingback: Top 25 A/B Testing Software and Tools for Small Businesses
Pingback: An A To Z Guide Of 200+ Essential Resources For New Entrepreneurs - Part 1 | YouthVation
Pingback: [Marketing Analytics] Testing and Experimentation
Sravan Vadigepalli
September 13, 2017 at 12:56
Hi Avinash,
Great post. Do you recommend any real good book that covers about A/B testing and best practices in digital world.
Thanks,
Sravan
Reply
1. Avinash Kaushik
  September 13, 2017 at 22:01
  Sravan: There are a whole lot of resources here: https://vwo.com/resources/
  You can also of course pick up the latest on on Amazon.
  Avinash.
  Reply