<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Excellent Analytics Tip#1: Statistical Significance</title>
	<atom:link href="http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html/feed" rel="self" type="application/rss+xml" />
	<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html</link>
	<description>Pluralitas non est ponenda sine neccesitate.</description>
	<pubDate>Mon, 08 Sep 2008 12:15:49 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.1</generator>
		<item>
		<title>By: Andrew Blank</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-466401</link>
		<dc:creator>Andrew Blank</dc:creator>
		<pubDate>Wed, 16 Jul 2008 15:45:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-466401</guid>
		<description>The link for more advanced stats - http://www.mwrms.com/wwwRMS/DirectMarketing/MarketingCalc2.asp does not seem to have anything to do with stats anymore.  Unfortunately I couldn't find a live replacement.</description>
		<content:encoded><![CDATA[<p>The link for more advanced stats - <a href="http://www.mwrms.com/wwwRMS/DirectMarketing/MarketingCalc2.asp" rel="nofollow">http://www.mwrms.com/wwwRMS/DirectMarketing/MarketingCalc2.asp</a> does not seem to have anything to do with stats anymore.  Unfortunately I couldn&#8217;t find a live replacement.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Barbara</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-449676</link>
		<dc:creator>Barbara</dc:creator>
		<pubDate>Sun, 13 Apr 2008 14:15:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-449676</guid>
		<description>Hi Avinash,

Thanks a million for your great posts.  I just got into web analytics and have found both this blog and your book extremely helpful.  You mentioned utilizing the statcalc.xls spreadsheet to measure significance between pageviews.  Can you kindly advise on how to do this?  Your response is eagerly awwaited. :)</description>
		<content:encoded><![CDATA[<p>Hi Avinash,</p>
<p>Thanks a million for your great posts.  I just got into web analytics and have found both this blog and your book extremely helpful.  You mentioned utilizing the statcalc.xls spreadsheet to measure significance between pageviews.  Can you kindly advise on how to do this?  Your response is eagerly awwaited. :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tyranny of numbers &#8212; checking for statistical significance &#124; Ubermarketer</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-441633</link>
		<dc:creator>Tyranny of numbers &#8212; checking for statistical significance &#124; Ubermarketer</dc:creator>
		<pubDate>Sun, 30 Mar 2008 19:21:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-441633</guid>
		<description>[...] RKG on finding statistical significance in two Adwords tests  Interesting commentary on the value (or lack-of-value) in copy testing in PPC Avinash Kaushik on separating signal from noise with statistical significance [...]</description>
		<content:encoded><![CDATA[<p>[...] RKG on finding statistical significance in two Adwords tests  Interesting commentary on the value (or lack-of-value) in copy testing in PPC Avinash Kaushik on separating signal from noise with statistical significance [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: michael choe</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-440990</link>
		<dc:creator>michael choe</dc:creator>
		<pubDate>Thu, 27 Mar 2008 22:05:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-440990</guid>
		<description>all -

i share similar concerns as jbuser... 

for example, 1.74 standard deviations is not 95% significance.  1.96 standard deviations is 95% confidence.

also, for computing standard deviation (s) of 2 or more proportions (in this case, conversion rate), i think it's a good practice to assume the largest margin of error.  in this case, margin of error = 2 * sqrt(0.5^2/n), where n is the size of your smallest sample.  this is what pollsters such as zogby do when communicating poll results about hillary/obama, etc.</description>
		<content:encoded><![CDATA[<p>all -</p>
<p>i share similar concerns as jbuser&#8230; </p>
<p>for example, 1.74 standard deviations is not 95% significance.  1.96 standard deviations is 95% confidence.</p>
<p>also, for computing standard deviation (s) of 2 or more proportions (in this case, conversion rate), i think it&#8217;s a good practice to assume the largest margin of error.  in this case, margin of error = 2 * sqrt(0.5^2/n), where n is the size of your smallest sample.  this is what pollsters such as zogby do when communicating poll results about hillary/obama, etc.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jbuser</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-420910</link>
		<dc:creator>Jbuser</dc:creator>
		<pubDate>Wed, 20 Feb 2008 21:56:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-420910</guid>
		<description>Avinash,

As a stats guy, I am a little concered with the assumptions behind the model.  I downloaded it and the first thing I noticed, was that it makes some pretty large assumptions with confidence levels (anything with z-score (I am assuming) of 1.65 and 2.33 = 95%).  I understand that this is probably there to make things "easy" but I think it can be misleading.  Also of note, was that IF their is a z-score assumption (which unsure of), there are some other assumptions underneath the covers (which I coudn't get to), and z-scores are only for known pop means and sta. dev.  Do you know what Brian is using?  Is it possible to get this information?  

Finally, one concern with the "plug and chug" nature of the spreadsheet is what you always must be wary of, and that is making statistical significance a badge of honor.  All it tells you is given the values you have, is there a difference between the two.  What you must do, more than anything else, is make sure your testing methods are solid BEFORE the test.  Otherwise, you are going to be putting in values after values and getting significance or not and making some very important decisions when the whole test could be wrong.  Practical vs. Statistical.</description>
		<content:encoded><![CDATA[<p>Avinash,</p>
<p>As a stats guy, I am a little concered with the assumptions behind the model.  I downloaded it and the first thing I noticed, was that it makes some pretty large assumptions with confidence levels (anything with z-score (I am assuming) of 1.65 and 2.33 = 95%).  I understand that this is probably there to make things &#8220;easy&#8221; but I think it can be misleading.  Also of note, was that IF their is a z-score assumption (which unsure of), there are some other assumptions underneath the covers (which I coudn&#8217;t get to), and z-scores are only for known pop means and sta. dev.  Do you know what Brian is using?  Is it possible to get this information?  </p>
<p>Finally, one concern with the &#8220;plug and chug&#8221; nature of the spreadsheet is what you always must be wary of, and that is making statistical significance a badge of honor.  All it tells you is given the values you have, is there a difference between the two.  What you must do, more than anything else, is make sure your testing methods are solid BEFORE the test.  Otherwise, you are going to be putting in values after values and getting significance or not and making some very important decisions when the whole test could be wrong.  Practical vs. Statistical.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Philip</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-419536</link>
		<dc:creator>Philip</dc:creator>
		<pubDate>Mon, 18 Feb 2008 18:38:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-419536</guid>
		<description>The Analytical Group link for a free spreadsheet download appears to have changed.   It's now &lt;a href='http://www.analyticalgroup.com/sigtest.html' rel="nofollow"&gt;http://www.analyticalgroup.com/sigtest.html&lt;/a&gt; (with html instead of htm). 

Thanks for all your great articles, Avinash!

&lt;b&gt;&lt;font color=blue&gt;Note :&lt;/font&gt;&lt;/b&gt; Thanks very much for the correction Philip! -Avinash.</description>
		<content:encoded><![CDATA[<p>The Analytical Group link for a free spreadsheet download appears to have changed.   It&#8217;s now <a href='http://www.analyticalgroup.com/sigtest.html' rel="nofollow">http://www.analyticalgroup.com/sigtest.html</a> (with html instead of htm). </p>
<p>Thanks for all your great articles, Avinash!</p>
<p><b><font color=blue>Note :</font></b> Thanks very much for the correction Philip! -Avinash.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Web analytics en statistische significantie - Onetomarket Blog</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-407450</link>
		<dc:creator>Web analytics en statistische significantie - Onetomarket Blog</dc:creator>
		<pubDate>Mon, 28 Jan 2008 16:18:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-407450</guid>
		<description>[...] Hiervoor kun je statistiek gebruiken, en dan met name statistische significantie. Je hoort het wel eens in reclames of je ziet het bij onderzoeken in de krant staan. In web analytics loop je de term ook tegen het lijf, zoals in het boek “Web analytics an hour a day” van Avinash Kaushik. Het stuk over statistische significantie staat ook op zijn blog. Wat betekent statistische significant nou eigenlijk, en hoe werken de tools die hij noemt? [...]</description>
		<content:encoded><![CDATA[<p>[...] Hiervoor kun je statistiek gebruiken, en dan met name statistische significantie. Je hoort het wel eens in reclames of je ziet het bij onderzoeken in de krant staan. In web analytics loop je de term ook tegen het lijf, zoals in het boek “Web analytics an hour a day” van Avinash Kaushik. Het stuk over statistische significantie staat ook op zijn blog. Wat betekent statistische significant nou eigenlijk, en hoe werken de tools die hij noemt? [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: pabitra chatterjee</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-330474</link>
		<dc:creator>pabitra chatterjee</dc:creator>
		<pubDate>Mon, 05 Nov 2007 06:11:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-330474</guid>
		<description>this is to continue where mr hakim aly left. you may find this little piece at my blog, http://directindia.blogspot.com/2007/10/no-beta-yes-risk.html, interesting.

i've also given links for templates in the piece. 

pac</description>
		<content:encoded><![CDATA[<p>this is to continue where mr hakim aly left. you may find this little piece at my blog, <a href="http://directindia.blogspot.com/2007/10/no-beta-yes-risk.html" rel="nofollow">http://directindia.blogspot.com/2007/10/no-beta-yes-risk.html</a>, interesting.</p>
<p>i&#8217;ve also given links for templates in the piece. </p>
<p>pac</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Curtis</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-202510</link>
		<dc:creator>Curtis</dc:creator>
		<pubDate>Fri, 17 Aug 2007 06:44:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-202510</guid>
		<description>Thanks for the great insight. Do you have more details on exactly how the statistical significance is calculated?  I'm curious how you derived the std deviations in the scenarios above with just the sample size and order counts.</description>
		<content:encoded><![CDATA[<p>Thanks for the great insight. Do you have more details on exactly how the statistical significance is calculated?  I&#8217;m curious how you derived the std deviations in the scenarios above with just the sample size and order counts.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hakim Aly</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-88382</link>
		<dc:creator>Hakim Aly</dc:creator>
		<pubDate>Tue, 24 Apr 2007 04:35:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-88382</guid>
		<description>Regarding the question of 1-tail vs. 2-tail test, the former is appropriate when one wants to determine whether one RR% is statistically HIGHER (or LOWER) than another. The latter is appropriate if one wants to know if a RR% is DIFFERENT FROM another.

I would suggest that in most marketing situations,  we are more interested in the former (higher than) than the latter (different from).

In a few cases, we may want to know whether a proposed course of action may harm the response rate, in which case a 2-tail test would be appropriate.</description>
		<content:encoded><![CDATA[<p>Regarding the question of 1-tail vs. 2-tail test, the former is appropriate when one wants to determine whether one RR% is statistically HIGHER (or LOWER) than another. The latter is appropriate if one wants to know if a RR% is DIFFERENT FROM another.</p>
<p>I would suggest that in most marketing situations,  we are more interested in the former (higher than) than the latter (different from).</p>
<p>In a few cases, we may want to know whether a proposed course of action may harm the response rate, in which case a 2-tail test would be appropriate.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hakim Aly</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-88376</link>
		<dc:creator>Hakim Aly</dc:creator>
		<pubDate>Tue, 24 Apr 2007 04:27:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-88376</guid>
		<description>Although a 95% CL seems to be common (other than in a medical/pharmaceutical context), in a marketing context a lower CL may be quite appropriate. As you know, the choice of significance level(or it's complement, Confidence Level) depends on the cost of being wrong.

A 5% significance level (95% CL) means there is a 5% probability of being wrong. This is Type I error, i.e., concluding that one RR% is higher than another (statistically significant)when in fact it is not. Acting on this wrong conclusion may result in incurring costs that do not yield revenue or profit to offset the costs.

Type 2 error is when one does not reject the null hypothesis when if fact it is false. In this
case, the cost associated with the decision to not roll out a marketing tactic is the foregone revenue/profit that would otherwise have been generated.

In many situations, the cost of Type II error exceeds the cost of Type I error. Clearly, a trade-off is involved, but a lower CL of 90% of even 80% may not be out of line.

Ultimately, each business needs to decide for itself what an appropriate CL is for purposes of assessing test results.

Would be interested in your thoughts. Hakim</description>
		<content:encoded><![CDATA[<p>Although a 95% CL seems to be common (other than in a medical/pharmaceutical context), in a marketing context a lower CL may be quite appropriate. As you know, the choice of significance level(or it&#8217;s complement, Confidence Level) depends on the cost of being wrong.</p>
<p>A 5% significance level (95% CL) means there is a 5% probability of being wrong. This is Type I error, i.e., concluding that one RR% is higher than another (statistically significant)when in fact it is not. Acting on this wrong conclusion may result in incurring costs that do not yield revenue or profit to offset the costs.</p>
<p>Type 2 error is when one does not reject the null hypothesis when if fact it is false. In this<br />
case, the cost associated with the decision to not roll out a marketing tactic is the foregone revenue/profit that would otherwise have been generated.</p>
<p>In many situations, the cost of Type II error exceeds the cost of Type I error. Clearly, a trade-off is involved, but a lower CL of 90% of even 80% may not be out of line.</p>
<p>Ultimately, each business needs to decide for itself what an appropriate CL is for purposes of assessing test results.</p>
<p>Would be interested in your thoughts. Hakim</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vicky Brock</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-114</link>
		<dc:creator>Vicky Brock</dc:creator>
		<pubDate>Thu, 01 Jun 2006 15:51:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-114</guid>
		<description>Hi Avinash,

I so much agree on the importance of taking into account statistical significance - an essential part of the "so what" factor! 

 This is a neat chi square tool to test for statistical significance: 

http://www.georgetown.edu/faculty/ballc/webtools/web_chi.html

I do love your blog, bye, Vicky</description>
		<content:encoded><![CDATA[<p>Hi Avinash,</p>
<p>I so much agree on the importance of taking into account statistical significance - an essential part of the &#8220;so what&#8221; factor! </p>
<p> This is a neat chi square tool to test for statistical significance: </p>
<p><a href="http://www.georgetown.edu/faculty/ballc/webtools/web_chi.html" rel="nofollow">http://www.georgetown.edu/faculty/ballc/webtools/web_chi.html</a></p>
<p>I do love your blog, bye, Vicky</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Avinash Kaushik</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-102</link>
		<dc:creator>Avinash Kaushik</dc:creator>
		<pubDate>Wed, 31 May 2006 07:03:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-102</guid>
		<description>Kerry: The example used was quite a simple one to show that we can accomplish much applying statistics to our standard KPI's with very little stress.
&lt;blockquote&gt;
Wouldn’t it have been more precise to use a two tailed test? If not, why?
&lt;/blockquote&gt;
You are right, one can get quite sophisticated and get ever better results. The emphasis of the article was how to detect statistical significance in a simple case. I hope to blog more about how we can apply advanced methodologies in testing (to build on my experimentation and testing post).

Thanks for taking the time to post a comment.</description>
		<content:encoded><![CDATA[<p>Kerry: The example used was quite a simple one to show that we can accomplish much applying statistics to our standard KPI&#8217;s with very little stress.</p>
<blockquote><p>
Wouldn’t it have been more precise to use a two tailed test? If not, why?
</p></blockquote>
<p>You are right, one can get quite sophisticated and get ever better results. The emphasis of the article was how to detect statistical significance in a simple case. I hope to blog more about how we can apply advanced methodologies in testing (to build on my experimentation and testing post).</p>
<p>Thanks for taking the time to post a comment.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kerry Kim</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-97</link>
		<dc:creator>Kerry Kim</dc:creator>
		<pubDate>Wed, 31 May 2006 00:37:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-97</guid>
		<description>Hi Avinash, my thanks also to you for sharing.  Any additional insights you might have about the key drivers of adoption you've experienced would be greatly appreciated.

Regarding statistical significance, it appears that the reference in your post used a one tailed z test for testing whether there is a significant difference between two sample proportions.  Wouldn't it have been more precise to use a two tailed test?  If not, why?</description>
		<content:encoded><![CDATA[<p>Hi Avinash, my thanks also to you for sharing.  Any additional insights you might have about the key drivers of adoption you&#8217;ve experienced would be greatly appreciated.</p>
<p>Regarding statistical significance, it appears that the reference in your post used a one tailed z test for testing whether there is a significant difference between two sample proportions.  Wouldn&#8217;t it have been more precise to use a two tailed test?  If not, why?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Avinash Kaushik</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-35</link>
		<dc:creator>Avinash Kaushik</dc:creator>
		<pubDate>Mon, 22 May 2006 06:06:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-35</guid>
		<description>Aurélie: Thanks for the thoughtful comment, I am sorry to have spoilt your entire weekend with my posts afterall there are so many more beautiful things in life.:)

I completely agree with the care around communicating anything that is not of significance, there is always a danger that inspite of your warning the will jump into the lake.
&lt;blockquote&gt;Another pavlovian reaction was to consider that any number of responses under 200 should not be taken into account as it holds high proability of not being representitive.&lt;/blockquote&gt;
(For our readers here is something on &lt;a rel="nofollow" href="http://www.nwlink.com/~donclark/hrd/history/pavlov.html"&gt;pavlovian reaction&lt;/a&gt;.)

In the world of Multivariate we can detect a strong signal even with small samples. We use somethings like &lt;a rel="nofollow" href="http://www.mwrms.com/wwwRMS/DirectMarketing/MarketingCalc2.asp"&gt;This Page&lt;/a&gt; to calculate sample set.
&lt;blockquote&gt;“Is a visitor engaging into A but not engaging into B, converted easier into a lead, than someone engaged into C and B?”&lt;/blockquote&gt;
Corelations are important, very, and of course my simply little spreadsheet won't account for that. Specially for complex web interactions it is important to understand the lower level conversion events might influence higher level (ultimate) goals.</description>
		<content:encoded><![CDATA[<p>Aurélie: Thanks for the thoughtful comment, I am sorry to have spoilt your entire weekend with my posts afterall there are so many more beautiful things in life.:)</p>
<p>I completely agree with the care around communicating anything that is not of significance, there is always a danger that inspite of your warning the will jump into the lake.</p>
<blockquote><p>Another pavlovian reaction was to consider that any number of responses under 200 should not be taken into account as it holds high proability of not being representitive.</p></blockquote>
<p>(For our readers here is something on <a rel="nofollow" href="http://www.nwlink.com/~donclark/hrd/history/pavlov.html">pavlovian reaction</a>.)</p>
<p>In the world of Multivariate we can detect a strong signal even with small samples. We use somethings like <a rel="nofollow" href="http://www.mwrms.com/wwwRMS/DirectMarketing/MarketingCalc2.asp">This Page</a> to calculate sample set.</p>
<blockquote><p>“Is a visitor engaging into A but not engaging into B, converted easier into a lead, than someone engaged into C and B?”</p></blockquote>
<p>Corelations are important, very, and of course my simply little spreadsheet won&#8217;t account for that. Specially for complex web interactions it is important to understand the lower level conversion events might influence higher level (ultimate) goals.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aurélie</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-34</link>
		<dc:creator>Aurélie</dc:creator>
		<pubDate>Sun, 21 May 2006 20:37:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-34</guid>
		<description>Hi Avinash,

Good to find you blogging, sharing thoughts and experiences. It's quite some interestign stuff and I hope you enjoy the experience.

I read your different posts on Saturday morning and your thoughts stayed with me for the entire week-end. Thank you.

Yes, statistical significance. I totally join you in the idea and would only add that tests that do not render truely significant results should not be communicated upon. I remember in my first job having warned of the non significance of a test only to find it had heavily influenced a commercial strategy. I vowed never again!

Another pavlovian reaction was to consider that any number of responses under 200 should not be taken into account as it holds high proability of not being representitive. I usually follow this first rule and adapt the variables in order to remain loyal to the statistical representitiveness of a sample. Quite pavlovian, I agree.

And the last thing is that I'll bare statistical significance in mind but would like to suggest another possible subject: correlation between conversion rates.

Siegert suggested this formulation for a client yesterday:
“Is a visitor engaging into A but not engaging into B, converted easier into a lead, than someone engaged into C and B?” 
In other words, you've got kind of low level conversion events that influence or not higher goals.
I'm having diffculty formulating this, sorry.
Hope it made sense, keep up the good work, cheers from expensive Brussels ;-)
Aurélie</description>
		<content:encoded><![CDATA[<p>Hi Avinash,</p>
<p>Good to find you blogging, sharing thoughts and experiences. It&#8217;s quite some interestign stuff and I hope you enjoy the experience.</p>
<p>I read your different posts on Saturday morning and your thoughts stayed with me for the entire week-end. Thank you.</p>
<p>Yes, statistical significance. I totally join you in the idea and would only add that tests that do not render truely significant results should not be communicated upon. I remember in my first job having warned of the non significance of a test only to find it had heavily influenced a commercial strategy. I vowed never again!</p>
<p>Another pavlovian reaction was to consider that any number of responses under 200 should not be taken into account as it holds high proability of not being representitive. I usually follow this first rule and adapt the variables in order to remain loyal to the statistical representitiveness of a sample. Quite pavlovian, I agree.</p>
<p>And the last thing is that I&#8217;ll bare statistical significance in mind but would like to suggest another possible subject: correlation between conversion rates.</p>
<p>Siegert suggested this formulation for a client yesterday:<br />
“Is a visitor engaging into A but not engaging into B, converted easier into a lead, than someone engaged into C and B?”<br />
In other words, you&#8217;ve got kind of low level conversion events that influence or not higher goals.<br />
I&#8217;m having diffculty formulating this, sorry.<br />
Hope it made sense, keep up the good work, cheers from expensive Brussels ;-)<br />
Aurélie</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jaimie Scott</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-21</link>
		<dc:creator>Jaimie Scott</dc:creator>
		<pubDate>Thu, 18 May 2006 16:34:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-21</guid>
		<description>Hi Avinash,

I too am very happy to see your blog.  I found it through Clint Ivy's blog and I am enjoying reading your posts very much.  I find them to be quite informative.

You say above:
"You can easily adapt the spreadsheet, as we have, to compute statistical difference between absolute numbers (say you want to know if the difference Page Views Per Visitor or Average Time on Site between segment One and Two is Significant)"

It's not obvious to me how to do this.  Can you elaborate?

Thanks.</description>
		<content:encoded><![CDATA[<p>Hi Avinash,</p>
<p>I too am very happy to see your blog.  I found it through Clint Ivy&#8217;s blog and I am enjoying reading your posts very much.  I find them to be quite informative.</p>
<p>You say above:<br />
&#8220;You can easily adapt the spreadsheet, as we have, to compute statistical difference between absolute numbers (say you want to know if the difference Page Views Per Visitor or Average Time on Site between segment One and Two is Significant)&#8221;</p>
<p>It&#8217;s not obvious to me how to do this.  Can you elaborate?</p>
<p>Thanks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Avinash Kaushik</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-20</link>
		<dc:creator>Avinash Kaushik</dc:creator>
		<pubDate>Thu, 18 May 2006 05:45:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-20</guid>
		<description>Jeff: Glad to see your post...

&lt;blockquote&gt;
I’m wondering now based on this article, how this model plays in with UX decision when applying A/B tests. We often mind minimal differences and often base our decision the winner between the two.
&lt;/blockquote&gt;

If we are doing a/b testing (asuming the Success Goal is clearly articulated and measurable and that it is not "impact on brand") then it would be a sin not to use the spreadsheet in the post above to seperate &lt;b&gt;Signal&lt;/B&gt; from &lt;b&gt;Noise&lt;/b&gt;. Simply looking at Conversion Rate (or similar metric) difference is very dangerous because of exactly what you say, how much is enough to be confident.

The great news is that most current a/b testing solution (atleast the ones that so "page testing") already include statistical computations to help us make better decisions. 

If you don't see atleast 90% plus statistica confidence take the results with a grain of salt.</description>
		<content:encoded><![CDATA[<p>Jeff: Glad to see your post&#8230;</p>
<blockquote><p>
I’m wondering now based on this article, how this model plays in with UX decision when applying A/B tests. We often mind minimal differences and often base our decision the winner between the two.
</p></blockquote>
<p>If we are doing a/b testing (asuming the Success Goal is clearly articulated and measurable and that it is not &#8220;impact on brand&#8221;) then it would be a sin not to use the spreadsheet in the post above to seperate <b>Signal</b> from <b>Noise</b>. Simply looking at Conversion Rate (or similar metric) difference is very dangerous because of exactly what you say, how much is enough to be confident.</p>
<p>The great news is that most current a/b testing solution (atleast the ones that so &#8220;page testing&#8221;) already include statistical computations to help us make better decisions. </p>
<p>If you don&#8217;t see atleast 90% plus statistica confidence take the results with a grain of salt.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Avinash Kaushik</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-19</link>
		<dc:creator>Avinash Kaushik</dc:creator>
		<pubDate>Thu, 18 May 2006 05:32:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-19</guid>
		<description>Mark McLaren: Thanks for your kind words about the post, I am glad you found it helpful.

&lt;blockquote&gt;
What else do we need to know about the groups involved in the test?

Are they essentially the same group or are they two completely different groups? (I’m assuming you would want to send offers to as many people as possible; hence, they are the same group - less 100 people in the second case.)
&lt;/blockquote&gt;

In the specific example I used, and the spreadsheet, you control for one thing usually. you can have as many groups as you want. For example you can send one offer to people who live in CA and NY and FL and OR and OH and plug that into the spreadsheet against a control and know which works best.

Alternatively you could try 5, 6, 10 whatever number of different offers to a bunch of folks and see which one converts best.

The problem becomes when you want to test different offers to differnt groups (or many different content in different locations on the same page). Now you are in the world of multivariate and need to apply advanced statistics (think &lt;a href="http://en.wikipedia.org/wiki/Genichi_Taguchi" rel="nofollow"&gt;Taguchi&lt;/a&gt;). 

Doing multivariate is awesomely powerful and yields great results, but beyond my humble spreadsheet.

&lt;blockquote&gt;
Do you need a random sample in order to apply principles of standard deviation? 5,000+ is a good size group from which to draw conclusions,
&lt;/blockquote&gt;

The beauty of using statistics is that the standard deviations required, and amount of Statistical Significance (my suggestion of 95% or higher), will drive how big a sample you need. There is no fixed number (like 5k). 

Hope this is the kind of information you were looking for.</description>
		<content:encoded><![CDATA[<p>Mark McLaren: Thanks for your kind words about the post, I am glad you found it helpful.</p>
<blockquote><p>
What else do we need to know about the groups involved in the test?</p>
<p>Are they essentially the same group or are they two completely different groups? (I’m assuming you would want to send offers to as many people as possible; hence, they are the same group - less 100 people in the second case.)
</p></blockquote>
<p>In the specific example I used, and the spreadsheet, you control for one thing usually. you can have as many groups as you want. For example you can send one offer to people who live in CA and NY and FL and OR and OH and plug that into the spreadsheet against a control and know which works best.</p>
<p>Alternatively you could try 5, 6, 10 whatever number of different offers to a bunch of folks and see which one converts best.</p>
<p>The problem becomes when you want to test different offers to differnt groups (or many different content in different locations on the same page). Now you are in the world of multivariate and need to apply advanced statistics (think <a href="http://en.wikipedia.org/wiki/Genichi_Taguchi" rel="nofollow">Taguchi</a>). </p>
<p>Doing multivariate is awesomely powerful and yields great results, but beyond my humble spreadsheet.</p>
<blockquote><p>
Do you need a random sample in order to apply principles of standard deviation? 5,000+ is a good size group from which to draw conclusions,
</p></blockquote>
<p>The beauty of using statistics is that the standard deviations required, and amount of Statistical Significance (my suggestion of 95% or higher), will drive how big a sample you need. There is no fixed number (like 5k). </p>
<p>Hope this is the kind of information you were looking for.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: June Li</title>
		<link>http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-18</link>
		<dc:creator>June Li</dc:creator>
		<pubDate>Wed, 17 May 2006 17:40:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.kaushik.net/avinash/2006/05/excellent-analytics-tip1-statistical-significance.html#comment-18</guid>
		<description>Hi Avinash,
I had the pleasure of hearing you speak at the 2005 eMetrics. I'm very happy that you've decided to blog.  I too found your blog through Robbin Steif's &lt;a rel="nofollow" title="LunaMetrics Blog" href="http://lunametrics.blogspot.com/"&gt;.  &lt;/a&gt;

It's excellent that you are giving us real examples of how statistics can be used, and providing tool references.  I look forward to additional case studies and discussions.

Will you also be posting about monitoring and managing outside influences?  Sometimes the Noise dampens the signal or deflects the signal.

Thanks,
Web: &lt;a rel="nofollow" title="ClickInsight.ca" href="http://www.clickinsight.ca/"&gt;www.clickinsight.ca&lt;/a&gt;
Blog: &lt;a rel="nofollow" title="Share the Genie's Power :: ClickInsight Blog" href="http://clickinsight.blogspot.com/"&gt;clickinsight.blogspot.com &lt;/a&gt;</description>
		<content:encoded><![CDATA[<p>Hi Avinash,<br />
I had the pleasure of hearing you speak at the 2005 eMetrics. I&#8217;m very happy that you&#8217;ve decided to blog.  I too found your blog through Robbin Steif&#8217;s <a rel="nofollow" title="LunaMetrics Blog" href="http://lunametrics.blogspot.com/">.  </a></p>
<p>It&#8217;s excellent that you are giving us real examples of how statistics can be used, and providing tool references.  I look forward to additional case studies and discussions.</p>
<p>Will you also be posting about monitoring and managing outside influences?  Sometimes the Noise dampens the signal or deflects the signal.</p>
<p>Thanks,<br />
Web: <a rel="nofollow" title="ClickInsight.ca" href="http://www.clickinsight.ca/">http://www.clickinsight.ca</a><br />
Blog: <a rel="nofollow" title="Share the Genie's Power :: ClickInsight Blog" href="http://clickinsight.blogspot.com/">clickinsight.blogspot.com </a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
