<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" > <channel><title>Comments on: Web Analytics Data Sampling 411</title> <atom:link href="http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html/feed" rel="self" type="application/rss+xml" /><link>http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html</link> <description>Pluralitas non est ponenda sine neccesitate.</description> <lastBuildDate>Tue, 16 Mar 2010 15:01:56 +0000</lastBuildDate> <generator>http://wordpress.org/?v=2.9.2</generator> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <item><title>By: Conciliando datos de diferentes herramientas &#171; Blog2puntocero</title><link>http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html/comment-page-1#comment-480385</link> <dc:creator>Conciliando datos de diferentes herramientas &#171; Blog2puntocero</dc:creator> <pubDate>Tue, 06 Jan 2009 16:19:42 +0000</pubDate> <guid isPermaLink="false">http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html#comment-480385</guid> <description>[...] La presentación de informes en tiempo real, basados en grandes volúmenes de tráfico, son la principal justificación para la utilización de esta técnica que,  siendo más precisos, presenta diferentes variantes que pueden dotarla de mayor protagonismo (recomiendo este post para profundizar en este tema).La carrera de obstáculos, invita a pensar que la situación ideal pasa por la utilización de una única herramienta. Avinash hizo hace unas semanas un checklist muy interesante al respecto en su blog. [...]</description> <content:encoded><![CDATA[<p>[...]<br /> La presentación de informes en tiempo real, basados en grandes volúmenes de tráfico, son la principal justificación para la utilización de esta técnica que,  siendo más precisos, presenta diferentes variantes que pueden dotarla de mayor protagonismo (recomiendo este post para profundizar en este tema).</p><p>La carrera de obstáculos, invita a pensar que la situación ideal pasa por la utilización de una única herramienta. Avinash hizo hace unas semanas un checklist muy interesante al respecto en su blog.<br /> [...]</p> ]]></content:encoded> </item> <item><title>By: Avinash Kaushik</title><link>http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html/comment-page-1#comment-162607</link> <dc:creator>Avinash Kaushik</dc:creator> <pubDate>Tue, 10 Jul 2007 05:11:20 +0000</pubDate> <guid isPermaLink="false">http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html#comment-162607</guid> <description>&lt;b&gt;&lt;font color=blue&gt;c0t0s0d0 : &lt;/b&gt;&lt;/font&gt; How delightfully clever!!!I don&#039;t have your real email address but please consider this as my thanks for coming back and helping clear up the mystery. That was kind of you.-Avinash.</description> <content:encoded><![CDATA[<p><b><font color=blue>c0t0s0d0 : </font></b> How delightfully clever!!!</p><p>I don&#039;t have your real email address but please consider this as my thanks for coming back and helping clear up the mystery. That was kind of you.</p><p>-Avinash.</p> ]]></content:encoded> </item> <item><title>By: Steve</title><link>http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html/comment-page-1#comment-162604</link> <dc:creator>Steve</dc:creator> <pubDate>Tue, 10 Jul 2007 05:02:31 +0000</pubDate> <guid isPermaLink="false">http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html#comment-162604</guid> <description>Well it&#039;s a little unfair to Dylan... :-)&lt;blockquote&gt; c0t0s0d0 = cylinder zero, track zero, slice zero, disk zero &lt;/blockquote&gt;Urm. No actually. :-)For starters it should be: c0t0d0s0. c == controller t == target d == drive s == sliceThis is a &quot;partition&quot; identifier. Not a position on disk identifier. That would be pointless at this level.Target is the SCSI ID. When you use the format command, you drop the slice to work with disks and hence set up your slices (or partitions).eg: &lt;code&gt; # format ... 0. c0t0d0 /pci@1f,4000/scsi@3/sd@0,0 1. c0t1d0  sol_8 /pci@1f,4000/scsi@3/sd@1,0 2. c0t2d0 /pci@1f,4000/scsi@3/sd@2,0 ... &lt;/code&gt;Three very different 9G SCSI drives on the one backplane. You might even decode some of that from the /pci... address too. :-)Speaking as someone who still is a Solaris sysadmin and has been for about 15 years. :-)Cheers!</description> <content:encoded><![CDATA[<p>Well it&#039;s a little unfair to Dylan&#8230; :-)</p><blockquote><p> c0t0s0d0 = cylinder zero, track zero, slice zero, disk zero</p></blockquote><p>Urm. No actually. :-)</p><p>For starters it should be: c0t0d0s0.<br /> c == controller<br /> t == target<br /> d == drive<br /> s == slice</p><p>This is a &#034;partition&#034; identifier. Not a position on disk identifier. That would be pointless at this level.</p><p>Target is the SCSI ID.<br /> When you use the format command, you drop the slice to work with disks and hence set up your slices (or partitions).</p><p>eg:<br /> <code><br /> # format<br /> ...<br /> 0. c0t0d0<br /> /pci@1f,4000/scsi@3/sd@0,0<br /> 1. c0t1d0  sol_8<br /> /pci@1f,4000/scsi@3/sd@1,0<br /> 2. c0t2d0<br /> /pci@1f,4000/scsi@3/sd@2,0<br /> ...<br /> </code></p><p>Three very different 9G SCSI drives on the one backplane. You might even decode some of that from the /pci&#8230; address too. :-)</p><p>Speaking as someone who still is a Solaris sysadmin and has been for about 15 years. :-)</p><p>Cheers!</p> ]]></content:encoded> </item> <item><title>By: c0t0s0d0</title><link>http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html/comment-page-1#comment-162467</link> <dc:creator>c0t0s0d0</dc:creator> <pubDate>Tue, 10 Jul 2007 02:07:09 +0000</pubDate> <guid isPermaLink="false">http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html#comment-162467</guid> <description>&lt;blockquote&gt;PS: Dylan Lewis loves riddles, I should ask him to see if he can solve what c0t0s0d0 stands for!! :)&lt;/blockquote&gt;c0t0s0d0 = cylinder zero, track zero, slice zero, disk zero (from my solaris admin days :-) </description> <content:encoded><![CDATA[<blockquote><p>PS: Dylan Lewis loves riddles, I should ask him to see if he can solve what c0t0s0d0 stands for!! :)</p></blockquote><p>c0t0s0d0 = cylinder zero, track zero, slice zero, disk zero (from my solaris admin days :-)</p> ]]></content:encoded> </item> <item><title>By: max</title><link>http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html/comment-page-1#comment-155394</link> <dc:creator>max</dc:creator> <pubDate>Tue, 03 Jul 2007 12:17:39 +0000</pubDate> <guid isPermaLink="false">http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html#comment-155394</guid> <description>Very insightful session, and clear that progress has been and continues to be made in this area.  That said, there remains a HUGE gap in the dialog.  When we talk about how to draw meaningful conclusions from statistically relevant samples of the data, we can&#039;t forget about the underlying data quality to begin with.  If, for example, the data itself has an inherent error rate which renders it statistically impaired, then no amount of sampling, however precise or sophisticated, will be able to restore statistical significance.  A lot can and routinely does go wrong at the source which undermines all efforts flowing from it.  For example, tags can be missing entirely.  If they&#039;re not missing, they can be inoperable (e.g., due to syntax errors and the like).  If they&#039;re operating properly, they can be generating inaccurate variables.  If the variables are accurate, the beacon parameters can be wrong.  These are not hypotheticals.  The incidence of such errors is shocking (error rates average well over 25% for relatively sophisticated practitioners - CODE SUPER BRIGHT RED!).  And by the way, this is &quot;systemic&quot; error and not &quot;random&quot; error, so the affects of such poor data quality cannot be made up through quantity.  The implications are profound.  Refining sampling methods is absolutely necessary for many reasons, but in truth, doing so without concurrently addressing the underlying data quality problems is a whole lot like painting over dirt.</description> <content:encoded><![CDATA[<p>Very insightful session, and clear that progress has been and continues to be made in this area.  That said, there remains a HUGE gap in the dialog.  When we talk about how to draw meaningful conclusions from statistically relevant samples of the data, we can&#039;t forget about the underlying data quality to begin with.  If, for example, the data itself has an inherent error rate which renders it statistically impaired, then no amount of sampling, however precise or sophisticated, will be able to restore statistical significance.  A lot can and routinely does go wrong at the source which undermines all efforts flowing from it.  For example, tags can be missing entirely.  If they&#039;re not missing, they can be inoperable (e.g., due to syntax errors and the like).  If they&#039;re operating properly, they can be generating inaccurate variables.  If the variables are accurate, the beacon parameters can be wrong.  These are not hypotheticals.  The incidence of such errors is shocking (error rates average well over 25% for relatively sophisticated practitioners &#8211; CODE SUPER BRIGHT RED!).  And by the way, this is &#034;systemic&#034; error and not &#034;random&#034; error, so the affects of such poor data quality cannot be made up through quantity.  The implications are profound.  Refining sampling methods is absolutely necessary for many reasons, but in truth, doing so without concurrently addressing the underlying data quality problems is a whole lot like painting over dirt.</p> ]]></content:encoded> </item> <item><title>By: anonymous</title><link>http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html/comment-page-1#comment-154374</link> <dc:creator>anonymous</dc:creator> <pubDate>Mon, 02 Jul 2007 15:24:22 +0000</pubDate> <guid isPermaLink="false">http://www.kaushik.net/avinash/2007/06/web-analytics-data-sampling-411.html#comment-154374</guid> <description>I think you may have forgotten one.  I work with a vendor who can apply a percentage governor to the JS lib file and cookie a sample of the UVs hitting the site.  if the % is set to 10 then 1/10 of the UVs are tracked. This is a bit different that what you explain or how I understood your explanation so I might have missed something.It seems closest to the ORANGE method but not sure since you imply its page based vs. UV based.  the code JS still sends data for every page but for only a % of the UV traffic based on a cookie identifier. You can even create new metrics based on the sample to get up to almost normal levels by creating a multiplier on UVs or PVs.</description> <content:encoded><![CDATA[<p>I think you may have forgotten one.  I work with a vendor who can apply a percentage governor to the JS lib file and cookie a sample of the UVs hitting the site.  if the % is set to 10 then 1/10 of the UVs are tracked. This is a bit different that what you explain or how I understood your explanation so I might have missed something.</p><p>It seems closest to the ORANGE method but not sure since you imply its page based vs. UV based.  the code JS still sends data for every page but for only a % of the UV traffic based on a cookie identifier. You can even create new metrics based on the sample to get up to almost normal levels by creating a multiplier on UVs or PVs.</p> ]]></content:encoded> </item> </channel> </rss>
<!-- This site's performance optimized by W3 Total Cache. Dramatically improve the speed and reliability of your blog!

Learn more about our WordPress Plugins: http://www.w3-edge.com/wordpress-plugins/

Minified using disk
Page Caching using disk (enhanced) (user agent is rejected)
Database Caching 8/20 queries in 0.007 seconds using disk

Served from: stickerbest.com @ 2010-03-16 19:51:13 -->