{"id":5832,"date":"2012-12-06T22:55:53","date_gmt":"2012-12-06T21:55:53","guid":{"rendered":"http:\/\/idiolect.org.uk\/notes\/?p=5832"},"modified":"2012-12-07T11:05:58","modified_gmt":"2012-12-07T10:05:58","slug":"bootstrap-corrected","status":"publish","type":"post","link":"https:\/\/idiolect.org.uk\/notes\/2012\/12\/06\/bootstrap-corrected\/","title":{"rendered":"Bootstrap: corrected"},"content":{"rendered":"<p>So, previously on this blog (<a href=\"http:\/\/idiolect.org.uk\/notes\/?p=5821\">here<\/a>, and <a href=\"http:\/\/idiolect.org.uk\/notes\/?p=5800\">here<\/a>) I was playing around with the bootstrap as a way of testing if two samples are drawn from a different underlying distribution, by simulating samples with known differences and throwing different tests at the samples. The problem was that I was using the wrong bootstrap test. <a href=\"http:\/\/maths.dept.shef.ac.uk\/pas\/staff_info.php?id=289\">Tim<\/a> was kind enough to look at what I&#8217;d done and point out that I should have concatenated my two sets of numbers and the pulled two samples from that set, calculated the mean difference and then used that statistic to constructed a probability distribution function against which I could compare my measured statistic (ie the difference of means) to perform a hypothesis test (viz. &#8216;what are the chances that I could have got this difference of means if the two distributions are not different?&#8217;). For people who prefer to think in code, the corrected bootstrap is at the end of this post.<\/p>\n<p>Using the correct bootstrap method, this is what you get:<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"5833\" data-permalink=\"https:\/\/idiolect.org.uk\/notes\/2012\/12\/06\/bootstrap-corrected\/mc_newboot\/\" data-orig-file=\"https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot.png?fit=1200%2C901&amp;ssl=1\" data-orig-size=\"1200,901\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;}\" data-image-title=\"mc_newboot\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot.png?fit=300%2C225&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot.png?fit=580%2C435&amp;ssl=1\" tabindex=\"0\" role=\"button\" src=\"https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot-300x225.png?resize=300%2C225\" alt=\"\" title=\"mc_newboot\" width=\"300\" height=\"225\" class=\"aligncenter size-medium wp-image-5833\" srcset=\"https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot.png?resize=300%2C225&amp;ssl=1 300w, https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot.png?resize=1024%2C768&amp;ssl=1 1024w, https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot.png?resize=399%2C300&amp;ssl=1 399w, https:\/\/i0.wp.com\/idiolect.org.uk\/notes\/wp-content\/uploads\/2012\/12\/mc_newboot.png?w=1200&amp;ssl=1 1200w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>So what you can see is that, basically, the bootstrap is little improvement over the t-test. Perhaps a marginal amount. As <a href=\"http:\/\/masi.cscs.lsa.umich.edu\/~crshalizi\/weblog\/\">Cosma<\/a> pointed out, the ex-gaussian \/ reaction time distributions I&#8217;m using look pretty normal at lower sample sizes, so it isn&#8217;t too surprising that the t-test is robust. Using the median rather than the mean damages the sensitivity of the bootstrap (contra my previous, erroneous, results). My intuition is that the mean, as a statistic, is influenced by the whole distribution in a way the median isn&#8217;t, so it a better summary statistic (statisticians, you can tell me if this makes sense). The mean test is far more sensitive, but, <a href=\"http:\/\/idiolect.org.uk\/notes\/?p=5821\">as discussed previously<\/a>, this is because it has an unacceptably high false alarm rate which is insufficiently penalised by d-prime.<\/p>\n<p>Update: Cosma&#8217;s notes on the bootstrap are <a href=\"http:\/\/www.stat.cmu.edu\/~cshalizi\/uADA\/12\/lectures\/ch05.pdf\">here<\/a> and recommened if you want the fundamentals and are already degree-level comfortable with statistical theory.<\/p>\n<p><b>Corrected boostrap function:<\/b><\/p>\n<pre>\r\nfunction H=bootstrap(s1,s2,samples,alpha,method)\r\n\r\ndifference=mean(s2)-mean(s1);\r\n\r\nfor i=1:samples\r\n    \r\n    sstar=[s1 s2];\r\n    \r\n    boot1=sstar(ceil(rand(1,length(s1))*length(sstar)));\r\n    boot2=sstar(ceil(rand(1,length(s2))*length(sstar)));\r\n    \r\n    if method==1\r\n        a(i)=mean(boot1)-mean(boot2);\r\n    else\r\n        a(i)=median(boot1)-median(boot2);    \r\n    end\r\n    \r\nend\r\n\r\nCI=prctile(a,[100*alpha\/2,100*(1-alpha\/2)]);\r\n\r\nH = CI(1)>difference | CI(2)<difference;\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>So, previously on this blog (here, and here) I was playing around with the bootstrap as a way of testing if two samples are drawn from a different underlying distribution, by simulating samples with known differences and throwing different tests at the samples. The problem was that I was using the wrong bootstrap test. Tim [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[5,9],"tags":[],"class_list":["post-5832","post","type-post","status-publish","format-standard","hentry","category-psychology","category-science"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p5KQtW-1w4","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/posts\/5832"}],"collection":[{"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/comments?post=5832"}],"version-history":[{"count":17,"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/posts\/5832\/revisions"}],"predecessor-version":[{"id":5848,"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/posts\/5832\/revisions\/5848"}],"wp:attachment":[{"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/media?parent=5832"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/categories?post=5832"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/idiolect.org.uk\/notes\/wp-json\/wp\/v2\/tags?post=5832"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}