{"id":479,"date":"2009-02-09T20:53:25","date_gmt":"2009-02-10T01:53:25","guid":{"rendered":"http:\/\/itp.indiamos.com\/blog\/?p=479"},"modified":"2009-10-22T22:07:59","modified_gmt":"2009-10-23T03:07:59","slug":"lies-damn-lies-and-statistics","status":"publish","type":"post","link":"https:\/\/itp.indiamos.com\/blog\/2009\/02\/09\/lies-damn-lies-and-statistics\/","title":{"rendered":"Lies, damn lies, and statistics"},"content":{"rendered":"<p><a href='http:\/\/itp.indiamos.com\/blog\/wp-content\/uploads\/2009\/02\/poster.pdf'><img loading=\"lazy\" style=\"border:1pt solid gray;\" src=\"https:\/\/i2.wp.com\/itp.indiamos.com\/blog\/wp-content\/uploads\/2009\/02\/poster-400x600.png?resize=200%2C300\" alt=\"Mainstreaming Information project proposal poster\" title=\"Mainstreaming Information project proposal poster\" width=\"200\" height=\"300\" class=\"alignnone size-medium wp-image-480\" srcset=\"https:\/\/i0.wp.com\/itp.indiamos.com\/blog\/wp-content\/uploads\/poster.png?w=400&amp;ssl=1 400w, https:\/\/i0.wp.com\/itp.indiamos.com\/blog\/wp-content\/uploads\/poster.png?w=682&amp;ssl=1 682w, https:\/\/i0.wp.com\/itp.indiamos.com\/blog\/wp-content\/uploads\/poster.png?w=1896&amp;ssl=1 1896w, https:\/\/i0.wp.com\/itp.indiamos.com\/blog\/wp-content\/uploads\/poster.png?w=948&amp;ssl=1 948w, https:\/\/i0.wp.com\/itp.indiamos.com\/blog\/wp-content\/uploads\/poster.png?w=1422&amp;ssl=1 1422w\" sizes=\"(max-width: 200px) 100vw, 200px\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Today we presented ideas for our semester-long projects in <a href=\"http:\/\/www.christianmarcschmidt.com\/NYU2009\/docu.html\">Mainstreaming Information<\/a>. The assignment, which apparently I was not the only person to be confused by, is over at <a href=\" http:\/\/www.christianmarcschmidt.com\/NYU2009\/components\/090202_semester_project.pdf\">Christian&#8217;s site (PDF, 36 KB)<\/a>.<br \/>\n<!--more--><br \/>\nLast week we had to bring in some &#8220;jaw-dropping statistics&#8221; to start considering working with, and because I&#8217;ve decided that I&#8217;ll get more out of the rest of my time at ITP if I keep my schoolwork linked to&#8212;duh&#8212;stuff I&#8217;m actually interested in, I selected a couple of tidbits from Dan Poynter&#8217;s mass of <a href=\"http:\/\/BookStatistics.com\/\">book industry statistics<\/a>\u2014<\/p>\n<blockquote><p>1993\u20132003: The number of titles published increased 58% while fiction readers declines 14%,<br \/>\n\u2014Malcolm Jones in <cite>Newsweek<\/cite>. Sources: NEA and RR Bowker.<\/p>\n<p>2004. 56.6% of adult Americans said they read at least one book, fiction or non-fiction, between August 2001 and August 2002 compared to 60.9% ten years prior.<\/p>\n<p>2002. 57% of the US population read a book. See report.<br \/>\n<a href=\"http:\/\/www.nea.gov\/pub\/readingatrisk.pdf\">http:\/\/www.nea.gov\/pub\/readingatrisk.pdf<\/a><\/p>\n<p>Most readers do not get past page 18 in a book they have purchased.<\/p><\/blockquote>\n<p>\u2014and John Kremer&#8217;s <a href=\"http:\/\/bookmarket.com\/statistics6.htm\">Recent Statistics Related to<br \/>\nBook Publishing and Marketing<\/a>\u2014<\/p>\n<blockquote><p>In a survey of 4,000 adults in the United Kingdom, 55% said \u201cthey buy books for decoration, and have no intention of actually reading them.\u201d (Teletext) This is another important reason why your books should be well-designed. They should look good on a buyer&#8217;s coffee table, bookshelf, bedside stand, etc.<\/p><\/blockquote>\n<p>These served the purpose at hand, but they&#8217;re all just isolated data points. So over the weekend I spent several hours digging around for more information, but for none of these could I find enough reliable numbers to support a semester-long project. I was also looking for any compelling information about e-book sales versus print or audio books, and this morning I spent a while rummaging around on <a href=\"http:\/\/www.teleread.org\/\">TeleRead<\/a>. They had all sorts of statistics, none of which quite fit my needs, though but did give me a few more ideas about stuff I&#8217;d <em>like<\/em> to have statistics about. So around 11 a.m., with 3.5 hours left until class, I <a href=\"http:\/\/twitter.com\/indiamos\/statuses\/1192187395\">lazytweeted it<\/a>, as a last resort. And I immediately got a bunch of responses from my nice, nice friends! Erin pointed me to the completely bitchen <a href=\"http:\/\/labs.timesonline.co.uk\/bookscraper\/\">Book Scraper<\/a>, from the London <cite>Times<\/cite>&#8216;s R&#038;D labs, and reminded me that the <cite>New York Times<\/cite> has an <a href=\"http:\/\/developer.nytimes.com\/docs\"><acronym title=\"Application programming interface\">API<\/acronym> for its best-seller lists<\/a>.<\/p>\n<p>In the end, I decided I&#8217;d better scale down from the macro to the micro view, so that I could use data I might actually <em>get<\/em>: vocabulary statistics scraped (using my mad new <a href=\"\/category\/a2z\/\">Programming from A to Z<\/a> skillz) from <a href=\"http:\/\/www.gutenberg.org\/\">Project Gutenberg<\/a> e-books, compared with those from recent <em>Times<\/em> best sellers. And then I went and found a Jaw-Dropping Statistic (which, not coincidentally, <a href=\"http:\/\/www.straightdope.com\/columns\/read\/2724\/does-the-average-american-student-have-less-vocabulary-today-than-in-days-gone-by\">is bullshit<\/a>; favorite line in the Straight Dope article: &#8220;At times it&#8217;s been attributed to Gallup polls or even entomologists.&#8221;) that went with the data I was planning to gather. Kind of bass-ackwards, but the result is the poster-style project proposal above, which was deemed Not Entirely Stupid during the classroom critique, despite its having been printed way too large, in fifteen 8.5 &times; 11-inch tiles, and glue-sticked-together in class using the second-worst glue stick in the universe (the worst being the one I had brought from home, which, it turned out, had dried&nbsp;up).<\/p>\n<p>Now, of course, I&#8217;m not even sure I can get files of contemporary best sellers to scrape, because of stupid !@%# <acronym title=\"digital rights management\">DRM<\/acronym>, so I&#8217;m kind of hoping that the Data Fairy will come to my aid. But my project is at least <em>theoretically<\/em> possible. Developing&nbsp;.&nbsp;.&nbsp;.<\/p>\n<p><em>Bonus: Find the typo in the poster!<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today we presented ideas for our semester-long projects in Mainstreaming Information. The assignment, which apparently I was not the only person to be confused by, is over at Christian&#8217;s site (PDF, 36 KB).<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false},"categories":[26,46,4,36],"tags":[],"jetpack_featured_media_url":"","jetpack_publicize_connections":[],"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3qY10-7J","_links":{"self":[{"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/posts\/479"}],"collection":[{"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/comments?post=479"}],"version-history":[{"count":19,"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/posts\/479\/revisions"}],"predecessor-version":[{"id":727,"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/posts\/479\/revisions\/727"}],"wp:attachment":[{"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/media?parent=479"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/categories?post=479"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/itp.indiamos.com\/blog\/wp-json\/wp\/v2\/tags?post=479"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}