I've been working on a book for a few years now. I'm always keeping an eye out for good examples and problems I can use as exercises. Often, this means finding a brief reference, then digging into the scientific literature to flesh out the material from its original source. Sometimes, everything parses, and I get a new good example. Other times, the process breaks down.
The other day, I found What does randomness look like, an article from 2012 picked up by WIRED. This article mentioned two things I hadn't seen before. One was a reference to glow worms making non-random patterns on the roofs of caves.
This seems like a really cool example to show students, so I started digging for the original material. The author credits Steven Pinker's book "The Better Angels of our Nature" as the source, but no peer reviewed scientific article or data. When I got the book, sure enough I found it on page 205. Sure enough, Pinker gave better references. The example actually comes from a story by Stephen Jay Gould, from the book "Bully for Brontosaurus". Gould has long been an icon for science writing in my family, so that discovery made me feel warm and fuzzy. The story goes that Gould visited caves in New Zealand, and used evolutionary theory to deduce that the glow worm populations would evolve to be more equally spaced than random. It is a good story, but one that you'd want data to support. And there is the catch. The picture labeled credited to glow worms is actually not real data -- it's a simulation created by Ed Purcell for Gould!
So, there's the problem. As it seems to stand at the moment, Gould's hypothesis was just that -- a hypothesis. There might be confounding factors like the shape of the cave roof. We really need a careful analysis of real glow worm data to determine if he's right or not.
I have not checkout out the text of Gould's article yet to see if he provides greater detail and references. I have not found any independent support yet for the idea either, although I did find this article which appears to have real glow worm location data available for analysis. Unfortunately, it's an old scanned version and the image quality might not be good enough to tease out the answer the to question of interest.
This blog post also has another issue with the story telling, as I read it. In contrasting the work of Poisson with that of his contemporary Quetelet, it says Poisson "argued that Quetelet was missing a model for his data." This seems a problematic statement -- I have not had time to read Poisson in his original language or English translation, but in Stigler's discussion of these works (I've found Stigler very authoritative in other contexts), the Poisson introduces and promptly abandons his distribution, and never so concisely identifies the Quetelet's shortcomings as missing a "model". The concept of a model doesn't really seemed to have leaked into statistics by this point in time.