Monday, April 07, 2014

Numbers don't lie, but....


The go-to for proof of anything from life on Mars to Schroedinger's Cat is the underlying theory of statistical analysis.  I did this number stuff in college in the dark ages when a computer arriving in the psych lab was a huge source of excitement even though it was nothing more than a key punch machine.  If you're old enough to remember those, 'nuff said.  If not, the deal was you used a typewriter-style keyboard to type the formula (of which there were many) into the machine which punched holes in a card.  Then each piece of data was entered in the same way on a separate card and the whole mess was stuck into the top of the next machine.  In a few hours, another card would come out the side.  That was the solution to the problem.  That got stuck back into the machine for a printout of the answer.

Now you need only ask Google because someone else asked another computer to do that math and put it lucky duck!

That early computer was not much in the way of technology, but it was a heck of a lot better than doing it longhand on paper as I had been doing for the first three years of college.  We psych majors sweated bullets for our shot at the Sensory Deprivation Room.

Now, keeping that scenario in mind, think about how many rats I had to put through the maze or how many  humans I had to have hold their hand on a spring while they did an anagram puzzle to get even a semblance of reliable data.  The answer is mostly less than 50.  Yet that data was important, and, in the spring-thing case, led to a scholarly paper on learning theory--not by me--and dozens of other experiments.

[thanks XKCD]
Here's the thing about statistics:  In most cases, the sample is limited.  It's not every person over the age of 12, everyone who has ever driven a Chevy, or every horse in the world.  It's whatever size group of those the researcher was able to accumulate before starting the experiment.  In the case of the spring devices (which I had a custodian build, much to his delight at being part of the process), I was out to prove that stress in the form of physical discomfort would lower scores on tests.  My subjects were two junior high classes at a nearby school.  That's about 50 kids.  Period.  Based on how 50 kids handled (or didn't--there's always a control group that doesn't get the medicine or has the springs loosened) the effort of an unfamiliar test while holding a tightly-wound spring down with full pressure, a theory was born.

The researcher then takes the data and sorts it.  The first thing that happens is that the outliers--the scores completely out of whack with the average that suggest cheating or some unexpected variable like a stuck spring--are dumped.  So now there were maybe 45 in the sample.  Based on that, I was able to report that the physical stress made no difference.


But, was that accurate?  It was for that group.  Period.  For those students in that school in Worcester, Mass, during that week, and with that device I created, it was.  The professor who borrowed my springs and my theory took it to the next level and I don't know what he found.  I graduated and couldn't have cared less.  But you can see where this is headed.

Recently statistics have been all the rage.  Just last week someone said to me that 73% of Americans share a particular position on an idea.  Funny, but no one asked me.  No one asked anyone I know.  So how does a think tank come up with the "majority" position?  By picking a baseline group of likely humans and extrapolating--guessing--from there how the rest of the universe would respond.

The study done last year on how our public school kids are faring on the SAT used a sample of 6,000 students.  Six thousand out of many millions.  No one mentioned where they were located, what sort of schools they attended, how wealthy or poor their families might have been.  They were just 6,000 students chosen for reasons only the researchers my classroom kids who were within walking distance of the college.

Just this week, results of a study on the efficacy of joint supplements in horses were reported in The Horse.  The conclusion was that they don't work.  The sample included 24 saddle horses (out of some 6 million in this country alone) and one chemical compound (out of at least a dozen available at any given time).

There's a huge difference in the type of data reflected in statistics.  Seventeen years ago, the NJDA came around and counted the number of horses we each had.  They showed up in person and took a head count.  How accurate might that statistic be?  Well, I know for a fact that they missed the guy around the corner who has six sheep and a horse.  And the biggest boarding farms in the area tend to house upwards of 100 horses at any given time, and that population is in constant flux as owners move their animals, or animals die, or new ones are born.

When you read that 180 people were in attendance at an event, and that last year's event only drew 120, that's a stat you can rely on.  It's simple enough for whoever sold the tickets or took the entry fees to count how many there were.  I could tell you on any given day exactly how many students were in my English class.  The number only changed by virtue of absence or transfer, and the students were easily accounted for.

But when you see a headline that states that there are X number of something world-wide, it's probably wrong.  It's probably very wrong by the regular Joe's standards.  "Statistically Significant" means the actual total of something the research is looking for is more than X number of "standard deviations"  from the average (mean) the results were.  We can make it even more complicated and questionable by throwing in the three M's--mean (average), mode (the answer that shows up most often), and median (the middle of the pack numerically speaking)--and making our statistics reflect something completely different.

Let's also throw in the biggest ringer, which is that dealing with live subjects is inherently flawed.  I can do that spring experiment today, and then follow the subjects for ten years to see if they developed PTSD from my tinkering.  But I can't go back in time.  I can't account for other experiences they might have or prior craziness.  The same goes for any animal.  The researcher can't undo what's been done to see if not doing it would make a difference.  I can give my horse a supplement and decide that he got better (or didn't) because of it, but I can't go back and un-give it to see if that makes a difference.  I can't un-teach the student, un-ride the horse, or un-shoot the gun to see what would have happened if.

So take your statistics with a handful of salt.  They give us something to think about, but the thinking piece is up to us.

Start with this.

No comments: