The one where Holly complains about statistical packages and software in general

Big stresses are not what will finally, in the end, drive me to insanity. It will be the culmination of the little things that finally pushes me off the ravine upon whose edge I live so perilously close.

These little things, things that should be easy but somehow aren’t, are most commonly presented in my life in the form of Stupid Programing Issues made in the statistical packages with which I am sometimes called to use for my work. Most of these packages (STATA, SPSS, SAS) are incredible expensive and call for both the privilege of access to them and then climbing the learning curve required to know how to use them. (Understanding what those numbers actually mean is a serious problem within the sciences themselves… poorly trained social, behavioral, and medical scientists, lack of good theory and critical thinking in higher education programs… I could go on, but this rant is about software.)

There is one free package, made by the CDC. It is what statistical software should be in public health: free, amenable to operation in older systems, fairly easy to learn, have access to standard comparisons of health/nutritional indicators. It is commonly used in the field for the reasons listed above. But it is not particularly powerful in analysis. Although it is not hard to manipulate for basic information when you’re set up in it, getting to the point where your data is in the system and correct is difficult — the software is not user friendly. Paul keeps telling me that we should write a grant to fund him making a new package. Something that would offer more statistical power but be a bit more intuitive in its interface. This does not sound like a good idea. If we did this and people found out, I fear our doorstep would be darkened daily by strung-out graduate students seeking vengeance.

And right now, I’m beating my head against the wall because the damn thing won’t read my run files. Don’t get me started on how many times this thing has crashed. It is insisting on a click-by-click dummy entry of things and just making my life really suck. It would take all day to recode half of these variables without a run file. I am ranting only because I decided to chuck the version and download an updated package and need a vent while I wait. I could do these things in STATA so much faster, and have considered changing the data and studying it elsewhere and then loading it back to Epi Info, but I think I need the practice here. I’m trying to write a lab assignment and need to be able to test that these things work before unleashing Master’s students on it (they are already nervous and unsure about the whole “lab” thing).

What is really firing me up is that these are REALLY SIMPLE sort of things I’m struggling with. User error is always a factor, although the same command that works in one second is full of syntax errors in the next. Wa…? Unfortunately, the “HELP” aspects of the software are miserable.

Which brings me to a point: if I am having problems here, in my cushy home, with all the resources around me I need to figure out a solution, how can we rely on this in the field? Shouldn’t a profession like public health, with such important implications in surveillance and data survey, have truly excellent caliber software — free for use — and widely available? And if Epi Info is our current solution to this issue, then for heaven’s sake… why isn’t it available for Mac OS (in the least) or Ubuntu?

Anyone out there a big data geek who knows Epi Info and is ready for some questions???