“There are three kinds of lies: lies, damned lies and statistics.”
- Mark Twain’s Own Autobiography
There’s been quite a hullabaloo over a recent WSJ article about the Best College Majors for a Career. Actually it’s less an article and more of a bad ass compilation of data that analysts like yours truly drool over. It ranks quite a few majors on several criteria such as unemployment, earnings, and popularity. Among the findings surprising to no one:
- Hard sciences, mathematics, engineering, and business have the lowest unemployment rates
- Psychology and the arts are fine if you want a 1 in 5 chance of not working
- School counseling has a quartile that makes less than I did in high school working in the mall
- Petroleum engineers start off making more than I do now.
This is fascinating for many reasons but I thought I’d focus on some of the pitfalls of this type of data. The blog Code and Culture has already taken a stab at this with a couple of useful scatter-plot charts such as this one:
His basic findings are that many of the majors that had a small sample size are over-represented in both high and low unemployment rankings. IE, they are outliers.
An additional complaint of mine is that this ignores critical components of our modern higher educational system such as impacted majors. That is, majors that more people wish to pursue than the programs can accommodate. I may wish to give up my career as a financial analyst and become a petroleum engineer, but first I need to find a school that will allow me to enter (in addition to being admitted to the school following the rigorous application process).
These data are also compiled using only a single point of time (the 2010 census). Much more interesting would be a trended set of data (or a median of say 5 years). This would remove many outliers given all of the upheaval we’ve seen since 2008. How about a geographic perspective? I’m sure New York and California are heavily represented but what about other states? Construction Management may be having a tough time in California, Nevada, and Florida but could be thriving in Colorado, Utah, and Idaho.
Finally, all of these data points are based on respondant data. Just as people at high school reunions are more likely to be the successful ones (people who feel they don’t have anything to brag about may be less likely to attend), the people responding to the census with their major and earnings data are probably more likely to be people who consider themselves prosperous. A similar complaint is made about the cost of weddings.
