« Iraq and Germany | Main | Get Involved By Avoiding Politics» Don Boudreaux

April 08, 2008

Striking out

Russell Roberts

Sometimes I get depressed about the quality of statistical work in economics. Then I read something from another social science. Here is a recent study where psychologists find that having the initial "K" increases your chance of striking out when playing professional baseball. Why? Well, it's obvious isn't it? The letter "K" is used when keeping score in baseball to represent striking out. So it's obvious now isn't it? Still don't get it? Neither do I. But hey, it's in the data. Between 1913 and 2006, players with first or last initial "K" struck out 18.8% of the time compared to 17.2% for the fortunate players unhandicapped by their initials. Here is the "explanation" of the authors:

Despite a universal desire to avoid striking out, K-initialed players strike out more often.  For those players, we argue that the explicitly negative performance outcome may feel implicitly  positive. Even Karl “Koley” Kolseth would find a strikeout aversive, but on the whole, he might  find it a little less aversive than players who do not share his initials, and avoid it less  enthusiastically.

But why? Why would having the initial "K" make striking out more pleasant? I just don't get it. The authors go on to "test" their theory by looking at grades of a sample of MBA students:

The MBA students in our sample are well aware of a direct connection between academic  performance and successful job placement. Nevertheless, despite the pervasive desire to achieve  high grades, students with an unconsciously-driven fondness for C’s and D’s were slightly less  successful at achieving their conscious goal.

That is, Charles Darwin received poorer grades than Alan Alda. But it turns out that Alan Alda didn't do better than the non-ABCD initialed:

Interestingly, A- or B-initialed students did not perform better than students whose  initials were grade-irrelevant. There are two possible explanations for this. First, students with  grade-irrelevant initials may already be maximally motivated to succeed. Second, because  performance is determined by motivation and ability, any increased motivation to succeed that  arises from having initials that match positive performance outcomes may not necessarily  translate into increased performance.

There is, of course, a third explanation: there is no real relationship and the authors have been fooled by randomness. Yes, their results are statistically significant. But how many relationships did they explore before finding the ones that were statistically significant. And ho many relationships are there to explore? To really test the theory, you'd have to look at baseball players with the initial "E" and see if they commit more errors than others. You'd have to look at guards in the NBA to see if those with initials "A" have more assists. Centers whose initials include an "R" should be better rebounders. You'd have to look and see whether students with the initials IC were more likely to take an "incomplete" in a class.

I guess Rabbi Jonathan Sacks, the Chief Rabbi of England should have been a football player. Or maybe he just gets fired more often than the average Briton because it doesn't bother him as much as someone with a different last name.

Did Kafka know baseball scoring? Does this explain why he found success in life so difficult? Is this why he named a character "K"?

Do players whose initials are a backwards "K" strike out looking more than the average?

Posted by Russell Roberts in Data, Sports | Permalink

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834518ccc69e200e551cc6f1c8834

Listed below are links to weblogs that reference Striking out:

Comments

These guys need to be introduced to the word "spurious" and forced to pay back the federal grant they undoubtedly received to engage in this nonsense.

Posted by: Methinks | Apr 8, 2008 11:42:26 AM

Articles like this make me chuckle. Thanks, Russ!

Posted by: marysienka | Apr 8, 2008 12:04:44 PM

Were pitchers over-represented in this population? And if so, would a K named pitcher strike more batters out?

The sad part is I'm sure these guys take themselves way too seriously.

"These findings provide striking evidence that unconscious wants can insidiously undermine conscious pursuits."

I can't decide whether to laugh or cry.

Posted by: Stretch | Apr 8, 2008 12:06:53 PM

This is an obvious example of data mining. Although it is obvious in this example, there are many similar examples that are not obvious to others. Just pick up books or articles on picking stocks, and one can find all sorts of examples of data mining.

Posted by: PaulD | Apr 8, 2008 12:11:34 PM

Noticed that the authors are from schools of management. Wondering if they are in training to be pointy-haired bosses or being paid to turn out pointy-haired bosses.

Posted by: Randy | Apr 8, 2008 12:14:51 PM

My last name starts with a "K" and I never liked striking out. And I was generally more patient at the plate and took more walks then my teammates, but my initials do include "P" or "W".

Of course, I'm a pretty small sample size.

Posted by: Avatar300 | Apr 8, 2008 12:35:21 PM

Oops, "do include" should be "do not include".

Posted by: Avatar300 | Apr 8, 2008 12:36:25 PM

I wonder what happens when you throw Kevin Mitchell out. But seriously, what percentage of players had K initials? The smaller percentage, the less statistically significant the difference.

Posted by: Brad Hutchings | Apr 8, 2008 12:54:19 PM

Don't be put off by your small sample size, Avatar!

These great researchers certainly wouldn't let a thing like that stand in their way!

Posted by: Methinks | Apr 8, 2008 12:55:49 PM

"Centers whose initials include an "R" should be better rebounders."

Dennis Rodman... although he was a forward. Coincidentally, his middle name is Keith.

Posted by: Brad Hutchings | Apr 8, 2008 12:56:59 PM

This was discussed on various blogs related to baseball and statistics about six months ago. For some interesting insights, see this post: http://www.hardballtimes.com/main/blog_article/ridiculous-science

They found that yes indeed, batters with an initial K struck out slightly more often that average. But there were eight other initials that were even worse.

Posted by: John S. | Apr 8, 2008 1:05:20 PM

And he teaches at Yale? Really? Somebody needs to point him to Andrew Gelman's papers and others in that literature (Bonferroni bounds, Hal White's data snooping tests....)

Posted by: Jack | Apr 8, 2008 1:07:27 PM

Av300, I'll second the point about not being discouraged by your small sample size. Just put yourself in the data 1,000 times and you'll have a big sample.

Posted by: dave smith | Apr 8, 2008 1:21:11 PM

Truly a study devoid of merit and one that wasted resources. I suppose now Dave Kingman will go on the ESPN lecture tour and claim that his massive strikeout ratio wasn't really his fault...he was inherently doomed from birth.

Posted by: tw | Apr 8, 2008 1:26:29 PM

My question is, how in the heck did anyone even think to ask such an asinine question? Are pyschologists really that short on topics about which they can write? Honestly, who even conciously or unconciously automatically associates initials with a scoring metric?

Posted by: Matt C. | Apr 8, 2008 1:40:24 PM

I'm wondering. How do they explain variations from the mean for other letters?

Posted by: Marcus | Apr 8, 2008 1:48:44 PM

I'm reminded of the old joke about the economist who drowned in a river with an average depth of six inches.

Economists (yes even here) torque around numbers with the best of them, and even engage in selectivity designed to mislead.

Posted by: save_the_rustbelt | Apr 8, 2008 1:51:41 PM

My question is, how in the heck did anyone even think to ask such an asinine question?

Two words:

Grant money

Posted by: Methinks | Apr 8, 2008 1:52:26 PM

Am I an asshole for suggesting that Latin American baseball players are less likely to have a K in their name than descendants of Eastern European families, and perhaps it is a cultural phenomenon that the former group are less likely to strike out than the latter? (I have two K's in my last name, and that's only one out away from retiring the side.)

Posted by: mpkomara | Apr 8, 2008 2:14:48 PM

mpk...HA...i love it; perfect point.

Posted by: shawn | Apr 8, 2008 2:17:53 PM

After some initial bewilderment, my first thought was that people from different ethnic groups and cultures were more likely to have certain initials than the rest of the population.

Did they control for race and culture at all? I'm assuming they at least controlled for gender?

Posted by: Grant | Apr 8, 2008 3:18:19 PM

I wonder if the predicted values from their regression went anywhere near the mean of the actual data.....


....sarcasm, of course, as this is a property of all regressions.

Posted by: dave smith | Apr 8, 2008 3:25:27 PM

It just occurred to me how odd it is that Russell Roberts, or R.R., which is to say, R-squared, isn't a bigger fan of regression.

Posted by: noahpoah | Apr 8, 2008 6:53:50 PM

Russ’ kurtosis precludes that.

Posted by: Mesa Econoguy | Apr 8, 2008 7:14:25 PM

The progressive thing to do for the sake of equity would be to allow players with K's in their names to have more strikes before being out.

Posted by: Justin Ross | Apr 8, 2008 7:19:37 PM

Unfortunately, that was from a business school not a school of social science. It could easily have come from a medical school. I stopped paying attention to reports of medical findings because most of them are innumerate as well.

My question is how do referees let this through? Now THAT's scary.

Several flaws in the report are pretty obvious. First "Kingman" , a player from the 1970s alone accounts for nearly a third of the deviation from the null hypothesis. A cursory review from a similar (but not identical) data set suggests they used a binomial distribution with all batters with the same letter category having the same mean strikeout. Since there is a significant variation among individual players, the null is almost certain to be false once you partition your data set among the smaller subsets. Kingman easily skews "k". In fact, I found huge deviations for every single letter. Their model was wrong, every subset was skewed because there are not enough individual players to wash out.

For grades, there was no difference between A&B, nor between C&D. That should have been the end of it. Besides, the absolute difference between AB and CD was about 0.02 with a mean around 3.4. That's one letter grade in 50 for a poor CD? That's a tiny effect even before being swamped by different standards in different classes and schools. Besides, with an average around 3.4, Just how many grades of C let alone D could have been in that data set? I'd have expected A-B to be the much bigger contributer, but its not there.

Posted by: Bill | Apr 8, 2008 10:15:05 PM

It's funny these guys are both from management schools. I'm a development econ student currently taking an international finance class in the business school, and I'm doing a regression on foreign exchange rates for a project we're working on.

The instructions for the regression analysis are ridiculous. They ignore autocorrelation and multicolinearity effects, and when I brought this up to my groupmates and the professor, it was clear that none of them knew anything about stats past how to run a regression in Excel. Meanwhile, I've only got two methods classes under my belt compared to my instructor's PhD.

It makes me think I could make a fortune in the finance world as the one of the only competent statisticians.

Posted by: Mike | Apr 8, 2008 10:32:02 PM

Would this work in reverse? Would pitchers with a K be better hurlers? Worked for Kevin Brown. Not so much for Knolan Ryan. :)

Posted by: Christopher W. | Apr 9, 2008 5:43:55 AM

Skepticism is always good, but one should examine the evidence at least before concluding that it's bunk!

I learned about these findings years ago. This paper is a new study that came out last year, replicating the results of the first one. This result has been replicated time and time again from different data sets, so it deserves some attention.

It's possible there's a different explanation, but the fact that people with the letter K in their initials strike out more often has been shown many many times. Just as the result that people with C's and D's in their initials get more C's and D's has been shown many different times.

Posted by: brian | Apr 9, 2008 7:44:04 AM

"The progressive thing to do for the sake of equity would be to allow players with K's in their names to have more strikes before being out.

Posted by: Justin Ross"

Assuming you say this in jest, am I to understand that you oppose handicaps in gold because they are progressive?

Posted by: brian | Apr 9, 2008 7:45:56 AM

Are you sure this wasn't an April Fools study?

Posted by: liberty | Apr 9, 2008 12:52:11 PM

Did they make some sort of Bonferroni adjustment for multiple comparisons?

If they think they have discovered a "significant" correlation, they should test the hypothesis prospectively.

Assuming this wasn't an April Fools study.

Posted by: Paris | Apr 9, 2008 5:07:18 PM

A fine example of junk science. This paper would be an F in a methods class. An example of spurious correlation. I guess if there is a consensus then it is correct.

Posted by: bee | Apr 9, 2008 8:36:23 PM

I just ran across a sports betting site that let's you bet on games from the NBA, NHL, and MLB for free and earn real $$$ when you win! The site is totally ad sponsored so the advertisers give you the money to bet with...pretty sweet idea. Definitely the most original idea I have come across this week.

Posted by: www.centsports.com | Apr 11, 2008 10:59:55 PM

Wow. I am amazed at the effort that we as a society goe to, to prove a point by a point(% point that is). I once had a boss that asked me to prepare a report to prove the worth of his proposal to top management by using statistics gathered from manufacturing records. I told him tell me what you want to see and I will make it happen. The figures were all true, just presented in a different manner.

Posted by: Joe Jaegers | Apr 17, 2008 8:53:18 PM

Hey, Nice Blog Here, I have been heavily into sports betting for a few years now.

I recently came across a very impressive system for winning 97% of all bets in the NBA and the MLB.

Its called sports betting champ, and it actually does what it says it does. For a more detailed look, check out the URL in my comment. All the best

Posted by: Sports Betting Champ Review | Apr 21, 2008 6:34:38 PM

now there's something Jim http://freebetbookmaker.com

Posted by: free bet | Aug 21, 2008 3:14:46 PM

Thanks for the read. You should check out OurPlaybook.com - community for sports fans, to enter our Free weekly Pick'em and Parlay Sports Contest. We offer some awesome prizes to the winners and it only takes a second to play. Cheers

Posted by: OurPlaybook.com | Sep 9, 2008 11:50:15 PM

This is just silly, did you know that all the players with the letter "Q" to start their name drive blue 1997 Toyota pick up trucks..lol

Posted by: HC Blogger | Sep 11, 2008 10:43:04 AM

Great post, always an interesting read. Keep updating this blog, it's always worth the time.

Posted by: Josh Neumann | Sep 11, 2008 4:45:18 PM

LOL - nice I think we should all judge our decisions on this system..

Posted by: Sports Blogger | Sep 15, 2008 11:37:59 AM

I agree, this is the way we should all run our lives, sports, personal or otherwise.

Posted by: Donald | Sep 18, 2008 1:10:36 PM

Dang... I just don't think I can get enough sports. Shhh my wife is coming. lol Hey thanks for the post and my for satisfying my need to "feed on sports" info. Kenney

Posted by: Athletic College Recruiting | Oct 17, 2008 11:53:33 PM

Yaa ,you are right this was a very good post,thanks for it.

Posted by: Chelsea Football club | Oct 30, 2008 1:04:30 AM

The comments to this entry are closed.