Tuesday, October 16, 2007

Pythagorean Limits

This will shock you, I know, but I love mathematical magic tricks.

You know the kind I'm talking about. You'll get an email that tells you to pick a number a number and then add 30 and square it and add pi and by the end of the email, it's telling you you your birthday. I love that stuff.

And in some ways, that's what baseball's pythagorean theorem is. (If you visit this site, you likely know what I'm talking about, and if you don't, this is a nice primer.) You square a few numbers and presto-change-o, you have a win-loss record. It's magical. And we all know from history what happens to anything magical.

That's right. It becomes misused. Because those that know the tricks like the attention or the power, and those that don't can become pretty gullible. That's probably overstating what is happening now, and it is surely twisting the motives. But baseball's pythagorean theorem has some limits that have become convenient to ignore for its practitioners. I'd like to go through a couple of those over the next couple days, starting with:

1. It ain't gospel.

You will undoubtedly see, at some point in next year's season previews, someone pointing to a team's pythagorean theorem as proof that they will be worse or better next year. For instance, the Braves underperformed their pythagorean record last year, so they're really a better team than you might have thought. This conclusion is justified because (as is often stated) a team's pythagorean theorem is a better indicator than a team's actual record as to their true ability.

And it is, but just barely. Here you'll find GameDay's study that looks at team over the last ten years. The spreadsheet compares each team's record to:
1) it's record the previous year and
2) it's pythagorean record the previous year.

The results are consistent with what every study I've read about this. The pythagorean record has a slightly higher correlation than the actual record, but not by much. In our study, the comparison was .582 to .555. That's a difference that is technically known in the statistical community as "pretty much a wash".

Or, in other words, flip a coin as to which one you want to pay more attention to.

Which doesn't mean that you should ignore a team's pythagorean record. Just understand that the team also has a real record, and it's nearly an equally valid indicator of how good that team is. So don't treat is as gospel. Or even a minor miracle.

It's just a neat magic trick.