Monday, July 20, 2009

Joe Mauer will Absolutely, Positively, No-Freaking-Way NEVER Hit .400! (Ever.)

Mea culpa. I didn’t mean it. It’s impossible. It can’t be done. I'm a true disbeliever.

Since I wrote that we shouldn’t automatically dismiss a ballplayer hitting .400, Joe Mauer has gone 7-38, dropping his average from .389 to .358. That must stop. So, in an effort at the ultimate reverse jinx, let’s take a look at some additional coverage and find out how there is absolutely, positively, no-freakin-way that Joe Mauer will ever hit .400. (Ever.)

One interesting, but frustrating, story was "Checking the Numbers:MauerQuest" by Eric Seidman on on Friday. Unfortunately, you can only see the first part of it unless you’re a subscriber, so I’ll attempt to summarize the methodology:

1) Treat Joe Mauer’s remaining at-bats as essentially random events.
2) The chance of a hit happening during those events is based on Mauer’s "true talent level," which he determines to be that of a .318 hitter.
3) Estimate how many at-bats he’ll have the rest or the season, which he determines to be about 420 at-bats.
4) Use a cool Excel function (BINOMDIST for you fellow geeks out there) to determine the chance that enough hits happen to reach a .400 batting average at the end of the season.

There are a lot of assumptions there, and so the exercise ends up being useful in a look-how-I-can-model-the-un-model-able kinda way, meaning it’s for fun only. It might not be used that way, but I’m pretty sure that’s what it’s meant for. And he concludes that Mauer has a .0267% chance.

So what can you take away from it?
1. BINOMDIST. I love this function. I’m going to use it silly.
2. The method itself strikes me as mostly useless, because it becomes so dependent on a players "true talent level." I don’t know what that term really means. Even if we knew exactly what Mauer’s career average was ultimately going to be, how can we assume that was his talent level in the season when he turned 26? If he’s at a different plateau this season, it can make all the difference. Using all the same numbers Seidal uses, look at what happens when you assume that Mauer is really a .360 hitter or better this year:

That bottom line is Mauer’s true talent level in terms of batting average. The up-down axis is his chance of hitting .400. As the assumption of his true talent level increases, so do his chances of hitting .400, and it's exponential. For instance, if he’s really a .380 hitter this year, his chances are 7.8%

3. I wondered what this little exercise would look like if I evaluated Tony Gwynn’s chances of hitting .394 in 419 at-bats, like he did in 1994. I took his true talent level to be a .338 hitter, since that’s what he was over the course of his career.

The answer was .96%, or less than 1%. So Tony was pretty lucky that year, except for the whole "my-baseball-season-was-canceled-due-to-a-labor-strike-while-I’m-just-three-hits-shy-of-hitting-.400-and-kissing-destiny-full-on-the-mouth" thing. (Or maybe his true talent level in 1994 was considerably higher than his career average. I'm just saying.)

4. Seidman concludes his results paragraph with one of the greatest movie quotes of all time: "So you’re saying there’s a chance." Beautiful. We should probably all keep that in mind when referring to this .400 thing. Even if it’s possible, it’s unlikely. Of course, that’s what would make it so great.

That was kind of fun, but the post that blew me away was at a rarely posted Twins blog called Away Games, authored by someone nicknamed Chiasmus. I’m going to summarize it a little, but I hate to do so, because I would really rather you just click through. It’s comprehensive, using statistical methods and data that made my jaw drop, but the author manages to hide all that stuff and write the results in a simple, fun, self-deprecating style. This is exactly the sort of treasures that the internet is supposed to produce.

The author cites a study from 1986 by Stephen Jay Gould that argues that the reason it is harder to hit .400 is that baseball has improved to the point where batting averages are becoming less variable. It’s harder to have outliers of any kind, be it low or high. That’s why players don’t hit .400 any more.

Except that Chiasmus carries it a step further, continuing Gould’s study past 1986 into the steroid era. That’s where he finds some surprising results. First, that batting averages have gone up since then, which means it’s a little easier to hit .400. But he also finds out that batting averages also have become slightly more variable since then, meaning an outlier has an easier time stretching from the herd.

He then uses the method above on grander scale, determining the chances that MLB produces a .400 hitter in any given year. And that also has some surprising results, and I’m not going to share them, because I really want you to click over. Honestly, how many links do I need to give you people?

What do his findings mean for Mauer? It means he’s too young, should’ve come around about 10 years earlier, and that MLB has almost no chance of fielding a .400 hitter this year.

So there you have it. Mauer, and all of MLB is doomed. It’s impossible. It can’t be done.

(But feel free to go 3 for 4 tonight, Joe.)


Anonymous said...

Thanks for the lead.

Chiasmus said...

Wow, thanks for all the kind words! I am definitely going to have to start posting more often.

Anonymous said...

I think the biggest problem facing new era hitters is specialization. Ted Williams usually faced the same pitcher through all 9. I wonder, if someone did an analysis of hits per inning and looked at the distribution. Did Ted get more of his hits after the 6th inning?

It is amazing watching games and see a manager bring a pitcher in for 1 inning. Starters barely last 5 innings as hitters are instructed to take pitches.

I guess it is all relative, but it is fun. GO TWINS!

Anonymous said...