Besides cringing, I wonder if the stats really prove anything about the future. For instance, we might say that we expect Delmon Young to be a .280-.290 hitter this year because he hit .284 last year. Or that Michael Cuddyer will have a -25 UZR because it was -26.5 last year. Both seem like reasonable expectations. But are those stats truly predictive? Let's find out:
I pulled all players that qualified for a batting title from 2006 through 2009, and if they qualified in two consecutive years, I matched up their stats in a bunch of basic and advanced hitting statistics. Then I ran a statistical test that shows how predictive they are. It ranks each statistic from 1 to 100. 100 means that you can perfectly predict the following year based on the previous year. 1 means that the following year is totally random compared to the previous year.
I also did this for the UZR and UZR/150 fielding metrics, but for them I used a benchmark of 850 innings played at a position. Why 850? Because it came to about the same number of players that qualified for the batting title each year.
Here are the results:
There are some surprises for me in there:
- I'm surprise that batting average is fairly low. It ranks about the same as FIP does to ERA when we studied that a few months ago. That's also about the same for the correlation between Opening Day payroll and number of wins a team has.
- I'm surprised that OBP, SLG and OPS are lower than something like HR and RBI. Note to self: when someone says "He's a 20 HR, 85 RBI guy," don't start talking about his OPS.
- I'm surprised that stats like BB% and K% are so far up at the top of that list. Those are stats that we talk about players improving as they develop plate discipline. It sure doesn't look like that varies very much.
- I'm surprised that BABIP is so low. I've always heard it is fairly consistent and can be counted on to rebound. This doesn't support that at all.
- I'm pleasantly surprised that UZR and UZR/150 aren't at the very bottom of the list. I still have some concerns over its limitation and feel it is often misused, but at least it's somewhat consistent.