Wednesday, June 02, 2010

Maybe Bert is Right (Part 3)

(This is the third part of a 3-part series that I’ll be running on the TwinsCentric blog and at TwinsGeek.com. In Part 1, we discovered that an original metric for evaluating pitcher abuse, Pitcher Abuse Points (PAP) had been declared bunk by its creator, Rany Jayazerli. However, he and Keith Woolner instead presented another metric called PAP3 to replace it that also started tabulating abuse points of a pitcher at the 100-pitch mark.

In Part 2, we found out that PAP3 was actually developed to show short-term performance effects of high pitch counts, not long-term injury effects. What’s more, the “significant” effect it showed was a 1% decrease in pitcher performance after a 130 pitch outing versus a 90 pitch outing, or about one extra run per year.

In Part 3, we’ll look at the second half of the Analyzing PAP essay, where Jayazerli and Woolner see how effective PAP3 is at predicting pitcher injuries.)

In the second half of the study, Jayazerli and Woolner are very careful about what they are trying to prove. They do NOT try to show that PAP3 is a good predictor of pitching injuries. Instead, they try to show that it is a better indicator than just counting pitches. Of course, to those that believe that counting pitches is already bunk, that’s not proving much.

They do this by looking at the same time period and finding 73 pitchers with arm injuries that cost them 30 days or more of playing time. They then find comparable healthy pitchers that threw about the same number of pitches over their career. Then they compared the PAP3 scores for those pitchers.

They found that the injured pitchers were three times more likely to have had above-average PAP3 scores for their career. There are also two more studies from this same data set that suggest that PAP3 is better than just straight pitch counts. But again, it isn’t showing any kind of cause-effect or even a correlation between injuries and pitch counts. It’s just showing that their way of counting pitches correlates better than just straight pitch counts.

Interestingly, they specifically study the baseline of 100 pitches where PAP3 starts counting, and conclude it is essentially a line drawn in the sand. They tested 90, 100, 110 and didn’t find that any one measurement was significantly better than the other. They end up using PAP3 for injuries because it is convenient:

“Therefore, any reasonable metric that gives extra weight to high-pitch-count outings should yield a risk factor that in time is in the same ballpark as PAP3 (pardon the pun). Because we have a preferred metric for short-term impact that does acceptably well long-term injury risk to, we will stick with simplicity and use a single metric for both purposes.”

So let’s review. The purpose of the study was to study long-term injury risk as it related to pitch counts. Instead, a formula was devised to predict short-term performance impact instead. While the formula starts creating very large and spooky totals at 100 pitches, the results show almost zero impact on a pitcher’s performance until 130 pitches and nothing significant until 140 pitches.

As for long-term injury risk, all that was proved was that any metric that applies extra weight to pitches thrown late in a game is more accurate at demonstrating increased injury risk than just counting pitches throughout a season. But, of course, if you think pitch count is bunk, this is just a little above bunk. What’s more, since no baseline number or formula was any more accurate than any other, the same formula was used because it was convenient, not because it was more accurate.

Conclusion
The 100 pitch count “limit” was ingrained in people’s noggins before this study. The original PAP metric had sabremetrics lovers everywhere analyzing and criticizing managers for fragging their players' arms, present company included. The revised metric supposedly brought it even more credence.

It didn’t. The study of pitch counts appears to – as Bill James wrote a couple of years later – resulted in a dead end. The evidence shows that pitchers arms are no more affected in the short or long term by 125 pitches than they are by 100, and barely affected up to 139 pitches. And pitch counts like that almost never happen in major league baseball these days.

I’m not advocating pushing the limit to 180 pitches. But maybe Bert is right: we can trust the opposing batters to indicate to a pitcher when he is done, and rely a lot less on arbitrary pitch counts.

3 comments:

Jack Ungerleider said...

So after three articles we arrive at... not much. (Which I think is your point.)

Here's another way of looking at it. Bert always says that a pitcher wants to get out of an inning throwing around 15 pitches. That would be 90 pitches in 6 innings and 105 pitches for 7 innings. Since the structure of most pitching staffs these days is molded around Starter, Setup, Closer (whether you think that's a good model or not) then 105-110 pitches (about 7 innings) is a good number.

Unknown said...

This was one of my favorite series ever on a blog. Thanks John.

I never heard or researched much about the PAP3 metrics. As Jack pointed out above, it didn't really go anywhere. Still, it is nice to see scientific research done about pitch counts. It really exposes how mishandled starting pitchers are today.

I guess in ten years when Texas is dominating hitters and winning 94 games a season with all their starting pitchers averaging better than 105 pitches a game, minor leaguers everywhere will start to be pushed. I've always loved the state of Texas. I may be falling in love with their MLB organization now as well.

Anonymous said...

Interesting stuff and it has ramifications to me outside of MLB. As the father of a pitcher who routinely throws 100+ for his Townball team it's good information.