Wednesday, June 02, 2010

Maybe Bert is Right (Part 3)

(This is the third part of a 3-part series that I’ll be running on the TwinsCentric blog and at In Part 1, we discovered that an original metric for evaluating pitcher abuse, Pitcher Abuse Points (PAP) had been declared bunk by its creator, Rany Jayazerli. However, he and Keith Woolner instead presented another metric called PAP3 to replace it that also started tabulating abuse points of a pitcher at the 100-pitch mark.

In Part 2, we found out that PAP3 was actually developed to show short-term performance effects of high pitch counts, not long-term injury effects. What’s more, the “significant” effect it showed was a 1% decrease in pitcher performance after a 130 pitch outing versus a 90 pitch outing, or about one extra run per year.

In Part 3, we’ll look at the second half of the Analyzing PAP essay, where Jayazerli and Woolner see how effective PAP3 is at predicting pitcher injuries.)

In the second half of the study, Jayazerli and Woolner are very careful about what they are trying to prove. They do NOT try to show that PAP3 is a good predictor of pitching injuries. Instead, they try to show that it is a better indicator than just counting pitches. Of course, to those that believe that counting pitches is already bunk, that’s not proving much.

They do this by looking at the same time period and finding 73 pitchers with arm injuries that cost them 30 days or more of playing time. They then find comparable healthy pitchers that threw about the same number of pitches over their career. Then they compared the PAP3 scores for those pitchers.

They found that the injured pitchers were three times more likely to have had above-average PAP3 scores for their career. There are also two more studies from this same data set that suggest that PAP3 is better than just straight pitch counts. But again, it isn’t showing any kind of cause-effect or even a correlation between injuries and pitch counts. It’s just showing that their way of counting pitches correlates better than just straight pitch counts.

Interestingly, they specifically study the baseline of 100 pitches where PAP3 starts counting, and conclude it is essentially a line drawn in the sand. They tested 90, 100, 110 and didn’t find that any one measurement was significantly better than the other. They end up using PAP3 for injuries because it is convenient:

“Therefore, any reasonable metric that gives extra weight to high-pitch-count outings should yield a risk factor that in time is in the same ballpark as PAP3 (pardon the pun). Because we have a preferred metric for short-term impact that does acceptably well long-term injury risk to, we will stick with simplicity and use a single metric for both purposes.”

So let’s review. The purpose of the study was to study long-term injury risk as it related to pitch counts. Instead, a formula was devised to predict short-term performance impact instead. While the formula starts creating very large and spooky totals at 100 pitches, the results show almost zero impact on a pitcher’s performance until 130 pitches and nothing significant until 140 pitches.

As for long-term injury risk, all that was proved was that any metric that applies extra weight to pitches thrown late in a game is more accurate at demonstrating increased injury risk than just counting pitches throughout a season. But, of course, if you think pitch count is bunk, this is just a little above bunk. What’s more, since no baseline number or formula was any more accurate than any other, the same formula was used because it was convenient, not because it was more accurate.

The 100 pitch count “limit” was ingrained in people’s noggins before this study. The original PAP metric had sabremetrics lovers everywhere analyzing and criticizing managers for fragging their players' arms, present company included. The revised metric supposedly brought it even more credence.

It didn’t. The study of pitch counts appears to – as Bill James wrote a couple of years later – resulted in a dead end. The evidence shows that pitchers arms are no more affected in the short or long term by 125 pitches than they are by 100, and barely affected up to 139 pitches. And pitch counts like that almost never happen in major league baseball these days.

I’m not advocating pushing the limit to 180 pitches. But maybe Bert is right: we can trust the opposing batters to indicate to a pitcher when he is done, and rely a lot less on arbitrary pitch counts.

Monday, May 31, 2010

What is Going on with the DL?

So, why was Cuddy at second base?

The Twins have been slow to place banged up players on the 15-day disabled list (DL) this year, and last night that resulted in Michael Cuddyer playing second base. Cuddyer hasn't played second base consistently since 2005 when he played 11 games there. He started last night because the Twins have two injured second basemen on the roster and neither of them were on the DL.

Starting second baseman Orlando Hudson was injured Sunday night, and the Twins are still trying to evaluate if there is something wrong with his wrist. After negative x-rays, and a CT scan that turned up nothing, they say he should be back in a couple of days. We'll see.

But backup Alexi Casilla underwent an MRI on his elbow back on 5/21 or 5/22. It found a bone spur and loose bodies, but nothing that prevented him from filling in for Hudson back on 5/22. He also played on 5/27, going one for four. However, last night he was only able to pinch run, which is why Cuddyer ended up starting at second base. Now he's going to be placed on the DL, ten days after his MRI?

This isn't the first time the Twins have kept a guy on the roster while he treating an injury. It's happened over and over. Instead of calling a healthy backup from Rochester, the Twins have played short-handed this year, and the player coming back has struggled. Let's review...

Nick Punto
Situation: Most recently, Punto was basically out of games from 5/22 to 5/28 with an injured finger. I say "basically" because he was used as a pinch runner a couple of times during that stretch. When he returned, he could only bat left-handed, which limited the spots in which he could be used.

Results: In the 9 days in which Punto would have been on the 15-day DL, he has garnered two hits.

JJ Hardy
Situation: Hardy jameed his wrist sliding into third base on 5/4. Initially, we were told he would miss a day or two. A week later he ended up seeing a specialist and was finally put on the DL on 5/11.

Results: The Twins played short-handed for a week - and still ended up putting Hardy on the DL. He didn't return until 5/25 and admitted this weekend that his wrest still isn't 100%. He is 4-22 since his return, with one double.

Joe Mauer
Situation: Mauer hurt his heel in a game 4/30. He was out until 5/8, when he pinch hit. He started as a designated hitter on 5/9. And he finally started again at catcher on 5/11, eleven days after the injury. During that time the Twins used multiple roster moves to cover for not placing him on the DL.

Results: Mauer went 8 for 19 in the games in which he would have been on the DL. What can you say - it's Mauer. He hits.

Nick Punto (again)
Situation: On 4/16 Punto was held out of a game because he had a sore groin. Later the problem was diagnosed as a hip flexor injury, and he was finally added to the DL on 4/23, a week later.

Results: For a week, the Twins played without their starting third baseman and without making a roster move. Punto returned 4/30.

Pat Neshek
Situation: On 4/15 it was reported that Neshek had a sore flexor tendon in his finger. He was held out of games until 4/24. A week later the Twins decided to send him to AAA, at which point he asked to be put on the DL, at which point the finger-pointing started. Eventually he was given an MRI and a new diagnosis was given: he had a problem with his palm pulley tendon.

Result: In the 15 days he would have been on the DL, the Twins played short-handed for nine days, and then Neshek gave up two runs on eighteen pitches. Plus, he was eventually put on the DL, plus it looks like the missed diagnosis resulted in treatment that aggravated the actual injury.

What the hell is going on out there? For four guys (Neshek, Punto's hip, Hardy and Casilla), it looks like the Twins just had no idea how serious (or what) the injury was. In Mauer's case, the felt like they would rather play short-handed than not have him for a few days (and Mauer made that look almost prudent). And with Punto's finger, he was seemingly rushed back.

I had thought that the Twins were trying to be careful about roster moves. That doesn't seem to the be case. It looks more and more like the problem is that players are either trying to downplay injuries or the medical staff is having trouble evaluating them. I can only think of a couple of games that it has impacted, but it must be driving manager Ron Gardenhire crazy. This is an area that needs some extra attention, and needs it fast.

Maybe Bert is Right (Part 2)

(This is the second part of a 3-part series that I’ll be running on the TwinsCentric blog and at You can find Part 1 here. Part 3 on June 3rd.)

In Part 1, we discovered that an original metric for evaluating pitcher abuse, Pitcher Abuse Points (PAP) had been declared bunk by its creator, Rany Jayazerli. However, he and Keith Woolner instead presented another metric called PAP3 to replace it. It also starts tabulating abuse points of a pitcher at the 100-pitch mark. The evidence that it has any correlation to pitcher abuse is supposed to be in the Analyzing PAP essay, which is divided into two parts. We’ll look at the first part of that essay today.

Analyzing PAP Essay

While the initial intent of PAP was to study whether a pitcher is at risk for injury or permanent reduction in effectiveness, Woolner and Jayazerli tried to get that to happen indirectly. They broke their study into two parts. First, they studied whether there is any short-term reduction in effectiveness for pitchers after a long outing. Then, in the second part, they studied whether those high pitch counts also can predict injury.

In part one, they looked at starts for pitchers over a ten-year period (from 1988 through 1998) and look at a pitcher’s performance 21 days before and 21 days after each start. If a pitcher has a high pitch count, do the 21 days after the start reflect a decrease in performance compared to the 21 days before?

After looking at some initial results, they implemented one more filter. They only analyzed “high-endurance” starting pitchers, or pitchers whose average pitch count is above that of the league. They did this essentially so they could study the better pitchers in the league, and the ones most likely to be pushed. It also provided data that makes a little more sense.

The essay starts with a surprising result: they find a very slight decrease in performance across the board - about 1% – no matter how many pitches a pitcher throws. That is true up through 129 pitches; at 130 pitches, future performance slopes down 2% and at 140 pitches future performance dives about 5%. Those results weren’t terribly in sync with what PAP would’ve predicted, so they tried some other formulas and came up with the PAP3 curve instead.

To summarize Part 1, they found that a high pitch count can have a slight impact to a “high-endurance” pitcher’s short-term performance. That impact is about 2% if a pitcher throws upwards of 130 pitches. In what is otherwise a very candid and objective study, I’m a little disappointed by the attempt to frame this as significant:

“Assuming a fairly abusive usage pattern across a staff, a team’s starting rotation could suffer a season-wide decline of about 2%. Considering the effect on both the innings pitched (putting more strain on the bullpen) and extra runs allowed by the starting pitchers, this might amount to perhaps 20-25 runs over the course of a season, worth about 2 to 2.5 games in the standings. It’s comparable to the difference in value between Tim Hudson and Kevin Tapani or Todd Ritchie in 2000. That’s a trade worth making.”

Um, hold it. So if I allow all my pitchers to throw 130+ pitches in 162 games, I’ll decrease my staff’s effectiveness by 2%? And if I allow them to throw just 90, they’ll only decrease by 1%? And we think that’s significant, do we?

Just so we’re clear, on what 1% is, one of the metrics that was used was runs against. So Carl Pavano (who has a significant injury history) gave up 119 runs last year while consistently throwing between 90 and 103 pitches. But if his teams would’ve allowed him to throw 130 pitches, he would’ve given up – one more run? Again, I’m supposed to think that’s a significant finding?

And, of course, this doesn’t measure what all this was supposed to measure – whether it’s actually dangerous to the pitcher. That comes next, in Part 3….