Thursday, July 28, 2011

Denard Span vs. Drew Storen

A Defense Of The Value Of Relievers

You can blame this extended column on John Dyer-Bennet, my Calc II professor back in 1985. He’s the guy that instilled in me a very high standard for what is “intuitively obvious.”

Yesterday, rumors heated up nationally about the Washington Nationals’ interest in swapping their closer Drew Storen for Twins center fielder Denard Span. The leading indicator of fan reaction, Twitter, nearly self-combusted. I’d estimate that 90-95% of the reactions varied from “this is a terrible idea” to “the Twins need to get more than Storen.”

There is no doubt that some of that is a knee-jerk reaction to last year’s Matt Capps-Wilson Ramos trade. There are too many similarities to ignore: the Twins acquire the Nationals closer at midseason for a young, cost-controlled, up-the-middle defensive player. And as someone who ripped the hell out of that trade the day it was made, I can sympathize.

But Span is not Ramos, and Storen REALLY isn’t Capps. Comparing Storen to Capps because they’ve both been Nationals closers is akin to comparing gold to lead because they’re both metallic elements. Storen isn’t an average reliever who happened to be plugged into the closer role for the Nationals without throwing up all over himself for three months. He’s the real deal. I’ll let some other blogs (or the comments section) give the statistical breakdown, but if a deal goes down, rest assured that the Twins are getting a good ’un here. For lack of a better comp, think Joe Nathan, with a year less service time.

But the comparison that counts isn’t Storen-Capps, it’s Storen-Span. So let’s compare them.

If you look at the “other stuff” that we pay so much attention to – things like salary, age, service time, contract options, health – there is no doubt that Storen comes out ahead. He’s younger. He has less service time. His money isn’t guaranteed so there is less financial risk. He’s under team control longer. He’ll be cheaper for the next four years (and Span will be a FA by then). I suppose one could argue about the health risks inherent with a pitcher vs. a position player, but Span’s concussion history would seem to balance that out.

On all those fronts, Storen gets the checkmark. To me, that is intuitively obvious.

It also doesn’t appear that Span is any better at his role than Storen is at his. Without a lot of analysis, Span would appear to be better than about 2/3 of the center fielders in the majors. But there is no question that Storen is quite a bit better than 2/3 of the relief pitchers in the majors. That figure might be as high as 90%.

(Dyer-Bennet would hate that last paragraph. But this story is already gonna go extremely long. I’ll take the demerits and move on. If someone wants to challenge it or do the analysis, you can get the extra credit.)

Instead, what is intuitively obvious to everyone – save me, apparently – is that an everyday position player is much more valuable to a team than a relief pitcher. It is so obvious that several Twitter users were flummoxed that I would even ask why they believe that. I was accused of playing dumb or trolling.

But gratefully, some did reply, and I’d like to examine the arguments.

Everyday Players Play And Do A Lot More
We track a lot of statistics for baseball, and relievers usually have the fewest of those statistics. It’s reasonable to suggest this shows a higher level of value for position players.

But comparing the overall value of those stats becomes problematic. First, there is the problem that hitters and pitchers have different statistics: how does an RBI double compare to a scoreless eighth inning? Then there is the problem of context – how did those hits or outs impact the game?

Indeed, measuring the value of players in a single game is problematic, let alone for a full season. For instance, in last night’s 7-1 win, who was more valuable: Brian Duensing (6 2/3 IP, 1 run) or Joe Mauer (2-4, 3R, 2RBI)? You might have your opinion, I might have mine. There is no intuitively obvious answer. How would one measure such a thing?

One way would be to try and measure each player’s impact on a game. You know that Mauer’s single in the fourth inning helped the Twins and impacted the game. You know that Duensing’s scoreless sixth inning impacted the game. But you don’t know exactly how much each impacted the game.

But what if I told you that historically (counting thousands of MLB games), teams that were in the same position as the Twins were when Mauer came to the plate had won 62% of their games. But after that hit, teams in roughly that same position had won 73% of their games. It would be fair to give Mauer credit for that 11%, wouldn’t it?

And what if I said that when Duensing took the hill in the middle of the sixth to protect a 3-run lead, historically teams had won 83% of their games? But that teams who still had a 3-run lead at the end of the sixth had won 89% of their games? Wouldn’t it be reasonable to suggest that Duesning and the Twins defense should get credit for driving that game 6% closer to a win?

And if you’re trying to determine the impact of a player on a season, isn’t it reasonable to add up all those percentages – both positive and negative – and see how a player impacted his team?

This is the theory behind Win Probability Added (WPA).

(And this is where I lose a big chunk of the sabrmetric stats guys. Because while you might think that they would love this stat, my experience is that most of them dislike it. The most common criticism? They don’t like the results. It’s usually expressed by saying something like “But that says that Phil Dumatrait has been more valuable than Carl Pavano!”

And I gotta say, as someone who championed sabrmetric stats closer to their infancy, that reaction makes me want to cry. Bill James talks about how he used to think that once he explained his discoveries to baseball teams, and proved his methods, they would accept them. Instead, they would say something like “That can’t be true – it shows that Darrin Erstad isn’t valuable! He’s a gamer!”

The parallels are obvious. It drives me crazy to think that the high priests who pride themselves on championing baseball research are those most passionate about discrediting stuff like this. I’m not kidding about the wanting to cry thing. I honestly feel a small buzzing below my ears when I hear people say crap like that. For those of you looking for a hot button, you found one.)

Anyway, there are flaws with WPA. One is that it gives credit to the pitcher for the defense behind him, which most traditional sabrmetricians suggest is worth about 1/3 of the value. Obviously, that also means fielders don’t get that credit, either. We’ll try to accommodate that a bit.

Another criticism is that even though WPA tries to value hits and scoreless innings in the context of game, it doesn’t take it far enough. The probabilities reflect average teams and not true probabilities of facing teams. For instance, ideally it would assign a higher probability of holding a lead versus the Royals as opposed to the Yankees.

(There are likely other flaws, too. It took time to uncover some flaws in the Pythagorean Formula, Runs Created, UZR, VORP and WAR. We’ll likely find some more in WPA too, provided we continue to actually study it.)

In terms of impacting the game, Denard Span leads all Twins hitters, having added 84% to the team’s probability in the 56 games he played. If you want to see all the Twins, both hitters and pitchers, you can do so .

And Storen? +236%. Even if we give 1/3 of that credit to his defense, and even if we give Span an extra fifty points for the above average defense he has played in center field, Storen has impacted the Nationals a bit more than Span has impacted the Twins this year. He also has had that impact while being a closer on a team that is five games under .500.

How can that be the case? Because one thing WPA shows is how a manager can leverage the value of player at critical points. Very good relievers can have very high or very low WPA scores because a manager will consistently put them in the right place at the right (or very wrong) time. If they come through, they save the game and increase the probability of winning significantly. If they blow it, they can lose a ton of those probability points. For instance, for the Twins, Glen Perkins is second on the team with +151%. But Matt Capps is near the bottom at –90%. The swings for relievers can be volatile – which bring us to the next point.

Relievers are too volatile to be valuable.
This is the point that makes the least sense to me. If relievers are more volatile than position players, wouldn’t it mean that the relievers who perform are more valuable? There’s a reason that tech stocks that perform are valued sky-high. It’s because tech stocks are volatile, and those that perform are worth a lot more – even more than regular high-performing stocks.

I think what is really meant here is “I don’t trust Drew Storen, because relievers are volatile and Storen is a reliever.” I can’t make you trust Drew Storen. If it makes you feel better, most of the tweets I saw yesterday concerning the trade from Nationals fans were also rending their garments. Apparently they trust him.

Good center fielders are more rare than good relievers.
There is another definition of value beyond impact: rarity. The more rare a commodity, the more valuable it is. I argued this several times during the offseason when berating the Twins for offering arbitration to Matt Capps.

The problem with comparing Span and Storen on that basis is that they’re both exceedingly rare. One doesn’t find 27-year-old center fielders with a career OBP of .366 on the free agent market, and one doesn’t find 23-year-old fireballers with a sub-one WHIP on the free agent market, either. If we did, my best guess is that Storen would probably get a better contract than Span, but I can understand those that are wary of him being overpaid because of his “closer experience.”

But I’m sure about one thing: they’re close to each other in the rarity department. For this exercise, that’s enough.

An everyday player is harder to replace than a reliever.
Usually, this is demonstrated in one of two ways.

The most common is anecdotal. “The Rays signed Juan Cruz to a minor league deal and look what he did for them this year.” Or, more regionally, “Nobody thought Glen Perkins was going to be any good, and look what he did.” Certainly, there are several success stories throughout each season that are similar.

Of course, there are also a lot of disasters, too. There is a reason that at every trade deadline relievers are a hot commodity, and believe it or not, it’s not because every GM of every really good team is too stupid to sign good relievers. It’s because, going back to an earlier point, relievers are really volatile.

When you have a lot of volatile commodities, many are much better than you think they’re going to be and many are much worse. If you only look at the ones that over perform, you feel like an idiot. “Look how that tech stock came through. Why didn’t all the traders pick that? They could’ve had it at a record low price. It was easy. Why are they all such idiots?”

They aren’t idiots – they’re just in the business of picking volatile assets. A bunch of them are going to over perform and look good. A bunch are going to under perform and look dismal. But looking at the good ones and concluding that good tech stocks are easily replaceable is foolish.

The second way is to use a formula like VORP or WARP or something that ends in RP, which stands for “replacement player.” The problem with using those kinds of metrics when evaluating relievers is that it misses the context of what they do. It relies on the number of innings they pitched, and since they don’t pitch many innings, they’re not very valuable. We know that isn’t true because of the importance of the innings that they are put in. They are really, really poor metrics for evaluating relievers.

It isn’t clear to me how to judge what a replaceable player is at each position, at least not in an overnight entry. So instead of looking at a player in relation to a “replacement player” which is supposed to be freely available talent in AAA, let’s look at it in relation to an average MLB player or pitcher.

Certainly, if you use that in relation to WPA, we’re going to get the same result as before. WPA compares both hitters and pitchers performance to how players have historically affected games or to an average player. Storen has improved his chances +236% over an average pitcher. Span has improved his team’s chances of winning +84% over an average hitter, plus he’s saved about 10 runs over an average center fielder. That’s what we came up with before.

If you prefer to use something like runs, I suppose we could compare Span’s runs created to the median center fielders runs created and tack on the defensive runs he saved. Then we could compare Storen’s runs saved to it using Baseball Prospectus’ great little report. But it’s after midnight, and I don’t see that report anywhere on their site right now.

That might show me I’m wrong – that Span’s impact is quite a bit greater compared to an average center fielder than Storen’s is to an average middle reliever. If someone wants to do that and post it somewhere just let me know in the comments below. I’m cooked. Perhaps that is why it is still not obvious that Span (or any effective hitter) is more valuable than Storen (or any effective reliever). Or perhaps it is because it isn’t obvious at all.


DrJubal said...

I don't know what's right either - but I love your analysis and the fact that you wrote this up for us.

Emotionally I think people love DSpan and would be sad to see him go.

Washington is 16 out - so they're obviously not looking to contend this year - do they want their "CF of the future" at a locked down price?

I'm just wondering what's driving THEM.

Anonymous said...

The sabr problem with WPA has nothing to do with not liking the results. The problem is it is not predictive. It's a fun little toy to tell a story about what happened in a game, it's not intended to be used to show that one player is or was more valuable than another.

TT said...

"But what if I told you that historically (counting thousands of MLB games), teams that were in the same position as the Twins were when Mauer came to the plate had won 62% of their games. But after that hit, teams in roughly that same position had won 73% of their games. It would be fair to give Mauer credit for that 11%, wouldn’t it?"

No, it wouldn't. It assumes all the other factors for who wins are random and they aren't. And no matter how large the size of your sample, that does not make it random.

You pull out one, what teams are playing. But there are decisions by both the manager and the players on both sides designed to optimize the outcome. Those decisions impact both how a team got there and what happens afterwards, including the role of players.

Duensing was not randomly pitching the sixth inning. The manager decided to send him out there. The factors that lead to that decision include things like how many innings the bullpen has pitched. And Duensing's pitching deeper into the game effects those decisions for today's game. And, as anyone who has watched a tired bullpen blow late game leads realizes, those factors are not insignificant.

I agree that Span is not Ramos. Span is worth a LOT more than Ramos. I say that as someone who thought Ramos was the Twins top prospect. The simple truth is that lots of players with potential never live up to it. Ramos is now struggling at the plate in Washington. The Twins apparently were concerned about his ability to control his weight. We may look back on that trade as terrible, but it certainly hasn't been so far. The Twins made the playoffs last year and they are still in the race this year at least in part because they have Capps.

"They aren’t idiots – they’re just in the business of picking volatile assets. A bunch of them are going to over perform and look good. A bunch are going to under perform and look dismal. But looking at the good ones and concluding that good tech stocks are easily replaceable is foolish."

You are mistaking "volatility" for risk. And looking at high risk investments and thinking that you are smart enough to determine which ones are which in advance is foolish.

You don't compare investments in a single bluechip stock to an investment in a single speculative stock. You invest in several speculative stocks and assume some will be winners and others losers. In other words, its a numbers game.

The problem with "replacement players" is that they are purely imaginary. Who will replace Span, who will pitch instead of Storen? In neither case will it be "replacement players".

In the case of Span, he will be replaced by Revere and Repko. In the case of Storen, its not clear who he replaces. Obviously the worst short guy in the bullpen loses the roster spot. But Storen is going to be pitching late in the game likely taking innings from Capps and Perkins this year. What makes this trade work is that Storen may well be the guy who steps in as closer next season.

In short, this is the kind of deal that may work for the Twins this year. It can help them win now, but it is an investment that should pay off in the future. And they are giving up a guy who, while valuable, they have a ready replacement for in Revere. Revere is not as good as Span and probably never will be. But the Twins have shown they can win with him in the lineup.

BeefMaster said...

Okay, here goes...

How can that be the case? Because one thing WPA shows is how a manager can leverage the value of player at critical points. Very good relievers can have very high or very low WPA scores because a manager will consistently put them in the right place at the right (or very wrong) time.

I was going to attempt a spiel on how the WPA of position players and pitchers are not really comparable, but then you went ahead and did it for me. Relief pitchers (ESPECIALLY closers, who 90% of the time get to end their appearance with their teams at a 100% ExW) get a boost in WPA because they're put in positions with much higher leverage. Ron Gardenhire can't send Denard up every time he needs a clutch hit, but Ron Washington can send Storen in whenever he needs three outs with a three run lead, and boom, there's 7% WPA for Storen. Obviously, he's done a lot more than 3-run 1-inning saves, given his WPA, but I'm still not convinced that starter and reliever WPA are comparable, never mind pitcher and hitter.

Also, Storen's WPA was -90% last year.

If relievers are more volatile than position players, wouldn’t it mean that the relievers who perform are more valuable?

This misunderstands the complaint. I mentioned this in a tweet as well, but I wasn't talking about all relievers as a group - I was pointing out that individual relievers have a tendency to be volatile. Yeah, maybe it's small sample sizes at work, but that's even more reason to be a bit leery of them. Your later comment was pretty much right - I don't trust Drew Storen because he's a reliever. He's pitched less than two seasons in the majors, and his K/9 and BB/9 had a pretty big variance between those two seasons (with what looks to me like a fairly substantial drop in both this year; that's good for BB and bad for K).

It's not just Storen, of course - I feel generally the same about any relief pitcher, especially absent a long track record. Sure, he's been good for two years, but his strikeout rate is good, not great, and he's being greatly helped by a .221 BABIP (which, in his defense, is accompanied by high ground ball and low line drive rates). I'm not saying his performance is a mirage, but I'm not willing to say it isn't, either, and I'm certainly not willing to wager Denard Span on the possibility.

John said...

TT, you can't give a guy who was up til 1 AM posting that much to chew on early in the morning. I'm reduced to short takes:

1) WPA has its limitations, and it might be lying in that it is attempting to put a precise number on something that is not precise. I don't know how to judge that except in the grossest of ways: empirically, on a day-to-day basis, I find it accurately grades those players who impacted a game.

2) Span is not Ramos. I think we can agree that is intuitively obvious. I half wrote a paragraph comparing them and decide as much and erased it.

3) Volatility and risk are not be precisely the same, but the analogy still works. The point was that people feel that because they can identify a lot of over performing examples after the fact, that they are replaceable. I don't think that's the case, and it appears you don't either.

4) I've read the study on "replaceable" players. I like it more than I thought I would. I think it's in BP's 2002 Annual if you would care to check out how they judge that statistical level. I agree that it is flawed, and I think the definition that most people embrace is erroneous. But it's been few months since I checked it out and don't have the energy to dive into this.

5) Finally, I stopped short of saying I would make this trade. I just wanted to see if the players were comparable in value. I think they are.

John said...


You're right in that WPA may not particularly predictive, though in truth I've never figured out the correlation coefficient for it from one year to the other. It might be stronger than I think. But that shouldn't stop sabrists from using it for something like evaluating a past performance. And you're right that it is a fun little toy, but so is the run expected matrix, and sabrists use that (correctly) for a lot more than that.

John said...

Beefmaster, I think the two issues you bring up are related and I almost touched on it last night.

But first, on your first point: It sounds like you're saying that because relievers can be used in high leverage spots, and position players cannot, that we should somehow discount the value of relievers. But this is inherent to the position.

Which brings me to the second point. High performance relievers might be perceived as more volatile precisely because they are used in those positions. The hiccups cost more. The times they shine reward more.

My final point is that those close, high leverage situations exist all the time. Whether you want to trust someone or not, a manager has to find someone to handle those situations.

Now, I assume for most of the past dozen years you trusted Joe Nathan in those situations. That was earned, and it was earned by watching him on a day in and day out basis with the Twins, something you haven't had a chance to do with Storen. It's likely a good sign that his own fan base seems to trust him.

Sky said...

Thanks for taking the time to outline your thoughts in (much) more than 140 characters. Some thoughts:

WPA, for position players, measure offense ONLY. Span is a center fielder and plays center field pretty well. That's probably an extra 1-1.5 wins in his favor over a full season.

WPA is relative to average, when we really should be measuring relative to replacement player. A full time replacement position player would rack up about -2 WPA, while a full-time reliever would rack up just about 0 WPA. That's two more wins in favor of Span over Storen.

Regarding fluctuation of WPA for relievers, it goes beyond regular ERA fluctuation, because the timing of the runs matters. Take Mariano Rivera, the most consistently dominating closer ever. His WPA's range from 1.5 to 5.5. So yes, there is a huge difference in what we observe relievers doing. And there will be large differences going forward. You just can't know which relievers will see extra high-leverage situations and avoid giving up their runs when it really matters. We need to make decisions on what is likely. In 2010, the 10-15th best relievers by WPA had about 2.5 WPA, for example.

Rarity doesn't necessarily equate to value. Guys who can steal 50 bases are exceptionally rare, but you wouldn't give Juan Pierre much money. Guys who can hit the ball 500 feet are exceptionally rare, but you wouldn't give Wily Mo Pena much money.

Moe said...

So, if we do bring in Storen, how does that affect our plan with Nathan? Do we keep him at $12.5m or buy him out for $2m? If we buy him out, then you could almost say the trade would be Span for Storen and another player with the money saved.

Moe said...

As mentioned in Seth's blog post, we'd also save $$ on not bringing Capps back, so that REALLY frees up a bunch of money.

Jack Ungerleider said...

[First off I love the Dyer-Bennett reference. JDB should have also taught you about "rigorous analysis" and this is pretty good. ;-)]

I guess my take on this is I'd rather have an outfield of Revere(LF), Span(CF), and somebody in RF, then most other options. So that's why I don't like this trade.

USAFChief said...

Very intriguing and thoughtful post, John.

One minor quibble: I don't think there's a difference in the number of team controlled years between Span and Storen. Span is signed through 2014, with a team option through 2015. Storen's last arb year will be 2015.

If Storen is successful, it's also very possible he ends up costing more in salary over that time frame than Span, simply because the arb process will likely reward Storen for saves, and Span is already signed to a team friendly contract.

Neither of those are big points, but worth adding to the equation, IMO.

BeefMaster said...

Thanks for responding, John. Back at ya...

But first, on your first point: It sounds like you're saying that because relievers can be used in high leverage spots, and position players cannot, that we should somehow discount the value of relievers. But this is inherent to the position.

This further underscores my point that you probably shouldn't use WPA to compare the value of a relief pitcher and a position player. I suppose you can try doing some tricks to normalize it against leverage index, but even that strikes me as a bit sketchy, and I don't know how far I'd trust those numbers. Relief pitchers, especially closers, are often put in positions to earn more WPA than most position players, and moreover, they can be put into those positions intentionally.

High performance relievers might be perceived as more volatile precisely because they are used in those positions. The hiccups cost more. The times they shine reward more.

I wasn't speaking just in terms of the effects their blowups have on games (although I agree with you). I'm perceiving them as volatile just based on year-to-year performance. Matt Guerrier had an ERA over 5 in 2008, after he'd earned so much trust from the team that he still led the league in appearances. Bobby Thigpen set the saves record in 1990 and was terrible by 1992 (and out of the majors at 30). Brad Lidge has had two years with ERAs over 5. Dennys Reyes had a three-year stretch of ERAs that went 5.15, 0.89, 3.99.

Yeah, I'm cherry picking a few cases that happened to come to mind, and maybe you don't think Storen's likely to blow up unexpectedly. That's fine. But perhaps you can forgive my skepticism - Storen's K/BB is worse than Matt Capps' was when the Twins traded for him last year, and his FIP/xFIP are only about a tenth of a run better. I think it's pretty reasonable to be worried about what the Twins would get from him going forward.

Anonymous said...

This was a horrendous blog post. The reasoning here is embarrassing.

"But Span is not Ramos, "

You're right. Span>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>Ramos.

WPA is a bad stat in general, comparing the relative value of a pitcher and position player is absurd. Your sample size is 2/3 of one season? So drew storen has been good at not blowing high leverage saves for 2/3 of one season and you believe this is an important stat to base player value.

My real beef with WPA is not that its a terrible stat, even though it is. My beef is how cherry picked it is. You could have looked at any number of more complete value statistics that would tell you that denard is significantly more valuable that drew storen. You could have broken down the players peripherals and seen that drew storens good but not elite k/9, bb/9, gb%, fip, xfip dont quite match his 9000000% wpa. But you didn't do any of that, you just wrapped the total value of these players up in bad value statistic with an irrelevant sample size.

Terrible job.

Anonymous said...

Ben revere has a .578 and sub 300 obp (those numbers are really terrible). Glen perkins is better than drew storen is right now. Its easy to find good relievers. The difference between good relievers and elite relievers isnt that great. Etc, etc.

This would be an awful trade. They analysis in this blog is bad on a number of levels.

frightwig said...

WPA only tells you the probability of winning after a particular event in the game. But when the game is over, a run or an out in the 9th isn't really any more valuable than a run or an out in the 1st. If you've won 2-1, you needed each of those runs just as much as the other.

WPA is really just as circumstantial as RBI. The rating for an action depends mostly on the inning, the score, the number of outs, how many men on base, and even whether a player is at home or on the road. Which makes it a poor measure of performance, but also an unreliable predictive tool.

In evaluating Storen, I'm much more interested in his FIP (currently 3.42) or xFIP (3.32), BABIP (.221), K/9 (7.33), HR/9 (.89), and fWAR (0.5) or rWAR (1.2). He seems like a decent reliever, but not a dominant one, and his BABIP indicates that he's been lucky this season. A guy like him could help the team, but Bill Smith should be able to find other relievers like him without trading an asset like Span.

Anonymous said...

I think this sums it up from Keith Law's chat -
Kevin (MN)
How dumb would the Twins have to be to trade Span for a reliever?

Klaw (12:24 PM)
It's not the optimal strategy.


John said...


"But when the game is over, a run or an out in the 9th isn't really any more valuable than a run or an out in the 1st. If you've won 2-1, you needed each of those runs just as much as the other."

That statement seems obviously true, but I don't think it is. Certainly, the game is played differently. If the Twins score 2 in the first, and carry that 2-0 lead into the ninth when the Rangers score one, but then hold on to win, that game is played differently than if it's 1-1 and the Twins win it in the top of the ninth. The Rangers are less likely to use their best relief pitchers down two runs but probably would in a tie game. Gardenhire is much more likely to pinch hit in late innings in a tie game, and more likely to use a defensive substitution leading 2-0. What's more, both managers are right to do so because the leverage is different. Players would have different goals in those games, too.

Does that make a difference, or am I splitting nits? I think it is important. And I'm quite sure it is much more important to pitch a scoreless eighth than a scoreless second inning in that scenario. I really think you raise an important point. This is where a lot of the disconnect is on this subject.

Anonymous said...

WPA is a fun statistic, but it doesn't have much predictive value. It can show you who got the game winning hit, but it doesn't tell you who will get the game winning hit tomorrow, let alone next year.

Abundant analysis has shown that clutch hitting is not repeatable. Coming through in important situations should not be used to predict future performance; your total stats should. Same for pitchers. Evaluate him by his ERA+, FIP, whatever you like. But not his WPA.

He's just a pitcher with 105 innings of 3+ ERA. That's it. He'll only contribute to 50 of the team's innings in an entire season, 70 at most.

You could say those innings are more important innings. That may be true, though it's debatable. But if you plugged in a different pitcher, you would not see much of a drop off; WPA would inflate his seeming importance equally much. It's like overpaying for someone based on the save statistic. (Which, incidentally, is exactly what the Twins would be doing -- I'd much rather have Clippard. THAT dude is good.)

Revere for Clippard, sure.

But Span for Storen? That's trading a guy with 4 years of .366 OBP, the only decent leadoff hitter they have, and an above average defender at an elite position, all of which would be very hard to replace, for a promising but unpredictable young relief pitcher who's value is inflated by the save statistic, and the WPA statistic.

John Sharkey, Esq. said...

Hmmm. Looks like I dallied a bit, and most people have covered what I think about this stuff. (And I'll go on the record as someone who's (a) pro-trading Span, and (b) pretty in the dark about Storen's virtues specifically.) More generally:

I don't find WPA particularly valuable when thinking about how to construct a team going forward. I think there's a case to be made that it works OK when evaluating who was particularly important in a given game/week/month/whatever, but if I'm trying to figure out who will help me win in the next game/week/month/whatever, WPA just seems useless -- we've got far more useful numbers for that (especially for hitters).

This ties, I think, to the volatility point. It's not necessarily that a center fielder is more valuable than a roughly-analogously skilled (however you measure it) RP (although I suspect, over the course of a season, that's true too; so far this year, Storen has faced ~200 batters, vs. ~250 PA for Span). I wouldn't quite phrase it as "RP's are too volatile to be valuable." Instead, I'd say something like "the difficulty in predicting, for any given year, who the effective RP's will be, means its unwise (as a general rule) to invest too heavily in RP."

I don't know if I phrased that as well as I could have. If you can tell me which relievers will be ~70th percentile in a given season, I'd be happy to trade/pay for them. But aside from a few elite cases, I don't think it can be done. (Whether Storen is one of those cases, I leave to others.) With someone like Span, I'm fairly confident (at least, as much as one can be with human baseball players) in what I think his production will be the next ~3 years. Any RP but Mariano, though, I have very little confidence in year-to-year. That, in a somewhat strage sense, makes them less attractive investments -- good RPs are valuable, but individual RP's aren't (because it's too hard to tell ahead of time who will be good and who won't be).

I'm repeating myself now, I think. I have some intuitive beef with your point about scarcity; I think of CF vs RP scarcity in a slightly different way. The optimal way to get a good CF is to... find a good CF. The optimal way to get a good RP is probably just: draft tons of hard-throwers, move the ones that can't start to the bullpen, and get a few years out of them. A truly consistently good RP is perhaps even more scarce than a good CF, but to me that just suggests that it's a fool's errand to chase RP. Buy in bulk, cross your fingers.

Anywho, that's all for me for now. In any event, good post.

(word verification: "grule". Hmmm.)

Anonymous said...

The Twins shouldn't trade because Phil Mackey and Patrick Reusse said on their radio show today that everyday players are more valuable than relievers. :)

lvl 5 Charizard said...

Stop blogging.

SoCalTwinsfan said...

"Here's another interesting factoid: Blackburn has 11 quality starts so far, two more than Slowey had all of last year."

This has nothing to do with this post, but I don't have a Twitter account. Here's another factoid: Blackburn has allowed at least six runs five times this season. Slowey has done that seven times in his career. Blackburn just has too little margin for error.