Thursday, January 14, 2010

An Average Way Below Average

Check out this page for a second. Open it up in a new window while you read this. (Right-click, choose "Open in new window")

The link is to FanGraphs.com, a great site which is largely responsible for why we are all relying on Ultimate Zone Rating (or UZR) so heavily as a defensive metric. FanGraphs.com updates UZR’s regularly, so the stat has become readily available, which is a huge reason that it is widely used. The other reason is that it seems to be a good stat for judging defensive prowess; far from perfect, but good.

The page I linked to is one of several valuable ones at FanGraphs.com. It shows qualified right-fielders, and I'd like you to sort by the last column, UZR/150 (which is the projected UZR of players over 150 games). UZR’s are supposed to measure the number of runs a player is better or worse than average at that position. So in this case, you can see that Nelson Cruz is at the top of the list, +13.4 runs better than average, while Brad Hawpe is at the bottom of the list, -25.4 runs worse than average.

But what struck me about this list is how two-thirds of the players are below average. Of the 19 qualified players (which I think means they played at least 900 innings in right field), only 6 of them are above average, and those six “saved” just 59.8 runs. But 13 are below average, and they cost their teams 140.3 runs.

Overall, the number of runs saved should be almost exactly the same. That’s how you determine what average is.

So what’s the explanation?

  • Possibly the right fielders who played in 2009 were worse than in previous years, and it’s the previous years benchmarks that are set. I don’t know if I’ve ever heard how far back UZR goes in determining what “average” plays are.

  • Perhaps right-fielders are substituted in late games by better defensive options frequently. Those subs drive up the average, making the hitting right-fielders looks worse defensively.

  • There are definitely a handful of above average fielding outfielders that didn’t play quite so much. In fact, it looks like there are a number of outfielder with between 590 and 900 innings who are above average, but didn’t make this list. I wonder how many of them are superior defensive outfielders who can’t hit enough for a regular spot and spend time there for some other reason.

Ultimately, I don't know how important this is. But it is interesting that of the guys who are given the most playing time in right field, so many of them are below average. If anyone cares to study this more, or has some takes on it, I'd love to see it in the comments.

23 comments:

Jim H said...

My take is that there is no way that UZR can mean much based on the way it evaluates fielders. But I have written that before. I think there is a better chance that UZR tells you more about the quality of a pitching staff than it does about fielders. More likely, it really doesn't tell you much of anything.

I think it trys to suggest something about range, which is why most right fielders get such poor ratings. I suspect it is taking uncatchable balls and showing that it could of been caught. I think that is why Cuddyer is rated so poorly. I believe he is at least an above average right fielder. Since many right fielders aren't real fast, that probably accounts for what you are seeing.

Actually, being fast isn't really that important for a right fielder. It is a smaller territory in most parks and usually less chances than in either center or left. The smaller sample size also probably effects the results. I would guess that rf's probably show a greater variation from year to year than any other position, according to UZR.

ubelmann said...

I would guess that rf's probably show a greater variation from year to year than any other position, according to UZR.

Weee! Why bother checking the facts when we can remain ignorant and bash a perfectly useful stat!

If you look at the list of all right fielders, qualifed or not, the sum of their UZR is -1.7 runs. If you take the sum of the absolute value of their UZR, it's around 490 runs. The -1.7 runs is probably just a matter of round-off error as the UZR totals on fangraphs, the totals are reported to the nearest tenth.

Taken as a whole, UZR is telling us that major league right fielders were average fielders with respect to their position. SHOCKING! I'm sure that there will be some way to spin this to make UZR look completely useless, of course...

TT said...

My take is more cynical - surprise huh?

If UZR really measured something tangible it wouldn't claim to quantify something we all know it isn't really possible to accurately measure - like an individual player's contribution to preventing runs.

The same can be said of any stat that claims to measure an individual player's contribution to runs scored or wins or pennants or world series championships. The goal is to obfuscate how the number was derived and its actual meaning.

Its not impossible someone just got the numbers wrong. Who would know?

ubelmann said...

We don't exactly need Sherlock Holmes to solve this mystery. Qualified right fielders only accounted for 50.4% of the innings in right field in 2009. (Although perhaps innings played has become a controversial defensive statistic while I wasn't looking.)

All this tells us is that as a group, the part-time right fielders were better defensively than the full-time right fielders. That shouldn't be terribly surprising. The best hitters are more runs per game above average than the best fielders are, so it's easier to have a higher overall value in a hits-well/fields-poorly package than a hits-poorly/fields-well package, although fielding poorly certainly lowers your overall potential value.

And if we turn our heads to the left, we see that in LF, the qualified players accounted for only 35% of innings at that position. The qualified players were 25 runs below average in the field. Not as bad as the 62 runs below average of the qualified right fielders, but if you account for the playing time difference, you're looking at 36 runs below average in LF in the same amount of time that the right fielders accumulated 62 runs below average. The difference appears to mostly be that RF has a few more outliers on the negative side of the ledger than LF does, in terms of qualified fielders.

(P.S. Fangraphs makes it really easy to sum up this stuff with their export to Excel/CSV feature.)

John said...

Thanks for the tip on the exporting thing. I'm looking forward to using that.

I like your main point in the 4th comment - that this means that part-time outfielders are better defensively as a group than full time outfielders - is a great one. It certainly isn't intuitively obvious that would be the case.

I'll tell you what I'd love to see from this exporting if you have the data fairly handy Mat. I'd love to see the correlation between UZRs between by player and position from year to year for those players with something like 500 innings of time. If you don't do it, I may have to. I generally like UZR to tell me how a player did the year before, but we constantly use it the other way - to tell us how a player will do the next year. It's not intuitively obvious to me that the stat should be used that way.

sean said...

John: Already done. Even shows the correlation for several BIP levels and compares it to wOBA.

Bryz said...

Is it possible that the instant subtraction of runs ("positional adjustment" or something like that) is to blame? Even before considering how good/bad the right fielder was, he would be penalized simply because he's playing a rather easy position. That might explain why so many fielders are considered to be below average.

Mike C said...

Cuddy is lucky to be called slightly below average in the field. So I think the comment about him being above average is a bit off base. Most games he looked OK in the field but he had several that were terrible. At least he has a good arm to make up for some of it.

As for the stat, I think it's a good general barometer of how good someone is in the field. It's not absolute, IMO. A player with a 7.0 UZR and a 8.0 UZR are probably not going to be noticably diffent in fielding ability, but a a 3-5 point gap will probably be noticable when you watch the players play.

Anonymous said...

I may be wrong but runs saved by UZR doesn't include runs saved by the RFers arm. Hardball times did an article a few years back on RF arm strength and found Cuddyer's arm saved 14 runs over the season, despite his range, b/c he held runners from taking an extra base and had a lot of OF assists that year. Not sure if that's still true but I would think that would be an important part of the overall equation. Here's the link:
http://www.hardballtimes.com/main/article/best-outfield-arms-of-2007/

Jim H said...

Ubelman, you are right, I am not about to check the "facts". If I understand UZR(and I probably don't) it takes where a ball is caught or lands in a player's zone and assigns a probability of whether it should of been caught or not on each play during the season. It also factors in how hard the ball was hit based on an observer's observation. It then translates this to runs prevented or allowed.

Since it is almost impossible to factor in positioning, that seems to be ignored. Cutting off balls in the gap and preventing baserunners from advancing is also ignored. I think maybe someone adds in rinners thrown out for outfielders but I don't how it factor in baserunners not trying to advance because of a strong arm.

I am a former high school math teacher. I have played, coached and umpired games. I just don't see how UZR numbers can possibility measure what they say they measure. There so many places for errors to creep into the system it is difficult to catalog them all.

When I say Cuddyer is probably an above average fielder, that is only based on observation on TV. Whenever UZR numbers don't match someone's perceptions they start looking for reasons to account for that. I say, don't bother. Not even people who see many games like scouts, reporters and even players or managers seem to agree on who the best fielders are.

As I have suggested before, when a player's UZR number varies wildly from one year to the next, it is probably more an idication of better or worse pitching rather than any change in the player's actual defensive play. But maybe not, who knows.

ubelmann said...

I am a former high school math teacher. I have played, coached and umpired games. I just don't see how UZR numbers can possibility measure what they say they measure. There so many places for errors to creep into the system it is difficult to catalog them all.

Certainly I would not advocate that UZR (or *any* statistic, measuring offense, defense, or pitching) is somehow an indication of an absolute truth. But I like the term Mike C used for UZR. It's a good general barometer. To me, the only thing worse than getting a somewhat flawed numerical handle on things would be to not try at all.

If offensive statistics didn't exist before 2000, we could poke just as many holes in them as we do defensive statistics. We could say that batting average doesn't take into account whether a hitter faced a nasty Randy Johnson slider or a tasty Livan Hernandez meatball. Batting average doesn't take into account whether or not a hitter got jobbed on a called strike three. It doesn't take into account whether a hitter faced a bunch of tough pitchers for the last month or whether a hitter had bad luck for a month. It doesn't take into account whether a hitter has a butt-ugly swing or a good approach at the plate or a bad approach. We could go on and on if we wanted to.

But because we see batting average evolve over the course of the season in front of our eyes, right below the player's name every time he comes to bat on TV, we've come to accept that while it is not perfect, it gives us a good idea of how likely a hitter is to get a hit. If we're just going to toss out defensive statistics altogether because there could be errors, we might as well toss out all statistics, no matter how long they've been around. No batting champ, no home run title, no significance to 20-game winners, no talk of triple crowns, no talk of perfect game totals, no-hit totals, etc., etc., etc.

From my point of view, defensive statistics are only under such a crazy microscope because they are new. New is not the same as useless, though.

(And Jim, I apologize for the excessive snark last night. You seem like a pretty reasonable guy after your last comment, but I definitely think you are underselling the utility of defensive statistics.)

Anonymous said...

I'd like to know more about how it's decided whether a ball should have been caught. Is it just some intern watching the game on TV?

I get the overall concept, but thinking about the specifics makes me skeptical.

Jim H said...

Sorry for the spelling errors in my last post. I will give you an example of why I am pretty skeptical of fielding stats. I remember when the whole Castro vs. Bartlett thing was going on. One year Castro and Bartlett pretty much shared time at ss. According to (I think UZR but maybe some other stat) they were pretty close defensively. The next year Bartlett was sent to the minors to begin the year and Castro played everyday. Later Bartlett came up and played everyday. Bartlett defensive stats were much better.

Now the general conclusion was that Bartlett was the much better defensive ss. That actually is probably through. But consider, they didn't really play behind the same pitching staff. All of the pitchers were hit very hard at the beginning of the year. Some were removed from the rotation. About the time Bartlett came up, nearly everone would agree that the pitching staff, especially the starters was much better.

So were Bartlett's better stats a reflection of playing behind a better staff? Or would you say the staff was better because of Barlett? How do you account for their stats being so similar the year before?

I will give you one more specific type example. Let's say that the Twins are using an extreme shift against Thome. Let's also say he hits a soft liner right through the middle of the 3b's "zone". It is a hit. How does UZR handle that? Does the observer change the zone so the 3b doesn't look so bad?

You might say that type of thing evens out over time but how do you know that? Does it keep a player from being penalized on that specific play?

I remember when Ripken won gold gloves over guys like Gagne. The justification was that even though he didn't have the range of other ss, Ripken was so good at positioning that he made up for his lack of range that way. Do you think that can be true? Does UZR really reflect that ability?

Sorry if my posts come off as being too anti new stats, but I think before I am going to embrace things like UZR, one must show me that they really work as advertised.

Jack Ungerleider said...

I usually leave the stat discussions to the statheads. One thing I"d like explained is what "??? Above Average" means. (Where ??? is Arm, Range Runs etc.) How are those numbers calculated? What I'm reading in Jim's last post is that some of this is based on subjective judgment calls.

If that is the case I don't think ubelmann's comparison to batting average is legit. BA is a discrete measurement of a series of events. The event (a plate appearance) has one of two classes of outcomes, those that constitute and "at bat" and those that don't. Within the "at bat" category you have one of two options, the player reaches base safely (a hit) or the player is out (not a hit). The rules dictate what is counted and the event outcome determines the result. Time has taught us that a hit 25% of the time is about average while 40% of the time is almost impossible. If the UZR components breakdown the same way then it may be a reasonable measure. Time will help us learn how to interpret the measurement.

TT said...

"I may be wrong but runs saved by UZR doesn't include runs saved by the RFers arm."

It appears from the description on the FanGraph site that UZR does consider an outfielder's arm and it considers Cuddyer's below average.

"It's a good general barometer."

What does that mean? A barometer accurately measures the barometric pressure. You can use that to predict changes in the weather. But those predictions are often wrong.

The problem here is that the measurement itself is inaccurate. It is like having a clock or thermometer that is close most of the time except when it isn't. You can use them to tell you the time or temperature only when the result doesn't really matter.

By its own standard, UZR claims to tell us exactly how many runs a player's defense costs or saves his team. If it doesn't do that, then it is as useless as using a spring as a yardstick.

It seems that UZR, like a lot of similar stats, is gospel when it supports someone's point of view and is only unreliable when it doesn't.

TT said...

Sorry - I mistated this:

"UZR claims to tell us exactly how many runs a player's defense costs or saves his team."

It only claims to be accurate to within 1/10th of a run. Not exact.

TT said...

"We could say that batting average doesn't take into account whether a hitter faced a nasty Randy Johnson slider or a tasty Livan Hernandez meatball. Batting average doesn't take into account whether or not a hitter got jobbed on a called strike three. It doesn't take into account whether a hitter faced a bunch of tough pitchers for the last month or whether a hitter had bad luck for a month. "

We can and do say all of those things. So what? Everyone understands that life isn't always fair and results do not just reflect personal accomplishment. Its quite a different thing to claim you have accurately accounted for which is which.

sean said...

Jack: What's the result of a fielding play? Fielded or not fielded. In the fielded case you have properly fielded or some error occurred.

Some people will have 95% fielded, while others might have 90% fielded. Average out all the plays at a position and they you can calculate what's above or below average.


Anonymous: The catchability of a ball is determined by how often a ball that his hit similarly is caught. A fielder that is perfectly positioned for every play will mess up things a bit, but this is balanced out by the fielders that are never positioned properly. Plus you can use many years' worth of data to get a better average.


TT: Would you judge any offensive stat on two months of data? Ortiz had a .570 OPS after 46 games. He had a .940 OPS after that.

UZR makes no claim to perfectly represent a player's true fielding talent, just an educated assessment of what actually happened.

TT said...

"The catchability of a ball is determined by how often a ball that his hit similarly is caught."

Determining whether a hit is similar is a non-trivial task and highly subjective.

"UZR makes no claim to perfectly represent a player's true fielding talent, just an educated assessment of what actually happened."

It explicitly claims to measure exactly how many runs a player saved or cost his team - to within one tenth of a run. If you mean no one really believes that, I hope you are right. How a player saves his team one tenth of a run during a season is beyond reason.

But then you might ask why you don't believe that but you take on faith that they have accurately determined the similarity of a hit ball. What we dealing with here is something called "willing suspension of disbelief". It is all theater.

Bryz said...

I'd like to know more about how it's decided whether a ball should have been caught. Is it just some intern watching the game on TV?

I'm going to answer this question in regards to the outfield. The infield is the same, except UZR only measures for grounders, not liners or fly balls.

The outfield is split up into x number of zones. Each of these zones is then divided up into the types of balls hit to that zone (for this explanation, I'll just say soft line drives, hard line drives, fly balls, etc.). With each type of batted ball, there is a probability that all fielders have caught that ball. So, a fly ball hit to straight-away center field has a value of something like .95, because 95% of all center fielders caught fly balls with the similar trajectory and landing zone. A hard line drive hit to the exact same spot would have a different value, because it would be tougher to catch (let's say .80) so 80% of all center fielders made a catch on balls similarly hit to that spot on the field.

Let's say during a game, your center fielder catches a fly ball like I described above. Then that fielder is credited with 1-.95=.05 runs, because he made a catch that 5% of all center fielders did not make. However, if he missed it, he is penalized .95 runs because he missed a ball that 95% of all center fielders normally caught. You can see how defensive positioning prior to the ball being hit can affect a fielder's UZR. Also, this is why the Baggy in the Dome was blamed for some of Cuddy's low UZR ratings in RF, because it prevented him from catching some balls that other fielders could make in a different ballpark where there isn't a fence blocking their ability to catch a ball hit in a certain zone.

Additionally, there are positional adjustments (as I mentioned in my previous post) given to fielders. This is because we should not treat an average shortstop as having the same defensive ability as your average first baseman. Therefore, these adjustments are:

Catcher: +12.5 runs
Shortstop: +7.5 runs
Second Base: +2.5 runs
Third Base: +2.5 runs
Center Field: +2.5 runs
Left Field: -7.5 runs
Right Field: -7.5 runs
First Base: -12.5 runs
Designated Hitter: -17.5 runs

I'm not the right person to explain this any further, you may want to go to fangraphs.com or Parker Hageman for that.

sean said...

TT: You are reading far too much into it. First, yes, a player can save a tenth of a run. Second, no one thinks that a +5.0 UZR means that a fielder saved precisely five runs. You are jumping from "it measures things to the tenth of a run" to "this is impossible, therefore completely false". There is some middle ground.

A pitcher can have a 3.87 ERA and thus expected, roughly, to give up 3.87 runs every nine innings. How is this any different? Sometimes he'll give up four or five runs, sometimes two or three. It averages to tenths and hundredths of runs.

Someone with 150 hits out of 500 ABs would have a .300 average. Someone with 151 hits would have a .302 average. One hit isn't a big deal, right? But there's still the same amount of significant digits as UZR. Ditto with ERA. No one cares because everyone knows the difference is negligible. Same thing with UZR. There is no difference between +5.0 and +4.5. This is why everyone uses integer runs for projections and true talent estimations.

Anonymous said...

Do they use radar guns or map the trajectory using tools? I just have this image of someone leaning back in a chair saying, "eh, we'll call that one a soft liner." I know that's essentially how they score hits and errors, and I'm not trying to be contrarian, I'm genuinely curious.

John said...

Good discussion out here. For those of you wanting more info on UZR, I did a post detailing it few weeks back at:

http://twinsgeek.blogspot.com/2009/10/theres-stat-for-that-uzr.html

I'm sorry I didn't include that in the original post. Brain lock.