Why I Dislike WAR as the Ultimate Player Valuation Metric
Maybe this isn't the best place to post this, but I was looking at the 2010 WAR leaders on Fangraphs and noticed something that, quite frankly was astonishing to me. Guess who generated an equal amount of WAR as Carlos Gonzalez in 2010?
You may not be as suprised as I was, but it's worth a look, nonetheless.
The answer: None other that Giants OF Andres Torres.
Now, I fully recognize that Torres had a very good year. A break-out year for him, and certainly one of their more valuable players while helping fill out the top of the lineup and playing very good defense. However, Andres Torres, really? This caused me to look into the individual metrics listed on the Fangraphs 2010 WAR leader page.
Here's what I found:
|
Player |
HR |
R |
RBI |
SB |
ISO |
BA |
OBP |
SLG |
wOBA |
wRC+ |
|
CarGo |
34 |
111 |
117 |
26 |
.262 |
.336 |
.376 |
.598 |
.416 |
155 |
|
Torres |
16 |
84 |
63 |
26 |
.211 |
.268 |
.343 |
.479 |
.363 |
128 |
There are many other metrics that could be included, I simply tried to get a range of traditional to SABR types, to satiate all parties (yeah, right).
Yet, their WAR:
CarGo = 6.0
Torres = 6.0
In fact, in Dollars (also from fangraphs):
CarGo = $23.8
Torres = $23.9
I don't know about you, but as good of a year as Torres had in 2010, it doesn't look/feel like he provided nearly as much value to the Giants as Gonzalez did to the Rockies. Why/how is this possible?
Fielding (Fld) metrics (from same Fangraphs page referenced above):
CarGo = -2.6
Torres = 21.2
Similar results can be found with Brett Gardner of the New York Yankees. While he had an above average season at the plate (123 wRC+), it's his league high 22.9 in the fielding metric that help elevate him to a WAR of 5.4.
While I have no doubt Torres is a very good defensive fielder, so is Gonzalez. At least, that's what my eyes and his Gold Glove from 2010 would indicate. I recognize Gold Gloves are a bit of a popularity/hitting contest and don't always go to the most deserving defensive player (Tulo/Rollins 2007), you don't usually suck on defense and win the award. Yet that's the only metric that Torres exceeded CarGo in (besides a 9.8% vs. 6.3% BB%). How does this make any sense?
Ultimately, maybe my biggest beef is with the various defensive metrics that are being used out there, as I think that they are significantly lacking when it comes to determining the true value of a player to his team on defense (particularly with the way Coors Field is "corrected" for, but that's a whole different ball of wax. Ultimately, when you're dealing with WAR, as with any calculated value, garbage in = garbage out; and I personally think this case shows that WAR might have more garbage in the input than many recognize. Thoughts?
Eat. Drink. Be Merry. But the above FanPost does not necessarily reflect the attitudes, opinions, or views of Purple Row's staff (unless, of course, it's written by the staff [and even then, it still might not]).
73 comments
|
0 recs |
Do you like this story?
Comments
mis titled
as you mention at the end of your post, it is UZR that you seem to have the biggest beef with. I think it is pretty common knowledge that UZR has trouble quantifying the skill of Colorado outfielders. What you are looking at is fWAR (f = fangraphs). Perhaps you should use bWAR, or even sub in your own defensive values (bloomingrockWAR?). WAR is a well constructed and very useful stat. Most of the problems arise when the stat is used incorrectly.
I understand what you're saying, but I still contest that any system that rates
Andres Torres’ 2010 season as equal (or slightly greater than) to Carlos Gonzalez’s 2010 has a problem. And my guess is that most different WAR values (bWAR vs. fWAR, etc.) suggest a similar conclusion as fWAR does.
To me, that means something’s wrong with the way WAR is calculated.
Don’t get me wrong, I think war is helpful, but not the holy grail to player valuation.
As a matter of interest, what do you propose is the correct usage of WAR (whatever variety you prefer)?
by blooming rock on Jan 18, 2011 5:17 PM MST up reply actions
You're taking the wrong direction when evaluating WAR
Simply because it doesn’t confirm what you already felt about CarGo and Torres, doesn’t mean it’s faulty.
Logic doesn’t go:
Preconceived Notions → Evaluation of Analytical Methods → Confirmation or Refutation of Analytical Methods
(2010 CarGo > 2010 Torres) → (WAR: 2010 CarGo = 2010 Torres) → (War = False)
It should be:
Analytical Methods → Evaluation of Preconceived Notions → Confirmation or Refutation of Preconceived Notions
I agree that CarGo is a better player, but that doesn’t mean he was more valuable than Torres in 2010. Don’t confuse talent with year-to-year value.
Rocktober is not a time of year, it is a religion.
You say PoTayTo I say PoTahTo. You are right about your logic, so does logic actually make you believe Torres value was higher than Cargo’s?
I don’t believe Torres was more valuable than CarGo in 2010.
As far as the most advanced quantitative analysis can determine, the difference in their 2010 value was negligible or indeterminable.
Rocktober is not a time of year, it is a religion.
But this is exactly what the argument is. You say there is negligible difference, yet you say you do not believe Torres was more valuable? Why? It is because if you watch the game, you know Cargo is of higher value AND the better player. I believe your argument is exactly why the credibility of metrics can be questioned.
I'm not sure that I follow you...
I was clarifying that I don’t think Torres was more valuable, but there is a good chance he was roughly equal to CarGo in value.
I do watch the game, and I see CarGo rake and rake. But I also am hesitant to let the "Eye Test’ be the end-all be-all of analysis. The “Eye Test” gave Jeter 5 Gold Gloves, and I’m not about to discount a player, such as Torres, who had an exceptional defensive year, simply because I don’t care to put any trust in defensive metrics.
I’ll definitely accept the possibility that other defensive metrics value Torres and CarGo differently, and thus end up with different WAR values at the end. But simply saying “fWAR sucks because it says 2010 Torres = 2010 CarGo” is a baseless assumption.
Rocktober is not a time of year, it is a religion.
At this point, however, you're kind of putting words in my mouth.
After reading through RIR’s comment below, I definitely see that the biggest issue I have is that Fangraphs use of UZR in their fWAR overvalues Torres in 2010 as compared to CarGo in 2010. Maybe my issue is simply how the defensive metrics are weighted in comparison to the hitting metrics by Fangraphs. I’m not trying to be difficult, I’m just trying to understand how the two, in 2010, could possibly be VALUED equally.
The bottom line, however poorly I articulated it, is that all things being equal, seeing Torres’ 2010 season valued equally to CarGo’s 2010 season causes me to question fWAR and how much stock can be put into it.
Beyond that, since there are so many different WAR metrics out there, which is the one that people are discussing when they say things such as: “1 WAR is worth approximately $4-5 million dollars in player salary on the open market”?.
by blooming rock on Jan 19, 2011 12:47 AM MST up reply actions
This might serve to confuse you even more, but when they say that specifically,
once you know the difference between the two (or three if you include Baseball Prospectus’ WARP) most commonly referred to WAR systems, you can tell which is which by their replacement level. So if you say "1 WAR is worth approximately $4-5 million dollars in player salary on the open market," you are likely talking about f-WAR, since a higher replacement level means that 1 r-WAR is worth about $5.5 million on the market, and 1 B-Pro WARP about $3.5 million.
It’s a point of contention right now that they’re starting from different baselines, and in the next few years we’ll probably get closer to a universally accepted replacement level, in the meantime, it’s another thing to add to what you have to know before you can be comfortable using it, and so I know why most fans will stay away from WAR entirely.
Once you do understand it, and how each system comes by their figures, their strengths and their flaws in evaluating players, you can make mental adjustments but that doesn’t mean you should disregard the whole thing. Boston and Colorado are wacky to the low side with their OF UZR numbers, SF seems to be as well in the opposite direction, so your example is similar to why we wouldn’t compare a San Diego pitcher to a Colorado pitcher by using ERA. Torres was likely overrated in defense, Gonzalez underrated, and it’s better to switch to the different system to compare them.
I believe UZR is skewed towards the Giants outfielders..
Take Randy Winn as an example; between 1998-2004 with TB and SEA he had a UZR average of 1.81 and then with SF his average UZR jumped to 10.32. The season after he left SF his UZR dropped to -6.6
Andres Torres with the Tigers was never better than a 3.7 UZR score and then suddenly last year as a Giant he posts a 21.2? It appears the Zones are off at PacBell/3Com/HousethatBalcoBuilt.
@charliedrysdale
2 players as the sample
one of which was 36 in the dropoff season in question?
by Andrew Martin on Jan 20, 2011 11:56 AM MST up reply actions
Several others seem to fit it though
See Aubrey Huff, Pat Burrell, Eugenio Velez, etc.
"I have no special talents. I am only passionately curious." - Albert Einstein
by Andrew T. Fisher on Jan 20, 2011 12:02 PM MST up reply actions
Check out stat corner WAR
It has Cargo at 6.1 and Torres at 3.4. Xeifrank is right; it’s not WAR you have a problem with, it’s the defensive metric fWAR (Fangraphs WAR) uses. UZR says that Torres was worth 21.2 runs above average while Cargo was worth 2.7 runs below average. In other words, UZR thinks that Torres was worth 24 more runs defensively than Cargo, which WAR converts into 2.4 wins above replacement .
However, fWAR agrees with the other numbers you posted above because it thinks that Cargo was worth over two WAR points more than Torres at the plate.
WAR isn’t perfect, but it’s a darn good metric.
73 more days until the Rockies Home Opener!!!!!!!
by RhodeIslandRoxfan on Jan 18, 2011 6:15 PM MST up reply actions 1 recs
Thanks. That's a good point. Still getting up to speed with all the various options for WAR out there.
Which brings up my next question: Which is more valued/accepted in the SABR community?
As you say, one puts Torres at 6.0 WAR, another has him at 3.4 WAR . . . that’s a ridiculously big span for one season of one player.
If Torres was a free agent this year, what would his fair market value be? Which WAR would people use to analyze a deal?
This is all fascinating to me. As Xeifrank somewhat mockingly said above, maybe I should create bloomingrockWAR with what I see is more fair/legitimate, since it seems like everybody else has.
by blooming rock on Jan 19, 2011 12:53 AM MST up reply actions
I believe most SABR guys like the fWAR version because it's easily accessible
and it uses wRC+ for hitters and FIP for pitchers which allows you to better compare players in vastly different settings. wRC+ does this because it’s park adjusted and FIP does this because it eliminates fielders from the equation. (wRC+ is basically wOBA+; you can read a little more about it here)
I used to look at both fWAR (Fangraphs) and bWAR (baseball reference) and kind of average the two because I believe the truth usually lies somewhere in the middle with these types of things. Then after getting tired of doing that for our players I actually crunched some numbers a few weeks back and came up with what I guess would be RIRFWAR :-) (I’d hesitate to call it that though because Fangrpahs still did about 95% of the work)
Anyway, I used everything Fangraphs used to calculate my WAR except for the defensive metric; because like you, I’m not buying what UZR is selling. I instead took UZR (the defensive metric Fangraphs uses for their WAR), Total Zone (the defensive metric baseball reference uses if it’s WAR), Defensive Runs Saved (another defensive metric available on both Fangraphs and baseball reference), and the number from the Fans Scouting Report (listed as FSR on Fangraphs) (I use this as my “eye test” part of the metric) and averaged all of them together to get my defesive number for my WAR score. Once I get it, I plug that number in where the UZR number is and get a new WAR score.
When I did this for Cargo and Torres, Cargo’s defensive number went from a -2.7 to a positive 0.8 (Not a huge leap but certainly a step in the right direction). Torres’s number was still incredable but it seemed much more realistic; it went from 21.2 down to 14.5. The end result? RIRFWAR says that Cargo was worth 6.3 wins above replacement and Torres was worth 5.3 wins above replacement.
73 more days until the Rockies Home Opener!!!!!!!
by RhodeIslandRoxfan on Jan 19, 2011 7:40 AM MST up reply actions
See now that I follow.
Not because it makes CarGo’s 2010 look better than Torres’ 2010, but it takes into account all the different defensive metrics out there to normalize that portion of the WAR equation. I like that because I think that the concept of defensive metrics, and therefore all the metrics out there are relatively new and still somewhat under development/review/evolution, which is likely why they vary so much.
Anyway, thanks for holding my hand through this learning process!
by blooming rock on Jan 19, 2011 11:52 AM MST up reply actions
Just remember
you can’t pick and choose which one you “like” the most. Not that you are saying this, but you can’t say “Tulo is a 7WAR player per fWAR, and Cargo is a 6WAR player per rWAR, so that’s like 13 WAR!”
by Andrew Martin on Jan 19, 2011 1:16 PM MST up reply actions
Yes and no
I can say that rWAR is probably more accurate for Jason Hammel, as he has started to establish himself as a high BABIP pitcher, but that I’d take fWAR for Esmil Rogers or Ubaldo Jimenez. Neither metric is perfect and they miss on certain types of players. I don’t see why you can’t fashion your choice on each metrics’ strengths based on pre-set rules. Like pitchers with 500+ innings and .333+ BABIP = overrated by fWAR.
"I have no special talents. I am only passionately curious." - Albert Einstein
by Andrew T. Fisher on Jan 19, 2011 8:20 PM MST up reply actions
That's disingenuous though.
the entire point of having stats is to confirm or deny our observations. We can’t just pick and choose which stat we want to use, or else we’re falling victim to our own biases.
If we’re looking at both simultaneously, I guess you can present both as a point of conjecture, but past that, it seems like picking the one you “like” the most.
by Andrew Martin on Jan 19, 2011 9:31 PM MST up reply actions
But it's not really that way
UZR seems to have real problems with Colorado outfielders. If I eschew fWAR for all Colorado outfielders due to that, I’m consistent. I’m not sure that intelligently moving past the flaws of one stat and floating to another is any more disingenuous than completely and knowingly accepting flaws of certain stats.
It isn’t me saying “I like Ubaldo Jimenez, so I’m going to use rWAR to pump up his 2010 season, but use fWAR to pump up Hammel just cuz.”
"I have no special talents. I am only passionately curious." - Albert Einstein
by Andrew T. Fisher on Jan 19, 2011 9:39 PM MST up reply actions
The issue with WAR though is that it's not really one stat
It’s a bunch of smaller stats combined together so we can comepare players at different positions as well as players who play in different parks. Someone may like the way fWAR accounts for batting while at the same time liking bWAR better for fielding. It only takes one aspect of WAR to be off for it to become significantly less useful and it seems to me that UZR often creates this problem.
This is why I prefer to average the defensive metrics while using the rest of fWAR’s components.
72 more days until the Rockies Home Opener!!!!!!!
by RhodeIslandRoxfan on Jan 20, 2011 7:16 AM MST up reply actions
the averaging defensive metrics is fine
but people who “like” TZ or UZR or whatever are typically doing it because it “makes sense” which tends to translate to “how well does this line up with my preconceived notions”
by Andrew Martin on Jan 20, 2011 11:57 AM MST up reply actions
For the most part I agree with this but I will say one thing
TZ seems to be more consistent from year to year than UZR. That’s why if had to choose I’d pick it over UZR. However, I don’t completely trust any of them alone.
71 more days until the Rockies Home Opener!!!!!!!
by RhodeIslandRoxfan on Jan 20, 2011 1:36 PM MST up reply actions
Oh and I'm not sure how it would work if Torres was a free agent
I’m sure Torres and his agent might cite that 6.0 number Fangraphs has for him but this is probably a better question for somebody else.
73 more days until the Rockies Home Opener!!!!!!!
by RhodeIslandRoxfan on Jan 19, 2011 7:42 AM MST up reply actions
WAR is a meaningless, useless stat to me.
It tells me absolutely nothing I want to know. Were I a replacement trying to break into a major league lineup; I’d file a grievence against anyone useing WAR to judge me. I want the people judgeing me in the context of each play on the field. I don’t think WAR accurately reflects that.
It might be usefull for fantasy geeks; I don’t think it reflects the real world at all.
"Why are they outlawin' the spit pitch? The curveball is a cheap 'n easy pitch; the spitter aint" Ty Cobb
"When I was pitching 90's in the seventies; I never thought I'd be pitching 70's in the nineties!" Frank Tanana
As I mentioned above
*WAR =/= talent
*WAR is a fantastic comparison metric (albeit neither perfect nor comprehensive) between players. It is the only way to take hitting, pitching, and fielding, and represent them on the same scale.
*WAR doesn’t really apply to “replacement players.” A replacement player is simply a hypothetical “AAAA” player who can be a baseline for all other comparisons. You can’t judge AAA players with WAR, because they don’t accrue it.
Rocktober is not a time of year, it is a religion.
I don't find it usefull at all except to justify overspending on fantasy bids.
I don’t find it usefull in evaluating talent. That is what I’m concerned with.
"Why are they outlawin' the spit pitch? The curveball is a cheap 'n easy pitch; the spitter aint" Ty Cobb
"When I was pitching 90's in the seventies; I never thought I'd be pitching 70's in the nineties!" Frank Tanana
WAR is useless for fantasy
Is has zero correlation with fantasy baseball.
If you aren’t concerned with valuation of player contributions to winning (See: Hitting, Pitching, Fielding), then there isn’t really much left to be concerned with.
I will rephrase my “WAR =/= talent” comment:
Year-to-year WAR =/= talent.
Rocktober is not a time of year, it is a religion.
My argument is this...
Every manager of every team manages their players in the context of the situation at hand. I may not be an effective batter against most flyball pitchers, for example, but own this particular one on the field so I play. I might not have a lot of range in the outfield and would not normally play in a big left field ; except my pitcher is not expected to give up any drives to my area vs the other line up. Or I may have tremdous range and my pitcher is expected to have a lot of hits to my area. I may not hit a lick; my defense out weighs any offensive contribution I might make against the other guy. I don’t think WAR as a single number can accurately value how I’m being used. The context is not there.
You say that it’s adjusted for parks and all of that and I see managers use people in ways that contradict it. Managers have to do a certain amount of trial and error to know what they have. They have to see where I’m confident and where I’m flustered. They have to see how I handle situations to know where I’m useful. When and where I can be expected to rise to the occasion and if I am a better fit.
As a player I have to deal with my manager’s predjudices, moods and intuition more than a stat book. Even the managers who use stats extensively know that stats are compiled on a player by their own subjective use of that player and are thus suspect. WAR might work for compareing Albert Pujols in history vs Albert Belle. In my mind you are still compareing apples to oranges.
"Why are they outlawin' the spit pitch? The curveball is a cheap 'n easy pitch; the spitter aint" Ty Cobb
"When I was pitching 90's in the seventies; I never thought I'd be pitching 70's in the nineties!" Frank Tanana
WAR is not a metric for managers, I can't stress this enough.
It’s not meant to say “This guy is 4 WAR, this guy is 5 WAR. That means the 5 WAR guy should bat 3rd and the 4WAR guy 6th”. WAR is meant to be a front-office evaluating tool to be used for payroll efficiency and the like.
I don’t think WAR as a single number … The context is not there
Which is why you typically look at WAR as well as the components feeding WAR when you do a full evaluation of a player. You use batting metrics, fielding metrics, scouting, player attitude, all of those things when determining how to individually manage a player.
by Andrew Martin on Jan 20, 2011 12:03 PM MST up reply actions
I don't even think it works for payroll efficiency...
I might not be anywhere near as valuble for team “x” as team “y” and team “z” may want me so badly they might over pay me. To me WAR makes no difference except as justification for “a” or “b” decision in arbitration and even then the arbitrator will ignore WAR. Again I say if I were a player I’d file a grievence against a team useing WAR to judge me. I have less problem with other metrics except maybe bapip and the fielding metrics.
"Why are they outlawin' the spit pitch? The curveball is a cheap 'n easy pitch; the spitter aint" Ty Cobb
"When I was pitching 90's in the seventies; I never thought I'd be pitching 70's in the nineties!" Frank Tanana
Like it or not, baseball is a business
If I were a player, I’d likely be forced to deal with however the team chose to value me.
by Andrew Martin on Jan 21, 2011 5:16 PM MST up reply actions
Oddly though,
some parts of the business, arbitrators for example, value stats like RBI’s, pitching wins, and saves above all else. It’s kind of funny when you think about it.
70 more days until the Rockies Home Opener!!!!!!!
by RhodeIslandRoxfan on Jan 21, 2011 9:04 PM MST up reply actions
Arbitrators not being baseball people is one of the biggest criticisms against the arbitration process
by Andrew Martin on Jan 22, 2011 11:19 AM MST up reply actions
Oh I agree 100%
but they are still a big part of the business as the system is currently constructed.
69 more days until the Rockies Home Opener!!!!!!!
by RhodeIslandRoxfan on Jan 22, 2011 2:20 PM MST up reply actions
Your criticisms may be useful for alienating a major portion of the fanbase
but that doesn’t make WAR a useless stat. It’s just another method of evaluating player production.
by Andrew Martin on Jan 19, 2011 10:49 AM MST up reply actions
Obviously I've got a lot to learn about the various WARs out there, thanks for helping bring me up to speed.
One of the reasons I love baseball is because I love stats and the mind games that go on within the course of a single game. I’ve even gotten into some of the advanced metrics, but am far from having a great understanding of all of them . . . particularly for something like WAR when there seem to be several different variations of it out there.
When multiple different WAR values are formulated that can vary so wildly, it seems as though the pure numbers/statistics side of things gets pushed aside a bit and you’re back to relying on subjective data . . . depending on what the creator of that specific WAR formula thinks is more important.
Again, I’m certainly not anti WAR, I just think it shouldn’t be taken as the be all end all for player evaluation . . . just another tool in the tool box.
your last sentence is absolutely correct, and the creators of the various WAR systems would tell you the same.
We use it a lot more during the off-season because it’s one of the most helpful tools in predicting salaries and projecting competitiveness for the coming season, as well as looking at HOF cases and award candidates, so it probably leads to WAR fatigue for those uncomfortable with it.
That brings up one question I have with the HOF discussions and such:
Are defensive metrics included in the calculated WAR values for players from earlier generations? Like the late 1800’s-early 1900’s when radio wasn’t around. The 1930’s-1940’s when radio was king. Or even the 1950’s-1980’s when there was TV but no internet.
How is the defensive side of things taken into account over all of baseball history to calculate WAR, when there wasn’t such extensive tracking of locations of batted balls and etc.?
I believe there was an article referencing the lifetime WAR of various players, HOF and otherwise, for players from the 1800’s . . . and that question popped to my head.
by blooming rock on Jan 19, 2011 12:01 PM MST up reply actions
Sean Smith's Total Zone, which Baseball Reference uses, does about as adequate a job with this as we can probably expect without a video record
or without the kind of range/zone stats that they use to calculate UZR. It basically compares batted ball data with assists and putouts per ball in play for the defender position to league levels, and includes a park adjustment for outfielders. You can read more about it here. Obviously it becomes less reliable the further back you go, as less batted ball data was recorded prior to 1956. It still works decently as a rough guesstimate.
One of my big complaints about TZ is that I think I’d like a pitching staff adjustment, as if you have a rotation that’s GB and contact oriented, as say the Orioles of the 1970’s were, you’re going to wind up overrating that team’s infielders without it. And as you can read with the linked article, Mark Belanger proves to be the TZ leader for the time covered. I think Belanger was a fine defender, but he’s likely overrated a bit by TotalZone.
Your last paragraph summarizes what i was thinking as I was reading the article you referenced.
The % of balls caught by a particular outfielder relates back to the pitchers on the mound. For example, if a certain pitcher is a flyball vs. groundball pitcher, obviously that’s going to impact things . . . but more specifically, if a pitcher gives up a higher percentage of line drives as part of those fly balls, that’s going to have a huge impact on what the outfielder catches and doesn’t catch.
A line drive that lands in the outfield should not be teated the the same as a pop/fly out that lands in the exact same spot.
by blooming rock on Jan 21, 2011 12:58 PM MST up reply actions
Is WAR zero-sum?
By which I mean if you add up ALL the WAR in the majors in a season, and then add the notional replacement level baseline, do you get 2430 (i.e. the number of wins in MLB each season)?
Beyond the Boxscore looked at Team WAR versus actual wins a while ago.
If WAR were a perfect stat, the two would be identical. As this article shows, they are darn close. (They used Rally WAR’s database, for those who are wondering.
Score a goal. Unit. Basket. Go squadron! Do good! Defeat the opponents soundly in this...skirmish.
+1
Great article and fantastic discussion underneath.
See Data Differently: Beyond the Box Score | @justinbopp
by Justin Bopp on Jan 19, 2011 11:32 AM MST up reply actions
I did the same thing last offseason
I added up all the WAR I calculated from positional splits for all 5 times, added the replacement level baseline and got 420.9 wins. The five teams in the NL West had 420 wins. Pretty awesome.
"I have no special talents. I am only passionately curious." - Albert Einstein
by Andrew T. Fisher on Jan 19, 2011 8:27 PM MST up reply actions
I was going to
but 1) it was a lot of work
2) the run environment changed a lot from 2009 to 2010. I didn’t know how to calculate the new linear weights of wOBA and park factors, and they weren’t posted until January.
3) I was trying to keep PR from being too WAR heavy.
"I have no special talents. I am only passionately curious." - Albert Einstein
by Andrew T. Fisher on Jan 22, 2011 9:56 AM MST up reply actions
I'm sick of WAR to be honest.
Dewey and KBR are just.......too........sweeeeeeeeeeeeeet!!!!!!
The Wolfpac is looking for new soldiers! Change your logo to the black and red!!!
Can't we just have world peace?
"I have no special talents. I am only passionately curious." - Albert Einstein
by Andrew T. Fisher on Jan 19, 2011 8:22 PM MST up reply actions
Perpetual WAR for perpetual peace.
"No Mission Too Difficult, No Sacrifice Too Great—Duty First" - 1st Infantry Division Motto
SB Nation Denver - The regional hub for Denver sports!
Purple Row - Covering all your Rockies needs!
what is the WAR for peace?
LOL
Dewey and KBR are just.......too........sweeeeeeeeeeeeeet!!!!!!
The Wolfpac is looking for new soldiers! Change your logo to the black and red!!!
Even Whirled Peas can cause conflict..
as we saw in last season’s series of Top Chef.
@charliedrysdale
My take on WAR
When I see these arguments about rWAR or fWAR, about “WAR doesn’t give the right answer to Gonzalez vs. Torres 2010” or “rWAR is better for Hammel, fWAR is better for Rogers” or whatever, to me that’s completely missing the point.
WAR isn’t a predictive metric, nor is it (in any implementation) a particularly good measure of a player’s actual retrospective value in a given season. If you want to project a pitcher, don’t build a projection based on his past WAR totals, just figure out your expectation of the pitcher’s ERA (relative to league average), estimate how many innings he’ll pitch, and determine WAR from there. And if you want a good metric for MVP debates and the like, don’t look at WAR, look at something that measures a player’s actual contribution to his team’s wins and losses, on a game by game basis (unfortunately, no such metric exists anywhere… I’ve played around with it myself, but it takes forever to compile the data, and honestly, it’s not interesting enough for me to bother with).
What WAR is is simply the necessary theoretical backbone for understanding player value. It allows us to synthesize expectations of future performance (i.e. projections) into something useful, putting everyone on the same scale. If I say that Free Agent X is an average defensive third baseman who’s expected to hit .270/.340/.440 this year, how valuable is that in meaningful units (wins and/or dollars)? That question is completely unanswerable without 1) the concept of replacement level and 2) a way of combining all the components of a player’s performance into one number. And that’s what the WAR framework provides. You simply can’t do proper analysis without it, whether you’re trying to evaluate contracts, project the standings for the upcoming season, or whatever.
So, yes, if I were a consumer of this stuff rather than a producer, it would drive me insane to see the competing versions of WAR and to try to make sense of them without being able to dig in and crunch the numbers on my own. But the concept of WAR, independent of the particular implementation, is absolutely indispensable. You can dislike rWAR or fWAR or even (God forbid) HeltonfanWAR, but you can’t dislike WAR itself.
by Heltonfan on Jan 19, 2011 9:08 PM MST reply actions 1 recs
look at something that measures a player’s actual contribution to his team’s wins and losses, on a game by game basis
You mean like WPA?
by Andrew Martin on Jan 19, 2011 9:34 PM MST up reply actions
I love the concept of WAR
Again, I’m not saying WAR is useless; I think it’s a great idea that is still under development.
However, if there are areas for improvement in “the necessary theoretical backbone for understanding player value” . . . then improvements should be made.
by blooming rock on Jan 21, 2011 1:06 PM MST up reply actions
No
Look at the name. “Win Probability Added.” I’m not talking about probability, I’m talking about actual wins. Something that says, “The Rockies beat the Dodgers 5-2 at Coors today. Let’s divide up the credit for the win among the 25 Rockies and the credit for the loss among the 25 Dodgers, based on the starting assumptions that the Rockies’ pitching should get more credit for the win than their offense, because allowing 2 runs at Coors is a lot more impressive than scoring 5. Likewise, from the Dodgers side, the lion’s share of the blame should go to the offense. But no Rockie can get a negative number for the day, and no Dodger can get a positive.”
Bill James was trying to do exactly this with his Win Shares, but the system was a bit of a train wreck.
Right about then, we all went to WAR.
Okay, puns aside, I do take some issue with you saying that WAR’s not a predictive metric but then in the next paragraph saying that it’s indispensable to evaluating contracts and projecting standings, both acts of which are predictive by their very nature. I can’t really see a very good argument for not using it for awards either, but that’s because I don’t feel the most valuable player in a league has to come from a winning team in any given season and WAR or some close cousin of it is necessary in evaluating between players at different positions.
Win Shares was trying to do it on the season level, not the game level.
You still have the problem there of players receiving positive credit for good performance in a loss and negative credit for bad performance in a win. I’m talking about a pure retrospective value metric the likes of which hasn’t been seriously attempted yet.
Sure, you’re free to define MVP as most outstanding player if you want, and if that’s the direction you want to go, WAR is great. Personally, I just don’t find “who contributed the most context-neutral value to his team in 2010?” to be a particularly interesting question. I figure that if we’re trying to measure past value, we should measure it entirely in context (as described above); if we’re trying to measure ability (“best player”), we shouldn’t restrict ourselves to one year of data; and if we’re trying to measure some hybrid of ability and value (which is exactly what historical WAR is), we should figure out why exactly that particular hybrid is of interest to us. In the case of HOF voting, for example, the ability/value hybrid represented by WAR is an excellent proxy for the concept of “greatness” that most of use to determine a player’s HOF worthiness. That concept doesn’t match my understanding of the phrase “Most Valuable Player”, but to each their own.
When I say WAR isn’t a predictive metric, what I mean is simply that you shouldn’t use past WAR to project future WAR; you have to analyze the components of WAR separately and then add them back together in order to get a proper projection. If you just do a Marcel-like projection using the last three years of WAR, you’ll get some pretty goofy results.
Sorry, I should have read this post before making my last comment
But honestly, if that’s how you perceive value then why have an MVP award at all (at least for the regular season) as it’s obscenely biased towards big-market players. In fact, wouldn’t you have to be on the WS-winning team to get it? Which Giant would you choose, as a matter of interest?
Have a “best” player award instead, it’ll at least have some kind of meaning.
MVP is a regular-season award, so it wouldn't have to be a Giant, just someone on a playoff team
So that would mean Votto in the NL, Hamilton in the AL. But I guess Halladay would beat Votto if we make pitchers eligible.
Anyway, the fact that it’s not fair is precisely the point of this metric. It’s not supposed to be fair. It’s supposed to take the “value” concept to its logical conclusion, instead of going halfway and not having any particular justification for stopping at that particular point in the middle of the ability/value continuum.
I don't think it's "supposed" to do that at all. The ability/value continuum only exists in the first place because of the flawed nature of this award.
How fair would it seem that the top “best in baseball” award wouldn’t even be accessible to 22 teams?
How's that fair, though?
If a player hits three dingers in a loss, they have no value simnply because it was a loss? I don’t know if the most hard-hearted MVP decision-maker would use such a brutal interpretation of value.
A player on a losing team
Can easily have a positive WPA. It happens every game. Hitting 3 homeruns will almost always give you a positive WPA (Unless you struck out with the tying/winning runs on base 3 times also).
That does bring up an interesting question: has there ever been a game where every single player on the losing team had a negative WPA? I’d guess it would have to bee during a no-hitter/perfect game, and also a blowout where every pitcher on the loosing team gave up runs.
Rocktober is not a time of year, it is a religion.
I'm sure there is
I started looking, but I think I’m comfortable saying it could happen. I did find a game where the best WPA was Taylor Buchholz at 0.09…as a hitter
"I have no special talents. I am only passionately curious." - Albert Einstein
by Andrew T. Fisher on Jan 20, 2011 11:01 PM MST up reply actions
Biondino is responding to Heltonfan's rule
for his proposed new stat: no winning player can get a negative value, and no losing player can get a positive value. I agree with Biondino that the WPA approach is probably more useful, both for the three homer scenario, and the three error scenario. If I am Pujols on the worst team ever constructed, what value would a stat have that showed I contributed 2 wins out of 20 when we all know he is a 10-15 win player? If you used the Heltonfan rule, you would almost be forced to immediately calculate player wins divided by team wins for comparability across players.
Bleed purple
Again, the whole point here is that it's supposed to be unfair.
I would never in a million years suggest that something like this should be used to say that one player is better than another, only that one player made a larger contribution to actual (not theoretical, not probabilistic, but rather actual and incontrovertible) team success than another. Those are completely different things, and we have different stats to answer different questions. I’m giving you a fork, and you’re saying it’s a lousy tool because you can’t eat soup with it. But soup isn’t the only food out there.
What’s interesting about it, I think, is that just doing the thought experiment forces us to realize how wide the gulf is between context-neutral performance and actual past value. If as a result of that thought experiment you end up saying that actual past value doesn’t interest you, well, there’s nothing wrong with that.
I understand, I think
But it’s probably anti-useful to use a pure team stat to judge individual players in any context, not just MVP voting.

by 



































