FIP: Why you don't deserve that ERA

Today's Counting Rocks, Boys and Girls, is going to be about FIP. I know I mentioned something about batting metrics and who makes the team, but we'll save that for another day.

FIP, which is short for Fielding-Independent Pitching, is one of many defense-independent pitching metrics used to evaluate a pitchers' performance. Typically they're all just summed up into DIPS, meaning "Defense-Independent Pitching Statistics," and there are several of them out there that are commonly used.

The purpose of Defense-independent pitching metrics is basically to get an idea of how well a pitcher actually pitched, regardless of his team.

Similar to Poseiden's Fist's Counting Rocks article that he so graciously covered for me when I was in Toronto at the WBC, let's take a quick rundown on the commonly used pitching statistics and break them down a bit.

For starters, the Joe Morgan staple, Wins.

Wins are problematic at best. They're based on team performances, and tell you nothing about a pitcher's skill or actual individual efforts. Wins tend to favor pitchers with a lot of longevity, rather than real talent. For example, Livan Hernandez won 13 games last year, while Jorge De La Rosa won only 10. Part of this is obviously due to the fact that DLR pitched out of the pen for part of the season, but I would struggle to find anyone who actually says that Livan Hernandez is actually a better pitcher than anyone.

We could argue the validity of Wins (and I imagine we will), but then that's going to come down to the subjective nature of how someone values a pitcher, and doesn't really relate to his skill set. Yes, there are games where a pitcher seemingly will take a team on his back and pitch 7 innings and allow only 1 run while his team is able to only eke out 3 runs of offense, but how much of that is coincidence and how much of that is pure winnitudeism?

On the flip side, how many times do we see Barry Zito get shelled for 5 runs over 5 innings, and it's clear he isn't going to be coming out to see the 6th, but then Bengie Molina clubs a 3-run homer to give the Giants a 7-5 lead, which their excellent bullpen holds for the next 4 innings, giving Zito the win? Is Barry Zito still the proven winner?

Let's move past wins. 

ERA is the other big one, and for the most part, I'm not going to argue with it. If you tell me a guy that has a 2.76 ERA is good, I'm gonna believe it. In a general sense, ERA is a good descriptor of the performance of a given pitcher.

But it has its shortcomings as well.

Join me after the jump and we'll discuss why.

 

One of the main vehicles of runs is hits, which I'm sure we can all get behind. People might think that AVG against, or OPS against, or even wOBA against are going to be similar measures of how good a pitcher really is. But how much of AVG/OPS/wOBA against is really a pitcher's fault? Is a hit always a hit?

There is an aspect of every pitcher's game where hits are just going to happen, and they'll fluctuate from pitcher to pitcher. I mean, if you have a guy up who gives up a hit every other PA, he's probably just not a major league pitcher, and is leaving everything up and right over the plate. But in terms of Major League Pitchers, the guys who really do belong there, there's an element of luck and chance and bad defense that will affect how many hits drop and how many find gloves. Is a dying quail really a pitcher's fault? He more than likely got the jamjob/popup, but if it falls between 3 fielders, well, how do you blame that on the pitcher? If a pitcher is having a day where every ball hit is finding gaps and missing gloves, the feared Seeing Eye Grounder, how is that the pitcher's fault? He got the ground ball, the defense should be gobbling it up and turning the 6-4-3.

Now, I'm not suggesting that we're talking about defensive ERRORS. That would be arguing the difference between RA and ERA (and with the level that today's fielding is at compare to 100 years ago, there really isn't much of a difference). I'm saying grounders that go deep into the hole at SS that results in an aired-out jump throw that the 1B stretches for and grabs, but the runner beat the throw by a good 3 steps. I'm talking about balls put into play that any scorekeeper would scribble a "1B 5/6" as the ball trickles between the diving 3B and SS to the waiting glove of the LF. Hits that certainly count as hits, but you almost want to score it as something different because, come on, that's awful. The kind of hits that just send a BABIP soaring.

This is why guys like Voros McCracken came up with the concept of Defense Independent Pitching. Because really, all a pitcher can control is his strikeouts, walks, and home runs. There is no defensive element in any of those 3, the 3 true outcomes as it were, and it stands to reason that the 3 of those are the true measures of a pitcher's skill.

And thusly, FIP, DIPS, dERA, etc were born. All 3 pitching metrics are based on the same idea that a weighted total of HRs, Ks, and BBs per inning pitched will result in a Fielding Independent ERA.

The concept is pretty sound, that those 3 elements really do make up the basics of a pitcher. If you can get on board the idea of hits being a variable aspect of the game, it all kind of falls into place. A fully developed pitcher will have roughly the same K9, BB9, and HR9 from year to year, and as long as those aspects of his game remain the same, you can most likely infer that he was getting a bit unlucky, balls were finding gaps, or maybe he just had garbage defense behind him. See: Marlins, Florida.

The one other metric worth mentioning is xFIP, which accounts for park effects in the HR9 category, because, as we know, HRs in Coors and Chase and Citizen's Bank Park just aren't fair. So by normalizing those numbers, we pretty much get to see what each pitcher was really worth over the course of the season, you dig?

So here's the thing. When we use metrics like FIP or DERA, we can treat them just like ERA. But the problem is that they typically need a calculator or spreadsheet to really do the stuff we're looking for. The FIP equation is (13*HR+3*BB-2*K)/IP+3.20. Not exactly mental math. But with ERA, we can say "Ok, Marquis gave up 4 runs, 3 earned, over 6 innings. 3*9=27, and divided by 6 is.....uh....4 something....wait no....5.....hangon.... HEY WHAT'S 6*4 - OH WAIT NEVER MIND I GOT IT ok Marquis has an ERA of 4.50 for this game. Well there's no surprise." If you can calculate FIP in your head you're a bigger dork than I.

What FIP and DERA and dIPS and the like are good for are comparing a pitcher's ERA during the season and then seeing how much of it was a function of the defense behind him and/or how lucky he got.

Great example: NL Cy Young Award in 2008. Tim Lincecum, right? Total stud, strikeout machine, etc etc. But who had the lowest ERA in the NL? That's right, Mr. Best Freaking Pitcher in Baseball Himself, Johan Santana.

Lincecum won the Cy Young with a 2.62 ERA, while Santana put up a 2.53 ERA in about 10 more innings than Lincecum. Yet Lincecum had 60 more strikeouts than Santana. Something isn't quite adding up.

Lincecum had more strikeouts. He also walked more than Santana. But hang on, Johan gave up TWICE as many homeruns as Lincecum (23 - 11)! He also gave up 20 more hits than Lincecum! Who had the better season?

This is where we turn to our Fielding Independent Pitching numbers. I'm going to use the basic version of FIP just for the sake of simplicity (and because it's already available on Fangraphs with no calculation needed).

In 2008, Tim Lincecum had a 2.62 ERA. Fantastic. He had a FIP of 2.62. Also fantastic.

Johan Santana had a 2.53 ERA. Magnificent. He had a FIP of 3.51. Hold the phone.

Santana's component number suggested that he was a full run worse in terms of his individual pitching performance, and the pitcher-friendly dimensions of Shea as well as a studly defense behind him and probably a bit of luck deflated his career low K9 and career high BB9 to the point where his numbers, while still excellent, looked hardly elite. Lincecum, on the other hand, was every bit the Cy Young winner that his ERA suggested. He earned every single point of that low, low ERA with his excellent strikeout rate, controlled walk rate, and downright stingy home run rate. Lincecum is the real deal.

What I'm driving at with this example is that FIP can tell you if an ERA should be trusted or not. If ERA is significantly lower than FIP, it most likely means the pitcher caught any number of breaks.

The worst offenders of this side of the FIP/ERA discrepancy were Armando Galarraga, Daisuke Matsuzaka, Johan Santana, Joe Saunders, and Gavin Floyd. These 5 pitchers were all at least .90 runs of ERA below what their FIP would suggest they really did. The thing is, there's nothing wrong with taking advantage of a good defense, but you shouldn't let yourself be fooled by their numbers if they're this off of their skills.

Taking it in the other direction, high ERAs can also be unfair to a pitcher who may not have deserved them. Pitchers who get a lot of contact tend to have these problems.

The top 5 most maligned in this regard were Nate Robertson, Livan Hernandez (say what you will, his FIP was just shy of 5, so he still stinks), Kevin Millwood, Javier Vazquez, and Ian Snell.

To bring it back to the Rockies, the pitcher who caught the most breaks was Taylor Buchholz. While his numbers were excellent, his FIP sat a whole 1.16 runs above his real ERA at 3.33, as compared to 2.17. Still very good numbers, but don't think that astronomically low ERA was really sustainable. On the other end of the spectrum, Micah Bowie, Mark Redman, Greg Reynolds, and Matt Herges all took the hard end of the bad luck chain, as Herg's ERA was a full run higher than his FIP, and Micah Bowie's FIP was 4 runs lower than his 9 ERA. In fact, all but 3 pitchers from the Rockies had FIPs lower than their actual ERAs, as would be expected based on the extra hits that scatter around Coors.

What you want to see out of FIP and ERA comparisons is the guys whose FIP and ERA don't change much, telling you that their performance was in fact a dependable one. Guys who fit in the category include Aaron Cook (ERA .20 higher than FIP), Ryan Speier (.16) and Ubaldo Jimenez (.16). I get excited about this because it tells me that Baldo's 2008 was the real deal. He wasn't getting lucky, he wasn't just getting all the nice bounces and shots right at fielders, but rather, he struck out and walked an appropriate number of guys to get the results that the scorecards show.

Looking at projections, the guys we should be most excited about next season are gonna be the ones with obviously the lowest FIP. It's a lot easier to project strikeouts, walks, and home runs rather than just hits and such, because of the variability of hits like we talked about earlier. When you look at the projections, it's no surprise, either. Ubaldo Jimenez, Aaron Cook, Jorge De La Rosa, Jason Marquis, Greg Smith. The best guys, period, are Street, Corpas, and Buchholz, but that was almost to be expected.

So that'll wrap it up for this week's Counting Rocks. Tune in next week when we explore what the optimal spacing of the pinstripes on Ubaldo's uniform is to create a Magic Eye picture that'll distract batters into thinking it's a bowling ball coming toward home plate, and how much wood it does in fact take to show that woodchucks do chuck wood.

 

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

Join Purple Row

You must be a member of Purple Row to participate.

We have our own Community Guidelines at Purple Row. You should read them.

Join Purple Row

You must be a member of Purple Row to participate.

We have our own Community Guidelines at Purple Row. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker