So in today's [Thursday's] Rockpile we were discussing the Rockies team BABIP and whether or not the Rockies were being cheated out of hits. Now most commonly it is believed that batters and pitchers BABIP should normalize around .300. However, if you look at the stats of hitters who have a very large sample size of at-bats like Todd Helton, career BABIP of .334 over 8728 PA, you begin to realize that with individual players and teams you can expect a wide range of BABIP.
So the next step becomes how do you know what a player's BABIP should be, obviously how they hit the ball matters, line drives are more likely to be hits then grounders, and grounders are more likely to be hit then flyballs So taking those thoughts and doing a little math you arrive at a simple formula
expected BABIP = .15 * FB% + .24 * GB% + .73 * LD%
which assumes that 15 % of fly balls on average become hits, 24% of ground balls and 73% of line drives.
Now you can go a little bit deeper into xBABIP formula you can find some interesting formulas all of which are more involved taking into account a lot of other factors. The one I used came from slash12 from beyond the boxscore with the original article found here.
xBABIP =0.391597252 + (LD% x 0.287709436 ) + ((GB% - (GB% * IFH%) ) x -0.151969035 ) + ((FB% - (FB% x HR/FB%) - (FB% x IFFB%)) x -0.187532776) + ((IFFB% * FB%) x -0.834512464) + ((IFH% * GB%) x 0.4997192 )
Fangraphs as well as a few other sites I checked out recommended it as accurate as a predictive stat could be. So I went ahead and plugged in the top offensive producers for the Rockies to see how their xBABIP matched up to their actual BABIP. I also wanted to compare this to their career BABIP (cBABIP) with my idea being that players who's current BABIP was below both their cBABIP and xBABIP were due for an upward regression. Also players with BABIP lower then their cBABIP, but close to their xBABIP were not due for a regression but simply had flaws in their hitting approach.
So from there you can start to see some interesting things like Todd Helton is hitting the ball probably as well as his career average but is not getting the ball to fall. Others whose BABIP is below what is expected from both their hitting tendencies this year as well as historic norms are Hernandez, Fowler, Tulowitzki and Scutaro.
Carlos Gonzalez is an interesting case because his xBABIP is exactly the same as his actual BABIP, this proved helpful later. Also while Cuddyer and Colvin have higher BABIP then normal they are both in line with what yu would expect from the way they have hit the ball this season, so any regression will be from them cooling off hitting the ball and not luck based regression.
So my next step was to then apply my new xBABIP and see what their Batting Averages would be using the number of hits they would have if their BABIP equaled the xBABIP. I found this by taking the BABIP formula
and solving it for H (hits) so I could plug in their strikeouts, at-bats, sacrifice fly balls and xBABIP get the number of expected hit (exH). I got this formula then
exH = HR + xBABIP(AB - K - HR + SF)
I plugged in all the same players and got their new expected hit totals divided it by the number of at-bats and got their xBA
As you can see from the chart above I also calculated their xOBP based upon their exH. I used Cargo as a test to make sure that I was executing the BA and OBP formulas correctly since his xBABIP matched his BABIP.
I'll be more then willing to discuss any implications of this as well as answer any questions in the comments.
Thanks for reading