A couple of days ago, Jay Tymkovich published this Rockpile on how Nolan Arenado could be a big factor in the Rockies' success for 2014. The argument was that if he lifted his batting value to just league average, he could be worth somewhere along the lines of a 5 WAR, thanks to his elite defensive skills.
I remembered that sometime during last season, RIRF wrote something about Nolan's line-drive percentage, and how it didn't match with the poor offensive numbers he was putting up. So I started doing a little research on batted ball data, to try to evaluate Arenado's 2013 and see how realistic it was to expect at least an average offensive performance from him in 2014.
I assume many of you know about expected BABIP (xBABIP), where it comes from, and what it's used for. But I'll do a quick recap anyway.
For hitters, most of the batted ball data has a strong correlation to their BABIP. So it's possible to estimate what a particular player's BABIP should be, based only on batted ball data. Using the correlation between each batted ball type and BABIP over a large sample size, you can come up with a formula for xBABIP that closely resembles the actual BABIP. There are many xBABIP calculators out there, in this case I used this formula published by slash12 on Beyond the Box Score. I ran it for players with at least 1500 plate appearances for the 2008-2013 period, compared it to their actual BABIP, and found that it had a Pearson Correlation Coefficient of 0.78, which indicates a very strong relationship.
As for how to use and interpret a player's xBABIP, what you do is compare it's value to the actual BABIP (always keeping in mind the player's career BABIP also), to see if the hitter's been "lucky" (xBABIP much lower than BABIP) or "unlucky" (BABIP lower than xBABIP). This is, in my opinion, a much better way than just comparing a player's BABIP to league average, since different batters have different BABIP, depending on the type of hitter that they are.
So what does batted ball data tell us about Arenado's 2013? Well, his LD% ranked 29th among qualified hitters and his pop-up rate was below league average. Both are usually indicators of a high BABIP. In fact, his xBABIP of .335 was above the league average. On the other hand, his actual BABIP was .296, which was below average. This is especially low when you consider the fact that he calls Coors Field home.
Unfortunately, Fangraphs doesn't have batted ball data for the minor leagues, so there's an obvious sample size issue with Nolan's numbers. But in order to see how repeatable this data is, I took xBABIP for qualifying batters for each of the last three years, and compared them to the three-year average preceding each season (i.e., 2013 vs. 2010-2012, 2012 vs. 2009-2011, and 2011 vs. 2008-2010). The average correlation factor was 0.73, which again, indicates a very strong relationship. To put that in context, I did the same for AVG, OPS, wOBA, and BABIP, and got these results:
As you can see, xBABIP is a fairly predictable stat, especially if you compare it with other important stats, so it's not the craziest thing to expect Arenado to be near that .335 next season. So if his actual BABIP can get closer to his xBABIP, he should have a considerably better 2014 at the plate.
Of course, BABIP does not correlate well with OPS or wOBA (0.37 and 0.40, respectively), but it does have a strong relationship with batting average (0.65 factor). Considering Arenado already has some pop, just by improving on his very poor .267 AVG, he could very well be that average hitter Jay Tymkovich talked about.