Skip to main content

Fantasy Baseball Toolkit: Sample Sizes Part 1

With the season winding down, the leaves changing, and fantasy 'ships on the line, I posted a question over on the Reddit Fantasy Baseball sub last week.  I asked what fantasy baseball topic redditors would most like to hear about on these electronic pages.  There were many great responses and although I only committed initially to writing about the subject that got the most upvotes and conversation, I decided "what the hay?" I'll do 'em all.  So, over the offseason, keep checking back here (I will be tweeting when they are posted @park_ro) for more parts to my new series:

Fantasy Baseball Toolkit

These toolkit posts will be an attempt to help you develop skills to identify breakout hitters (why didn't I believe in you J.D. Martinez?) and pitchers (Jake Arrieta?? Really? Who saw that one coming?) and to know when it is time to cut a guy loose because he ain't coming back.  I will introduce some sabermetric stuff that you will need to know to make these determinations.  I will be using Fangraphs a lot, so go ahead and bookmark that right now. Ok, time's up, let's move on.

In the inaugural toolkit post, we will dive into the world of sample sizes.  This was the most discussed suggestion on reddit and it is easy to see why.  "Small sample size" or SSS is often heard across the Internet when fantasy baseball players are discussed.  It is usually used to downplay a player's sudden success or breakout and dismiss it as a temporary fad.  Like so many popular phrases, it is often misused and misunderstood.  Today, let's learn what a "small" sample size really is, what it means, and how we can work with them.


How Small is Small?

It's pretty small.  Ok, I guess we're done with this section, next, I will...oh, I should probably give you a little more than that.  Here are three articles that will give you far more detail than I ever could about sample sizes and what is considered a meaningful sample size: number 1, number 2, and numero tres.  The third one is from Fangraphs and actually references the first two, which are much more in-depth.  Here is my best tl;dr version of these articles: most pitching and hitting statistics stabilize at some number of data points and that number is different for each stat.  Hitting stats, in general, stabilize faster than pitching stats.

Finally, what does "stabilize" mean?  When a stat stabilizes, it means that there is now enough data gathered to adjust future projections of that stat away from league averages and toward the new data at just over a 50/50 split between the data and the average for that stat.  It does not mean that the stat can now be projected to continue at its current pace for the rest of the season.  It simply means that we can now say we have enough data to conclude that the stat indicates some deviation from the expected mean and we can adjust our projection to include a 50/50 split between our average projection and the projection driven by our new data.  As we move away from that stabilization point, we can adjust the projection even further from the mean.

Ok, so that may be pretty boring and technical, but I want to make sure we are all on the same page here before we get to the fun of using this stuff.  Basically, if you are at or below the stabilization point for a given stat, you are dealing with some SSS.  Often in fantasy, we do have to work with SSS and make roster decisions off of them, but we can leverage the different stabilization points of the different stats to help us out.  Enough with the big blocks of text, let's do some real-world examples!

Quick disclaimer: stabilization points aren't perfect and do not guarantee that a player has achieved a significantly better level of play, they may have just marginally improved over league average and sometimes there are outliers and players show improvement past the stabilization point but then come crashing down the rest of the year below the average.  So, don't put blind faith in the methods I am about to show you, but they can be very helpful.

Example Time!

Let's take a look at some sample data for two mystery hitters from this season:
Player A: .374/.418/.616, 5 HR, 23 R, 18 RBI, .242 ISO (112 PA)
Player B: .245/.291/.363, 3 HR, 9 R, 13 RBI, .118 ISO (111 PA)

I'll give you a hint: player A's data is from April and player B's is from August.  Surprise! It's the same player: Charlie Blackmon, everyone's favorite early season waiver darling.  This is the type of thing that SSS preachers use to say that you shouldn't trust small samples of data, and you should probably ignore it completely until you have tons of data (like half a season). It's the "show me" attitude.  In Blackmon's case, what we had was a 28-year-old non-prospect suddenly becoming an all-star.  Nothing in his history suggested that he could sustain his early season pace.  Here are some stats of his if you take out that historic April: .258/.305/.381 with a 4.6% BB rate and 16.5% K rate. Here is his entire four season major league stat line for those same things: .284/.323/.423 with a 4% BB rate and 15.1% K rate. And finally, this is his total 2014 stat line: .279/.326/.423 with a 5% BB rate and a 15% K rate. Look familiar? All these numbers basically say that April was a fluke and August was a fluke too, but in the end, he ended up about where he always does and back at the average projection we would have made for his season.

Let's look at the stabilization points for average, OBP, slugging percentage, ISO, K% and BB%. Average:
  • 910 AB
  • OBP: 460 PA
  • Slugging: 320 AB
  • ISO: 160 AB
  • K%: 60 PA
  • BB%: 120 PA
What we can take from these values is that slugging, OBP, and average take too long to stabilize and we really can't trust them within a season too much (slugging is the shortest and is still about a half-season of at-bats).  However, ISO, K% and BB% are nice and short and take only 2-5 weeks on average to stabilize.  This means we can use them to look at breakout hitters.  Let's try this out:

Here is Player C's stat line for the first two months of his season (no, not Blackmon this time):
6.4 BB%, 0.304 ISO, 24.8 K%, 125 PA

Here are his 2013 numbers for those stats:
6.8 BB%, 0.124 ISO, 17.3 K%, 352 PA

This is Devin Mesoraco's line.  What we can see here is that his ISO jumped up much higher than 2013 and his K rate increased significantly too.  His walk rate was basically the same. Since the PA are more than the stabilization point for ISO, K% and BB%, we can say that we would expect his ISO this year to be about 50% of his average and 50% of his early season value, which works out to about .170.  Well, it turns out his ISO jumped even more than we could know at that time and now sits at .258 for the season.  But, we could have identified Mesoraco's power breakout after that early season explosion and known that it was likely a real improvement and not just noise.  The same goes for his strikeout rate increase.  His 2014 rate is 22.2%, which is definitely higher than 2013's value and falls between that value and the early season value.  We'll take the K increase when it comes with a power boost, of course. What is also interesting here is that his 2014 BB% is 9.7%, which would not be expected by those early season numbers, which did not indicate an increase in walks for him.  This is not a perfect method, by any means, but using the three stats I used here is a good idea because they stabilize so quickly and become meaningful in smaller sizes than other stats.

Let's see if some small sample (120 PA) ISO values could have predicted these breakout power hitters:

J.D. Martinez: 2013 ISO: .128, 310 PA,
4/21/14 - 6/22/14 ISO: .283, 122 PA,
2014 ISO: .255

Marlon Byrd (his breakout was in 2013): 2011-2012 ISO: 0.098, 635 PA,
4/1/13 - 5/24/13 ISO: .186, 123 PA,
2013 ISO: .220

Steve Pearce: 2013 ISO: .160, 138 PA,
4/4/14-6/21/14 (he didn't play much in April) ISO: .246, 124 PA
2014 ISO: .263

Clearly, all three showed significant increases in the ISO values early on and should have been signaling to us, "hey pick me up!"  An important note here is that we don't always get to wait even one month to see if we should pick up a guy or not or we risk missing our chance.  That's true and you sometimes just have to pick up the hot hand and be willing to dump him if he can't sustain it up to the stabilization point or further.  You will pick up guys that turn out to be terrible after a one week hot streak.  That's just part of the game we love.

So, this method works for these three stats well and seems to identify breakout players.  Now I'm going to give you some breakout players this year that this method does not support.

Josh Harrison and Danny Santana have both been excellent all season and no one can take that away from them, but if we look at the stats that are fueling their success, it is two stats in particular: BABIP (batting average on balls in play) and LD% (line drive %).  Now, these two stats are related to one another and a high line drive rate will produce a high BABIP because the league BABIP on liners is over .600.  However, both of these stats take a looooong time to stabilize.  BABIP requires 810 balls in play (between 1.5 and 3 seasons, at least) and LD% takes 600 balls in play (1-2 seasons).  So, we can say that Harrison and Santana are certainly no guarantee to continue to produce at this level next season and we can still expect that their career norms or league averages will still dominate their expected level of play.  Matt Carpenter is an example of what happens when high LD% and BABIPs fuel career years. When those values return to league average values or career average values, a player can quickly return to being a slightly above average guy instead of an all-star.  Carpenter's BABIP in 2013 was .359, fueled by a ridiculous 27% LD%!  This year, his BABIP has returned to just above league average at .313 and his LD% is a good, but not insane 23%, the same as in 2012.  He is still a good player to have on your roster, but he isn't batting .318 this year.

I would add Kennys Vargas to the list above of guys who will come back to earth next year and should not be expected to produce at this level.  His BABIP is .354 and hasn't even come close to stabilizing yet, he is striking out 26% of the time and only walking 2% of the time (those have stabilized).  He is hitting 48% grounders, which is not good for slow, plodding power hitters (see Jay Bruce and Allen Craig below).  He is still young and in his first year in MLB after jumping from AA ball, but the stabilized stats tell us to be worried.    

As far as identifying the anti-breakout players, the guys who take a nosedive in production, we can use two more quickly-stabilizing stats: GB% (groundball %) and FB% (flyball %).  A sudden spike in either can sometimes spell doom.  It only takes 80 balls in play for these to stabilize, which can happen in 30 games or even less.  If you look at Allen Craig's disaster of a year or Jay Bruce's, you can see their down years a mile away.

Allen Craig: 3/31/14-4/29/14, 86 Balls in play, 16.3% LD%, 62.8% GB% 21% FB%
Allen Craig: 2013 26.9% LD%, 45% GB%, 28% FB%

Even after just one month, it is clear that he had exchanged line drives for grounders, which is a bad thing.  Line drives go for a hit over 60% of the time, while grounders have a BABIP in the .200s.  It was clear it was going to be a long year for Craig and you could have tried to trade him on name recognition before it was too late.  Jay Bruce tells a similar story.

Jay Bruce: 3/31/14-5/29/14, 83 Balls in play, 20.5% LD%, 49.4% GB%, 30% FB%
Jay Bruce: 2013, 24% LD%, 37% GB%, 39% FB%

Once again, with only a little over a month of games (Bruce missed a few weeks in this time), we can see that he has begun to drive the ball into the ground too much, which hurts his batting average and with it, runs and RBI.  We could have also seen that both Craig's and Bruce's ISOs dropped as well, signaling a lack of power on top of the grounders.

Look out for Part II: The Pitchers Strike Back

Well, that was fun.  In fact, we had so much fun that I have run out of time to get to pitchers, which have their own unique relationships with sample sizes and stabilization.  Here's a quick primer to whet your appetite: while hitters only get 4-5 PAs per game, pitchers can throw over 100 pitches per start which gives us lots more data and faster stabilization in per-pitch metrics, but the downside is ball-in-play stats take even longer to stabilize since they only pitch every 5 days.

I'll tweet out and post on Reddit when part II comes out, but for now, I will just say good luck in your playoffs and final roto weeks! Tschus!

You can follow me or pepper me with baseball questions on Twitter @park_ro or, if you prefer, you can message me on Reddit u/WisconsinsWestCoast