This is a guest post by the data nerds at Homefield Labs. To see the wind forecast updated every hour or to explore more research in golf analytics, check out Homefield Labs or follow @homefieldlabs.
This is the current weekday wind forecast for the upcoming John Deere Classic. As far as wind forecasts go, this one isn't very exciting. One side of the draw doesn't seem to have a big advantage. Plus, thunderstorms are possible on Thursday and Friday, so the second round might end on Saturday. In the absence of a weather delay, Homefield Labs projects a 0.57 mph advantage in the average wind speed for the Thursday PM/Friday AM draw. We project this side of the draw will face conditions .018 strokes easier and will account for 50.74% of the players who make the cut.
As golf fans, we believe that stronger wind makes a golf course more difficult. Our intuition is borne out of our own struggles in the wind and reinforced every time we see the flags whipping on The Golf Channel. As DFS gamers, we look at the weather forecasts hoping to identify the players who will face less wind than their competitors. But in order to optimize our lineups, we must first quantify the impact that a change in wind has on scoring.
What is the impact of 1 mph of wind? To project the wind's future impact, we rely on its historical impact. To start, we combine weather and scoring data for full-field PGA Tour events that employ a traditional morning-afternoon wave system off two tees and a 36-hole cut. For example, we do not include the Humana Challenge because everyone tees off in the morning and play rotates among three courses. We also exclude all tournaments where play was suspended for weather reasons during the first two rounds. For each tournament, we split the field into 'Early-Late' and 'Late-Early' halves based on the tee time assignments. We then calculated the scoring average and wind speed average for both halves.
Below is a plot of the difference in scoring average vs. the difference in average wind speed. Each circle is a tournament.
We use Ordinary Least Squares to estimate the parameters of a linear regression model. In this regression, we assume that both halves of the draw are equally skilled; therefore, the intercept is zero. The equation of the line is:
increase in scoring average = 0.32 x increase in average wind speed
In other words, one additional mile-per-hour of wind has on average coincided with a .320 increase in the scoring average.
For daily fantasy golf enthusiasts, success requires selecting players who make the cut. To get a sense for how the change in wind affects the likelihood of making the cut, for each tournament in our sample, we calculated the percentage of players who made the cut that came from each side of the draw. Below is a plot of the contribution to the weekend form the side of the draw with more wind vs the increase in wind speed.
Again, we use Ordinary Least Squares to estimate the parameters of a linear regression model. In this regression, we assume that both halves of the draw are equally skilled; therefore, the intercept is 50%. The equation of the line is:
increase in made cut percentage = 50 + -1.30 x increase in average wind speed
In other words, one additional mile-per-hour of wind has on average coincided with a -1.30 change in the percentage of players the side of the draw will contribute to the weekend.
- Are both sides of the draw equally good? We assumed both halves of the draw, on average, are equally skilled. They may not be. We believe the intent of the morning-afternoon wave system is fairness. We also know that the PGA Tour divides its marquee players into both sides of the draw to keep the TV viewership happy. But is there a Task Force designed to make sure the hottest players are evenly distributed on both sides of the draw? No.
- Could factors other than change in average wind speed be the reason why the scoring changes? Yes. Changes in wind direction, player preferences, pin placements, and a thousand other factors including randomness could be the reason the data is the way it is. This data does not prove the wind makes the course harder (what data could?). All it does is provide evidence in support of our intuition that the wind makes the course harder and put a number on the size of the impact. We will also note that the chance that this "relationship" between the average wind speed and scoring average is entirely due to randomness is less than two in one thousand.
- Can these equations be used to predict what happens? Sort of. Look at how spread out the dots on both plots are. Look at how few of them are on the lines. There is just about no chance that the wind has the impact that we project. However, if you wanted our best estimate for the average impact of the wind, here you have it.
- Will this data change over time? Yes. We will update our estimates of the wind’s impact on scoring and the cut every week.
- Does the analysis change when dealing with higher wind speeds? We treat the difference between 10 mph and 12 mph to be the same as the difference between 20 and 22 mph. It's possible, and maybe even intuitive, that a five mph difference at high wind speeds has more impact on scores than a five mph difference at low speeds. A topic for future investigation...