Wednesday, May 12, 2010

Quantifying Edson Buddle's Hot Start


On pace for one of the greatest goal-scoring seasons ever

Everyone's been amazed by Edson Buddle's fast start to the 2010 MLS season - 9 goals in 8 matches. It's one of the greatest starts to a season, and likely was a key factor in his selection to the US Men's National Team preliminary squad for the World Cup. There is a statistical way to quantify how good his pace is compared to the Top 5 goal scorers in each MLS season. That method is discussed below.

Statistical Methodology

Given that each player plays a different number of matches due to injuries, service on the national team, and other reasons I will look to a normalized metric rather than the raw statistic of total number of goals scored. The key to the analysis will be to select a normalized metric that produces a normal distribution - this will allow for the most straightforward analysis to understand how well Buddle is doing.

I tried two statistics. The first was the traditional goals per game. That produced a non-normal data set according to a normality test. I then performed a Box-Cox transform to see if any transformation of this data set would yield a normal data set. The results of the Box-Cox test was -1.0, which means the inverse of the data set. This means the most likely normal distribution is actually produced when looking at games per goal.

Sure enough, the data set is normal based upon the p-value of 0.562 in Figure 1.

Figure 1: Graphical summary of games per goal

Now that we have a normal data set, we can generate the same data (goals per game) for Buddle and the four other top goal scorers so far in 2010 and compare them to the historical data via a Z-statistic. The Z-statistic takes the player's value (X), the historical mean (mu) and standard deviation (sigma), and creates a single point Z-score that can be converted to a percentile via tools like this website. Figure 2 shows the equation for Z.

Figure 2: Z statistic equation

Based upon Figure 1, mu = 1.7919 and sigma = 0.3810 for games per goal.

Where Buddle sits vs. MLS history

Figure 3 shows the 2010 Top 5 goal scorers' Z statistics and resultant percentiles vs. the 1996-2009 historical data. As one is trying to minimize the games played per goal, percentiles will be the reverse of what's normally expected. Edson Buddle's performance to date has him in the first percentile - that means he scores goals more frequently than 99%+ of the players in MLS history. What's interesting is that three players (Buddle, Dwayne De Rosario, and Chris Wondolowski) are currently on pace to score more frequently than 95% of the players who have ever played in MLS.

Figure 3: 2010 Top 5 goal scorers

I will keep this table up to date on a weekly basis, and track Buddle's progress towards MLS history as the most frequent goal scorer in MLS history. I will also spend some time in the coming week or two creating similar analyses for assists, shots, and shots on goal.

No comments:

Post a Comment

LinkWithin

Related Posts Plugin for WordPress, Blogger...