Friday, June 11, 2010

MLS Attendance 2010: Getting the statistics right


This is what 36,000 fans each week looks like. Not every club is so lucky.

I am focused on World Cup like everyone else, and I am working on a nice little spreadsheet and post to accompany my viewing experience. But I must take this opportunity to provide some statistical commentary.

As a US soccer fan, I am all for the sport's growth and Major League Soccer plays a key part in that growth. Everyone is hoping for a big bounce in MLS's profile this year due to the World Cup, so I was naturally excited when MLS Daily came out with a post trumpeting 10.8% growth at MLS match attendance. The problem is that after a closer examination of their numbers, I find that their calculation method is wrong and that the actual attendance growth is far less. Let me explain.

MLS Daily's method is simple, straightforward, and one that the average fan might make. They take the 2010 average attendance through 93 games (16,472) and divide it by the 2009 average attendance through 91 games (14,862) to get the 10.8% growth in attendance. The problem is that this violates the basic rule of statistics: it's not the average that matters, but the distribution around the average. The correct way to calculate the average increase in 2010 MLS attendance vs. 2009 MLS attendance is to answer the following two questions:
  • Is 2010 MLS attendance statistically significantly higher than 2009 MLS attendance?
  • If so, by how much?
To answer those two questions, we must look at each individual team's difference and not the league average.

The Statistics

To answer the first question, I first must develop the 2009 and 2010 attendance figures by club. This is easy if I trust the MLS Daily numbers by club. In this case I do trust them, and I divide the 2010 numbers they have published by (1 + % change from 2009) for the respective clubs. That gives me this breakdown of 2009 and 2010 attendance (deleting Philly, of course). We're now ready to make some comparisons.

What's the first rule of any statistical analysis? That's right: check for normality. Neither data set tested normal (which is the key assumption the MLS Daily analysis relies upon yet violates), so I looked to see if I could get both data sets to be normal via a common recommended Box-Cox transform. No luck again. It's time to turn to the Mann-Whitney parametric analysis.

The Mann-Whitney test uses medians, not means, to evaluate the difference in two populations - in this case, 2009 and 2010. That allows it to evaluate non-normal distributions. Using the Mann-Whitney test can allow us to understand whether or not a statistical difference exists and how big it is.

The Results

Running the Mann-Whitney test yields some interesting results. Figure 1 shows the results of the test.

Figure 1: Mann-Whitney test results for 2009 vs. 2010 MLS attendance

As the last line of the test indicates, the difference between the two seasons is significant which means the 2010 season has seen an increase in attendance. However, the difference is only 802 people which yields a 5.5% increase in attendance - nearly half of what MLS Daily claims. What's astonishing is that if you look at the likely spread in differences (95% spread to be exact), almost 2/3's of the likely difference is negative. That means that while the observed values indicate a likely attendance growth, they don't rule out a possible attendance drop given the poor showings in some cities.

Getting these statistics right is very important. MLS is on an expansion plan the next several years - Vancouver and Portland in 2011, and Montreal in 2012. Expansion is important, but key established teams are suffering: New England and San Jose are both down double digits versus last year. Expanding too quickly can hurt the existing teams, and stunt league growth. It's important that MLS and its fans have a realistic view of its growth from one year to the next to ensure club and league viability.

No comments:

Post a Comment

LinkWithin

Related Posts Plugin for WordPress, Blogger...