You can't buy this kind of hardware.
I made posts here, here, and here recreating the Soccernomics' pay-to-win regression, as well as highlighting some hidden results that the authors of that book didn't publish. Previous to that series of posts I highlighted the fact that MLS team payroll disparity isn't increasing, the average MLS team payroll is increasing at three times the rate of inflation, and teams with designated players (DP's) are the ones fueling that rise. The next logical question that arises is:
- Is there a correlation between MLS team payroll and finishing position like there is in the English leagues?
Background
For my international readers, I must take a few sentences to explain the Americanized version of the world's game that MLS plays. We don't award our championship to the team that wins the table. Instead, we give them what we call a Supporters' Shield which qualifies them for the CONCACAF Champions League, and then we break them and the next seven teams into a playoff. That's right - our domestic league uses a knockout round system to determine its champion. This bows to the very American way of determining championships in every other major sport - hockey, basketball, football, and baseball. In all of those other sports except football, playoff teams at least have to win a 5 or 7 game series to advance to the next round - that is, they have to win 3 or 4 games over their opponent. MLS and our football league - the NFL - have largely decided on a single game format. One match between two sides decides who advances to the next round of the playoffs. Pull off one above average performance, and you can easily send a team that consistently outperformed you all season into the off season wondering what all that hard work was for.
Not content to be like every other US sports league, MLS does throw in a home-and-away format in the first round of the playoffs. It's not clear to me why they do this in the first round but not the second round or the MLS Cup. Perhaps it is an attempt to prevent a large number of first round upsets, but it isn't exactly clear to me the purpose it serves.
To keep the playoffs interesting, MLS breaks eight teams every year into the playoff tournament. This presents some interesting challenges for a growing league. The total number of teams in MLS the past few seasons is as follows:
- 2005: 12
- 2006: 12
- 2007: 13
- 2008: 14
- 2009: 15
- 2010: 16
The teams added in 2007 through 2010 are not ones promoted from lower divisions - they are new franchises created for the explicit purpose of playing in MLS. This is another key difference from the rest of the world's game. Because of the rapid growth the last four years, this will be the first year that MLS will have not placed the majority of its teams into the playoff tournament.
All of these factors have a huge impact on the American game. Given that one just needs to make it into the postseason tournament to have a shot at a championship, a number of teams with losing records make it in every year. Indeed, last year's champion, Real Salt Lake, had a losing record and happened to flip a switch at the end of the season to make it into the playoffs and pull off an impressive run of wins once they were in the tournament. Also, given that there are a number of new franchises the last few years looking to make a splash, spending is way up for them yet they still struggle with the usual "expansion team" performance challenges. Finally, there is only the motivation of playoff seeding to push teams to compete for the top spots in the table. All these could potentially affect the drive of teams to spend money and resources to finish first rather than fourth or fifth in the table, thus making a relationship between payroll and performance harder to prove.
The Inputs
As MLS is divided into two conferences (East and West) for the playoff format, I had to combine the two conferences for each season and assigned finishing positions based upon each team's total points for the season. Where ties in points existed, I awarded the teams the same position in the table and then skipped to the next finishing position for the first team after those that were tied. Once each team in each season had a finishing position assigned, I compiled the average finishing positions for each team.
I used the player payroll data to calculate each team's payroll as a multiple of the league average for the season. The team payroll multiple from each season was then compiled to make an average value for each team in MLS.
The results of these two compilations can be seen in Figure 1 below.
Figure 1: MLS average league position and payroll, 2005-2009
Just like the Soccernomics analysis, the data above is non-normal and must be transformed to perform any correlation studies and regression analyses. To do this, I initially tried the Soccernomics transformations of translating finishing positions to percentages as well as using natural logs and found that they worked. See Figure 2 below, where the p-value is greater than 0.05 and the assumption of normality is a safe one.
Figure 2: Graphical Summary for ln(p/16-p)
The team payroll data was also transformed by a natural logarithm, and we can now explore if there is any relationship between the data.
Correlation Test Results
As in my previous post on regression, the first attribute to check is the Pearson coefficient statistic before doing any regression analyses. Doing so will tell us if there is a statistically significant correlation between the two data sets. Figure 3 shows the results of the tests.
Figure 3: Correlation test between average team finish and team payroll multiple (with and without DP salaries included.
As Figure 3 indicates, there seems to be little chance there is a statistically significant correlation between team payroll and where the team finishes in the table as the p-values are not less than 0.05. What's interesting is that if the DP salaries are excluded (Team% no DP), the correlation statistics actually improves.
I did try a number of other transforms to the data to see if there was one that would generate improved fit. Unfortunately, none of the other transforms I tried improved the correlation statistics. Thus, I conclude there is no relationship between team payroll and table finishing position in MLS.
Reasons for the Lack of Correlation
Given the prominence of the Soccernomics analysis and the different conclusion drawn for MLS, here are some explanations why we might see such a difference in outcomes between the leagues.
- The poor cost/benefit equation of the DP: While the DP sucks up a ton of available pay-roll (MLS salary cap guidelines not withstanding), it represents only a single player on the pitch. As we saw in the correlation statistic comparison, the statistical score actually improves when the DP's salary is removed. This is especially true of the LA Galaxy, whose blowout purchase of David Beckham and his $5M+ annual salary has resulted in two bottom table finishes followed up by an appearance in the MLS Cup in 2009.
- The volatility in the league's makeup: There have been three expansion franchises added to the league in the last three years of the data used in the analysis. Two out of the three have tried to make a big splash by signing DP's, with one experiencing wild success in table position (Seattle Sounders FC) while the other has been in the league basement (Toronto FC). The Soccernomics study used a relatively stable list of teams that fought for positions in a mature league structure, which would provide less "special cause" variation seen in MLS's results the last few years.
- Low sample size: As with all statistical tests, sample size is key. The greater the number of samples, the more forgiving the test is and the lower the threshold for concluding a statistically significant relationship exists. See Figure 4 below for an example of how the number of samples affects the critical test statistic. The highlighted column indicates the critical correlation values that a test must be equal to or greater than to ensure less than a 1% chance of error in assuming a correlation exists between two data sets. In the case of the MLS data I used, n=15 so one must observe a correlation statistic of 0.5923 or greater. As I stated in my regression post, the Soccernomics study had 58 teams included and thus only needed to observe a correlation statistic between 0.2948 and 0.3218 to make a conclusion of correlation.
Figure 4: Critical correlation value vs. sample size and significance level
- The league's salary cap structure: Outside of the league's DP rule, MLS does try to maintain some form of a salary cap like most other American leagues. While America tries to pride itself as one of the most capitalistic societies, it's exactly the opposite in its sports leagues. In some ways, it makes sense. Capitalism fosters a system of cutthroat competition that eventually leads to a few winners and many losers. This can be counterproductive to providing a healthy, competitive league of 20-30 teams. Providing a ceiling on team payroll may help league parity, but it does make it difficult to rationalize expenditures in hopes of future success.
Ultimately, as MLS moves towards a stable 20 team league in the next few years (increased sample size) and the use of DP's becomes more rational with experience (improved cost/benefit equation) we may see the correlation statistic improve.
Next Steps
If the the main goal in MLS is not to win the table but to win the playoff tournament and the MLS Cup, what attributes could be considered in understanding the likelihood of winning the Cup? I will explore this topic in my next post. Until then, enjoy some soccer this weekend!
No comments:
Post a Comment