Monday, March 28, 2011

Quantifying the Impact of the Bias of Arsenal's Referees, Part 1

Special thanks to Dog Face for the data (he and I will be collaborating on the second post in this series), and to Chris from Soccer By Numbers for help in dissecting the stats.

A little over a month ago I completed a post that quantified the different treatment Arsenal appeared to receive from various referees in the Premier League.  In that post I used statistics from Tim at 7AM Kickoff to show how shots taken, the ratio of shots-on-goal to shots taken, and Premier League fantasy points for yellow and red cards to show that Webb, Dean, and Dowd are the least favorable referees for Arsenal while Foy and Atkinson are the most favorable.  What I was unable to do at the time was to show how these different match statistics impacted the outcome of the match.  Luckily, a writer with Untold Arsenal that goes by the name of Dog Face contacted me and supplied data going back to the 2005/06 season and for every match in the Premier League for each of the seasons covered.  This data allowed for the analysis of the impact of such calls on match outcome.

The Data and Statistical Methods Used

Dog Face's data set contains key match statistics from every Premier League match from the 2005/06 season through the latest matches of this season.  To eliminate any error associated with using data from the incomplete 2010/11 season, I focused on the following attributes for the 2005/06 through 2009/10 seasons:
  • Venue (home/away)
  • Shots
  • Shots-on-goal
  • Corners
  • Fouls
  • Yellow cards
  • Red cards
The data came to me paired - each row showing the data for both the home and away - so I broke it into unpaired team data.  I then calculated the differential for each statistic except venue, which was coded as a binary statistic (1 = home team, 0 = away team).

In attempting to assess the impact of play on the pitch and referee decisions, we have several options.  We could try and determine a relationship between goal differential and the inputs listed above, but this is problematic given the relative paucity of goals and resultant goal differential.  I've done enough analysis of soccer match data to know this is a fool's errand.  The better method is to determine the likelihood of winning a match given the differentials achieved by a team or dealt out by a referee.  To do this a binary logistic regression analysis was performed using all of the match statistics.  A set of dummy variables based upon the season the data point came from were created to observe any the affects of any overlooked variables in the analysis.  Such a regression analysis allowed the construction of a mathematical model to predict the likelihood of winning (earning 3 points), with (1-likelihood of winning) being equal to the likelihood of not winning (earning 1 point for a tie or 0 points for a loss).  Unfortunately, as it's name suggests binary logistic regression's output is binary in nature and thus cannot differentiate between a soccer match's three possible outcomes.  This is an compromise that must be made to use the analysis.

Like any other statistical analysis, binary logistic regression analysis allows statistical significance to be tested.  In this analysis, the general rule of thumb of p <= 0.05 was used to determine which terms in the analysis were significant (with allowances for slightly higher p-values in team data given the lower sample size).  Based upon this criteria, the following factors were significant in impacting the likelihood of winning a match:
  • Venue (home/away)
  • Shots-on-goal differential
  • Yellow card differential
  • Red card differential
The same criteria ended up also being significant when isolating for only the Arsenal data within the wider data set.  This allows for a comparison of the impact of various match attributes on the average Premier League team, and how Arsenal is impacted to a greater or lesser degree for the same match statistic.

The Effect of Yellow & Red Cards

A comparison of the effects of various match statistics could be completed once binary logistic models were created for the league and Arsenal over the five seasons.  The two of interest - yellow card and red card differential - are of most interest as the referee directly controls when a foul is simply a foul and when it is serious enough to warrant a card.  As noted by Chris at Soccer By The Numbers, binary logistic regression predictions present some challenges when trying to provide two dimensional plots of the likelihood of an event (in this case winning) versus a single variable (in this case yellow or red cards).  With Dog Face's data I used an approach of splitting the analyses into home and away games, and then set all other variables to their averages for each venue while sweeping through the max and min values of the variable of interest (either yellow or red cards).  The output generated by each sweep came in three forms: the nominal odds, the lower 95th percentile, and the upper 95th percentile.  Such an approach allows us to observe how sample size and the variability of outcome as the data set approaches its extremes (yellow card differentials of 7 or red card differentials of 2) impact the confidence in the model.

The plots below show the impact that yellow cards have on match outcome.  The first graph shows the impact at home, while the second graph shows the impact away.  The black lines represent the likelihoods based upon the full league data over the five seasons.  The red lines represent the likelihoods based upon Arsenal's data over the same five seasons.  Solid lines, and their associated equations, represent the nominal predictions from the model, while the dashed lines represent the upper and lower 95th percentile lines.



A few things are clear from the graphs above.
  1. Playing at home clearly has its advantages.  Even with a six yellow card advantage at an away match while achieving their average away number of shots on goal, Arsenal's likelihood of winning an away match is only slightly better than a home when they are even on yellow cards playing to their average home form.
  2. Clearly the reduction in data points for Arsenal (190 matches) versus the league wide data (1900 matches) contributes to the wide variation shown via the 95th percentile lines.  The relative obscurity of Arsenal matches with an absolute yellow card differential greater than 2 creates the uncertainty at the extremes - 88% of all Arsenal matches ended with an absolute yellow card differential of 2 or less.
  3. A yellow card at home is only slightly less costly than a yellow card away - each yellow card away results in a 0.4% lower likelihood of winning versus a yellow card at home.  Clearly, the difference in home and away likelihoods of winning can't be chalked up to a difference in the impact of yellow cards when the yellow card differential home and away is even.
  4. The non-parallel nature of the Arsenal and league average lines in both graphs indicates that the impact of yellow cards on Arsenal is more severe.  To be exact, it's nearly three times as severe.
Similar odds can be calculated for red cards.  The graphs below show such odds over the range of red cards in the data set, and follow the same conventions as the yellow card graphs above.



A few more conclusions can be drawn based upon the graphs above:
  1. Playing at home has even bigger advantages when it comes to red cards.  In the case of Arsenal, even when they get a red card away their likelihood of winning with average away form is only 0.6, which is still 0.09 (or 15%) lower than the average home performance with no red card advantage or disadvantage.
  2. While the Arsenal data set still shows greater variation than the league wide data due to decreased sample size, it does show greater separation in the data sets (especially at home).  It could be declared that the separation at home between the two data sets for 0 and +1 red card differential shows that Arsenal's improved chances of winning are statistically significant when compared with the league average.
  3. Red cards are certainly a greater detriment to a team's likelihood of winning.  For an average Premier League team, they're 5 times as costly at home and away versus yellow cards.  For Arsenal, they're 4 times as costly at home and nearly 5 times as costly away.
The graphs above indicate the change in the likelihood of winning with each passing yellow or red card in a match, assuming Arsenal is playing at their average form for shots on goal.  They're very useful for illustrative purposes, but not very useful in assessing the impact of the referees identified in my previous posts.  For such an analysis, the individual likelihoods of winning each match are constructed from the match data, and a comparison between the referees is made.

The Impact of Referee's Decisions in Arsenal's Matches

From the graphs above, the impact of Arsenal's yellow and red cards are not the same as those on the average Premier League team.  Arsenal pays a much bigger penalty for their red and yellow cards compared to the average Premier League team, and thus the differentiation in referee statistics shown in my last post has a much bigger effect on Arsenal.

Now that a binary logistic regression has been created to predict the effects of various match statistics on the likelihood of an Arsenal win, the contribution from each statistic for each match can be measured.  In studying the referees, the match statistics have been broken into three categories:
  1. Things neither team nor the ref can control - venue
  2. Things the referee tangentially controls - shots on goal, corners, fouls, etc.
  3. Things the referee directly controls - yellow and red cards
There certainly is some interplay between all three - a home team may sense a more lenient ref (see Scorecasting) and will likely achieve a higher number of fouls before a yellow card is thrown their way. Luckily, from a statistical point of view very few of these interactions matter.  The results from the binary logistic regression indicate a precious few variables are statistically significant: venue, shots-on-goal differential, yellow card differential, and red card differential.

To calculate the impact of each referee, a comparison was made between
  1. Each match's likelihood of winning given the match statistics as called versus
  2. How the likelihood of winning would have changed had Arsenal experienced their average number of cards (adjusted for whether the match was home or away).
A general linear model was then constructed with this data to observe the impacts that season and referee had on the difference to the expected average.  The results from the general linear model are presented below via the main effects and interaction effects plots.



The graphs above confirm that Phil Dowd provides the highest differential against Arsenal from their expected mean.  On average, he costs them 4% per match against their odds of winning a match if they had experienced their average card differential - equivalent to a little more than a yellow card per match officiated.  As mentioned in the previous post on this topic, this is especially odd given the high proportion of home matches that he has officiated (home matches should result in a lower number of cards and thus higher proportion of winning).  Howard Webb is the only other official of the eight with a negative differential.  Four of the remaining six officials are right at the average differential of zero, while Chris Foy and Mark Halsey provides the most beneficial treatment of Arsenal.

All of this demonstrates that of the referees who officiate the greatest number of Arsenal matches, Dowd and Webb are the most biased against the Gunners.  Is this due to them actually being biased against Arsenal, or are they simply "tougher" officials when it comes to every team?  The calculations to determine one theory over the other are a good bit more involved, and will have to wait until the second post in this series...

No comments:

Post a Comment

LinkWithin

Related Posts Plugin for WordPress, Blogger...