Is soccer goalscoring Poisson?

A. Heuer, C. Müller, O. Rubner, "Soccer: Is scoring goals a predictable Poissonian process?", Europhysics Letters, 89 (3): 38007, 2010. [arXiv]

Does there exist a formula that can predict the score of a soccer match between two teams?  The answer's not that simple, but three physics researchers from Germany have used German league data from the past 20 seasons to develop an expression that predicts qualitatively the expected outcome of a soccer match as a function of the "fitness level" of both sides.  They show that this fitness level remains constant over a season, establish that correlations between the two teams more often than not result in draws, and question the existence of "goal affirmation" once a team scores a goal.  The most significant result is the existence of a Poisson distribution in the expected goal difference in a match.

For several months I've been working intermittently on an investigation of whether goalscoring distributions are Poisson in nature.  My motivation was that whenever I presented my work on the soccer Pythagorean, invariably I would receive questions asking if I had considered a Poisson goalscoring distribution.  There is some literature out there that questions the use of Poisson distributions for goals scored by a team over the course of a season, and when I attempted to derive the Pythagorean from a Poisson distribution I could never get the expression to work properly. 

The paper that I will discuss touches on a number of topics in soccer goal distributions, from the existence of "self-affirmation" (the football fever that some researchers have mentioned) to the influence to team quality over random factors in football results.  Most importantly, the paper will serve as a springboard for presenting some tests on Poisson distributions in the football data that I have.

The authors of this paper are researchers in physical and organic chemistry in a university in Münster, Germany.  The lead writer is Dr. Andreas Heuer, whose research spans experimental and computational work in physical chemistry with an interesting side hobby in sport statistics.  Naturally, his work in sport statistics has generated more media attention than his physics research!  This particular paper was published in Europhysics Letters, which is a journal devoted to publishing brief papers on very recent results (Physical Review Letters in the USA is similar).  It may be five pages long, but it is a very mathematically and technically dense paper — you'll need a cup of coffee, a legal pad, and at least two readings to make sense of it.

A little bit of information about the data used for this analysis is in order.  The authors used German 1. Bundesliga data between the 1987-88 and 2007-08 seasons, inclusive except for 1991-92, which was the reunification season when the Bundesliga had 20 teams.  They did not use data before 1987 because of what they described as a significant difference in the goal distributions before then, upon which they do not elaborate.

There are three major results in this paper, and I'll describe them briefly.

First, team fitness levels remain constant over a season.  The authors characterize the team "fitness" (a better word might be "quality") as its average goal difference normalized by the number of matches played.  This metric is an estimate of the true fitness level of the team which is attained after many matches played.  By correlating a team's results with those of all of its rivals over the season (and their rivals as well), one arrives at a measure of how team quality changes over a season.  It fluctuates over the course of a season, yet there exists a constant bias term.  This constant term corresponds to the variance of team quality in a league.  So there are some variations to team quality, but on a macro scale that quality remains constant.

It should be noted that the authors are making conclusions solely on same-season data.  So it is very possible — no, it's very likely — that team fitness levels change over multiple seasons, due to either team turnover, or relegation, or promotion.  It would be interesting to see how these fitness levels change for clubs who are either newly formed in a league (e.g. MLS) or recently promoted to a new league, and assess how different classes of clubs (one-year wonders, elevator clubs, consolidating teams) perform differently in terms of fitness level.

The second major result is that fluctuations in fitness have short-term implications but matter little in the long run.  Now that result should make sense to most people, but those short-term fluctuations drive the increased number of draws and streaky runs (winning or losing) by teams.  The authors developed a simple model to characterize the match result, which they described in terms of the expected outcome between both teams based on fitness levels, the systematic influence on the match, and random factors.  The systematic influences on the match include external effects, such as injuries, suspensions, weather, or the occasion of the match, and intra-match effects, which include match events such as expulsions or goal scorings. None of these effects can be estimated, of course, but the variance of these effects can, and the authors develop some expressions that do that.  (I am still not sure how they derived those expressions after a couple of readings; I might try again at some point in the future.)  By fitting their dataset to the model, they found out that the high-order effects that they were modeling fell out, and that the variance due to fitness fluctuations was much smaller than the variance of the expected outcome.

The third major result is that while goal distributions generally aren't Poisson, goal scoring does appear to follow a Poisson process.  There is a distinction between "process" and "distribution" which lies between describing the distribution of goals in a match, and the distribution of goals scored in a match over the course of a season.  The authors develop Poisson distributions from the expected number of goals of each team (from computing the estimated goal difference and estimated goal sums), and show that the distribution of the goal difference holds up very well to the actual data.  The distribution of the actual data does not spread out for lopsided goal differences, which challenges the existence of the goal-affirmation phenomena proposed by Bittner et al.  The Poisson model of the goal difference fits well expect for ties and minimum-goal differences, which is a huge proportion of soccer match results.  The issue is that the goals scored by home and away teams are slightly dependent, and it is that slight statistical variation that accounts for the narrow results and draws.  (Tied matches with more than six goals are more in line with the statistical independence assumption.)

In summary, the class of a team does come out over the course of a league season, and random variations in the teams account for a substantial number of results in soccer, in particular the narrow results and the 0-0, 1-1, and 2-2 draws.  Because the number of goals (points) and scoring opportunities is so low in soccer compared to other sports, random effects are much more significant.  The most interesting description of a soccer match that I've read comes at the end of this paper when the authors state:

"…a soccer match is equivalent to two teams throwing a dice. The
number 6 means goal and the number of attempts of both
teams is fixed already at the beginning of the match, reflecting
their respective fitness in that season."

If you're willing to brave the highly concentrated mathematics, the abuse of notation, and the hand-waving (necessary in a five-page paper), the paper has plenty to chew on for those who like to think about how seemingly random a goal is in football.  It also illustrates how much of a fool's errand it is to predict the exact score of a football match, but it doesn't stop millions from attempting to do so, for which the betting houses are grateful.