D. R. Brillinger, "A Potential Function Approach to the Flow of Play in Soccer", Journal of Quantitative Analysis in Sports, 3(1): Article 3, 2007. [Link]
How does one describe mathematically the run of play in a soccer match? In this paper, the author proposes a scalar potential to describe in time and space the sequence of passes that led to Argentina's goal against Serbia & Montenegro at the 2006 World Cup. The end result is a simulation that shows in a broad sense the evolution of the play from midfield toward goal. It is proposed that this approach has applications to play analysis and modelling.
I think it was my aerodynamics background that attracted me to this paper. Attempting to understand the motions in football through mathematics is a task that many mathematicians and physicists undertake, but soccer — and sport in general — doesn't always follow neat deterministic paths. Nevertheless, David Brillinger, a professor of statistics at UC-Berkeley, makes his contribution with the application of scalar potential functions to one specific sequence in soccer: the 25-pass possession by Argentina that ended in a goal by Estebán Cambiasso against Serbia and Montenegro at the 2006 FIFA World Cup.
A scalar potential function is a very fundamental concept in physics and engineering. As the name suggests, it is a scalar function of any number of variables. One variable defines the potential along a line, two variables define it on a plane, and three define it throughout a volume. A potential function is used to define a vector field:
What this equation is saying is that at a particular point (x1,x2,x3), the field F points in the direction of its steepest decrease. There is a lot more to say about certain types of vector fields and what they mean mathematically, but they're not relevant to the conversation and I don't want to scare any more people away with mathematical notation, anyway.
Potential functions are used very widely through physics and engineering, from fluid mechanics, to electrostatics, to mechanics. In real physical systems there are nonconservative forces such as frictional forces, magnetic forces, or pure rotational motions, but potential functions are still used as building blocks for modelling physical phenomena. So what does this mean for soccer? Brillinger fits potential functions to the trajectory of the ball during the play in order to obtain a picture of how the run of play developed.
There are three parts to the analysis. The first is data preparation. Brillinger considered only the ball location during that 25-pass sequence and used video data to estimate the ball location. The methods for estimating the location are primitive by 2011 standards — he froze the video frame at the time that each pass was initiated and used the pixel locations of the ball and the frame number to estimate location and time, respectively. For a possession that lasted almost 60 seconds, this data collection sounds really brutal. No wonder the paper only covers one pass sequence.
The second part of the analysis is the calculation of the velocity along the ball's path. This is a straightforward procedure with the distance formula. Knowing the velocity not only allows us to calculate the potential, it also allows us to appreciate how the tempo of the play changes as Argentina approaches the Serbian goal.
The third part of the analysis is the estimation of the potential function. Brillinger employs a potential function that describes the motion of the ball toward a point of attraction (in this case, the goal), with some added flexibility included through a general quadratic term. At the risk of scaring more people away, here's the function:
In this function, r is the distance from the point of attraction, and (x,y) is the 2D point on the pitch. It is possible to differentiate this function as long as it's not done at zero (absolute value functions are not differentiable at their origins), and Brillinger uses the potential to come up with an expression for the velocity. Then it's a matter of plugging in values for the velocity at field locations and determining the value of the parameters with a least-squares method.
The estimated potential function is shown in the plot above. The darker colors refer to smaller magnitudes of the potential. Consistent with the definition of the vector field, the play is flowing toward the goal from the left flank into the penalty area. The trajectory of the ball during the goalscoring sequence is plotted for completeness; it is neat to see that the contours of the potential fall in line with the play's path. It is also interesting to note that the potential function is asymmetric; this result is due to the generalized quadratic terms that were added.
There are a number of directions where one could take this paper, but in-match data are required. For example, one could look at how a team's play develops during the course of a match and perhaps determine from where the play originates and where it tends to end. A related application would be a model of player influences on the play using a source/sink model. One limitation of the potential function approach would be the amount of data required to fit the parameters. Also, the time required to perform the least-squares could be very long, and the system matrix may not be invertible. I need to think about the requirements of this approach some more; it's not a trivial problem.