Which players have been most influential in Argentina’s Primera División 2016-17?

In addition to expected goals, passing network analysis has become a key element in soccer analytics.  I’ve written about network analysis in a couple of Paper Discussions, but haven’t gotten around to exploring the concepts on data from a football competition.  I’ll start now with an analysis of which players were most influential in the recently completed Primera División season in Argentina.

Graphs and Networks

To understand which players have the greatest influence in their teams’ passing networks, let’s borrow some concepts from graph analysis.  A graph in this case is a collection of points (nodes in mathematical parlance) that are connected by edges that represent some kind of relationship between them.  A graph is a mathematical way of representing a network; a network is a graph with real-world descriptions of the nodes and edges.

Here’s an example of a passing network, from the match that gave the Argentine league title to Boca Juniors:

Passing network of San Lorenzo in the San Lorenzo vs Banfield match, Primera División Argentina, 2016-17 season, matchday 29. Data from DataFactory LatAm.

In this network, circular nodes represent the starting players, and square nodes the players substituted into the match.  The size of the node represents the number of passing touches of the player, and the thickness of the edges  between players represents the total number of completed passes between them.  (If less than four passes were made between two players, that line is colored gray.)

It’s clear that the #20 player — Néstor Ortigoza — has seen more of the ball than any of his teammates.  (I pointed this out on Twitter last month and received a massive response from San Lorenzo supporters who consider Ortigoza a symbol of the team.) Does the math confirm his apparent level of influence on this match?

The Concept of Centrality

To answer that question, let’s borrow some more tools from network analysis, namely, the idea of centrality.

Centrality is a measure of the relative importance of a node within a network.  If you follow that aforementioned link, you’ll find a diverse collection of centrality metrics that owe their origins to research in social networks.  I have settled on the eigenvector centrality metric which expresses the importance of the node by observing its connections to other important nodes.  This idea is expressed in mathematics as

\[
c(v_i) \propto \sum_{j=1}^N a_{ij} c(v_j)
\]

where \(c(.)\) represents the centrality metric, \(v_i\) the node within the network, and \(a_{ij}\) the element of the adjacency matrix, or to relate back to passing network analysis, the number of successful passes between players \(i\) and \(j\).  If you place the above equation in matrix form you get

\[
A\mathbf{v} = \lambda \mathbf{v}
\]

This is an eigenvalue problem from which the centrality metric receives its name, and the centrality quantities are the elements of the eigenvector associated with the largest eigenvalue of \(A\).

Now, centrality in football doesn’t mean that the player with the highest metric is the best player, or even the most effective player.  It does, however, give an idea of which players are most important to the team’s distribution of the ball during the match. It is very possible that such players also correspond to the “best” players, but that’s not necessarily true and it’s important to be aware of that.

All right, caveats over.  Let’s see what results we have.

Centrality in Argentine Primera

The complete list of eigenvector centrality for every player who participated in all 30 teams of this season’s Primera will be placed in Soccermetrics’ Project Data repository on GitHub.  I’ll highlight the centrality scores of a few of the teams.

Boca Juniors

The eigenvalue centralities of the champions are displayed in the figure below.  The centralities are split by the position that the player was given in the lineup, so some players are listed multiple times in the list (for example, Carlos Tevez and Ricardo Centurión).  It’s probably not much of a surprise that so much of Boca’s play was heavily influenced by the play of Fernando Gago and Carlos Tevez, two players who had significant European careers.  Tevez of course left for China at the end of 2016, but Boca had other players who were just as influential in Pablo Pérez in the center of the field and Cristian Pavón up front.  Darío Benedetto, who led the league in goalscoring, didn’t have a very high centrality score, but his role was different — he was tasked with placing the ball in the net, which he was very successful at doing.

Eigenvector centrality of Boca Juniors players, 2016-17 Argentina Primera División. Data sourced from DataFactory LatAm.

San Lorenzo

I had done some qualitative analysis of player centrality in San Lorenzo’s passing networks; now, I wanted to take a more quantitative approach.  Below are the eigenvector centralities for the players who appeared in league play.

The one match that I observed at the close of the season wasn’t a fluke — Néstor Ortigoza really was that important to San Lorenzo’s play.  His eigenvector centrality of 0.46 is to the best of my knowledge the highest in the division.  No other player is as influential to his team’s play as Ortigoza.  It makes the challenge of finding a replacement so challenging and one that makes Ciclón followers more than a little uneasy.

Fernando Belluschi is the closest player to Ortigoza in terms of influence (0.40), which isn’t too surprising given his pedigree. Franco Mussis (0.39) isn’t far behind.

Eigenvector centrality of San Lorenzo players, 2016-17 Argentina Primera División. Data sourced from DataFactory LatAm.

Sarmiento

I wanted to see what the centralities of a relegated team looked like, and Sarmiento’s list gives an indication of which players are most likely to move on to other Primera clubs.  As was the case with most of the teams in the division, the midfielders have the highest levels of influence.  The players with highest centralities by position are Javier Burrai in goal, Guillermo Cosaro in defense, Gervásio Núñez and Walter Busse in the midfield, and Brian Fernández up front.  All of these players except Burrai have since left the club; Cosaro and Fernández have returned to their original clubs (Talleres and Racing, respectively) upon the conclusion of their loan periods, and Núñez and Busse have moved to Atlético Tucumán.  Basically there are a few players of great influence in the club, and those players are the first to leave when the club is forced down a tier.

Eigenvector centralities of Sarmiento players, 2016-17 Argentina Primera División. Data sourced from DataFactory LatAm.

Atlético Tucumán

Atlético Tucumán’s centralities, compared to their competitors, appear to be more balanced than most teams.  Among the players who have appeared in more than nine matches, the eigenvector centralities differ from 0.04 to 0.10 for a specific position.  El Décano are also different in that their most influential players (nine appearances or more) are defenders — Fernando Evangelista (0.33) and Leonel Di Plácido (0.32). Guillermo Acosta is the only other player with a centrality score greater than 0.30. Evangelista left the club at the conclusion of his loan deal, Di Plácido thought he was going to leave but didn’t read his contract closely enough, and Acosta will stay on with the team and will likely become captain for the 2017-18 season.

Eigenvector centralities of Atlético Tucumán players, 2016-17 Argentina Primera División. Data sourced from DataFactory LatAm.

Conclusion

Eigenvector centrality provides the ability to gain some insight into players most influential on a team’s play.  It’s an imperfect picture as currently constructed — average positions and passes stripped of temporal context have limited value — but its findings do correspond well with qualitative observations.  One direction for future work involves the creation of time-varying networks (one such presentation will be the subject of a future Paper Discussion); another involves the use of betweenness centrality, which indicates which players are most critical to the success of the team’s passing network.

You can locate the centrality data on Soccermetrics’ Project Data repository on GitHub.

Share

Tags: