Moneyball.
The sports-film that made you go groggy over numbers. The film that made you
root for the obese bespectacled Yale Economics graduate over the various
athletic baseball players around. Anyone remotely interested in sports and
analytics has surely watched this gem of a movie and has had thoughts of using
data to better comprehend his/her own favorite sport. Football is somewhat a
late adapter in this aspect. But the recent case of Midtjylland that went on to
win the Danish Superliga on back of Analytics embodies the change that is
afoot.
Basic
statistical analysis has been used in football for quite some time. Statistics
like possession percentage, shots on goal, assists etc, while useful, fall
short of capturing the immense complexity of this beautiful game. A plethora of
footballing-strategies abound, each leading to different stats and making a
universal analysis difficult. Although all the strategies, and indeed the
entire game, have one thing in common: passes. As the much-abused adage goes,
Football is a passing game! So why not study how a team passes the ball to
study the team’s strategy?
To
understand the passing strategy of a team, players are modeled as nodes in a
passing network while the passes themselves are links connecting them. The weight
of an edge between any two players is proportional to the number of passes
between them. Thus a passing network is constructed. The network could be used
to study the overall passing strategy of a team by aggregating the passing data
over a game or a set of games. The temporal evolution of the passing network
could be used to study the shifting of strategies over the course of the game.
We have a look at both of these analyses in the following sections.
The Aggregated Passing Network
Pena and
Touchette[1] do an amazing work of modeling the aggregated passing network in
their paper on network-theory analysis of international football teams. They
use the data from the 2010 FIFA World Cup. This data consists of an aggregated
version of all the passes made by every player of every team that participated
in the competition. Based on the data, Pena and Touchette constructed a passing
network as described earlier. Some of their visual representations are shown in
figure 1. These visualizations are an oversimplification of a football game, as
players do not remain in static positions during games. However, they do a
great job at providing an immediate insight into the team’s tactics.
Figure
1: Passing networks for the Netherlands and Spain drawn before the final game,
using the passing data and tactical formations of the semi-finals. (Source:
Pena and Touchette [1])
Pena and
Touchette then go on to analyze the importance and roles of different players
in the passing network. They do so by calculating various centrality measures
for all the players. The rationale behind the use of the various parameters is
described below:
1. Closeness
Centrality: The closeness centrality or closeness of a player is quite
simple notion of centrality. High closeness centrality usually means that the
node is at a very short average geodesic distance from all the other nodes in
the network. Thus, a higher value of the closeness centrality of a player means
that the player is more available to receive pass from any teammate. This should
ideally ensure a high closeness centrality for central midfielders and a low
closeness centrality for strikers. Analyzing the closeness centrality might
indicate if a player has been completely cut-off from the game by his own
movement or the opposition defence. Eg, Xavi, being a central midfielder, will
usually have a high closeness centrality. The value of Xavi’s closeness
centrality might be reduced however if the opposing team tries to man-mark him
and nullify his influence on the game.
Figure
2: Ci (Closeness Centrality) for different players of the Spanish
National Team. Note that the two central midfielders Xavi and Busquets have the
highest closeness centrality while the forward Pedro has a very low Closeness
centrality (Source: Pena and Touchette[1])
2. Betweenness Centrality: Betweenness
centrality measures the extent to which a player lies on the shortest-path
between two other players. A player with a very high betweenness centrality is
thus pivotal to the entire game-play of the team.
Average
betweenness centrality of a particular team provides a good measure of the
game-play that the team adopts. Eg, In the Spanish team with its famous passing
game of tiki-taka, multiple paths exist between any two players and no player
is highly important as compared to other players. The Spanish team has a very
low average betweenness centrality. In contrast, a team like Portugal which
relies more on the brilliance on individual players rather than a coherent
passing philosophy has a high average betweenness centrality.
Figure
3: Average Betweenness centrality is quite low for a passing team like Spain.
(Source: Pena and Touchette[1])
3. Page-rank
Centrality: Pagerank centrality roughly assigns to each player the probability
that he will have the ball after a reasonable number of passes has been made.
Since central midfielders are more prone to have the ball after a certain
number of passes, Xavi and Busquets were observed to have a high pagerank
centrality in correspondence with a high closeness centrality.
Thus
several conclusions could be drawn about a team’s strategy based on their
passing network. This strategy is however not uniform. As the game progresses
over 90 minutes, a team goes through different phases where it might adapt
different strategies. It might go for a full-frontal attack during the early
stages of the game and try to defend its lead later on. It might press high up
the pitch initially but might not be able to sustain the same pressure later on
as players run out of stamina. We now explore a temporal analysis of a passing
network that tries to explore these changes in the flow of a match.
Evolution of the Passing Network
over the Course of a Match
Cotta et
al. [2] explored the evolution of the passing network of the Spanish National
Team during three of the knock-out phase games of the 2010 FIFA World Cup. The
data that they required was a much focused nature where minute-by-minute
descriptions of the passes was necessary. Since this data was not made
available by FIFA, Cotta et al. manually recorded the data using television replays.
Besides collecting the data about the players between whom the passes were
exchanged, they also noted down the part of the field from which the pass was
made and the part of the field where the pass was received. The data that they
built up was consequently very rich.
The main
emphasis of Cotta’s study was to study how effective the opposing team was in
nullifying the Spanish tiki-taka. Tiki-taka is characterized by a very patient
build-up play with a series of consecutive passes without losing possession.
Thus the ‘number of consecutive passes’ was the major parameter considered in
the study. Let us go through their analysis of one of the matches: Paraguay vs
Spain, to appreciate their work better.
Spain
played Paraguay in the pre-quarter final stage of the tournament. Paraguay was
aware that it was outclassed by the silky Spanish team. Their main strategy was
to press high up the pitch and disrupt the passing flow of the Spanish team.
Paraguay succeeded quite well in this endeavor in the first half as well as
through large swathes of the second half. Spanish team were completing only 3-4
consecutive passes on an average.
However,
as the game progressed, the Paraguain team started to fatigue and Spain started
to capitalize. Xavi and Xabi Alonso emerged as the most central nodes in the Spanish
passing network and started to dictate the flow of the game. There was a marked
increase in the number of consecutive passes that the Spanish team could put
together. The trend continued and Spain finally had a breakthrough in the 83rd
minute as a potentially offside David Villa slashed the ball into the Paraguain
net.
|
Figure
4: Mean consecutive passes vs time during the first and the second halves of
the Spain vs Paraguay game from 2010. Note the marked increase in the number of
mean consecutive passes midway through the
We have
explored a few ways of using the passing network to study the myriad strategies
of various football teams. While there is great speculation about the use of
such techniques by mainstream Football teams, a lack of free data makes it
difficult for any curious soul to look into the methods in a more rigorous
manner. Passing-networks, however are a necessary way of analyzing any
footballing strategy. A deeper dive into the intricacies of the passing-network
will help us better comprehend the beautiful game, and maybe, even enhance it!
References:
[1]
Pena. J. L., Touchette. H., “A network theory analysis of football strategies”,
2012, Euromech Physics of Sports Conference, Editions de l’Ecole Polytechnique.
[2]
Cotta. C., Mora. A. M., Merelo-Molina. C., Merelo. J. J., “FIFA World Cup 2010:
A Network Analysis of the Champion Team Play.”, 2011, Journal of Systems
Science and Complexity.
Only a football specialist such as you can write this! enjoyed reading it
ReplyDelete