Filter Bubble: Network Analysis of Football Strategies

Moneyball. The sports-film that made you go groggy over numbers. The film that made you root for the obese bespectacled Yale Economics graduate over the various athletic baseball players around. Anyone remotely interested in sports and analytics has surely watched this gem of a movie and has had thoughts of using data to better comprehend his/her own favorite sport. Football is somewhat a late adapter in this aspect. But the recent case of Midtjylland that went on to win the Danish Superliga on back of Analytics embodies the change that is afoot.

Basic statistical analysis has been used in football for quite some time. Statistics like possession percentage, shots on goal, assists etc, while useful, fall short of capturing the immense complexity of this beautiful game. A plethora of footballing-strategies abound, each leading to different stats and making a universal analysis difficult. Although all the strategies, and indeed the entire game, have one thing in common: passes. As the much-abused adage goes, Football is a passing game! So why not study how a team passes the ball to study the team’s strategy?

To understand the passing strategy of a team, players are modeled as nodes in a passing network while the passes themselves are links connecting them. The weight of an edge between any two players is proportional to the number of passes between them. Thus a passing network is constructed. The network could be used to study the overall passing strategy of a team by aggregating the passing data over a game or a set of games. The temporal evolution of the passing network could be used to study the shifting of strategies over the course of the game. We have a look at both of these analyses in the following sections.

The Aggregated Passing Network

Pena and Touchette[1] do an amazing work of modeling the aggregated passing network in their paper on network-theory analysis of international football teams. They use the data from the 2010 FIFA World Cup. This data consists of an aggregated version of all the passes made by every player of every team that participated in the competition. Based on the data, Pena and Touchette constructed a passing network as described earlier. Some of their visual representations are shown in figure 1. These visualizations are an oversimplification of a football game, as players do not remain in static positions during games. However, they do a great job at providing an immediate insight into the team’s tactics.

Figure 1: Passing networks for the Netherlands and Spain drawn before the final game, using the passing data and tactical formations of the semi-finals. (Source: Pena and Touchette [1])

Pena and Touchette then go on to analyze the importance and roles of different players in the passing network. They do so by calculating various centrality measures for all the players. The rationale behind the use of the various parameters is described below:

1. Closeness Centrality: The closeness centrality or closeness of a player is quite simple notion of centrality. High closeness centrality usually means that the node is at a very short average geodesic distance from all the other nodes in the network. Thus, a higher value of the closeness centrality of a player means that the player is more available to receive pass from any teammate. This should ideally ensure a high closeness centrality for central midfielders and a low closeness centrality for strikers. Analyzing the closeness centrality might indicate if a player has been completely cut-off from the game by his own movement or the opposition defence. Eg, Xavi, being a central midfielder, will usually have a high closeness centrality. The value of Xavi’s closeness centrality might be reduced however if the opposing team tries to man-mark him and nullify his influence on the game.

Figure 2: C_i(Closeness Centrality) for different players of the Spanish National Team. Note that the two central midfielders Xavi and Busquets have the highest closeness centrality while the forward Pedro has a very low Closeness centrality (Source: Pena and Touchette[1])

2. Betweenness Centrality: Betweenness centrality measures the extent to which a player lies on the shortest-path between two other players. A player with a very high betweenness centrality is thus pivotal to the entire game-play of the team.

Average betweenness centrality of a particular team provides a good measure of the game-play that the team adopts. Eg, In the Spanish team with its famous passing game of tiki-taka, multiple paths exist between any two players and no player is highly important as compared to other players. The Spanish team has a very low average betweenness centrality. In contrast, a team like Portugal which relies more on the brilliance on individual players rather than a coherent passing philosophy has a high average betweenness centrality.

Figure 3: Average Betweenness centrality is quite low for a passing team like Spain. (Source: Pena and Touchette[1])

3. Page-rank Centrality: Pagerank centrality roughly assigns to each player the probability that he will have the ball after a reasonable number of passes has been made. Since central midfielders are more prone to have the ball after a certain number of passes, Xavi and Busquets were observed to have a high pagerank centrality in correspondence with a high closeness centrality.

Thus several conclusions could be drawn about a team’s strategy based on their passing network. This strategy is however not uniform. As the game progresses over 90 minutes, a team goes through different phases where it might adapt different strategies. It might go for a full-frontal attack during the early stages of the game and try to defend its lead later on. It might press high up the pitch initially but might not be able to sustain the same pressure later on as players run out of stamina. We now explore a temporal analysis of a passing network that tries to explore these changes in the flow of a match.

Evolution of the Passing Network over the Course of a Match

Cotta et al. [2] explored the evolution of the passing network of the Spanish National Team during three of the knock-out phase games of the 2010 FIFA World Cup. The data that they required was a much focused nature where minute-by-minute descriptions of the passes was necessary. Since this data was not made available by FIFA, Cotta et al. manually recorded the data using television replays. Besides collecting the data about the players between whom the passes were exchanged, they also noted down the part of the field from which the pass was made and the part of the field where the pass was received. The data that they built up was consequently very rich.

The main emphasis of Cotta’s study was to study how effective the opposing team was in nullifying the Spanish tiki-taka. Tiki-taka is characterized by a very patient build-up play with a series of consecutive passes without losing possession. Thus the ‘number of consecutive passes’ was the major parameter considered in the study. Let us go through their analysis of one of the matches: Paraguay vs Spain, to appreciate their work better.

Spain played Paraguay in the pre-quarter final stage of the tournament. Paraguay was aware that it was outclassed by the silky Spanish team. Their main strategy was to press high up the pitch and disrupt the passing flow of the Spanish team. Paraguay succeeded quite well in this endeavor in the first half as well as through large swathes of the second half. Spanish team were completing only 3-4 consecutive passes on an average.

However, as the game progressed, the Paraguain team started to fatigue and Spain started to capitalize. Xavi and Xabi Alonso emerged as the most central nodes in the Spanish passing network and started to dictate the flow of the game. There was a marked increase in the number of consecutive passes that the Spanish team could put together. The trend continued and Spain finally had a breakthrough in the 83^rd minute as a potentially offside David Villa slashed the ball into the Paraguain net.

Figure 4: Mean consecutive passes vs time during the first and the second halves of the Spain vs Paraguay game from 2010. Note the marked increase in the number of mean consecutive passes midway through the

We have explored a few ways of using the passing network to study the myriad strategies of various football teams. While there is great speculation about the use of such techniques by mainstream Football teams, a lack of free data makes it difficult for any curious soul to look into the methods in a more rigorous manner. Passing-networks, however are a necessary way of analyzing any footballing strategy. A deeper dive into the intricacies of the passing-network will help us better comprehend the beautiful game, and maybe, even enhance it!

References:

[1] Pena. J. L., Touchette. H., “A network theory analysis of football strategies”, 2012, Euromech Physics of Sports Conference, Editions de l’Ecole Polytechnique.

[2] Cotta. C., Mora. A. M., Merelo-Molina. C., Merelo. J. J., “FIFA World Cup 2010: A Network Analysis of the Champion Team Play.”, 2011, Journal of Systems Science and Complexity.

Filter Bubble

Sunday, 20 March 2016

Network Analysis of Football Strategies

1 comment: