Network Centrality Metrics in Football
Definition
Network centrality metrics are mathematical measures borrowed from graph theory and social network analysis (SNA) that quantify the importance, influence, or connectedness of individual nodes (players) within a passing network. When applied to football passing data, they reveal which players are most central to a team's possession structure.
History & Origins
Centrality metrics have deep roots in mathematics and sociology. Degree centrality is the simplest and oldest concept in graph theory. Betweenness centrality was formalized by Linton Freeman in 1977 in the context of social networks. Closeness centrality was also developed in the SNA tradition.
Their application to football was pioneered by Javier López Peña (mathematician at University College London) and Hugo Touchette (Stellenbosch University) in their 2012 paper "A network theory analysis of football strategies". They demonstrated that standard SNA metrics, when applied to passing data from international tournaments, could identify key players, reveal tactical structures, and even predict match outcomes to some degree.
This work opened a research stream that has since been expanded by numerous academics, including Luca Pappalardo, David Sumpter (Soccermatics, 2016), and researchers at KU Leuven, Barcelona Innovation Hub, and various universities.
Key Metrics
Degree Centrality
- What it measures: the number of unique passing connections a player has
- High degree: a player who passes to and receives from many different teammates (e.g., a central midfielder)
- Low degree: a player who interacts with few teammates (e.g., a target striker or a wide player in a rigid system)
- Formula: number of connections / (total players - 1)
Betweenness Centrality
- What it measures: how often a player lies on the shortest passing path between two other teammates
- High betweenness: a player who acts as a bridge or hub — if removed, passing between certain groups of players would require longer routes (e.g., a deep-lying playmaker like Busquets or Pirlo)
- Insight: identifies players whose absence would most disrupt the team's passing flow
- Origin: formalized by Linton Freeman (1977)
Closeness Centrality
- What it measures: how quickly (in terms of passing steps) a player can reach all other teammates
- High closeness: a player who is "close" to everyone in the network — typically a central player in both position and passing role
- Low closeness: a peripheral player who needs many intermediary passes to connect with distant teammates
- Formula: inverse of the average shortest path length to all other nodes
Clustering Coefficient
- What it measures: how interconnected a player's passing partners are with each other — do the players you pass to also pass to each other?
- High clustering: indicates triangular passing patterns around a player (tiki-taka style combinative play)
- Low clustering: indicates a player who connects separate groups (passes to the left-back and the right-winger, who never pass to each other)
Eigenvector Centrality
- What it measures: a player's importance based not just on their connections but on the importance of the players they're connected to
- High eigenvector centrality: a player connected to other highly-connected players (central to the core of the team's passing structure)
- Used by: Google's PageRank algorithm is a variant of eigenvector centrality
Graph Density
- What it measures: the ratio of actual passing connections to all possible connections in the team
- High density: the team passes between many different player pairs (diverse, decentralized passing)
- Low density: passing is concentrated along a few routes (predictable or hierarchical)
What They Reveal in Football Context
- Tactical structure: comparing centrality distributions across teams reveals different styles — Barcelona's tiki-taka era had high graph density and clustering; a direct team has low density with high betweenness for the target man
- Key player identification: the player with the highest betweenness centrality is often the tactical linchpin — disrupting them disrupts the team
- Formation detection: average positions from network nodes often reveal actual formations more accurately than stated lineups
- Substitution impact: computing networks before and after a substitution reveals how a change restructured the team's passing
- Opponent scouting: identifying an opponent's most central players suggests pressing targets
Limitations & Debates
- Static aggregation: computing centrality over a full match averages out temporal dynamics — a team that played differently in each half will show a blended, potentially misleading network
- Possession dependency: teams with more possession naturally generate denser networks, which can inflate centrality metrics without reflecting superior organization
- Doesn't capture pass quality: a backward pass to the goalkeeper counts the same as a line-breaking through ball for centrality computation
- Weighted vs. unweighted: basic centrality uses binary connections (did they pass or not?), but weighting by pass count, value (xT), or success rate produces different and often more meaningful results
- Diminishing returns in practice: while academically elegant, many practitioners find that simpler metrics (progressive passes, xT) are more actionable than centrality scores for coaching
Key People
- Linton Freeman — formalized betweenness centrality (1977)
- Javier López Peña — pioneered SNA application to football (2012)
- Hugo Touchette — co-author of the foundational football network paper
- David Sumpter — popularized network thinking in football (Soccermatics)
Resources
- López Peña & Touchette, "A network theory analysis of football strategies" (2012)
- Freeman, "Centrality in social networks: Conceptual clarification" (1979)
- Sumpter, Soccermatics (2016)
- NetworkX (Python) — standard library for computing all centrality metrics
Tags: #football #analytics #network-analysis #centrality #graph-theory #SNA
