Network Centrality Metrics in Football

Definition

Network centrality metrics are mathematical measures borrowed from graph theory and social network analysis (SNA) that quantify the importance, influence, or connectedness of individual nodes (players) within a passing network. When applied to football passing data, they reveal which players are most central to a team's possession structure.

History & Origins

Centrality metrics have deep roots in mathematics and sociology. Degree centrality is the simplest and oldest concept in graph theory. Betweenness centrality was formalized by Linton Freeman in 1977 in the context of social networks. Closeness centrality was also developed in the SNA tradition.

Their application to football was pioneered by Javier López Peña (mathematician at University College London) and Hugo Touchette (Stellenbosch University) in their 2012 paper "A network theory analysis of football strategies". They demonstrated that standard SNA metrics, when applied to passing data from international tournaments, could identify key players, reveal tactical structures, and even predict match outcomes to some degree.

This work opened a research stream that has since been expanded by numerous academics, including Luca Pappalardo, David Sumpter (Soccermatics, 2016), and researchers at KU Leuven, Barcelona Innovation Hub, and various universities.

Key Metrics

Degree Centrality

What it measures: the number of unique passing connections a player has
High degree: a player who passes to and receives from many different teammates (e.g., a central midfielder)
Low degree: a player who interacts with few teammates (e.g., a target striker or a wide player in a rigid system)
Formula: number of connections / (total players - 1)

Betweenness Centrality

What it measures: how often a player lies on the shortest passing path between two other teammates
High betweenness: a player who acts as a bridge or hub — if removed, passing between certain groups of players would require longer routes (e.g., a deep-lying playmaker like Busquets or Pirlo)
Insight: identifies players whose absence would most disrupt the team's passing flow
Origin: formalized by Linton Freeman (1977)

Closeness Centrality

What it measures: how quickly (in terms of passing steps) a player can reach all other teammates
High closeness: a player who is "close" to everyone in the network — typically a central player in both position and passing role
Low closeness: a peripheral player who needs many intermediary passes to connect with distant teammates
Formula: inverse of the average shortest path length to all other nodes

Clustering Coefficient

What it measures: how interconnected a player's passing partners are with each other — do the players you pass to also pass to each other?
High clustering: indicates triangular passing patterns around a player (tiki-taka style combinative play)
Low clustering: indicates a player who connects separate groups (passes to the left-back and the right-winger, who never pass to each other)

Eigenvector Centrality

What it measures: a player's importance based not just on their connections but on the importance of the players they're connected to
High eigenvector centrality: a player connected to other highly-connected players (central to the core of the team's passing structure)
Used by: Google's PageRank algorithm is a variant of eigenvector centrality

Graph Density

What it measures: the ratio of actual passing connections to all possible connections in the team
High density: the team passes between many different player pairs (diverse, decentralized passing)
Low density: passing is concentrated along a few routes (predictable or hierarchical)

What They Reveal in Football Context

Tactical structure: comparing centrality distributions across teams reveals different styles — Barcelona's tiki-taka era had high graph density and clustering; a direct team has low density with high betweenness for the target man
Key player identification: the player with the highest betweenness centrality is often the tactical linchpin — disrupting them disrupts the team
Formation detection: average positions from network nodes often reveal actual formations more accurately than stated lineups
Substitution impact: computing networks before and after a substitution reveals how a change restructured the team's passing
Opponent scouting: identifying an opponent's most central players suggests pressing targets

Limitations & Debates

Static aggregation: computing centrality over a full match averages out temporal dynamics — a team that played differently in each half will show a blended, potentially misleading network
Possession dependency: teams with more possession naturally generate denser networks, which can inflate centrality metrics without reflecting superior organization
Doesn't capture pass quality: a backward pass to the goalkeeper counts the same as a line-breaking through ball for centrality computation
Weighted vs. unweighted: basic centrality uses binary connections (did they pass or not?), but weighting by pass count, value (xT), or success rate produces different and often more meaningful results
Diminishing returns in practice: while academically elegant, many practitioners find that simpler metrics (progressive passes, xT) are more actionable than centrality scores for coaching

Key People

Linton Freeman — formalized betweenness centrality (1977)
Javier López Peña — pioneered SNA application to football (2012)
Hugo Touchette — co-author of the foundational football network paper
David Sumpter — popularized network thinking in football (Soccermatics)

Resources

López Peña & Touchette, "A network theory analysis of football strategies" (2012)
Freeman, "Centrality in social networks: Conceptual clarification" (1979)
Sumpter, Soccermatics (2016)
NetworkX (Python) — standard library for computing all centrality metrics

Tags: #football #analytics #network-analysis #centrality #graph-theory #SNA