OpenTactic
About
Launch App
← Back to Wiki
General4 min read

Data Sources & Libraries

Data Sources & Libraries for Football Analytics Overview Football analytics data comes in two main forms: event data (discrete on-ball actions with co

Data Sources & Libraries for Football Analytics

Overview

Football analytics data comes in two main forms: event data (discrete on-ball actions with coordinates) and tracking data (continuous positions of all 22 players and the ball). The availability, cost, and granularity of these data types shape what analyses are possible.


Event Data

What It Contains

Timestamped records of on-ball actions: passes, shots, tackles, fouls, dribbles, etc. Each event has x,y coordinates on the pitch, plus metadata (body part, outcome, assist type, etc.).

Providers

StatsBomb

  • Founded by Ted Knutson in 2017
  • Known for the highest-quality publicly available event data
  • Unique features: freeze-frame data (positions of all visible players at the moment of each shot), detailed event taxonomy
  • Open Data: free dataset covering select competitions (Men's & Women's World Cups, select La Liga and NWSL seasons, etc.) — GitHub
  • Commercial data covers 100+ competitions worldwide

Opta (Stats Perform)

  • One of the oldest football data companies, founded 1996
  • Widely used by media (Premier League broadcasts, FBref, WhoScored)
  • Comprehensive event coverage but less detailed than StatsBomb's freeze frames
  • Commercial only (no free tier)

Wyscout

  • Scouting-focused platform combining video and event data
  • Used extensively by clubs for recruitment
  • Pappalardo dataset: ~1,900 matches from 5 top European leagues (2017/18), released for academic research. Available on Figshare. Published by Pappalardo et al. in Nature Scientific Data (2019).
  • Commercial platform for broader access

InStat

  • Popular in Eastern Europe, Scandinavia, and lower-tier leagues
  • Combines video analysis with event data
  • More affordable entry point than StatsBomb/Opta for smaller clubs

Free/Public Sources

SourceCoverageAccess
StatsBomb Open DataSelect competitionsGitHub
Wyscout/Pappalardo5 leagues, 2017/18Figshare
FBrefMajor leagues (StatsBomb xG)Web (fbref.com)
UnderstatTop 6 European leaguesWeb (understat.com)
TransfermarktMarket values, squad infoWeb

Tracking Data

What It Contains

Continuous x,y (sometimes z) positions of all 22 players and the ball, typically sampled at 25 frames per second. Enables analysis of off-ball movement, pressing, space creation, speed, and pitch control.

Providers

Second Spectrum (Genius Sports)

  • Optical tracking via stadium cameras
  • Official tracking provider for the Premier League, MLS, and others
  • Founded by Rajiv Maheswaran and Yu-Han Chang

Hawkeye (Sony)

  • Known for ball-tracking in tennis and cricket
  • Provides tracking for some football leagues
  • Semi-automated offside technology in the Premier League uses Hawkeye

SkillCorner

  • Uses broadcast video (no stadium cameras needed) to generate tracking data
  • Makes tracking data accessible for leagues without optical infrastructure
  • Growing provider, recently partnered with multiple competitions

Metrica Sports

  • Provides tracking data and analysis tools
  • Released a free sample dataset for research — one of the few publicly available tracking datasets

Availability

Tracking data is far less accessible than event data. Most is proprietary and expensive. Public options are limited to Metrica's sample data and synthetic datasets used in academic papers.


Python Libraries

LibraryMaintainerPurpose
statsbombpyStatsBombPython wrapper for StatsBomb open data API
socceractionKU LeuvenSPADL format, VAEP, xT computation
mplsoccerCommunityPitch plotting, heat maps, shot maps, Voronoi, pass sonars
kloppyCommunityNormalizes event/tracking data from multiple providers into a common format
floodlightCommunityAnalysis framework for tracking data (kinematic features, synchronization)
codeballCommunityExpected goals models and shot analysis tools

Key People in Football Data

  • Ted Knutson — founded StatsBomb, major figure in making analytics accessible to clubs
  • Luca Pappalardo — released the Wyscout public dataset for research
  • William Spearman — pitch control models, previously at Hudl/Second Spectrum
  • Javier Fernández — EPV (Expected Possession Value), previously at FC Barcelona
  • Luke Bornn — EPV co-author, statistics professor, previously at Sacramento Kings and AS Roma
  • David Sumpter — author of Soccermatics, Friends of Tracking YouTube series
  • Laurie Shaw — Friends of Tracking contributor, open-source pitch control implementation
  • Karun Singh — created xT framework

Tags: #football #analytics #data #libraries #providers

OpenTactic

Open-source tactical board for the modern football community.

Product

  • Board
  • Roadmap
  • Changelog

Community

  • Wiki
  • Articles
  • Academy

Legal

  • Privacy
  • Terms
© 2026·Created byRAZT Studio