Guide to Football/Soccer data and APIs

Last updated: July 1 2016

Where can I actually find football/soccer data?

There are three main ways to get data. You can parse/scrape it from a hobbyist project/website, you can pay for it or you can try to collect it yourself.

Jump to a specific source:

Football Data Report Newsletter

If you would like more frequent updates regarding football APIs, football data and open source football projects then signup for the newsletter. Emails will be sent about once or twice per month. I will try to highlight free data and open source projects. Please email any links or newsletter suggestions.

Open Source Data on Github

openfootball - aka football.db

openfootball (aka football.db) has started a free, open source public domain football database. The data is historical data, meaning no lives scores but the data does include the schedule, teams and players for the 2014 World Cup along with global league data. This is a very promising project and has the potential to be the definitive source for historical data for the public. See the opensport Google Group for discussion and questions. The data is stored in various repos on github. Consider contributing any data you have yourself and be sure to thank Gerald Bauer. All the various repos can be intimidating. A good place to start is at github.com/openfootball.

An example of the plain text (custom format):

[Sat Aug/16]
  12.45  Manchester United    1-2  Swansea City
  15.00  Leicester City       2-2  Everton FC
  15.00  Queens Park Rangers  0-1  Hull City
  15.00  Stoke City           0-1  Aston Villa

Check out the following github organizations for more listings:

There is also an open source football.db HTTP JSON(P) API for demo purposes. Example Endpoints:

http://footballdb.herokuapp.com/api/v1/event/world.2014/teams
http://footballdb.herokuapp.com/api/v1/event/world.2014/round/1

jokecamp/FootballData

jokecamp/FootballData - my own hodgepodge of JSON and CSV Football/Soccer data on GitHub with a focus on the EPL

soccerstats.us

soccerstats.us is github organization with multiple repositories for sets of data (with a focus on North American data). The parser is written in python and looks like it was designed to parse the rsssf.com text data.

Other smaller Projects/Repos

Free APIs

football-data.org (beta)

football-data.org is a RESTful API in beta with regularly updated data. If you register for a free API key you will get CORS support. I recommend registering for a key to show your support and help the service track usage. However, a key is not required yet so you can try out the endpoints right now. I am excited to see this API grow and mature!

Available endpoints

/soccerseasons/
/soccerseasons/{id}/ranking
/soccerseasons/{id}/fixtures
/fixtures
/soccerseasons/{id}/teams
/teams/{id}
/teams/{id}/fixtures/

Some example calls:

Example JSON output for a team:

{
   "_links":{
      "self":{
         "href":"http://api.football-data.org/v1/teams/5"
      },
      "fixtures":{
         "href":"http://api.football-data.org/v1/teams/5/fixtures"
      },
      "players":{
         "href":"http://api.football-data.org/v1/teams/5/players"
      }
   },
   "name":"FC Bayern München",
   "code":"FCB",
   "shortName":"Bayern",
   "squadMarketValue":"559,100,000 €",
   "crestUrl":"http://upload.wikimedia.org/wikipedia/commons/c/c5/Logo_FC_Bayern_München.svg"
}

Sports Open Data API

This free RESTful API (hosted on Mashape) is an impressive service provided by http://sportsopendata.net/. Introduced in early 2016 it is very promising. The data is under a Creative Commons license. Limited to 10,000 requests a month. Mashape registration is required.

Available Endpoints:

/leagues
/leagues/{league_slug}/seasons
/leagues/{league_slug}/seasons/{season_slug}/managers
/leagues/{league_slug}/seasons/{season_slug}/referees
/leagues/{league_slug}/seasons/{season_slug}/topscorers
/leagues/{league_slug}/seasons/{season_slug}/rounds/{round_slug}/matches
/leagues/{league_slug}/seasons/{season_slug}/rounds/{round_slug}/matches/{match_slug}
/leagues/{league_slug}/seasons/{season_slug}
/leagues/{league_slug}/seasons/{season_slug}/rounds
/leagues/{league_slug}/seasons/{season_slug}/rounds/{round_slug}
/leagues/{league_slug}/seasons/{season_slug}/standings
/leagues/{league_slug}/seasons/{season_slug}/standings/{position}
/leagues/{league_slug}/seasons/{season_slug}/teams
/leagues/{league_slug}/seasons/{season_slug}/teams/{team_slug}/players
/stadiums

Example Season JSON

{
  "identifier": "0e5b67aa8cff15b2c6781454e55d1609",
  "league_identifier": "bc425f51d5ee924580c35c38da138de8",
  "season_slug": "15-16",
  "name": "2015-2016",
  "season_start": "2015-07-01T00:00:00+0200",
  "season_end": "2016-06-30T00:00:00+0200"
}

Example Standings JSON

{
    "identifier": "a963a7776379b3c8b818930ae9d9d0ca",
    "position": 1,
    "team_identifier": "a9ef824ba73b0a57e982df21467c3efc",
    "team": "Juventus",
    "overall": {
    "wins": 26,
    "draws": 9,
    "losts": 3,
    "points": 87,
    "scores": 72,
    "conceded": 24,
    "last_5": "WNWWN",
    "macthes_played": 0,
    "goal_difference": 0
}

openfooty API

openfooty API had promising API documentation but a quick look at the developer forums shows a stale community and questions about why no one seems to actually be able to get a developer key.

Betlines Ninja API (betting/odds)

The free API (hosted on Mashape) is for betting odds but contains a lot of upcoming fixture data. The API is provided by the Betlines Ninja. Along with match data the service provides recent odds data from all major sportsbooks (11 currently including Bwin, Paddy Power, Betfair etc.) Results can be obtained for a maximum of 3 days back in the free plan. There is also a data-dump database of historical data for sale.

Subscription Services/APIs

football-api

football-api.com is a paid API service. The API restricts by IP addresses and limit calls based on your package. Includes endpoints for competitions, teams, standings, live scores, fixtures and commentaries. See the pricing page. Prices range from $15 to $200 per month.

Example endpoints:

api/?Action=competitions&APIKey=xxxx
api/?Action=standings&comp_id=1204&APIKey=xxxx
api/?Action=today&comp_id=1204&APIKey=xxxx
api/?Action=fixtures&comp_id=1024&&match_date=[DATE_IN_d.m.Y_FORMAT]&APIKey=xxxx
api/?Action=commentaries&APIKey=xxxx&match_id=[MATCH_ID]

The demo (EPL) free feature is no longer available. Here is part of the email notice sent:

Discontinuation Of The Demo Plan (March 3 2016)

In the last couple of weekends, we experienced a very high load on our servers. This lead to delayed responses and very slow data return. We had to take emergency measures and unfortunately, we had to suspend the demo access to the English Premier League in favour of our paying customers. This decision is hard for us since we would like to offer this demo access for free. However, with around 3000 demo users, the load on our servers was too high.

Since the demo plan was not planned for production purposes (hence the name), we apologize if we have disturbed your development and surprised you with a lot of error messages.

This is why we can't have nice things.

CrowdScores and FastestLiveScores API

CrowdScores is a UK company that uses a crowd-sourcing football data collection process. You sign up for an account and report game events to their servers. They have iPhone and android apps for reporting. The collected data is then available as an API on FastestLiveScores.com. They currently offer three different API tiers. Free trial ($0), Basic ($100 per month) and Pro (price unlisted). View the API documentation.

Example endpoints and parameters:

/teams{?round_ids,competition_ids}
/matches{?team_id,round_ids,competition_id,from,to}
/competitions
/rounds{?competition_ids}
/seasons
/league-table/{round_id}
/league-tables?{team_id,round_id,competition_id}
/football_states
/events
/playerstats?{team_ids,round_ids,competition_ids,season_ids}

SPAPI

SPAPI offers subscriptions for a RESTful Sports API including live scores, player statistics, betting odds, pre-game data and match event data. The data response look pretty comprehensive.

Pricing:

Starter Plan is $299 per month
Growth pla is $499 per month
Pro Plan is $899 per month

Look at the example of one of the action data objects. It looks like data for a defensive clearance including x,y coordinates.

{
    "minute": 53,
    "second": 10,
    "team_id": 32,
    "start_x": 20.8,
    "start_y": 69.5,
    "expanded_minute": 57,
    "period": {
      "name": "SecondHalf",
      "value": 2
    },
    "type": {
      "name": "Clearance",
      "value": 12
    },
    "outcome_type": {
      "name": "Successful",
      "value": 1
    },
    "qualifiers": [{
      "type": "Head"
    }, {
      "type": "Zone",
      "value": "Back"
    }],
    "is_touch": true
  }

Example endpoints:

https://spapi.pw/api/v1/competitions/5/upcoming_matches
https://spapi.pw/api/v1/competitions/5/finished_matches
https://spapi.pw/api/v1/competitions/5/standings?standing_type=standings
https://spapi.pw/api/v1/competitions/5/player_rankings?ranking_type=all
https://spapi.pw/api/v1/teams
https://spapi.pw/api/v1/matches/17588/scores
https://spapi.pw/api/v1/livescores
https://spapi.pw/api/v1/players/1/statistics

Pretty impressive but this level of detail comes at a price. Authentication method is a API key in querystring.

Soccerama.pro

soccerama.pro offers a JSON API. There are plans with prices ranging between 15 to 125 euro's per month. There is also an option for custom plans. This API support lazy loading, meaning you pass parameters in your request to load relationships and nested relationships.

Registration to this API is free and every plan has a 14 day trial.

Some example endpoints are:

https://api.soccerama.pro/v1/livescore?api_token=__YOURTOKEN__
https://api.soccerama.pro/v1/competitions?api_token=__YOURTOKEN__
https://api.soccerama.pro/v1/competitions/{id}?api_token=__YOURTOKEN__
https://api.soccerama.pro/v1/matches/{id}?api_token=__YOURTOKEN__&include=hometeam,awayTeam,events
https://api.soccerama.pro/v1/statistics/match/{id}?api_token=__YOURTOKEN__

And an example response looks like:

{
  "home": {
    "team_id": 39,
    "shots_on_goal": 6,
    "shots_total": 16,
    "fouls_total": 11,
    "corners_total": 5,
    "offsides_total": 5,
    "possesion": 68,
    "yellowcards": 2,
    "redcards": 0,
    "saves": 1,
    "team": {
      "id": 39,
      "name": "Bayern München",
      "logo": "/img/teams/GER/39.png",
      "twitter": "@FCBayern"
    }
  },
  "away": {
    "team_id": 123,
    "shots_on_goal": 1,
    "shots_total": 10,
    "fouls_total": 12,
    "corners_total": 2,
    "offsides_total": 2,
    "possesion": 32,
    "yellowcards": 2,
    "redcards": 0,
    "saves": 5,
    "team": {
      "id": 123,
      "name": "Benfica",
      "logo": "/img/teams/POR/123.png",
      "twitter": "@SL_Benfica"
    }
  }
}

XMLSoccer.com

xmlsoccer.com is another subscription service. You can demo the service for free with the Scottish Premier League. The monthly pricing is 10 € per 1 month, 25 € per 3 months and 90 € per 12 months. You can browse the demo web methods here to see the types of calls available http://www.xmlsoccer.com/FootballDataDemo.asmx and the WSDL. The data is only returned in XML.

Each call must provide an API Key. You can get a free demo API Key by registering.

Example results for a team

<XMLSOCCER.COM>
  <Team>
    <Team_Id>45</Team_Id>
    <Name>Aberdeen</Name>
    <Country>Scotland</Country>
    <Stadium>Pittodrie Stadium</Stadium>
    <HomePageURL>http://www.afc.co.uk</HomePageURL>
    <WIKILink>http://en.wikipedia.org/wiki/Aberdeen_F.C.</WIKILink>
  </Team>
<XMLSOCCER.COM>

Player data

<Player>
  <Id>2523</Id>
  <Name>David Goodwillie</Name>
  <Height>1.7</Height>
  <Weight>70.29</Weight>
  <Nationality>Scotland</Nationality>
  <Position>Forward</Position>
  <Team_Id>45</Team_Id>
  <PlayerNumber>17</PlayerNumber>
  <DateOfBirth>1989-03-28T00:00:00-08:00</DateOfBirth>
  <DateOfSigning>2014-07-07T00:00:00-07:00</DateOfSigning>
  <Signing>Free</Signing>
</Player>

Open source Python client for XmlSoccer API

Resultados de Fútbol API

Resultados-futbol.com is a spanish football website with live scores and historical data with a database of more than four million matches and hundreds of competitions all around the world. When they decided to make their own mobile apps they created an API to serve both the apps and the website and made it available to developers.

Their API covers a lot from live scores with events to player stats, transfers, betting odds, etc. Chances are that if you find some info on the website it will be available on the API also. It is well documented in spanish only.

Their prices range from 99€ per year for around 1,000 requests per day to 499€ per month for 100k request per day. They offer a one month free trial with full access to the API and 500 request per day. Authentication method is a API key in the querystring.

opta

opta is one of industry leaders. This is what the tv networks use and likely what the actual football clubs use for scouting. If only this data were public! Opta used to provide a developer program under the title "Opta Playground" but it seems that the site has been removed and now shows a 404 error. The site used to read "Opta can provide data for programmers wishing to develop a mobile app or website with selected historical data available to download." You had to request permission in an email. I applied and they sent me the xml data set for 10 rounds of games from the start of the 2007/2008 Bundesliga 2. The more detailed game data had either x,y coordinates of game events. A very impressive dataset but it felt more like an advertisement. The data provided I had no interest in and I'm not sure why an indie developer would spend time working on a data set they could never afford. They even track this data point "Spectator on pitch." Read this article FiveThirtyEight behind the scenes look at how opta tracks data (spoiler: young male gamers).

An example of an "event" in xml

<Event id="1115853439" event_id="9" type_id="3"
    period_id="1" min="0" sec="19" player_id="21202"
    team_id="1744" outcome="1" x="64.9" y="11.6"
    timestamp="2007-08-19T13:02:08.482" last_modified="2007-08-19T13:02:13">
        <Q id="152113216" qualifier_id="56" value="Right"/>
</Event>

prozone

prozone is another large commercial data provider.

Match Analysis

Match Analysis is another large commercial data provider that lists Fox Soccer Channel, US National Team and the MLS among their clients.

Other Websites

FootballSquads

footballsquads.co.uk has current and historical squad details for clubs (rosters) and national teams from all across the world for many leagues and competitions, including the 2014 World Cup squads.

And example of the squad/roster data:

Num  Name         Nat   Pos  Height   Weight    Date of Birth    Birth Place     Previous Club
1    David De Gea ESP    G     1.92    82        07-11-90        Madrid          Atlético Madrid
2    Rafael       BRA    D     1.72    65        09-07-90        Petrópolis      Fluminense

Rec.Sport.Soccer Statistics Foundation (RSSSF)

Rec.Sport.Soccer Statistics Foundation (RSSSF) has massive collection of formatted plain text statistics. An example of English Premier leagues results.

Example of the data for table results:

1.Chelsea                  8   7  1  0  23- 8  22
2.Manchester City          8   5  2  1  18- 8  17
3.Southampton              8   5  1  2  19- 5  16
4.West Ham United          8   4  1  3  15-11  13

and scores:

Round 1
[Aug 16]
Arsenal       2-1 Crystal P
Leicester     2-2 Everton

football-data.co.uk

football-data.co.uk is a betting and odds website that has made a lot of historical league data available as csv files. The data includes results and a lot of betting/odds related data. I have tried to aggregate and clean up the data in the following repo github.com/jokecamp/FootballData

Leagues and divisions included:

England Football Results    Premiership & Divs 1,2,3 & Conference
Scotland Football Results   Premiership & Divs 1,2 & 3
Germany Football Results    Bundesligas 1 & 2
Italy Football Results      Serie A & B
Spain Football Results      La Liga (Premera & Segunda)
France Football Results     Le Championnat & Division 2
Netherlands Football Results    KPN Eredivisie
Belgium Football Results    Jupiler League
Portugal Football Results   Liga I
Turkey Football Results     Ligi 1
Greece Football Results     Ethniki Katigoria

The key/legend of all the field abbreviations gives you idea of what is available in the CSV files:

Div = League Division
Date = Match Date (dd/mm/yy)
HomeTeam = Home Team
AwayTeam = Away Team
FTHG = Full Time Home Team Goals
FTAG = Full Time Away Team Goals
FTR = Full Time Result (H=Home Win, D=Draw, A=Away Win)
HTHG = Half Time Home Team Goals
HTAG = Half Time Away Team Goals
HTR = Half Time Result (H=Home Win, D=Draw, A=Away Win)

Match Statistics (where available)
Attendance = Crowd Attendance
Referee = Match Referee
HS = Home Team Shots
AS = Away Team Shots
HST = Home Team Shots on Target
AST = Away Team Shots on Target
HHW = Home Team Hit Woodwork
AHW = Away Team Hit Woodwork
HC = Home Team Corners
AC = Away Team Corners
HF = Home Team Fouls Committed
AF = Away Team Fouls Committed
HO = Home Team Offsides
AO = Away Team Offsides
HY = Home Team Yellow Cards
AY = Away Team Yellow Cards
HR = Home Team Red Cards
AR = Away Team Red Cards
HBP = Home Team Bookings Points (10 = yellow, 25 = red)
ABP = Away Team Bookings Points (10 = yellow, 25 = red)

european-football-statistics.co.uk

www.european-football-statistics.co.uk is a visually dated website but has a lot of historical football data (mostly an overview of league/tournament results) displayed in nice clean HTML tables. Looks like they already have 2014 EPL stats. The site claims "The target of this site is to collect european football statistics which are not easily found on internet."

openligadb.db

openligadb.db has an old-school windows asmx web service with methods such as "GetGoalsByMatch()"

Wikipedia

Wikipedia - has a lot of structured data and is also crowd/public sourced. You can use their API to query then parse the data. It is very fragmented into specific pages making this a good source if you are looking for very specific team/player data. For example here is a table of Manchester United season results http://en.wikipedia.org/wiki/List_of_Manchester_United_F.C._seasons.

Post War English & Scottish Football League A - Z Player's Database

Post War English & Scottish Football League A - Z Player's Database contains a lot of HTML tables of "players who appeared for their clubs between 1946/47 and the end of the 2013/14 season and who have now left their clubs." Here is a list of ex-Manchester United players.

Stats included are: NAME, POS, SEASONS, SOURCE, TRANSFERRED TO, APPS, GOALS

2015 Women's World Cup Data

FIFA PDF files - includes unformatted data on participating teams, schedules and random statistics

world-cup-women on github - plain text file list of teams and schedule

2015 Women's World Cup Wikipedia page - includes a great visual bracket view

2014 World Cup APIs

kimono labs 2014 World Cup Api - has a very nice restful API available. Free registration required to access the API. The API has a player, team, club, matches, and playerseasonstats endpoints. See the documentation and start making calls withe the API explorer

2014 World Cup JSON API - Soccer for Good - The API is available now. The author explains that the data is from a scraper so the availability is not guaranteed but should be available throughout the tournament. There are endpoints for teams, matches, today, tomorrow and current. The Ruby on Rails source code is available on Github

World Cup in JSON - an open source ruby project available at github.com/estiens/worldcupjson that scrapes a few sources and combines into an API. API is available at http://worldcup.sfg.io/matches

Unofficial FIFA.com JSON API for Mobile Apps This is unofficial and I wouldn't be surprised if it is protected/unavailable soon. Until then its nice to see data straight from the source. Known endpoints: matches, teams or detailed match info

Deprecated/Retired - "The Graveyard of APIs"

ESPN API has an API for registered users (free). You can get a list of all the players in the EPL. However they are very limited in their data. They restrict all fixtures and scores to "strategic partners." However, you can get lists of players and teams. The Public API is being retired on Monday, December 8, 2014 Read the announcement

StatsFC used to have an restful JSON API of all EPL scores and fixtures. It was about $8 us dollars a month but was recently shut down. See their official statement. They still offer widgets and they plan on reviving their servies. See their comments at the bottom of the page.

Other Reading / Resources

opendata.stackexchange forum

Are there any open datasets for Soccer statistics? - keep your eye on this open data forum for more answers.

Linked Soccer Data

Linked Soccer Data (pdf) is a white paper on one group's attempt to "create a dataset including reliable information about soccer events covering as many historical data as available including recent competition results." Some dead links but worth a skim.

Fantasy Data

Even more links to explore

You've made it this far. Why stop now?

Share your own sources -- What have I missed?

Please let me know about your own data sources or add a pull request on github. I have mainly searched for EPL data and would love to add data from other leagues/competitions to the list.