Context

League of Legends is a popular global online game played by millions of players monthly. In the past few years, the League of Legends e-sports industry has shown phenomenal growth. Just recently in 2020, the World Championship finals drew 3.8 million peak viewers! While the e-sports industry still lags behind traditional sports in terms of popularity and viewership, it has shown exponential growth in certain regions with fast-growing economy, such as Vietnam and China, making it a prime target for sponsorship for foreign companies looking to spread brand awareness in these regions.

While the e-sports data industry is also showing gradual growth, there is not much available publicly in terms of published analysis of individual games. This may be due to the fact that the games are fast-changing compared to traditional sports--rules and game stats are frequently and arbitrarily changed by the developers. Nevertheless it is an interesting field for fun researches: hence the reason for many pet projects and graduate-level papers dedicated to this field.

All existing League of Legends games (minus custom games, including ones from competitions) are made available by Riot's API. However, having to request and parse the data for every single relevant game is quite annoying; this dataset intends to save that work for you. To make things (hopefully) easier, I parsed all JSON files returned by Riot API into CSV files, with each row corresponding to one game.

Components

This dataset consists of three parts: root games, root2tail, and tail games.

I found that quite often when trying to predict the outcome of a match prior to its play, the historical matches of a player prior to that game count as an important factor (Hall, 2017). For such purpose, root games contains 1087 games from which tail games branches out.

Tail games contains historical matches of each player for every game in root games. Root2tail maps root games's each player's account ID and that player's controlled champion ID to a list of matches that can be found in tail games.

To simplify the explanation, if you want to access historical matches of a player in root games file,

Get player's account ID and the game ID.
Load root2tail file.
Queue for matching row on account ID and game ID.
The corresponding row contains a list of game IDs that can be queued on tail_games files.

Note that root2tail documents most recent 5 matches, or a list of matches played within the past 5 weeks, prior to the game creation date of the corresponding "root game". It also only documents the most recent games by the player played with the same champion he/she played in the "root game". In cases where there is an empty list, it means the player has not played a single match with the same champion within the past 5 weeks.

Content

How was this data collected?

On 2020, December 5th, I fetched the list of current players in Challenger tier, then recursively gathered historical matches of those players to consist root games, so this is the data collection date.

What do the rows and columns of the csv data represent?

Root2tail is self-explanatory. As for the other files, each row represents a single game. The columns are quite confusing, however, as it is a flattened version of a JSON file with nested lists of dictionaries.

I tried to think of the simplest way to make the columns comprehensible, but looking at the original JSON file is most likely the simplest way to understand the structure. Use tools like https://jsonformatter.curiousconcept.com/ to inspect the dummy_league_match.json file.

A very simple explanation: participant.stats._ and participant.timeline._ contains pretty much all match-related statistics of a player during the game.

Also, note that the "accountId" fields use encrypted account IDs which are specific to my API key. If you want to do additional research using player account IDs, you should fetch the match file first and get your own list of player account IDs.

Acknowledgements

The following are great resources I got a lot of help from:

These two actually explain everything you need to get started on your own project with Riot API.

The following are links to related projects that could maybe help you get ideas!

Kim, Seouk Jun, https://towardsdatascience.com/discussing-the-champion-specific-player-win-rate-factor-in-league-of-legends-match-prediction-3d83d7e50a94 (2020)
Huang, Thomas, Kim, David, and Leung, Gregory, https://thomasythuang.github.io/League-Predictor/ (2015)
Jiang, Jinhang, https://towardsdatascience.com/lol-match-prediction-using-early-laning-phase-data-machine-learning-4c13c12852fa (2020)
Hall, Kenneth T., “Deep Learning for League of Legends Match Prediction”, https://github.com/minihat/LoL-Match-Prediction (2017)

验证报告

以下为卖家选择提供的数据验证报告：

Complete In-Depth Dataset for League of Legends

￥7

已售 0

49.12MB

申请报告

Complete In-Depth Dataset for League of Legends

Context

Components

Content

How was this data collected?

What do the rows and columns of the csv data represent?

Acknowledgements

关于典枢

下载与支持

服务协议

关于我们

官方公众号

技术交流群