Evaluating Features in Classification of LoL Rank


For this project, I have compared the effectiveness of various feature sets from League of Legends (LoL) game data in classification tasks. Data for individual LoL matches were scraped from the North American match history servers. A decision tree, k-nearest-neighbors model, and multi-layer perceptron neural network were each configured and tested on their ability to identify player ranks when presented with game data. I used 13 different feature sets with each classifier and compared the results. The neural network always outperformed the other two models and the best feature set was the creep score intervals, gold earned intervals, and vision ward placement/destruction. The worst feature set was the losing team’s KDA by itself.


League of Legends, created by Riot Games, is the most popular online game in the world with over 100 million active players each month (2016) and 27 million active players daily (2014)[2][3]. It places 2 teams of 5 players against each other in a consistent arena. Each match has the same exact rules and possibilities, which allows the results of the match to be mostly dependent on the skills and strategies of the team. To supplement this, League of Legends has instituted a 6-tier competitive ranking system that reflects the general skill of the players at each rank.

In my project, I tested the effectiveness of common artificial intelligence models on identifying these ranks when presented with specific matchs’ statistics and events. I chose to compare a decision tree, k-nearest-neighbors model, and multi-layer perceptron neural network. Evaluation of their effectiveness was done with a measure of how close to the actual rank each identification attempt was.


A decision tree is an AI model that classifies a target by making a series of logical evaluations, with each evaluation branching from the previous result. The height and complexity of this tree structure grows throughout training, and each decision function is made on characteristics and qualities of the input data [1].

A k-nearest-neighbors model plots training data into ℝn, where n dimensions are made from each entry in the input vector. When asked to classify a target, the data is also plotted and a distance function finds the k nearest points (the nearest neighbors). Using multiple neighbors prevents classification from being skewed by outlying data and anomalies. K is usually odd, which prevents a “tie” among binary classes, and can help prevent a tie in more complex classifiers if classes are mostly isolated [1].

A multi-layer perceptron neural network uses a data structure called a perceptron that is modeled after a human neuron to evaluate input data. These perceptrons are organized in layers, where inputs enter each perceptron function and an output is passed to the perceptrons of the next layer. Each input is weighted, and the learning algorithm adjusts these weights during training. The last layer of perceptrons is responsible for returning the classification [1].

Since each of these classifiers accepts a vector of data as its evaluation target, I selected a set of features from a typical League of Legends match and arranged them into a matrix.


In order to collect the data for the project, I wrote a scraping script using the Requests python library and a Riot Games API key. I had to conform to the limits imposed by Riot, and managing the timing of requests was important. The script would begin with a certain player and traverse their match history. In order to avoid having data skewed by the patterns of a particular player, the script would switch to a new player after 10 matches had been collected, or at random. A list of saved match IDs was held in memory to avoid wasting precious API calls. The scraping would tend to gravitate towards the middle ranks over time, and so I added an additional exception to the 10-matches per player rule for the highest ranks to be able to get an adequate sample. I also restarted the scraping process with 10 new player seeds at both the lowest rank and highest rank for increased variety. Ultimately, I collected approximately 6,500 match records.

The match data are in individual json files, and so I had to parse them into arrays compatible with scikit-learn. The NumPy library provided C-like arrays that were perfect for holding match features. From the available match stats, I calculated the following features:

Total minions killed
Total gold collected
Winning team’s longest killing spree
Kills per team
Assists per team
Deaths per team
Ratio of magic damage to physical damage per team
Number of vision wards placed per team
Number of vision wards destroyed per team
Total baron kills
Total dragon kills
Match length

After organizing the features, I had to assign a rank to each match data because the general rank is not included in the json data. To accomplish this, I simply took the average rank of the 10 players by assigning numeric values to the 6 unique ranks. Once features were assigned a rank, I split the data into training and testing data in a 2:1 ratio. The training and test data had to be scaled in order to work with the neural network, so I used the scikit-learn scalar tool to do so.

To implement each of the classification models, I chose to use the scikit-learn library. It provided a simple and well-documented solution for the training of a decision tree, nearest neighbor model, and neural network. However, it’s included classification accuracy report only presented the percentage of perfect classifications, and did not take into consideration “how wrong” the incorrect classifications were. To remedy this, I wrote a distance function that calculated how far from the correct answer each evaluation from the test set was (using the same numeric system to average the ranks).


I tested the three classifiers on the listed feature sets with each one including the match length feature and reported the average classification distance. The neural network consistently performed better than the k-nearest neighbors model which was consistently more accurate than the decision tree.


The best feature set for all three classifiers was the combination of creep score intervals, gold earned intervals, and vision ward placement/destruction. This combination yielded a neural network distance of 0.509. The worst feature set was the losing team KDA with a decision tree distance of 1.377.

Future work for this project would include continued collection of match data, testing with a variety of neural network configurations, and using folds in testing and training sets. Because of the bell curve rank distribution, the highest ranks were more difficult to collect samples for, and may have been underrepresented with only 123 Master/Challenger samples. Different neural network configurations may have been more ideal for each feature set, but I used a constant configuration that I determined to be best when working with the “Everything” feature set.


[1] Russell, S., & Norvig, P. (2010). Artificial Intelligence: A Modern Approach. Prentice Hall.

[2] Sherr, I. (2014). Player Tally for ‘League of Legends’ Surges. https://blogs.wsj.com/digits/2014/01/27/player-tally-for-league-of-legends-surges/.

[3] https://www.riotgames.com/our-games

Leave a Reply

Your email address will not be published. Required fields are marked *