GD Thread: How much does rank matter? Initial experimental results.
Reddit Thread: Initial results of rank-impact analysis
The following charts show how various ranks of Summoners fared when thrown into custom games (Factions matches) containing uneven mixes of ranks, such as Platinum-Silver-Silver-Bronze-Bronze vs. Gold-Gold-Silver-Silver-Silver. Using this data, we have attempted to assess how much game-winning impact various ranks of players bring to such matches.
Preliminary results indicate that Bronze/Silver are fairly close together in terms of game-winning impact, as are Platinum/Diamond, with a significant jump from Bronze/Silver to Gold and then from Gold to Platinum/Diamond.
Note: I would not place much significance on the “Sub-30” and “Unranked” data, for various reasons. (For example, in most of the data, they aren’t distinguished from Bronze.) I’m including them above mainly because I know that people will want to see them.
Let’s say you arrange a match of 5 Silvers (SSSSS) versus 1 Gold and 4 Silvers (GSSSS). How fair is this match? If you held 10 such matches, how often would the GSSSS team beat the SSSSS team?
What about SSSSS vs. GSSSB? Which side has the advantage? Is the “upgrade” from Silver to Gold more or less significant in terms of game-winning impact than the upgrade from Bronze to Silver?
For that matter, how accurately does rank reflect actual game-winning impact in the first place? Will adding a typical Gold into an all-Silvers match completely unbalance it, or have little impact?
Proposed answers to these questions vary from “rank is meaningless” to “the lower ranks are all the same, but Plats and Diamonds are godly” to “every single rank gap is massively significant.” It’s about time, I think, to put these theories to the test.
Why We Care
I imagine that many LoL players have wondered just how much awesomer Diamonds are than Bronzes, and how the various ranks might fare in a mixed-rank brawl. We know the climb gets harder and harder as you move up, but are people dramatically improving as they advance or just polishing increasingly finer details of their gameplay?
Moving beyond academic curiosity, though, this is of particular importance for the Factions game mode.
Factions matches are weighted based on relative skill levels
Factions is a community game mode based around faction versus faction matches, such as Noxus vs. Ionia, using faction-specific Champion lists. Matches are organized as custom games. Factions gain points from victories and lose points from defeats. Although major tournaments are mirror-balanced, the average Factions pick-up game is a varied mix: “Plat-Gold-Silver-Bronze-Bronze vs. Diamond-Gold-Gold-Silver-Bronze” would not be unusual. We have to some extent made a virtue of necessity, and the novel experience of a mixed-rank game (where you might be laning against a Bronze and then get ganked by a Diamond) has become a central aspect of the mode. However, this does raise a question of fairness. If Noxus beats Ionia, it should be because the Noxian Summoners did a better job of strategizing, or because they practiced together more, or because axes are just inherently superior to katanas — not because for whatever reason Noxus is full of Plats and Ionia is full of Silvers.
Factions uses an automated point-adjust formula to weight matches based on how even they are. A “mirrored” match is worth a base value of 5 points on Summoner’s Rift. As the difference in ranks between the teams increases, the value changes: predictable stomps (e.g. 5 Golds beat 5 Bronzes) are worth fewer points than normal, while upset victories (e.g. 5 Bronzes beat 5 Golds) are worth more points.
To balance matches, we must assess relative skill
The problem: we don’t really know how to weight the tiers appropriately. The point values Factions currently uses to weight match outcomes are based on a survey of perceptions — we essentially asked players to decide how many points each tier should be worth, if a Silver player is worth 100 points. This produces tidy numerical values, but they’re entirely based on player intuition. Generally speaking, the popular consensus is that the gaps between tiers get more significant as you go up in rank: e.g., the Platinum/Diamond gap is believed to be the largest.
We’re running experimental test matches to gather data on tier strength
We’ve recently begun a new round of balance testing. This involves playing non-Factions custom matches and reporting the results. (Non-Factions matches are better for pure balance testing in that they exclude the confounding variable of faction strength.) Although we only have a handful of match results so far, I hope to collect more during the intermission period between the end of Hextech Revolution and the start of the next arc.
This study represents only the very first steps toward applying empirical analysis to these questions. To move forward, we need more data, specifically more non-Factions data. You can contribute data very simply: just play a non-Factions balance testing match and make sure someone sends in the results. Instructions can be found here.
Here are some initial results, mostly using Factions data. The underlying data is available here.
Winrate by ranked tier
As a starting point, I’ve looked at the winrates for various ranked tiers. The methodology is explained further below.
This rudimentary analysis was performed as follows: each match was treated as up to 5 “wins” and 5 “losses”, one for each player. First, we cancel out mirrored pairs, because we’re using an additive skill model to start with: if both teams have 3 Silvers, then we simply ignore those 6 players and focus on the players whose ranks vary. Then, we award wins and losses to each rank.
For example, let’s say a team of 5 Silvers (SSSSS) beat a team of 3 Silvers, 1 Gold, and 1 Bronze (GSSSB). This would be counted as:
- 2 wins for Silver
- 1 loss for Gold
- 1 loss for Bronze
We divide the “blame” for the loss equally among Gold and Bronze. We could shift the blame toward the lower rank, but this would amount to reasserting our assumptions rather than testing them.
With this model, rather than simply awarding non-specific “win” or “loss” points to each rank, match data is viewed as supporting or contradicting the validity of specific claims. For example, if SSSSS beats GSSSB, that will be recorded as:
- decreased support for the claim that “Golds are stronger than Silvers”
- increased support for the claim that “Silvers are stronger than Bronzes”
A “support factor” of 1.00 means the two tiers seem about the same strength.
Rank distribution by faction
Right now, we only have the Factions dataset. Some factions end up stronger than others, whether due to an especially enthusiastic playerbase, a high level of intra-faction strategizing, or just a more powerful roster of Champions. To the extent that the rank distribution may vary from faction to faction, this could in turn skew the results.
This chart shows not the composition of sign-ups for each faction, but actual match representation: if a faction had an even blend of tiers, but for some reason only their Silvers ever played matches, this chart would show them at 100% Silver.
Notably, Piltover and Bandle City have more Bronzes playing for them than Zaun and Demacia do. Piltover has the most Diamond representation, and Zaun has the least. Given that Zaun is doing pretty well this arc, that might be the source of the low Diamond winrate. (Of course, the causal direction could run the other way.)
These rank distributions are only for Hextech Revolution and Shon-Xan; I haven’t gotten around to adding in Mirrorwater yet.
Contribute by Playing Test Matches
Want to help? You can do so by playing matches. Factions matches certainly do add data, but if you really want to be helpful, you can also throw together a non-Factions match from time to time. Instructions are available here. Basically, join the balance chat and auto-join:
So if you feel like playing with some Factions friends with a full Champion roster, or showing off with your main, consider organizing a balance match or two.
I am thinking about applying another adjustment to the scoring algorithm before the end of the arc, but I’m concerned that changing the rules so close to the end might seem unfair. A compromise would be to leave previous scoring intact and only change the scoring going forward.
So, should we use a revised scoring system for future matches, while leaving previously scored matches alone? Or should we keep with the survey-based scoring system until the arc ends?
If we did change the scoring system, here’s how I would do it.
- I’ll take the percent winrate for each tier and multiply it by 10. Thus, a winrate of 55.3% would become “553”.
- I would use these numbers as the new point values for each tier.
- Exception: I would arbitrarily set the value for Sub-30 Summoners as being somewhat below Bronzes.
- I’d update the skill adjust magnitude accordingly, so that for every 1 point of average skill difference between the teams, the value of the match goes up or down by 1%.
Although come to think of it, the point adjust math should probably be a bit different; we don’t necessarily want to vary it 1:1 with winrate, but instead to make it so that, if there were no other variables (such as faction), we’d expect the risk times the reward to be the same for both sides.
Alternatively, we could use a multiplicative method; among other things, this would reflect a theoretical belief that players amplify or dampen their teammates’ contributions, whereas the additive model views players as independently adding to or subtracting from their team’s total power.
This wouldn’t change how any previous matches have been scored, but would affect how future matches are scored.
Thoughts? Join in at one of the discussion threads:
- GD Thread: How much does rank matter? Initial experimental results.
- r/lol Thread: Initial results of rank-impact analysis
- r/leaguefactions Thread: Should we rework the scoring system?
And again, please do consider playing a balance match or two to help us build a better dataset.
GD Thread: How much does rank matter? Initial experimental results.
[…] I crunched some numbers using previous arc data and some rather simplistic analysis. You can read the results here. […]
nice analysis :)
Thanks! Also just updated it.