Help Us Improve Factions Match Balancing

UPDATE: See this newer post.

 

We’re going to be running some test matches to gather data to improve our balancing system. Any help you can provide would be greatly appreciated. Read on for more.

If you want to volunteer as a lab rat, join the appropriate in-client chat for your rank (higher of S3 and S4 solo queue SR rank): e.g. “BRONZE”, “PLATINUM”, “SILVER”. We need lots of Silvers in particular.

JoinChatButton-Marked-SILVERSet the chat to auto-join by clicking the gear icon in the upper-right corner of the chatbox and selecting “auto-join”. Then you can just forget about it; you’ll eventually get a custom game invite or two. People who are setting up these test matches will pull from those chats once they reach sufficient size. You can also ask around in the general Factions chat.

Even better: If you’re willing to run some of these matches, read on for instructions.

Also, if you haven’t yet, you can take this poll to share your intuitions about the size of the tier gaps.

Primary Objective: Weight Mixed-Tier Matches Appropriately

Factions does not currently have the benefit of the automated Matchmaker system. Thus, unlike normal LoL games, Factions matches are usually not razor-balanced 50/50 win/loss matchups. We do encourage people to take the time to at least roughly balance the teams in pick-up matches, and we try to be strict about that for tournaments and featured matches, but it’s safe to say that one side has the skill advantage going into any particular Factions match.

In some ways, I think this has become a strength of Factions: you get a chance to play with and against Summoners of a broad range of skill levels, whereas normally in League you play 99% of games with people in the same narrow MMR band.

However, to make things fair when scoring matches, we do need to take skill gaps into account. This requires some non-trivial analysis of a few key questions:

  • What are the relative “strength values” of various Tiers? If a typical Silver brings “3 points” of game-winning power, what’s the number for a Gold? A Bronze? A Diamond?
  • How do you combine these strength values to generate a number for an entire team? Is it an average? Or more of a multiplication? Something more complex that factors in “outliers”?
  • Ultimate question: what is the best formula for computing how slanted a match is, given the Tiers of Summoners making up each team?

When scoring Factions matches, we want to apply appropriate weighting. A predictable stomp (5 Diamonds beating 5 Silvers) should be worth few points, while an impressive upset victory (5 Bronzes beating 5 Golds) should be worth more points.

The Status Quo

We’ve done our best to jerry-rig a rudimentary point-weighting system for Factions. For reference, I’ll briefly describe our current approach. We model team skill as an additive average, using the following point values:

  • Bronze/Unranked: 1
  • Silver: 3
  • Gold: 5
  • Platinum: 8
  • Diamond: 12

So, for example, the following teams have the following averages:

  • Gold, Gold, Gold, Silver, Silver: 4.2
  • Diamond, Plat, Bronze, Bronze, Bronze: 4.6
  • Diamond, Diamond, Diamond, Silver, Silver: 5.0
  • Gold, Gold, Bronze, Bronze, Bronze: 2.6
  • Plat, Silver, Bronze, Bronze, Bronze: 2.8

Though this produces nice tidy decimal-pointed numbers, it’s really just guesswork. For example, this model presumes that upgrading a Plat to a Diamond is a bigger boost for a team than upgrading a Bronze to a Silver. I based that on the non-linear distribution of Elo ratings, which is indirect evidence at best of actual game impact, and many Summoners have remarked to me that they have a different intuition, i.e., that they think the low-Tier upgrades are worth more.

I think we’re going to need to perform some empirical investigation to move beyond guesswork and intuition here. So, after some thought, I’ve designed a first round of experiments to help us get a rudimentary idea of what might be going on here.

Initial Experiment Design

I want to run some matches and try to get a feel for the relative power gaps between the Tiers in mixed-Tier matches. To do so, I invite Summoners to run matches with the following parameters and submit the results. Apologies for the nitpicky details: obviously, I appreciate any help you can provide, but I really want to make sure we have good, clean data, so I’m being rather strict with these parameters.

We’re going to start with an assessment of the relative gaps between the Tiers, using Silver as a baseline. We’ll do this with large numbers of trial matches with the following configurations:

  • Team A: Silver, Silver, Silver, Silver, Silver
  • Team B: ____, ____, Silver, Silver, Silver
    • Fill in the blanks with a pair of Summoners from the same non-Silver Tier: Bronze, Gold, Platinum, or Diamond.

The key statistics we’ll be tracking will be the win rates of various Team B arrangements versus the measuring all-Silver team. For example, if 2 Golds and 3 Silvers beat 5 Silvers 70% of the time, then there should be a large point difference between Golds and Silvers when weighting matches.

Instructions

  • Bump this thread to help recruit Silvers. Check the SILVER in-client chat for them.
  • Here’s a list of people who have volunteered as balance testers.
  • Matches are to be Tournament Draft. Use up to 5 minutes of pause time per side if someone d/cs or has similar problems.
  • Use the full (100+) Champion roster, not Factions rosters. (This removes the potentially troublesome variable of faction strength, albeit at the cost of not being able to examine specifically how Tiers perform in the Factions context, a possible subject for future study.)
  • Use bans as normal (as, e.g., in a Ranked game).
  • Play your best: these matches are imba by design, and we need precise measurements of how much imbalance different Tiers create when added to the mix. Although your Elo isn’t at stake, the data will be compromised if people derp around with joke builds or whatnot. Don’t go easy on the other team if you’re the higher-Tier team: the goal here is not to have fair matches but to quantitatively measure the degree of imbalance resulting from different Tier variations. To do that, we need people to cast aside their weighted training clothing, eat any Senzu Beans they have, and fight at their true power level. Pretend you’re playing Ranked. Play to win. (Of course, you should still be sportsmanlike and friendly in chat and such.)
  • Each Silver player should toss a coin to decide which side they’re on for each match. (Obviously, if one side fills up, any new entrants just go on the remaining side.)
  • Use the following Tier arrangements:
    • One team should be all-Silver.
    • One team should be 3 Silvers plus a pair of Summoners from another Tier. (Both should be from the same Tier, e.g. 2 Bronzes or 2 Diamonds.)
    • Use a coin toss to decide whether the all-Silver team is Blue Side or Purple Side.
  • Tier is defined as the higher of end-of-S3 solo queue SR rank and current S4 solo queue SR rank. In other words: use LoL Nexus and take the higher of the two ranks that appear.
    • op.gg can also show you at a glance a Summoner’s S3 and S4 Tiers.
    • Bronze can mean Bronze or Unranked. That said, for these experimental matches, we prefer “actual” Bronzes, and want to stay away from people who, e.g., were Plat in S2 but haven’t played Ranked since then.
    • Example: FrozhanKhetos was Plat in S3 and is now Gold. For our purposes, they are a Plat.
  • Verify Tiers before the game starts.
  • The match organizer is also responsible for submitting the results.
  • The match organizer should open up LoL Nexus while the game loads and screenshot that.
  • Submit results here at the end.

An example:

  • Blue Side: Silver, Silver, Silver, Silver, Silver
  • Purple Side: Plat, Plat, Silver, Silver, Silver

One side should always be all-Silver. The other side will have a pair of Summoners from another Tier.

You can copy and paste the following text into Factions chat to help you gather players (edited as appropriate for the specific Tiers you want to test):

Running a balance test match! We need lots of Silvers and two Golds. (Tier is higher of S3 and current rank, solo queue SR.) If you’re available, join the appropriate chat room: SILVER or GOLD.

Why This Design?

I thought a lot about this, and I think the best first step is to try to suss out where the various Tiers stand in relation to Silver in a mixed-Tier match. Our initial results from this round of tests will help us figure out where to go next.

Rewards

This data will be very, very useful to us, and will help us make Factions more fair in future arcs. People who organize lots of these matches will be commended and may receive some kind of reward from CupcakeTrap as a token of appreciation.

Other Issues

There are many, many more issues we’re planning to investigate to improve balance. While this is only a rudimentary first step, the move from guesswork and intuition to empirical testing is a crucial one.

—CupcakeTrap

Caitlyn, you monster.

Posted in Uncategorized
9 comments on “Help Us Improve Factions Match Balancing
  1. […] you’d like to help run a match, visit this post and read the “Instructions” section. That section also contains a link to the match […]

  2. […] a primary objective as we head into the next arc. To that end, we’re performing some balance testing experiments designed to generate empirical evidence on the appropriate point-weighting of the tiers. We need […]

  3. Nikolai says:

    I’m glad they actually listened or payed attention to my endless raging last weekend. xD

    • CupcakeTrap says:

      Ha! No problem. :)

      This has actually been on my mind for a while now. As always, there are lots of things to work on with Factions, and we’re rotating through them as best we can. Right now, though, I’d probably put balance at the very top of the list. I want to get something better than the current system in place ASAP.

      A secondary issue: we’re thinking about modifying the Featured Match format from a full “rainbow” (DPGSB) to something narrower. For example, we might try GSSSB for the first match of the set, DPGGS for the second, and SSSSS for the third, or something. (Those are just random examples.) Or maybe we’ll start running more monochromatic matches as Featured Matches, e.g., SSSSS followed by GGGGG followed by PPPPP or BBBBB or DDDDD.

  4. Dan says:

    What do you mean by “If you’re available, join the appropriate chat room: SILVER or GOLD.”?
    Do you mean on League?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: