Friday, 16 August 2019

Beginner balance versus pro balance

Game balance is often approached from the angle of the pro-gamer: how strong are things when used by a skilled player? However, the balance as it's experienced by beginners is equally important, since a large portion of the playerbase will never reach pro skill levels but will still want to have a fun experience. Today I'd like to discuss three different approaches we've used in our games Awesomenauts and Swords & Soldiers II to make the gameplay fun for beginners but also balanced for experienced players.

The big challenge here is that in a complex game with varied characters/weapons/factions/etc. it's nearly impossible to achieve perfect balance under all circumstances. Balance is influenced by almost everything and if you want variation, then that variation is undoubtedly going to upset the balance. For example, if some characters are faster than others, then having maps of different sizes can greatly change the balance on one map compared to another. The alternative is to make all maps the same size, but that's boring. For this reason balance is a moving target that you're constantly trying to get closer to but a certain amount of imbalance is (grudgingly) accepted in almost every game.



To make a competitive game fun for players of all skill levels, the ideal situation is to make it balanced for pro players, intermediate players and beginners alike. An example of striving for this goal are the changes we've made to the Awesomenauts character Gnaw. At some point Gnaw was heavily overpowered for beginning players, but mediocre at best for pro players. That meant that if we had nerfed (weakened) Gnaw for the sake of the beginner experience, Gnaw would have become totally useless for pro players. If simple nerfing is not an option, then what to do instead?

When a character is overpowered specifically in matches with beginners, this is often either because the character is too easy to play well, or because the counters are too difficult to figure out and perform. The nice thing is that if we can change the difficulty, it won't matter much for pro players: if the character becomes harder to play, pros will still master it. And if the character becomes easier to counter, that also doesn't matter for pro players: their opponents already mastered the counters anyway. Realising this doesn't make fixing the balance easy, but at least it gives us a starting point for where to look for a fix.



To modify Gnaw's difficulty for beginners, our designers made a number of changes over the years. Most of these revolved around making it more work to be effective with Gnaw. For example, initially Gnaw's Weedlings (little creatures Gnaw can leave behind to attack enemies) would live forever. One of the changes we made was that Weedlings would die after a while, requiring Gnaw to place them again. Weedlings were also changed to start weaker and become stronger over time. The result of these changes is that the player needs to be a lot more active to keep their Weedlings in the right places. Also, if the enemy has destroyed the Weedlings, Gnaw can't just replace them with equally strong ones right away.



Of course these changes also influenced the balance for pro players, but combined with some further tweaks we managed to keep Gnaw about equally strong for pro players while making him harder to play well for beginning players. Hence the pro balance remained similar while the beginner balance was improved.

Another Awesomenauts character with a similar problem was Ayla. Here too we made changes to make her harder to play well and easier to counter. While our designers did manage to improve Ayla's balance for beginning players, she remained problematic to counter for beginners.

Therefore when going free-to-play we employed a different tactic. In the free-to-play version of Awesomenauts, characters need to be unlocked with Awesomepoints, which the player can collect by playing the game. Characters have different prices. For example, some characters are difficult to play so we made them expensive to keep beginners from unlocking them right away. Since we had a problem with Ayla not being so much fun to play against for beginners, we chose to make Ayla expensive as well. This way few beginners will have Ayla and thus few beginners will encounter other beginners who are playing Ayla. This feels like a crude solution, but sometimes crude is the best one can do.



An even cruder solution that might be considered is that if a character is too damaging for the beginner experience, then maybe it's worthwhile to nerf the character to the point where it's okay for beginners, despite that this makes the character not viable for pro play anymore. An important thing to realise here is that with a cast of dozens of characters, a single character not being viable in pro play doesn't make that big of a difference since there are so many other options. However, a single character being overpowered for beginners will mean that lots of beginners play this character and it will ruin many matches.



The third example of fixing beginner balance that I'd like to discuss today comes from our real-time strategy game Swords & Soldiers 2. This game has 3 wildly different factions (Vikings, Persians and Demons), which gives enough headaches in terms of balance already. However, on top of that there is also a tactic that may be balanced, but that's just not fun for beginning players: rushing.

Experienced Swords & Soldiers 2 players can have a lot of fun harassing each other's economy as early as possible, forcing the opponent to spend their gold on defence instead of upgrades, or maybe occasionally even winning the game in under a minute. However, beginning players rarely start a match effectively. They're likely still reading some upgrade descriptions or thinking about what to do next. The result is that if the opponent employs even a very ineffective rush tactic, a beginning player will still be overwhelmed and lose in less than a minute.

For beginners this is highly frustrating, but since rushing is so much fun for pro players we didn't want to remove rush tactics from the game altogether. Instead we came up with something that's just for beginners: Starting Gates. Starting Gates are gates in front of each player's base that need to be destroyed before the base can be reached. This slows down rush tactics a lot and gives the defending player quite a lot of extra time to respond once the enemy soldiers come into view. Starting Gates are truly only for beginners: in matchmade online matches they're not placed if both players have played a bunch of online matches already.



Making the balance as fun for beginners as it is for experienced players is a hard and sometimes nearly impossible challenge. In this post I've given examples of 3 different tricks we've employed to improve beginner balance: Gnaw was made more difficult to play without making him stronger, Ayla was made more expensive so that beginners would encounter her less, and rushing in Swords & Soldiers 2 was changed for beginners by introducing a new mechanic that only applies to beginner matches.

These are all examples of looking at balance as a creative challenge, not just as a topic that's about spreadsheets and tweaking numbers. Have you used any nice tricks to improve balance? Please share in the comments!

Saturday, 3 August 2019

New Song: A Century Sails By

I've finished a new cello song! :D This one started out as a travelling theme for a game concept with a little boat, but will end up on my album as the intermezzo where the story jumps from the 1850s to the 1950s. The title reminds of both uses. I hope you like it!



Musically the idea behind this song was to play with delay/echo: two of the instruments get a rhythmic echo. This is a fun little musical effect with a big impact, as it results in these instruments constantly playing chords with their own previous notes. It's easy to mess this up by choosing the wrong progression of chords and making it sound dissonant and ugly (or to do this on purpose, if dissonance is the goal). During most of the song I steer away from this and choose notes that sound nice with the notes that came before. However, at the end of the cello melody (at 1:10 in the video) I go from C# to C. This is very dissonant but at the same time brings us back to the chords at the start of the song. I really like the tension this creates. This effect is so strong that I needed to keep the music in the same chord for four bars before starting the chord progression again, just to make it settle back in and feel good.

As with all my songs, sheet music for cello can be found at music.joostvandongen.com, including a recording of the song without the cello, to play along to.

I've also made an arrangement of the song for an acoustic quartet with two violins and two cellos. Since acoustic instruments can't do a real echo without additional equipment, I've added in some extra notes where possible to mimic the echo. This captures the feel of the song surprisingly well, as I experienced when I played this version with a couple of friends a few years ago. Sheet music for this version is also on music.joostvandongen.com.

Sunday, 3 March 2019

The psychology of matchmaking

Matchmaking is a touchy subject and this has previously made me somewhat hesitant to write about it in an open and frank manner. Today's topic especially so, since some players might interpret this post as one big excuse for any faults in the Awesomenauts matchmaking. However, the psychology of matchmaking is a very important topic when designing a matchmaking system, so today I'm going to discuss it anyway. For science! :)

While we were designing the Galactron matchmaking systems I did quite a lot of research into how the biggest multiplayer games approach their matchmaking. The devs themselves often don't say all that much about it, but there's plenty of comments and analysis by the players of those games. The one thing they all have in common, is that whatever game you look for, you'll always find tons of complaints from users claiming the matchmaking for that particular game sucks.



My impression is that no matter how big the budget and how clever the programmers, a significant part of the community will always think they did a bad job regarding matchmaking. Partially this might be because many games indeed have bad or mediocre matchmaking, but there's also a psychological factor: I think even a theoretical 'perfect' implementation will meet a lot of negativity from the community. Today I'd like to explore some of the causes for that.

The first and most obvious reason is that matchmaking is often a scapegoat. Lost of match? Must be because of the crappy matchmaking. My teammates suck? Must be the crappy matchmaking. Got disconnected? Definitely not a problem in my own internet connection, must be the crappy matchmaking. Undoubtedly in many cases the matchmaking is indeed part of the problem, but these issues will exist even with 'perfect' matchmaking. Sometimes you're just not playing well. Sometimes a teammate has a bad day and plays much worse than they normally do. Sometimes your own internet connection dropped out. No matchmaker can solve these issues.

There's a strong psychological factor at play here: for many people their human nature is to look for causes outside themselves. I think this is a coping mechanism: if you're not to blame, then you don't have to feel bad about yourself either.

Of course there is such a thing as better or worse matchmaking. That players use matchmaking as a scapegoat shouldn't be used as an excuse to not try to make better matchmaking. But it sure makes it difficult to asses the quality of your matchmaking systems. Whether your matchmaker is doing well or not, there will practically always be a lot of complaints. The more players you have, the more complaints.

For this reason it's critical to collect a lot of metrics about how your matchmaking is objectively doing. We gather overall metrics, like average ping and match duration and such, but we also store information about individual matches. This way when a player complains we can look up their match and check what happened exactly. This allows us to analyse whether specific complaints are caused by problems in the matchmaker, are something that the matchmaker can't fix (like a beginner and a pro being in a premade together) or whether the user is using matchmaking as a scapegoat for something else.



Another problem for matchmaking is that player's don't have a single, predictable skill level. The matchmaker matches a player based on their average skill, but how well they actually play varies from match to match. One match they might do really well, and then the next they might do badly. For example, maybe the player gets overconfident and makes bad decisions in the next match because of that. Or maybe the player is just out of luck and misses a couple of shots by a hair that they would normally hit. Or maybe the player got home drunk from a party and decided to play a match in the middle of the night, playing far below their normal skill level. These are things that a matchmaker can't predict. This will often make it seem like the matchmaker didn't match players of similar skill. Sometimes this might result in a teammate who would normally be as good as you but happens to play like a bag of potatoes during this one match in which they're in your team.

While this problem is not truly solvable by matchmaking, there are some things developers can do to improve on it. For example, in Heroes Of The Storm you select your hero before you go into matchmaking. This allows the matchmaker to take into account that you might be better at some heroes than at others. If it detects that you're playing a hero that you haven't played in a long while, then maybe it should matchmake you below your skill level. I have no idea whether Heroes Of The Storm actually does this, but it's certainly a possibility. This would allow detecting some of the cases in which a player is normally really good, but is currently trying something new that they haven't practised yet.



However, this particular trick comes at a heavy cost, which is why we decided not to put it into Awesomenauts: if players select their hero before matchmaking happens then matchmaking is severely limited in who can play with whom, which damages other matchmaking criteria. (I've previously discussed this in my blogpost about why you need huge player numbers for good matchmaking.)

A very different kind of psychological aspect is that players are often bad at estimating their own skill level. The following example of this is something that I have no doubt will be recognised by many people who play multiplayer games. A while ago I played a match in which a teammate was constantly complaining about how badly I was playing and how I was causing us to lose the match. However, looking at the scoreboard I could see that he was constantly dying and was by far doing worst of all players in the match. Apparently that player didn't realise that he was much less good at the game than he thought he was.

There's more to this than anecdotal evidence however. The developers of League of Legends have described that on average, players rate their own skill 150 points higher than their real MatchMaking Rating. (That part of the post has been removed in the meanwhile, but you can still find it here on Wayback Machine.) As they also mention there, psychology actually has a term for this: it's called the Dunning-Kruger effect. A League of Legends player analysed a poll about this here and gives an excellent explanation of how it works:
"According to the Dunning-Kruger-Effect people overestimate themselves more the more unskilled they are. This isn’t caused by arrogance or stupidity, but by the fact that the ability to judge a certain skill and actually being good at that skill require the same knowledge. For example if I have never heard of wave management in LoL I am unable to notice that I lack this skill, because how would I notice something if I don’t even know it exists? I would also not notice this skill in other people, which is why I would overestimate my own skill if I compared myself to others. This it what causes the Dunning-Kruger-Effect." - Humpelstilzche

As some readers suggested, all of these psychological factors invite an interesting thought: would it help to give players more control? After all, if you have no control you blame the system, while if you do have more choice, you also have yourself to blame and maybe accept the results more. Giving players choice over matchmaking is in many ways old-fashioned and reminds me of the early days of online multiplayer, where you selected your match yourself from a lobby browser. The common view these days seems to be that players expect a smooth and automated experience. They don't want to be bothered and just want to click PLAY and get a fun match. Or at least, most devs seem to assume that that's what players want.

For Awesomenauts I've actually been curious for a while what would happen if we removed the automated matchmaking and instead relied entirely on opening and selecting lobbies. The modding scene would surely thrive a lot more that way, but how would it affect player experience and player counts? It would be a risky change and also too big a change to just try though so I doubt we'll ever get to know the answer to that question. Also, I'm not sure whether our current lobby browser provides a smooth enough experience for making it that important.

I do think it's interesting to explore this further though. I wonder what would be the result if matchmaking were from the beginning designed around being a combination of automation, communication and player control.

Before ending this blogpost I should mention one more psychological aspect of matchmaking: the developer's side. As a developer it can be really frustrating to get negative comments on something you've spent a lot of time on. Understanding why can help in coping with this frustration. Just like matchmaking can be a scapegoat for players after losing a match, the psychology of matchmaking can be a scapegoat for developers after getting negative feedback from players.

None of the psychological factors discussed in this post are an excuse to just claim that the matchmaking in a game is good despite players complaining. However, for a developer it's really important to realise that these factors do exist. Understanding the psychology of matchmaking allows you to build better matchmaking and helps to interpret player comments. I have no doubt that I've only scratched the surface of this topic, so I'm quite curious: what other psychological aspects influence how matchmaking is experienced by players?

Sunday, 24 February 2019

New song: Hear Her Typewriter Humming

I've finished a new song for my cello album! This one revolves around a gently bobbing... typewriter! Most of the song has my cello as the main melodic instrument, plus some humming vocals here and there. My goal with this composition was to give it a really relaxed feel, which I think worked out quite well. :)



As with all my songs, sheet music for cello can be found at music.joostvandongen.com. I've also included a recording of the song without the cello, to play along to.

Vocals and artwork were done by Marissa Delbressine. Guitar was played by Thomas van Dijk.

One thing I had a lot of fun with for this composition, was cutting up a sample of a typewriter into separate keystrokes, so that I could then play typewriter on my keyboard. This allowed me to create a typewriter that's typing freeform, but at the same time does keep to the rhythm a little bit.

Sunday, 6 January 2019

An overview of many ways of doing a beta

Giving players access to the beta of a new game or new content before it's released is a great way to get feedback and find bugs, allowing you to add that extra bit of polish, balance and quality before the official full release. There are many different ways to give players access to a beta. Which to choose? In this article I'd like to give a comprehensive list of options in today's market and discuss the differences.

Traditionally bugs in games are found by QA testing companies. However, hiring a QA company to exhaustively test a complex game is very expensive. Many smaller companies don't have the budget to hire QA at all, or can only get a limited amount of QA and can't let QA cover every aspect of the game, let alone doing so repeatedly for every update. However, even if you do have the budget for large amounts of QA testing, that won't give good feedback on whether a new feature is actually fun or balanced. That requires real players, experiencing the content in the wild. So whether you can afford paid QA or not, a beta might still be a good idea.

There are many aspects to doing a beta. Should everyone get access, or only a limited number of players? Is the beta for a new game that hasn't released yet, or for new content for an existing game? Is the beta also intended to gather additional development funds, or only for testing purposes?

Another interesting topic is what to actually put in a beta. Should it be all the content, or only a portion so as not to spoil the main release too much? (There's an interesting bit about that in this talk about Diablo 3.) How early should we do a beta? Although these are important questions, to limit the scope of this post I'm going to ignore the content of the beta: today I'm focusing exclusively on how the beta is delivered to customers.

Over the years at Ronimo we've done a bunch of different approaches to betas. With Awesomenauts and the recently released Swords & Soldiers 2 Shawarmageddon we tried betas before release and for new content, through DLC and through beta branches, limited paid betas and open betas, and more. That means a large portion of this post is based on our own experiences, but since I want this list to be as comprehensive as possible I'll also discuss approaches that we haven't tried ourselves.



Since consoles offer very few possibilities for betas and since Steam is the biggest and most complete platform on PC, this post mostly lists options in Steam. Some of these will probably be possible in similar ways on competing platforms like GoG or Itch.io. If I missed anything that's fundamentally different on other platforms or if I missed some approaches altogether, then please let me know below in the comments so that I can add them.

Steam beta branches

For updates to an already released game

This is the most common way to do a beta on Steam. When uploading a build you can select in which branch it should go live. This makes it possible to release a build under a 'beta' branch only. Users can then simply right click the game in Steam and select the branch they want, after which Steam will download it and replace the main game with the version from the branch.



If you want to limit access to the beta you can set a password for the branch. This works fine, but it's a single password for all users, so if you share this password with players there's a good chance that some might share it further with others. For Awesomenauts we got lucky with our community: no players posted the passwords for closed betas publicly online. Undoubtedly some players did share a password with a few friends privately, but that never caused any problems.

Branches are also great for internal testing purposes. When we want to test a build internally or want to provide a build to QA we also use Steam branches and simply share the password only internally or with the QA company.

Beta through (free) DLC

For updates

A downside of Steam beta branches is that when you switch to or from a branch, Steam downloads and updates the game to this version. In other words: switching takes time and bandwidth and it's not possible for users to have both the beta and the main game on their computer simultaneously. If updates are hundreds of megabytes or even bigger then this gets cumbersome for users. In cases where an Awesomenauts beta didn't get a lot of feedback we often saw players mention that they didn't like to wait for the download.

Our solution was to not use beta branches anymore and instead put the entire beta build in a DLC. Users can then enable the DLC to get the beta. This allows them to have both the main game and the beta on their computer simultaneously. On Steam it's possible to set a DLC to being disabled by default, so users can deliberately choose to get betas or not by enabling the DLC in the Steam interface.

That the beta is a DLC doesn't mean it needs to be paid: it's possible to do free DLC on Steam. Nevertheless, the option to make the beta paid is useful in some cases. For example, backers of the Starstorm Kickstarter campaign were initially the only players who got Awesomenauts beta access. We could have handled this by making the DLC unlisted in the Steam store and sending keys for it to Kickstarter backers. I guess it's even possible to make a beta a paid DLC directly on Steam, although I imagine this might rub some players the wrong way.

Giving out keys for the main game before launch

For new games

This is the easiest way to do a beta before the release of a game. Just put the build live before the store opens and give keys to the players you want to have access to the beta.

A big question with this approach is what to do once the game actually releases. Do those users keep the game, or do you revoke those keys? If this was an open beta then you'll probably want to revoke the keys, but in case of a closed beta with a small group you may also choose to consider the game a gift for those who did beta testing.

If you choose to revoke the keys, be sure to do so a few days before launch. I've heard of cases where users couldn't buy the game on launch because the revocation of the keys had been done too shortly before launch: Steam apparently hadn't processed that entirely yet. I have no idea how much time is needed for that to not go wrong, but revoking the keys a few days before launch seems safe enough I guess.

Beta as a separate app

Both for new games and for updates

An issue with giving users keys to the main game and then revoking those before launch is that if they had the game wishlisted, then the wishlisting will be gone after this. A solution for this is to do the beta in a separate app entirely. This way it's an entirely separate game with its own settings, achievements, leaderboards and AppID. This version of the game is not listed in the store, so it only exists for those users who activated the game with a beta key.



An additional option when doing your pre-launch beta through a separate app is that it can continue being used after the game has launched and can then be used to do betas for updates.

Note that some developers have reported that Steam didn't allow them to do this. We've applied this method in Autumn 2018 ourselves and it wasn't a problem then. Apparently Steam's rules for whether this is allowed or not are not entirely clear.

Steam Early Access

For new games

Early Access allows you to sell a game that's not actually finished yet. This way development of the game can be done in a very public and interactive manner, getting constant player feedback while still adding core systems to the game. Another benefit is that Early Access games are generally paid, so this can help generate additional funding before the actual launch of the game.

Common wisdom seems to be that the launch into Early Access should be considered the main launch of the game. In many cases the final launch of a game is completely ignored by press and players alike, unless the game became a success during Early Access already. This means that a game should be really strong before going into Early Access: it might have missing features and bugs but if it's not super fun yet, then you likely won't get a second chance when the game releases out of Early Access.

An interesting aspect of Early Access is that it functions as a strong excuse towards players for bugs, balance issues and a lack of content. Reviews by both players and press for an Early Access game will often mention things like "It's buggy but that's okay since it's still in Early Access." The equivalent of that for a normally released game is "Don't buy this buggy mess."

For some users Early Access has left a sour taste because some games never launched out of it or didn't deliver on the promised features. Nevertheless, Early Access remains a strong category with many successful games.

Xbox Game Preview

For new games

While Xbox Game Preview is roughly the same as Steam Early Access, I'm listing it as a separate option because it's the only form of beta or early access that's currently available on consoles (as far as I know), making it quite a unique thing.

I don't know what Microsoft's policy around Game Preview is exactly, but I expect this option is not open to just everyone, so if you want to go this route you probably need to talk to Microsoft. I'm guessing that for the right projects, Microsoft might even have some budget to help get them into Game Preview. One thing to keep in mind is that competing platforms might be less interested in featuring your game on launch if it has already been on Game Preview for a while, just as they are generally less interested in featuring a game that launched on another console first.

Soft-launch on a smaller platform

For new games

As I mentioned above, the Early Access launch version of a game needs to be pretty strong. That's kind of counter to the goal of Early Access: getting feedback in an early stage. To work around this problem some developers choose to release their games on a smaller platform like Itch.io first, and then come to Steam (with or without Early Access) once the game is strong enough.



One might expect this strategy doesn't work: by the time the game gets into Steam Early Access, it's been available elsewhere for a while so it's old news. However, I've heard from some devs that if a PC game is not on Steam, to a lot of people it doesn't exist. So even if it's been on another store for a while, the moment it gets to Steam is apparently considered the 'real' launch. (This logic of course doesn't apply to juggernauts like Fortnite, Minecraft and League of Legends.)

Regional soft launch

For new games and for updates

This is a common approach in the world of free to play mobile games: launch in a specific country, improve the game until it makes enough money per user, and only then launch worldwide. I haven't heard of any PC games using this approach, but undoubtedly it has been done. I expect the challenge here would be that hardcore gamers are much more informed and internationally connected than casual free-to-play mobile gamers. If your game is to sell through word-of-mouth then it's going to be weird if the word on Reddit and Discord ends at a single nation's border. Still, I imagine this approach might work in some cases, especially for single player games.

So, that's it! These are all the relevant ways I know of doing a beta in today's market. Did I miss any? Let me know below in the comments so that I may add them! Which approach do you prefer?

Sunday, 23 September 2018

Interior Mapping: rendering real rooms without geometry

The recently released game Marvel's Spider-Man has interiors behind windows in many buildings. This looks great and it seems to be done using a rendering trick: the geometry for the interiors isn't actually there and is generated using a shader. I haven't seen any official statement by Insomniac regarding how they made this, but based on how it looks it seems very likely that they implemented interior mapping: a technique I came up with in 2007 as part of my thesis research. I've never written about this on my blog before so I figure this a good moment to explain the fun little shader trick I came up with.

Let's start by having a look at some footage from Marvel's Spider-Man. The game looks absolutely amazing and Kotaku has captured some footage of the windows in particular:



As you can see around 0:40 in this video the rooms aren't actually there in the geometry: there's a door where there should clearly be a window. You also see a different interior when you look at the same room from a different corner of the building. In some cases there's even a wall that's beyond a corner of the building. All of these suggest that the rooms are faked, but nevertheless they are entirely perspectively correct and have real depth. I expect the faults of these rooms don't matter much because while playing you probably normally don't actually look at rooms as closely as in that video: they're just a backdrop, not something to scrutinise. I think creating rooms this way adds a lot of depth and liveliness to the city without eating up too much performance.



Before I continue I'd like to clarify that this post is not a complaint: I'm thrilled to see my technique used in a big game and I'm not claiming that Insomniac is stealing or anything like that. As I stated in the original publication of interior mapping, I'd be honoured if anyone were to actually use it. If Insomniac indeed based their technique on my idea then I think that's pretty awesome. If they didn't, then they seem to have come up with something oddly similar, which is fine too and I'd be curious to know what they did exactly.

So, how does interior mapping work? The idea is that the building itself contains no extra geometry whatsoever. The interiors exist only in a shader. This shader performs raycasts with walls, ceilings and floors to figure out what you should be seeing of the interior.



The ray we use is simply the ray from the camera towards the pixel. The pixel that we're rendering is part of the exterior of the building so we only use the part of the ray beyond the pixel, since that's the part of the ray that's actually inside the building.

Doing raycasts may sound complex and expensive, but it's actually really simply and fast in this particular case. The trick is to add a simple limitation: with interior mapping, ceilings and walls are at regular distances. Knowing this we can easily calculate which room we're in and where the ceiling and walls of that room are. Ceilings and walls themselves are infinite geometric planes. Calculating the intersection between an infinite plane and a ray is only a few steps and eats little performance.



A room has 6 planes: a ceiling, a floor and 4 walls. However, we only need to consider 3 of those since we know in which direction we're looking. For example, if we're looking upward then we don't need to check the floor below because we'll be seeing the ceiling above. Similarly, of the 4 walls we only need to consider the 2 that are in the direction in which we're looking.

To figure out exactly what we're seeing, we calculate the intersection of the ray with each of those 3 planes. Which intersection is closest to the camera tells us which plane we're actually seeing at this pixel. We then use the intersection point as a texture coordinate to look up the colour of the pixel. For example, if the ray hits the ceiling at position (x,y,z), then we use (x,y) as the texture coordinate, ignoring z.

A nice optimisation I could do here at the time is that we can do part of the intersection calculations for each of the three planes at the same time. Shaders used to be just as fast when using a float4 as when using a float, so by cleverly packing variables we can perform all 3 ray-plane intersections simultaneously. This saved a little bit of performance and helped achieve a good framerate with interior mapping even back in 2007 when I came up with this technique. I've been told that modern videocards are faster with float than float4, so apparently this optimisation doesn't achieve much anymore on today's hardware.



For more details on exactly how interior mapping works, have a look at the paper I wrote on interior mapping. This paper was published at the Computer Graphics International Conference in 2008. Having a real peer-reviewed publication is my one (and only) claim to fame as a scientist. This paper also includes some additional experiments for adding more detail, like varying the distance between walls for rooms of uneven size and randomly selecting textures from a texture atlas to reduce repetition in the rooms. It also goes into more detail on the two variations shown in the images below.



Since we're only doing raycasts with planes, all rooms are simple squares with textures. Any furniture in the room has be in the texture and thus flat. This is visible in Spiderman in close-ups: the desks in the rooms are in fact flat textures on the walls. As you can see in the image below it's possible to extend our raycasting technique with one or more additional texture layers in the room, although at an additional performance cost.



After having published this blogpost one of the programmers of Simcity (2013) told me that interior mapping was also used in that game. It looks really cool there and they have a nice video showing it off. They improved my original idea by storing all the textures in a single texture and having rooms of varying depth. The part about interior mapping starts at 1:00 in this video:



If you'd like to explore this technique further you can download my demo of interior mapping, including source code. If you happen to be an Unreal Engine 4 user you can also find interior mapping as a standard feature in the engine in the form of the InteriorCubeMap function.

After all these years it's really cool to finally see my interior mapping technique in action in a big game production! If you happen to know any other games that use something similar, let me know in the comments since I'd love to check those out.

Sunday, 16 September 2018

The Awesomenauts matchmaking algorithm

Matchmaking is a big topic with lots of challenges, but at its core is a very simple question: who should play with whom? A couple of years after releasing Awesomenauts we rebuilt our entire matchmaking systems, releasing the new systems in the Galactron update. Today I'd like to discuss how Galactron chooses who you get to play with. While the question is simple enough, the answer turns out to be pretty complex.



Many different factors influence what's a good match. Should players of similar skill play together? Should players from the same region or language player together? Should we take premades into account? Ping? Should we avoid matching players who just played together already? Should we use matchmaking to let griefers play against each other and keep them away from normal folks?

If the answer to all of these questions is 'Yes, let's take that into account!', then you'd better have a lot of players! I've previously written about how many players you need for that and the short answer is: tens of thousands of simultaneous players, all the time. That's not realistic except for the very biggest hits, so you'll usually need to do with fewer players and make the best out of your matchmaking.

For Awesomenauts we chose to let the matchmaker gather a lot of players and then once every couple of minutes match them all at once. This way the matchmaker has as many players as possible to look for the best possible combinations.



We'd like to match more than 100 people at the same time so that we have plenty of choice. More than that would be even better: our research shows that our algorithm keeps getting better up to as many as 300 people per round, because smaller distant regions (like Australia) don't have enough players for proper matchmaking until that many. However, the more players we wait for, the worse the player experience gets because the waiting times become too long.

Scoring the quality of a match


Okay, so we have 100+ players that all need to be put in matches, how do we decide what's the best match-up? For example, if we can choose between making a match where the players have equal skill, or one where the ping is low, which do we prefer? And which do we prefer if the difference is more subtle, like we can get slightly more equal skill or slightly better ping? Where do we draw the line? And how do we balance this against wanting to let premades play against each other? Etc. etc.

What we need here is some kind of metric so that we can compare match-ups. Somehow all of those matching criteria need to culminate in one score for the entire group. Using that, the computer can look for the best match-up: the one that gives us the highest score.

To calculate a total score we start by calculating the matchmaking score for each match. We do this be taking a weighted average of a bunch of factors:
  • Equally skilled teams. This is the most obvious factor in matchmaking: both teams should be equally good. Turning this requirement into a number is simple: we take the average skill per team and look at the difference between those. The bigger the difference, the lower our score. Note that this does require some kind of skill system in your game, like ELO or Trueskill. This also means that beginning players are difficult to match, because you don't know their actual skill until they've played a bit.
  • Equally skilled players. Even if the two teams are perfectly balanced, the match might not be. For example, if both teams contain two pros and one beginner, then the average skills of the teams are the same, but the match won't be fun. So we use a separate score that ignores teams and simply looks at the skill differences between all players. In a 6 player match there are 15 combinations of players and we simply average all of those. The bigger the difference, the lower the score.
  • Premades. Ideally a group of 3 friends who coordinate together should play against a similar group and not against three individual players who don't know each other. Assigning a score to this is simple: if both teams have the same situation, then we have a 100% score. If there's a small difference (like for example a 3 player premade versus a 2 player premade + a solo player) then we get a 60% score. For a big difference (a 3 player premade versus 3 solo players) the score is 0%.
  • Ping with enemies. For each player we check the ping with all 3 players in the enemy team. The higher the ping, the lower the score. There are 9 combinations of players this way and we simply average those 9 scores to get the total ping score for this match.
  • Opponent variation. This is a subtle one that we added a while after launching Galactron. Since our matchmaker basically looks for players of similar skill with a good connection to each other, it tends to repeatedly put the same players against each other. We expected this to be rare enough that it would be fun when it would happen, but in practice our players encountered the same people too often. To counter this we give a match a lower score if players encountered each other in the previous match as well. If they encountered each other as opponents but are now teammates (or vice-versa) we give that a slightly better score than if they meet each other in the same situation (opponents before, opponents now; or teammates before, teammates now). This gives the matchmaker a slight tendency to swap teams around if despite this rule it ends up making another match with the same players. This rule has a very low weight since we value the other rules more, but still this rule improved the situation significantly and we got much fewer complaints from players about this.
  • Two possible servers. Since Awesomenauts is peer-to-peer, each match should have a player who can connect with everyone in the match, so that this player can be the server. Ideally there is also a second player in the match who can connect with everyone. This way if the first server-player drops out of the match, host migration can happen and the match can continue for the remaining players. This rule is only needed for peer-to-peer games: in a game with dedicated servers or relay servers this rule is irrelevant.



There's one important factor missing here: ping with teammates. Why is this not taken into account? The reason for this is that for every factor we add, the others become a little bit less important. In Awesomenauts a bad connection with your teammates is usually not a big problem because you never need to dodge their bullets. A bad connection with an opponent is much worse, because dodging becomes really difficult if the ping is too high. By ignoring the ping with teammates, we make the ping with opponents a lot more important.

Something to think about when calculating scores is whether we want them to be linear. For example, is a ping improvement from 210ms to 200ms as worthwhile as one from 110ms to 100ms? Both are 10ms improvements, but the latter is relatively twice as big. In Awesomenauts we've indeed tweaked our scoring formulas to better match the perceived difference instead of the absolute difference.

Another consideration is limiting the range of the scores. For example, in Awesomenauts the highest ranked players have a skill of over 20,000, but only 0.3% of all players are above 18,000. In other words: there's a huge difference in skill score in the very top, but this is quite useless to the matchmaker since there are too few players with such a high score to actually take this into account. So before calculating the skill scores we cap them to workable ranges. This way the matchmaker will consider a match-up between two players with skill 18,000 to be as good as one between an 18,000 player and a 20,000 player. Ideally you don't cap anything, but since we have to balance between all the different matchmaking goals capping at some points will make other ranges and aspects more important.

Combining scores


For each of the above 6 rules we now have a score from 0% (really bad match-up) to 100% (really good), but we need 1 score, not 6. To get this 1 score we take the average of all the scores. We use a weighted average for this so that we can make certain things more important than other things. For example, in a fast-paced game like Awesomenauts, ping should be pretty important. Opponent variation on the other hand is a detail that we don't want to stand in the way of good ping, so that one gets a pretty low weight.

An interesting thing that happens here is that some of these scores react much more extremely than others. Getting a 0% skill score requires having 3 pro players and 3 beginners in a match, which is extremely rare. Getting a 0% premade score however happens much more easily, since all that's required for that is having a three player premade play against three solo players. Some scores reacting more subtly than others is something we need to take into account when choosing the weights: since premades react so strongly, we need to give them a lower weight to keep them from overshadowing scores that respond more smoothly, like skill and ping.

Now that we have a single score for each match, we can calculate the totale score for the 100+ players we're matching. We do so by simply averaging the scores of all the matches. The result is one big megascore.



Since we look at the total score, the algorithm gets certain preferences. If swapping two players increases the score of one match by 5% but decreases the score of the other match by 10%, then it won't do that since in total that makes things worse. This sounds obvious, but in some cases one might want to deviate from this. For example, maybe you might prefer improving a match with a 50% score (which is really bad) to 55% at the cost of decreasing a 90% match to 80% (which is still pretty good). A way to achieve this might be to take the square root of all the match scores before averaging them. That way improving bad matches becomes relatively more important. For Awesomenauts we decided against this, because we think the individual scores already represent well enough how big a match quality improvement really is.

Finding the match-up with the best score


Now that we have a scoring system that defines what the best match-up is (the one with the highest score) we get to the algorithmic part of the problem: how to actually find that best match-up. With 100+ players the number of combinations we can make is insane so we can't brute-force our way out of this by simply trying all combinations.

I figured this might actually be a graph theory problem so I tried looking for node graph algorithms that could help me, but the problem turned out to be so specific that I didn't find any. Even after consulting a hardcore academic algorithmic expert nothing turned up, so I decided to look for some good guesstimate and just accept that the actual best match-up is probably not findable. The problem might even be NP-complete, but I didn't actually try to prove that. Finding a better algorithm might be a fun thesis topic for a computer science student somewhere.

The approach I ended up at is to first create a somewhat sensible match-up using a simple rule. I tried a couple of rules for this, like for example simply sorting all players by skill and then putting players in matches based on that sorting. So the top six players together get into one match, then the next six, etc. This is easy enough to build and produces the best possible skill scores. Since it ignores all the other scores entirely it performs really badly for ping and premades.

We now have 6 players per match, but we haven't decided who's in the red team and who's in the blue team yet. So the next step is how the six players in each match are divided over the two teams. Here we can brute-force the problem: we simply try every possible combination and select the best one. Here we take into account all 6 scores, so not just skill. The number of combinations is pretty low, especially since the order in which players are in each team doesn't matter. Since we try every combination we know for sure that the players will be split over the teams in the best possible way.

We now have a good starting point. The next step is to look for improvements: we're going to look for swaps of players that will improve the total score. We do this one swap at a time. Here we can again brute-force our way out of this problem: we simply try all swaps and then perform the best one. There are lots of different possible swaps and we also need to recalculate the split over the two teams for each swap, so this uses a lot of processing power. However, modern computers are super fast so with some optimisations we can do a few hundred swaps per second this way.

Especially the first swaps will improve the matches drastically, but we start seeing diminishing returns up to the point where no swaps can be found that are actually an improvement. There might still be some bad matches there, but no single swap will be an improvement overall. In other words: we've reached a local optimum. We almost certainly didn't find the best match-up possible for those 100+ players, but we can't improve this one any further with our current swapping algorithm.



To get out of that local optimum I tried a few things. The first thing I tried is to do a number of forced swaps where the players in the very worst positions are forced into other matches, despite the overall result becoming worse. After a dozen or so forced swaps we start doing normal swaps again. This indeed resulted in a slightly better score, but it did make the algorithm take much longer. Since processing is already taking seconds at this point, performance is very relevant. We don't want players to wait half a minute extra just for this algorithm to finish.

I then tried a different approach: I just leave the first result as it is, and start over again but from a different starting point. I semi-randomly generate a completely new match-up and start doing swaps on that. This will also bring us to a local optimum, but it will be a different one that might actually be better than the previous local optimum.

In practice within 5 seconds we can usually do a couple of retries this way (less if there are more players) and then we simply pick the best one we found. This turns out to produce much better results than trying to force ourselves out of a local optimum, so I threw away that approach and instead we just try a bunch of times. To limit waiting times we simply check how long we've been going so far and do another retry only if we haven't spent too much time yet.

To see how good this could get I tried letting my computer run for a whole night, constantly retrying from different starting points. This produced tens of thousands of different match-ups, all at different local optimums. It turns out that the difference between trying a couple of times and trying tens of thousands of times is actually surprisingly small. This showed us that doing just a couple of retries is already good enough.



Downsides


So far I've explained how we do matchmaking and why. Now that Galactron has been running for almost two years in this way it's also a good moment to look back: did our approach work out as nicely as hoped? Overall I'm pretty happy with the result, but there is one big choice in here for which I'm not sure whether it's actually the best one: matching everyone at the same time. Doing this gives our matchmaking algorithm the most flexibility to find the best match-ups, but it turns out that players are spread out over the world even more unevenly than I had estimated beforehand, causing problems here.

For example, if we look at a matchmaking round at 20:00 western European time, then the vast majority of those players will be from Europe and the rest of the players will be spread out over the rest of the world. That means that in a matchmaking round of 100 players, there are only a few Australians or South Americans. Those players won't get a good match in terms of ping because there simply aren't enough players in their region in that round.

Solving this requires having even larger numbers of players per round: I think around 300 would be ideal. However, this means either having extremely long waiting times, or having a gigantic playerbase. We'd love to have a playerbase as large as League of Legends, but unfortunately that's not realistic for all but the biggest hit games. The result is that we chose a middle ground: we wait for those 100 players, which can take 5 minutes or even more, and then we just perform matchmaking and accept that the result won't be as good as we'd want for some people. If the number of players is too low then at some point we just perform matchmaking anyway so that waiting times never go beyond a maximum.

While building Galactron we've found very little information on how the big multiplayer games do their matchmaking exactly, but a talk by a designer of Heroes of the Storm (which I unfortunately can't find anymore on YouTube) suggested that at some point in that game, the system was that a match would be started as soon as it could be created with a good enough match quality. In our case this would mean that if there are 6 players who are similar enough in skill and have a good ping with each other, then the match is started immediately. If it takes too long to fill a match, then the requirements are gradually decreased. At some point a player who has waited too long becomes top priority and just gets the 5 most fitting players that the matchmaker can find, even if match quality is lower than desired.



The benefit of such an approach is flexibility: in areas with lots of players one would get matches much more quickly, while areas with fewer players would get longer waiting times. This kind of flexibility is also nice for pro players: one can make them wait longer so that they only play against others of almost exactly the same skill. Our own algorithm can't diversify like that since players from the whole world are matchmade simultaneously at fixed moments.

In comparison I expect our own method produces slightly better results, because it can juggle with a lot more players at once to find the best match-ups. However, this benefit might not be big enough to weigh up against the benefit of having lower waiting times.

There is one really nice benefit that we get from our fixed matchmaking moments: we can tell the player beforehand exactly how long they'll have to wait for matchmaking to happen. I think waiting is more endurable if you know how much longer you have to wait. Showing how much longer you need to wait is a really nice touch that few other games have.

A direct comparison between the live matchmaking algorithms of Awesomenauts and Heroes of the Storm is unfortunately not possible because Heroes of the Storm has so many more players. Any matchmaking algorithm will do better with more players, so it is to be suspected that regardless of the algorithm, the matchmaking quality in a big game like Heroes of the Storm will be much higher than in Awesomenauts anyway.

Conclusion


Building a matchmaking system starts with defining what good matchmaking is. No algorithm can produce good match-ups if you don't give the computer rules for what that actually means. In this post I've discussed the rules we use, which are mostly based around skill, ping and premades. Depending on what's important for your game you can add more rules or define them differently.

Just keep in mind that the more rules you have, the less important the other ones become, and every rule you weigh heavier makes the others less important. Restraint and balancing are key here. Also, you probably need a while to finetune the rules further once your matchmaking is live, like we've done with the Galactron matchmaker in Awesomenauts by adding the 'opponent variation' rule later on.