Saturday 28 June 2014

Finding bugs through autotesting

Some bugs and issues can only be found by playing the game for ages in a single play session, or by triggering lots of random situations. As a small studio we don't have the resources to hire a ton of people to do such tests, but luckily there is a good alternative: a fun method is to hack automated controls into the game and let the game test itself. We have used this method in both Swords & Soldiers and Awesomenauts and found a bunch of issues this way.

Autotests are quite easy to build. The core idea is to let the game press random buttons automatically and leave it running for many hours. This is very simple to hack into the game. However, such a simplistic approach is also pretty ineffective: randomly pressing buttons might mean it takes ages to simply get from the menu to actual gameplay, let alone ever actually finishing any levels. It gets better if you make it a little bit smarter: increase the likelihood of pressing certain buttons, or even just automatically press the right button in certain menus to get through them quickly. With some simple modifications you can make sure the autotesting touches upon many different parts of the game.

This kind of autotesting has very specific limited purposes. There are a lot of issues you will never find this way, like if animations are not playing, texts are not displaying, or characters are glitching through walls. The autotester does not care and keeps pressing random buttons. Basically anything that needs to be seen and interpreted is difficult to find with autotesting, unless you already know what you are looking for and can add a breakpoint in the right position beforehand.

Nevertheless there are important categories of issues that can be found very well through autotesting: crashes, soft-crashes and memory leaks. A soft-crash is a situation where the game does not actually crash, but the user cannot make anything happen any more. This happens for example if the game is waiting for a certain event but the event is never actually triggered. Memory leaks are when the game forgets to clean up memory after usage, causing the amount of memory the game uses to keep rising until it crashes. Especially subtle memory leaks can take many hours before they crash and are thus often never found during normal development and playtesting.

Another category of issues that can be found very well through autotesting is networking bugs. This one is very important for Awesomenauts, which has a complex matchmaking system and features like host migration that are hard to thoroughly test. Our autotesting automatically quits and joins matches all the time, potentially triggering all kinds of timing issues in the networking. If you leave enough computers randomly joining and quitting for long enough, almost any combination of timings is likely to happen at some point.

Recently we needed this in Awesomenauts. After we launched patch 2.5 a couple of users had reported a rare crash. We couldn't reproduce the crash, but did hear that in at least one case the connection was very laggy. Patch 2.5 added Skree, a character that uses several new gameplay features (most notably chain lightning and spawnable collision blocks). This made it likely that the crash was somewhere in Skree's netcode.

We tried reproducing the crash by playing with Skree for hours and triggering all kinds of situations by hand. To experiment with different bad network situations we used the great little tool Clumsy. However, we couldn't reproduce the crash.

I really wanted to find this issue, so I reinvigorated Awesomenauts' autotesting system. We had not used that in a while, so it was not fully functional anymore and lacked some features. After some work it functioned again. I made the autotester enter and leave matches every couple of minutes. Since I didn't know whether Skree was really the issue I made the game choose him more often, but not always. I also made the autotester select a random loadout for every match and made it immediately cheat to buy all upgrades. The autotester is not likely to accidentally buy upgrades otherwise, so I needed this to have upgrades tested as well.

I ran this test on around ten computers during the soccer match Netherlands-Australia. While we beat the Australians our computers were beating this bug. Using Clumsy I gave some of those computers really high artificial packet loss, out-of-order and packet duplication.

Watching the computer press random buttons is surprisingly captivating, especially as it might leave the match at any moment. Simple things like a computer being stuck next to a low wall become exciting events: will it manage to press jump before quitting the match?

Here is a video showing a capture of four different computers running our autotest. The audio is from the bottom-left view. Note how the autotester sometimes randomly goes back to the menu, and can even randomly trigger a win (autotesters are not tactical enough to destroy the base otherwise):

And indeed, after a couple of hours already three computers had crashed! Since I had enabled full crash dumps in Windows, I could load up the debugger and see exactly what the code was doing when it was crashing.

The bug turned out to be quite nice: it required a very specific situation in combination with network packets going out of order in a specific way. When Skree dies just after he has started a chain lightning attack, the game first sends a chain lighting packet and then a character destroy packet. If these go out of order because of a really bad internet connection, the character destroy packet can arrive first. In this case the Skree has already been destroyed when his lightning packet is received. Chain lightning always happens between two characters, so the game needs both Skree and his target to create a chain lightning.

Of course we know that this kind of thing can happen when sending messages over the internet, so our code actually did check whether Skree and his target still existed. However, due to a typo it also created the chain lightning if only one of the two characters existed, instead of if they both existed. This caused the crash. Crashes are often just little typos, and in this case accidentally typing || ("OR") instead of && ("AND") caused this crash.

Once we knew where the bug was, it was really easy to fix it (the fix went live in hotfix 2.5.2). Thus the trick was not in fixing the code, but in reproducing the issue. This is a common situation in game programming and autotesting is a great tool to help reproduce and find certain types of issues.

Saturday 14 June 2014

Solving path finding and AI movement in a 2D platformer

When we started developent of the bots for Awesomenauts, we started with the most complex part: how to do path finding and movement? When people think of path finding, they usually think of A*. This well-known standard algorithm indeed solves the finding of the path, and in games like an RTS that is all there is to it, since the path can easily be traversed. In a platformer however the step after the finding of the path is much more complex: doing actual movement over the path. Awesomenauts features a ton of different platforming mechanics, like jumps, double jumps, flying, hovering, hopping and air control. We also have moving platforms and the player can upgrade his jump height. How should an AI know which jumps it can make, how to time the jump, how much air control is needed? This turned out to be big challenge!

Since there are so many potential subtleties in platforming movement, my first thought was that handling it in our behaviour trees might not be doable at all. Behaviour trees are good at making decisions, but might not be as good at doing subtle controls during a jump. Add to that that the AIs only execute their behaviour trees 10 times per second because of performance limitations, and I expected trouble.

The solution I came up with was to record tons of gameplay by real players and generate playable path segments from this. By recording not just the player's position but also his exact button presses, I figured we could get enough information to replicate real movement with precise control. Player movement would be split into short bits for moving from one platform to another. The game could then stitch these together to generate specific paths for going from A to B.

To perform movement this way, the behaviour tree would choose where it wants to go and then executes a special block that takes control and fully automatically handles the movement towards the goal. Very much like playing back a replay of segments of a players' previous movement. The behaviour tree could of course stop execution of such movement at any given time to engage in combat, which would again be controlled entirely by the behaviour tree.

While the above sounds interesting and workable, the devil is in the details. We would have to write recording and playback code, plus a complex algorithm to analyse the recorded movement and turn that into segments. But it doesn't end there. There were six character classes at the time, each with their own movement mechanics. They could buy upgrades that make them walk faster and jump higher. There are moving platforms that mean that certain jumps are only achievable in combination with certain timing of the platform position. All of these variations increase the complexity of the algorithms needed, and the amount of sample recordings needed to make it work. Then their would need to be a way to stich segments together for playback: momentum is not lost instantly, so going into a jump while previously going to the left is not the same as when going into that same jump while previously going to the right.

The final blow to this plan was that the levels were constantly changing during development. Every change would mean rerecording and reprocessing the movement data.

These problems together made this solution feel just too complicated and too much work to implement. I can still imagine it might have worked really well, but not within the scope of a small indie team building an already too complex and too large multiplayer game. We needed a simpler approach, not something like this.

Looking for a better solution we started experimenting. Programmer Bart Knuiman did an internship at Ronimo at the time and his internship topic was AI, so he started experimenting with this. He made a small level that included platforming, but that did not need path finding because there were no walls or gaps. Bart's goal with this level was to make a Lonestar AI that was challenging and fun to play against, using only our existing behaviour tree systems. Impressively, he managed to make something quite good from scratch in less than a week. Most Ronimo team members lost their first battle against this AI and took a couple of minutes to find the loopholes and oversights one needed to abuse to win. For such a short development time that was a really good result, so we concluded that for movement and combat, the behaviour trees were good enough after all.

The only thing really impossible with the systems we had back then was path finding in complex levels. We designed a system for this and Bart built this as well. The important choice we made here was to split path finding and movement into a local solver and a global solver. I didn't know that terminology back then, but someone told me later that it was a common thing with an official name. For finding the global route towards the goal we used path finding nodes and standard A* to figure out which route to take over them. The nodes are spaced relatively far from each other and the local solver figures out how to get to the next node.

The local solver differs per character class and can use the unique properties of that type of character. A jumping character presses the jump button to get up, while one with a jetpack just pushes the stick upwards. The basics of a local solver are pretty simply, but in practice handling all the complexities of platforming is a lot more difficult, yet still doable.

The complex recording system outlined at the start of this post was incredibly complex, while the solution with the local and global solvers is so much simpler. The reason it could be so simple is that although platforming mechanics in Awesomenauts are diverse, they are rarely complex: no pixel-precise wall jumps or air control are needed like in Super Meat Boy, and moving platforms don't move all that fast so getting on to them doesn't require super precise timing. These properties simplify the problem enough that creating a local solver in AI is quite doable.

One aspect that I haven't mentioned yet is how we get the path finding graph. How do we generate the nodes and edges that A* needs to find a route from A to B? The answer is quite simple: our designers place them by hand. Awesomenauts has only four levels and a level needs well below one hundred path finding nodes, so this is quite doable.

While placing the pathfinding we need to take the different movement mechanics into account. Some characters can jump higher than others, and Yuri can even fly. Each class has his own local solver to handle its own movement mechanics, but how do we handle this in the global solver? How do we handle that some paths are only traversable by some characters? Here the solution is also straightforward: any edge we place in the path finding graph can list included or excluded classes. When running A* we just ignore any edges that are not for the current character type. This was originally mostly needed for flyer Yuri: all other classes were similar enough that they could use the same path finding edges.

A similar problem is that falling down from a high platform is possible even if it is too high to jump towards, and jumppads can also only be used in one direction. These things are easy to include in the path finding graph by making those edges only traversable in one direction.

Creating the path finding graph by hand has a couple of added benefits. The first is that we can exclude routes that may work but are too difficult for the local solver to traverse, or are not desirable to use (they might be too dangerous for example). Placing the nodes and edges by hand adds some control to make the movement look sensible. Another nicety is that we can add markers to nodes. The AI can use these markers to decide where it wants to go. For example, an AI can ask the global solver for a route towards "FrontTurretTop" or towards "HealthpackBottom".

The path finding nodes were placed by our designers and they also built the final AIs for the bots. Jasper had made the skirmish AIs for Swords & Soldiers and he also designed the bot behaviours for Awesomenauts.

Awesomenauts launched with only six classes and only four of them had a bot AI. Since then many more characters were added, but they also never received bot AIs. Now that modders are making AIs for those we will probably have to update the edges to take things like Swiggins' low jump and the hopping flight of Vinnie & Spike into account.

To me the most interesting aspect of pathfinding in Awesomenauts is that after prototyping we ended up with a much simpler solution than originally expected. This is a good reminder that whenever we think about building something complex or something that is too much work, we should spent some time prototyping simpler solutions, hoping one of them might work.

PS. The AI tools for Awesomenauts have been released for modders in patch 2.5. Modders and interested developers can try them out, including our pretty spectacular AI debugging tools. They are free to use for non-commercial use, check the included license file for more details. Visit our basic modding guide and modding subforum for more info on how to use these tools.

Sunday 1 June 2014

The AI tools for Awesomenauts

With the next Awesomenauts patch (patch 2.5) we will release our AI editor and enable players to load modded AIs in Custom Games. The editor is in beta right now and a surprisingly large amount of new AIs have already popped up. Other game developers can also use our AI editor for non-commercial purposes, or contact us to discuss the possibility of using our tool in a commercial product. This all makes for a great occasion to discuss how we made the AIs and what kinds of tools we have developed for this.

Anyone who wants to give making AIs for Awesomenauts a try can check this little starting guide that explains the basics.

I have previously discussed in two blogposts how we made the AI for Swords & Soldiers (part 1 and part 2). Since then we have changed some of the fundamentals and those blogposts are well over three years old now, so I will write this blogpost assuming you didn't read them.

When people think about "AI" they usually think about advanced self-learning systems, maybe even truly intelligent thinking computers. However, those are more theory than practice and attempts in that direction are rarely made for games. AI that really comes up with new solutions is incredibly difficult to build and even more difficult to control: what if it uses lame but efficient tactics and thus kills the game's fun? The goal of game AI is not to be intelligent, but to be fun to play against. As a game developer you usually need control over what kinds of things the AI does. Nevertheless, some games have used techniques that can be described as real AI, especially Creatures and Black & White are known for this. I suppose for them it worked because the AI is at the very core of the game.

What almost all games use instead is an entirely scripted AI. The designer or programmer creates a big set of rules for how the AI should behave in specific circumstances and that's it. Add enough rules for enough situations, plus some randomness, and you can achieve a bot that seems to act very intelligently, although in reality it is nothing but a big rulebook written by the developer.

Awesomenauts is no different. The AI system is a highly evolved version of what we made for Swords & Soldiers. The inspiration for it came from an article Bungie wrote about their behaviour systems in Halo 2. Something similar was also presented at GDC years ago as being used in Spore and a couple of other games that I forgot the names of.

The basic idea in our AIs is that they are a big if-else tree, connecting conditions and actions. If certain conditions are met, certain actions are done. For example, if the player is low on health and enemies are near, he retreats to heal. If he also happens to have a lot of money, he buys a bunch of upgrades.

These big if-else structures are shaped like a tree and are quite easy to read. Certainly much easier than reading real code. The whole principle is best explained by a screenshot from the AI editor:

Before we made our AI editor we tried some other approaches as well. In an old school project I programmed the AI in C++, and for our cancelled 3D action adventure Snowball Earth we used LUA scripting. We were quite unhappy with both: although programming gives the most flexibility, creating such big sets of if-then-else rules is just very cumbersome in a real programming language. The endless exceptions and checks quickly become an enormous amount of confusing code.

So we set out to make a tool specifically for making AIs. Our AI editor is structured entirely around these combinations of conditions and actions and makes the problem a lot more workable. It is true that our AI editor is less flexible than code and cannot do certain things (most notably for-loops), but being faster and clearer to work with makes it possible for us to make much better AIs in the same amount of time.

Each type of action and condition in our AIs corresponds to a class in C++. For example, the condition "canPayUpgrade" corresponds to a C++ class called "ConditionCanPayUpgrade". This class looks up the price of the upgrade and the amount of money the player currently has to determine whether the player has enough money to buy the upgrade.

Since the blocks are programmed in C++ they can do very complex things. A core principle is that we try to hide the complexity and performance inside the blocks. If we need to do something in an AI tree that is not possible with simple if-else trees, then we can always add a new type of block that can do that. A great example of this is our block "isCharacterInArea", which under the hood does a collision query and checks for things like line of sight, class and health. There is quite a bit of code behind that block, but to the AI designer it is a simple and understandable block.

Our AI editor evolved and changed significantly from Swords & Soldiers to Awesomenauts. The two biggest differences are the debugging tools and the general structure. At the time of Swords & Soldiers our designers could not see any information on a running AI. To find and debug AI problems they just had to play the game and observe what the AI was doing. AIs in Awesomenauts contain thousands of blocks, so better debugging tools became necessary. Therefore we added the AI observer, internally known as "the F4 editor", since it is opened by pressing F4. The AI observer shows the state of the AI, and we even added a real debugger that can be used to step through AI updates and see the exact path through the AI.

The structure of the AI changed as well when we adapted them for Awesomenauts. In Swords & Soldiers the AI trees where "priority trees", similar to those in Halo 2 and Spore. This means that the goal of the tree is to find one action to perform, for example "flee", "attack", "reload" or "seek cover". The top-most action that has all its conditions satisfied is always executed, and nothing else is.

Priority trees are great when an AI should do only one thing at a time, but they turned out to be way too rigid for us. In practice an AI might want to move somewhere and shoot at whatever it passes and observe the situation to make a choice later. Our designers wanted to perform more than one action per tick so badly that they ended up making all kinds of weird workarounds, so for Awesomenauts we ditched the whole concept of priority trees and instead turned it into simple if-else trees. These are not only more flexible, but also much easier to understand.

The original version of our AI editor was programmed by Ted de Vries, who did an intership at the time and later joined us as a full-time programmer (he currently works on Assassin's Creed at Ubisoft). The AI observer and debugger were also programmed by an intern: Rick de Water.

Next week I will dive into a surprisingly complex aspect of AI: path finding and navigation in a 2D platformer. While standard path finding is pretty easy and can just use A* and that's mostly it, adding platforming mechanics and different movement mechanics per class made this topic much more interesting that we had expected beforehand. Double jumps, jetpacks, kites, moving platforms: we needed something that could handle all of it.