Recently we managed to reduce the number of network errors in Awesomenauts by 10% by improving our throttling algorithm. While automatic throttling by a router is killing, smart throttling by the game itself can be a good tool to make the game work better on crappy internet connections. Today I would like to explain what throttling is and how we approached this topic.
(Note that the throttling improvements were in Awesomenauts patch 2.5.3. This post is unrelated to the bandwidth optimisations in patch 2.5.4 that are currently being tested in beta.)
The basic idea of throttling is that if you detect that an internet connection cannot handle as much as you are sending, then you start sending less. This can happen on the side of the game, or on the side of the connection itself (by the modem or router, for example). If we keep sending more than the connection can handle, then either we lose a lot of packets or, even worse, the internet connection is lost altogether, causing a network error. Throttling is intended to keep this from happening.
The basic idea may sound simple enough, but actually implementing this is a lot more difficult. There are two problems: how can we reduce bandwidth dynamically, and how can we detect whether we need to throttle?
Lets start with how to reduce bandwidth. Modems usually have an approach to this that is simple enough: they just throw away some packets. This is a very efficient way of reducing bandwidth, but completely killing to any game. If a packet with important data is dropped then it needs to be resent. We cannot do this immediately as the internet connection doesn't notify the game that a packet was dropped. The only thing we can do is just wait for the acknowledgement to come in. If after a while we still haven't received an acknowledgement, then we conclude that the packet has probably been lost and needs to be resent.
Resending based on acknowledgements is the best we can do, but it is a pretty imperfect solution: the acknowledgement might still be under way. We cannot know whether it is, so we just need to pick a duration and decide to resend if that much time has passed. If we set this duration too long, then it takes very long before the packet is resent, causing extra delay in the gameplay if the packet was really dropped. If we choose a very short resending duration, then we will probably be sending a lot of data double that had actually already arrived. This wastes a lot of bandwidth and is not an option either.
Since we cannot pick such a short resending time, we need to wait a little while before resending. This means that dropped packets arrive in the long run, but with an enormous delay. If for example the packet contained information on a player's death then he might die one second too late, which is really bad for the gameplay experience.
In other words: we never ever want the modem to throttle. We want to decide ourselves what gets thrown away, so that we can make sure that the really important packets are never dropped. If we need to throttle, then we want to at least throttle data that is less important. The problem with this is that if we could get away with sending less, then we would always do that instead of only when throttling. After all, an important goal in multiplayer programming is to use as little bandwidth as possible. This means that throttling comes with a drawback and we don't want to do it unless necessary.
In Awesomenauts we reduce bandwidth and packet count when throttling by sending less position updates. During normal gameplay Awesomenauts will send the position of a character 30 times per second. This way if a player turns around quickly, other players will know about it as soon as possible. If we send position updates less often, then we essentially add a bit of lag: we don't send the latest information as soon as we know it. We would prefer not doing that of course. However, if the alternative is that the modem is going to throttle by randomly throwing away packets that might be important, then our hand is forced and we prefer sending less position updates.
Now that we know how to throttle we get to the much more important question: when to throttle. How can we know whether we need to throttle? It is not possible to just ask an internet connection how much bandwidth it can handle. Nor do you get notifications when the connection starts dropping packets because it is too much. We therefore can never know for sure whether throttling is needed and have to deduce this somehow.
Our initial approach to this was to throttle if the ping was too high. The idea is that if a connection cannot handle the packets it needs to send, then latency will increase and we can detect this. This works fine for connections that normally have low ping: if the standard ping is 50ms and suddenly it rises to 300ms, then it is extremely likely that we are sending too much and need to throttle to keep the connection from being lost altogether.
This approach is too simplistic however: internet connections are a very complex topic and can have all kinds of properties. Some people might indeed have a fast connection and a painfully low maximum bandwidth. However, if an Australian and a European player are playing together and they both have a really good internet connection, then their ping will still be high because the distance is so large. In this case throttling won't help at all. In fact, since our throttling essentially increases lag by sending less often, throttling in this case will actually decrease the quality of the connection!
This brings us to the change we recently made in Awesomenauts patch 2.5.3. Instead of looking at ping, we now look at packet loss. Awesomenauts uses UDP packets and we have our own manual reliability system, since various parts of the game require various degrees of reliability. This means that we send and receive our own acknowledgements and thus know exactly how many packets are lost. This is a much better indicator of connection problems than ping. If a lot of packets are dropped by the connection, then apparently we need to throttle to keep from sending too much over a limited internet connection.
It doesn't end there though. I already mentioned that internet connections are a complex topic, and this new plan too is thwarted. Some internet connections are just inherently lossy. For example, maybe someone is playing on a wireless connection and has a wall in between the computer and the modem. Maybe this causes 10% of all packets to be lost, no matter how many packets are sent. I don't know whether wireless routers actually work like that, but we have definitely seen connections that always drop a percentage of the packets, no matter how few we send. Since throttling increases lag we only want to do it when it significantly improves the internet connection. If, like in this case, throttling does not reduce the number of dropped packets, then we do not want the throttle.
Ronimo programmer Maarten came up with a nice approach to solve this problem. His throttling algorithm is based on letting the game perform little experiments. If a player has high packet loss, then the game enables throttling and starts sending less. Then it measures the packet loss again. If packet loss decreased significantly, then we keep throttling. If packet loss remains roughly the same, then we stop throttling and start sending at maximum sending rate again.
The result of this approach is that we only throttle if it actually improves the internet connection. If throttling does not help, then we only throttle shortly during those experiments. These experiments take place automatically during gameplay, but are short and subtle enough that players won't actually notice this happening. If the connection is really good, then we never ever throttle: we don't even do those experiments.
The result of adding this throttling algorithm is that network errors due to losing the connection have been reduced by 10%. This is not a spectacular improvement that many players will have noticed directly, but it is definitely significant enough that we are happy with this result.
In conlusion I would like to stress that internet connections are extremely unpredictable. We have seen all kinds of weird situations: connections that are really fast but stop for a few seconds every couple of minutes, connections that send packets in groups instead of immediately, connections that have low ping but also low bandwidth capacity, and many other combinations of properties. The big lesson we have learned from this is to not make assumptions about properties of internet connections, and to assume any random weirdness can happen for anyone's internet connection. This why I like the approach with the experiments so much: instead of assuming throttling works, it just tries it.