Difficulty is Hard

December 19, 2007 at 12:30 pm (Difficulty, Game Design)

We usually want games to be challenging, but reasonably beatable: not too easy, not too hard. Finding this point can be difficult; even worse, the desirable amount of difficulty varies from player to player, and even for a single player as they get better at the game. Getting the difficulty right is…well, difficult. There are several common techniques for trying to produce the right level of difficulty, with varying degrees of effectiveness.

The Difficulty Curve

The first and most important tool is the difficulty curve: games traditionally get progressively more difficult as you get further into them. This serves two important functions. First, as the player plays through the game, she is expected to get better at it, and therefore will want greater difficulty towards the end of the game than at the beginning; the difficulty curve satisfies this desire.

The second and more subtle benefit is that this tends to draw the player to the part of the game where the difficulty is most appropriate. A player who finds the game too difficult will have trouble completing game tasks, slowing or halting her advancement, and thereby allowing her to gain practice before moving on to a more difficult segment. A player who finds the game too easy will complete tasks quickly and easily, moving more quickly to an area of greater difficulty. Thus, to a certain extent, the player’s skill automatically adjusts to fit the difficulty curve, making the game suitable for a wider skill range than a static difficulty could achieve.

However, in order to get this effect, it is necessary to ensure that greater skill allows the player to advance more quickly, not just more reliably, and that players facing an overly difficult section of the game will be getting good practice, rather than simply failing.

Difficulty Settings

Some games offer a global adjustable difficulty level, so that players can choose how hard they want the game to be. On the face of it, this is clearly a good thing: the player can select the most appropriate out of all the available difficulties. It may require a little work to figure out which setting is the best one, but that’s a relatively easy problem (the best guide I’ve seen was in Serious Sam, where the difficulties were given advisements like “for inexperienced FPS players,” “for experienced FPS players,” and “for experienced Serious Sam players”).

The difficulty comes in when you start worrying about how to create all of those difficulty settings. Manually fine-tuning the difficulty of each separate setting is time-consuming, especially when you end up needing different content for different settings (for example, an “easy” puzzle and a “hard” puzzle can easily end up with completely different solutions).

Many games try to offer multiple difficulty settings for a minimum development cost by making subtle, global changes instead of fine-tuning the content for each setting–in other words, algorithmic changes to difficulty, rather than manual ones. The most common example of this is probably games where higher difficulties cause the player to receive more damage when hurt and deal less damage to enemies. But while these sorts of changes are easy to make, they rarely have the desired effect. Simply changing the numerical parameters of a game is seldom an effective way to adjust the difficulty outside of a narrow range; you can fundamentally change the gameplay and required tactics surprisingly quickly.

Changing the amount of damage the player deals, for example, can easily make the difference between killing a target in one combo or two, or change whether you have enough time to eliminate one enemy before another gets within range–these, in turn, force you to continue worrying about the first target when you would otherwise devote your attention to something else, which can quickly result in a fundamental shift in your tactics. While this makes the game harder, it can also shift emphasis away from the gameplay elements that are designed to bear the player’s attention and towards secondary considerations that are less interesting–for example, by forcing the player to fortify or snipe instead of going on the offensive, or by forcing a player to perform some set of actions repetitively that would otherwise be done infrequently.

Reducing the player’s health is often less disruptive, as it usually just results in a lower tolerance for error. However, many game tasks are structured in such a way that even someone who is playing very well is expected to sustain damage, or where the player must make a trade-off between avoiding damage and some other goal, and such tasks can be adversely affected by reducing the damage the player is allowed. Reducing health also places more emphasis on reliability in the player’s performance, which, as I previously discussed, can be hazardous.

With that previous discussion in mind, another simple approach would be to impose a higher standard on the accuracy of the player’s basic actions–for example, requiring more precise timing and positioning in order for an action to succeed. Depending on the game, this could be done by reducing the avatar’s speed or jumping height, reducing the area targeted by the player’s attacks, increasing her hitbox, etc., most of which seem fairly easy to implement (though some might be difficult to calibrate to a specific degree of difficulty).

Unfortunately, this approach has another problem, which is that it can be very jarring to players transitioning from one difficulty to another. A player may have learned a very specific way of doing something that reliably centers her performance within the tolerance of the game, but not at the centerof that tolerance–for example, if a button needs to be pressed between 0.2 and 0.5 seconds after some signal, the player may have trained reflexes and muscle memory to press it after 0.4 seconds, instead of the “optimal” 0.35 seconds. If you reduce the window from 0.2-0.5 down to 0.3-0.4, the player’s target time is no longer safely inside the window, and the player will begin to fail frequently even if her behavior is extremely consistent. Thus, the player is forced to “re-learn” skills that are already very reliable, but “miscalibrated,” and the old skills may already be very entrenched.

The best difficulty settings seem to show up in games whose challenge is already modeled around composing many simple tasks, and where higher difficulties simply add on more simultaneous tasks, or replace some of the tasks in a set with similar but harder ones.

For example, in Geometry Wars, a single enemy is not ordinarily a credible threat–you can easily predict its movement and avoid it or destroy it. The challenge of the game comes from facing many enemies (and many kinds of enemies) simultaneously; you cannot destroy them all at once, and thus are forced to target a few at a time while evading the others, and while a single enemy is easy to predict and evade, simultaneously predicting and avoiding several enemies in different locations (or with different movement patterns) becomes more challenging. The difficulty of an encounter can be smoothly scaled up by adding additional enemies, or substituting one type of enemy for another that is similar, but individually harder.

However, it is extremely important to understand how all the individual tasks interact to create a composite challenge. Geometry Wars fails in this regard, by failing to predict interactions between enemies with very different behaviors.

Enemies in Geometry Wars can be divided into two general classes: enemies that deliberately chase you (which I will call “aggressive”) and enemies that essentially ignore you and move erratically (which I will call “passive”). Aggressive enemies are a more pressing threat, because they home in on your position, but this makes their movements more predictable and means that groups of aggressive enemies tend to coalesce into thick swarms that are easier to predict and avoid. Passive enemies pose little danger by themselves, being easy to avoid and unlikely to wander into your path, but they spread out and increase the general hazard of the level by forcing you to pay attention to threats from many different directions.

Most of the challenge of the game comes from fighting both at once, because the aggressive enemies put pressure on you and force you to keep moving, while the passive enemies make moving quickly more dangerous and force you to divide your concentration. Fighting a mixed group of aggressive and passive enemies is substantially more difficult than fighting a uniform group of either.

When introducing a new, powerful type of passive enemy–the snake–the game first spawns a small number, so you can familiarize yourself with them, then spawns a larger number, challenging you to handle many at once. The next logical step would be to spawn a small number of snakes and a group of aggressive enemies, forcing you to learn to handle snakes while under the pressure of aggressive foes, and then slowly scaling up the numbers; but instead, the game jumps immediately to a huge swarm of snakes simultaneous with aggressive monsters. This probably seemed to the designers like a reasonable step–throwing the player back into a mixed group (which she’s fought before), but with a new type of enemy in the mix. In fact, they probably just substituted snakes into an existing algorithm for scaling up the numbers of enemies that was already calibrated. But because the snakes are much stronger than (and substantially different than) the previous passive enemies encountered by the player, combining them suddenly and in large numbers with aggressive enemies actually represents a large immediate jump in difficulty, instead of the gradual transition the designers probably intended.

Auto-Adjusting Difficulty

Some games attempt to gauge the player’s performance and switch between different difficulty settings automatically. As an optional feature, this is potentially helpful. Again, however, the implementation can be trickier than it seems.

First, there is a danger of giving the player something other than she wants. The perfect difficulty does not depend only on the player’s abilities; it also depends on her personality and mood–how much she wants a challenge compared to how much she just wants to win. For this reason, there should probably be an option to override any automatic difficulty adjustment.

The other problem is that accurately measuring player performance and adjusting the difficulty accordingly is surprisingly complicated. My brother told me once about a problem he had with the auto-adjusting difficulty in Max Payne. Before buying the game, he spent much time playing the demo (consisting of the first couple levels of the full game). When he bought the real game, he had played the first levels so many times that he was able to complete them near-flawlessly, thus convincing the game it should move to the highest difficulty. But as soon as he got past the areas he knew, he was not able to play nearly as well, and the highest difficulty was too much. However, he discovered that if he died, the difficulty did not change–apparently being processed only at the end of a level–and that if he successfully completed a level, that meant he had performed well enough to warrant continuing at the highest difficulty (in the game’s estimation).

The game was biased towards escalation and had no reliable way to adjust the difficulty downward, trapping the player on an inappropriate difficulty setting. Having no way to change the difficulty manually without restarting the game (arguably another poor decision), he was forced to play each level many times–and occasionally seek help from friends–in order to finish the game. Though this problem may seem obvious in retrospect, it wasn’t obvious in advance–predicting pathological cases of complex algorithms is very hard. You need to be careful that an auto-adjusting difficulty is really taking an accurate measure of player performance, and that it doesn’t get “stuck” on an inappropriate setting.

Bonuses

Another technique for accommodating additional skill levels is to include optional objectives for more highly-skilled players. The “bonus” objective is either harder than the required objective, or pursuing it makes the required objective more difficult (perhaps by consuming resources, like time, or putting additional constraints on the player’s behavior). This allows players to want a greater challenge to seek one out, without hindering the players for whom the regular game is enough.

However, this is confused somewhat by an unrelated purpose served by bonuses: adding content into the game, increasing the overall play time and value of the game. This is likely the more common purpose of bonus objectives: as a type of content that is of lower quality and lower cost than the “main” gameplay, increasing the value of the game for enthusiastic players and giving them something to continue to aim for even after they’ve burned through all of the premium content, without consuming too many development resources.

These two objectives for bonuses can be synergistic, since there’s likely a lot of overlap between skilled players and completionist players, but their requirements are not identical, and so if you wish to implement both goals, both must be kept in mind.

Testing

Ultimately, the only reliable way to gauge the game’s difficulty is to have people play it and see how they do. Don’t fall into the trap of testing only for bugs: checking for bugs only verifies the game’s implementation. In order to verify the game’s design, you need to test for gameplay.

Keep in mind that testers will get better as they continue to play, and that new testers may vary widely in their experience with similar games and their natural aptitude.

Unfortunately, good testing is resource-intensive, and it requiresmultiple testers–one person can’t do all the work, because like all empirical methods, playtesting relies on multiple samples to eliminate aberrations. This means testing can be difficult, especially for amateur projects.

Still, no matter how good your testing is, you need someone with a good abstract understanding of how the game works and what makes it hard in order to find effective solutions to the problems you discover. Testing only reveals problems, it doesn’t diagnose or fix them. Also, remember that testing can only verify the existence of problems, not the absence of problems.

The Result

This may seem like a lot of trouble for what is effectively a straw man: the point of all this carefully calibrated difficulty is not, after all, to prevent the player from winning, but merely to provide something to win against. But isn’t the opponent the entire reason for winning in the first place?

The benefit of getting the game’s difficulty just right is the thrill the player feels in struggling against it and the triumph she feels when she wins. If you’ve ever felt the satisfaction of finally beating a game on the edge of your skill, you know that that perfect difficulty is irreplaceable.

4 Comments

spellman23 said,

January 14, 2008 at 5:56 pm

Yay! Difficulty. Oh man….

There was a similar thing mentioned by the developers of Civ4 when I watched their development movie on prototyping. They mentioned that the difficultly of the AI was something they added in later. In fact, all of the AI was built near the end. That way, the other fun issue of balance as worked out prior. They actually quoted (or you quoted) how the AI is there so it’s fun to win against something.

Now, once again I’m a bit concerned. You’ve mentioned different ways of changing the difficultly, and several pitfalls for all of them. But I still don’t see much of what you think should be done. What, in your opinion, is the best way to deal with difficultly? Perhaps actually tooling each one individually? Create varying levels of AI? Manage the curve differently (personally Valve does this very well)? What is a good example of well-made difficultly setup?

Anyways, enough of my griping. Nicely done overall. Curse those who think reducing damage I put out and increasing the amount I take is the only way to make things harder.

Reply
Tommi said,

April 8, 2008 at 10:52 am

Simply changing the numerical parameters of a game is seldom an effective way to adjust the difficulty outside of a narrow range; you can fundamentally change the gameplay and required tactics surprisingly quickly.

So, assume the gameplay and tactics are significantly changed. This means that the game essentially has several similar games in one package, some of them harder than the others (and tactics that work at harder levels almost always also work at lower levels). Personally, I would say this is a good thing.

Reply
Antistone said,

April 8, 2008 at 1:00 pm

It would be a good thing if every one of those game variations was interesting, enjoyable, and had the intended level of difficulty. However, that is unlikely to be the consequence unless you manually calibrate each variant, and the entire point of making an algorithmic change to difficulty is to avoid doing that.

Reply
Tommi said,

April 9, 2008 at 4:58 am

I’ll talk about Civ 4. Difficulty levels basically determine how much economic bonuses the player and the AI get (barbarians also work in slighty different way, depending on the difficulty level). I think this is very much an algorithmic change.

Manipulating the economy is approximately what the game is about. The harder difficulty levels kill some (many, most) strategies that work on the lower ones. At the lowest levels a significant amount of game elements, like health and happiness, take wuite a while before becoming relevant. Not so at the high ones.

Are the game experiences different? I’d say so, though I am not a hardcore player and actually enjoy the lower difficulties more. Are they all successful? Depends on how you define “successful”. I like the game for its value as a toy mroe than for the challenging side, but this doesn’t make the higher difficulties badly designed.

Basically that is my point: A game that contains many similar games is more likely to appeal to a given player who can enjoy one of the levels, even if others hold little appeal.

Reply

Gaming’s Alembic