If the goal of depth is to create as many states as possible, how do we arrive at a system with many possible states? When thinking about how to build depth, we need to think about the effect that every element has on every other element. We need to consider the bang for our buck. This coincidentally plays on an age-old idiom about games, “simple to learn, hard to master”. No production has unlimited funding, and arguably there’s an upper limit to the complexity humans can process in a given game. More practically, many people don’t want to play a game with a high upfront complexity.
Thus there are different ways to build a game that are more efficient than others. There are certain ways to configure rules that will result in more states for less effort. Depth is a criteria that is agnostic to the means by which it is achieved, so we could manually build a large amount of diverse content and carefully balance it all to be relevant and that would be fine, but practically we need to pick our battles and be strategic with how much we invest into development. Additionally, if we attempt to imbue our designs with traits that result in larger state spaces every step of the way, then if we have the budget to make a lot of content we’ll end up with more depth in the end than if we didn’t.
To this end, I have what I call the 4 criteria for depth, though realistically they’re more like rules of thumb from a designer’s perspective.
- Give everything a niche, don’t let stuff completely invalidate each other (make your elements differentiated)
- Allow everything to have multiple functions (don’t build stuff to do just one thing, let it do a bunch of different stuff)
- Let the way players input modulate the output of the move (like mario holding jump to go higher/lower, and controlling your speed as you move in the air. Add nuance to a single mechanic)
- Create ways for elements to affect one another, producing combination effects.
The first two aren’t directly related to state space, the second especially is more about design elegance than necessarily anything practical. The first helps make sure your elements are differentiated and you’re not adding additional content just to have the appearance of a lot of content. The second plays off of the 3rd and 4th criteria, giving you a more tangible way to judge your success. The third and fourth criteria by contrast, are purely interested in cultivating raw possibility space.
The third criteria is about the nuance of a single element. A game that exemplifies this is Getting Over It: With Bennett Foddy; a game about a man in a pot who holds a hammer that you can control with the mouse. It only has one mechanic, moving your hammer. It does not have any enemies, or level design gimmicks. It just has the physics of your own character, and some diverse levels, and it manages to do a massive amount with these two things. The range of motion and propulsion possible just by moving your hammer is tremendous. The game is responsive to the full range of motion across which you move your hammer, the speed at which you move it, the direction you move it, the angular momentum of the hammer and the character, the friction of the hammer and character’s contact on surfaces, the weight of the character and hammer. Across all of these factors, the number of ways you can approach any given obstacle in the entire game is so vast that you are guaranteed to never approach anything the same way twice. This single mechanic is everything a mechanic should be, and possibly the most deep single mechanic I have ever encountered in a game.
The depth of this mechanic is in no small part due to the amount of information delivered through the mouse. The mouse delivers a series of [x,y] coordinates over time. From this there are many emergent properties, such as location, speed, angle, curve of motion, and gesture. Buttons by contrast are just an on/off switch, but even from those there are many emergent properties that can be gleaned, tap versus press versus hold duration, mash speed, rhythm, press and release timing. Mario’s jump is based on how long you hold the button down. Nero’s Red Queen in DMC4 revving is based on pressing and holding the rev button in a rhythm. Many fighting game characters have different moves for pressing (attack), holding (charged attack), and releasing (puppet character attack) buttons. Multiple button and stick inputs can be combined together to create even more dynamic results. Buttons can act as modifiers for other buttons. Tilts and smashes in Smash Bros are based on the synchronicity of your button press with the stick input, and how far the stick is tilted over what time interval. Gestural inputs of multiple directions in sequence are the basis of fighting game moves.
Tying inputs to outputs that are modulated over a range is especially effective for creating depth. Again, Mario’s variable height jump is a classic example of this. The analog range of time that you hold the button down is converted into the analog range of jump height. In many games, the range of speed you mash buttons at is converted into a range of possible output, such as speed escaping from stun, or maximizing a bonus. In Tony Hawk, you balance yourself by attempting to keep the reticule in the center of the line as it veers from right to left. Matching analog input to analog output is a powerful strategy for creating state space. Matching discretely differentiated inputs to discretely differentiated outputs is similarly effective.
The fourth criteria is about allowing elements to interact in order to create combined states they wouldn’t create alone. The designers of Breath of the Wild referred to their style of design for the game as “multiplicative”. This is a good term for it. Supposing that every element can uniquely interact with every other element, for every element you add to a game the total number of interactions grows. This can look a lot like a Punnett Square, mapping all the possible combinations. With 2 elements, there’s 4 resulting interactions. With 3 elements, you get 9, and so on.
However you might notice that this includes elements interacting with themselves where element 1 crosses on the punnett square with element 1, element 2 with element 2, and so on. Another thing you might notice is that this is assuming that element 2 interacting with element 3 is different from element 3 interacting with element 2. In most games, this isn’t true, there isn’t a directionality to influence from one element onto another, and elements tend not to interact with themselves. When you take these rules into account, then the Punnett Square is reduced to only one half, and none of the diagonal row where elements cross over themselves. So a square with 2 elements only has 1 interaction, not 4. A square with 3 elements has 3 interactions, not 9. It’s about the number of combinations, ignoring permutations.
However the number of combinations still grows at a compounding rate as you introduce more elements. With 4 elements, you have 6 combinations. 5 makes 10, 8 makes 28, 10 makes 43, 20 makes 189. Of course, you don’t have to limit yourself to this. If you can build ways for elements to affect each other directionally, where it’s different based on which one affects which, you’re creating a larger state space. If you allow multiple instances of the same element to interact with itself, then increase the range even more. And then on top of that, if you build every element with respect to the third criteria, AND have the particular modulation of each element reflected when those elements interact with other elements, then you have an explosion in state-space.
These dynamics can also be represented with a node-graph. Elements (nodes) that can interact with each other are connected by edges, with the directionality of the edge indicating which one has an effect on which, bidirectional indicating they can affect each other. This creates a clear goal, create elements that have as many connections as possible. Every cell on the Punnett Square starts out null, and is activated when there is an edge connecting it in the node-graph. If it’s a one-way interaction or a symmetrical interaction (like mario collecting a powerup), only the cell on one side gets activated. If it’s a bi-directional interaction (like mario either stomping an enemy, or getting hit by an enemy), then the cells on both sides get activated.
This can help make tangible exactly how much possibility space there is in a game, in a way that’s not too complicated to implement. There are more complex types of interaction, such as chains of objects having effects on each other (like Mario hitting a block, bouncing something on top of the block), but that’s a bit harder to represent in this format.
This also makes clear why it can be dangerous to segregate a game into different modes of gameplay, like with minigames. By doing this, it creates localized ghettos of elements that have connections with each other, but not any elements outside that particular mode. Reusing elements between different modes alleviates this problem.
Another shortcoming of this model is fairly representing things like loadouts. In the abstract unabridged graph, there may be many possible interactions, but loadouts require to to slim those down to only a few interactions, but when a loadout is selected, the graph gets a lot smaller. This does not mean loadouts are inherently bad, because there’s other considerations like balance and differentiation that are also important to depth, where this article is focusing purely on state-space. If everyone has access to everything at all times, then much of that might be redundant, or irrelevant to play, resulting in a samey game experience where only a few options are chosen, so only a few interactions are shown off, even if vastly more are possible overall. Card games have to contend with this type of problem and do their best to encourage variety when players can assemble a deck with practically any card available, but imagine how much worse it would be if you didn’t have to assemble a deck, and could simply play any card you wanted at any time. For that matter, imagine how impossible a game like that would be for a human to play.
Additionally, not everything neatly fits this model. Some games don’t have discrete elements that interact, and have a lot of ways the same element can interact with itself such as in Go or Reversi.
Here’s a graph of the super effective interactions in Pokemon. I’m not graphing out everything, because I’m not insane.
Additionally, not everything neatly fits this model.
You wouldn’t believe how wide my eyes are right now.
Evoland 2 (and to a certain extent Evoland 1) had disjoint minigames. I think the game overall suffered from a lack of depth, but I liked how there were minor shared abilities: the same abilities on the overworld ‘worked’ in most other minigames (a character does a straight-forward long range projectile attack on the overworld, that also worked the same way in the shoot-em-up mode)