matthewmatosis

Project Proxima Developer Diary 002 (Patreon)

Published:

2021-04-30 17:17:20

Edited:

2021-04-30 17:29:24

Imported:

2021-07

Content

The First Mistake
What do Shovel Knight, Axiom Verge, Hollow Knight, Celeste, The Messenger and many retro games like Mega Man have in common? None of them have slopes. This isn’t some kind of metaphor, I mean they literally don’t contain slanted surfaces.

Perhaps the preceding list is enough of an argument to demonstrate that slopes aren’t necessary but some others rely on them. Dustforce, N++ and Sonic wouldn’t be the same if you removed them. Super Metroid and similar games would remain mostly intact but stand to benefit from slopes regardless, even just atmospherically. If you want to have a cave level then slanted ceilings and floors will make it much more convincing.

I am presumably both a designer and a programmer (although I barely feel like either yet) which means I’m wearing two hats at the same time but since these metaphorical hats are inside my head they talk to each other freely. So far, Mr. Designer has been very insistent that Mr. Programmer not limit his potential by building systems that are too rigid, something which causes Mr. Programmer a lot of heartache as he spends an excessive amount of time trying to solve edge cases that may or may not be necessary. Something tells me this relationship exists in the real world too. I hope these two can form a healthy relationship but for now I think Mr. Designer is right. We have somewhere between zero and four games to make, a solid foundation seems like a worthwhile goal.

I may be poor and Irish but at least I have two hats.

All good in theory but before I continue I need to discuss the first major mistake so far which is really two smaller mistakes that compounded on each other.

Firstly, I announced that I would discuss the collision detection stuff in advance.

Secondly, I made a framework instead of solving the problem.

The first part is simple and makes me think of that quantum effect where measuring the properties of a particle apparently changes those properties. Simply put, knowing that I was under observation changed how I behaved for the worse. I felt like I had to get everything perfectly in order before I could talk about it which caused me to dwell on things much more than I should’ve. There’s a difference between “good enough for now” and “good enough to tell everyone about” which sucked the fun out of work at a time when I should’ve been enjoying myself. Long time followers probably know about my struggles with perfectionism, hopefully you can see how this would compound the issue. This led to some paralysis at the start of the month but I’ve since corrected course, shifting my focus to prototyping Proxima which has been much more productive and enjoyable.

The second part is one which I might not have recognised without hearing Mike Acton put it into words. To paraphrase, he says that programmers have a tendency to create frameworks which will allow someone else to solve a problem rather than just solving the problem themselves. In my defense, I think it was necessary to create some scaffolding before proceeding with collision detection. After all, I need a reasonable way to store tile data and a reasonable way to access it. Thankfully these parts are looking good so far. My mistake comes from not understanding the full scope of the problem which is what Mike is really getting at. The problem wasn’t “how do I store and access tile data?” it was “how does the player character interpret and respond to the tiles around them?” I’ve answered the first question but I still have work to do on the second. By itself I suppose that would be fine but it’s demoralising to think you’re nearly finished with something only to realise you’ve barely begun. Lessons learned, don’t raise topics in advance and be more mindful of the big picture before you start congratulating yourself.

With all that said, let’s highlight the issue because by this point you might be wondering why slopes are even a problem in the first place. Take a look at this seemingly harmless image:

What you’re seeing is a player’s bounding box resting on a slope. In order to make this look natural we want to position their center coordinate just above the ground, otherwise they’d appear to be floating in the air like so:

Fixing this causes the bottom right of their box to intersect with the floor. In other words, we can no longer trust that the player’s bounding box should always be outside of a surface because we sometimes want to allow overlap. This is also how we move up slopes in the first place. First move along the x-axis, only stopping if blocked by a full tile. Once the x position has been decided, it’s easy to tell how far we have to push the player up so they stand on the slope. None of this is rocket science but selectively allowing the player to slip inside the terrain introduces a new layer of complexity. You might be surprised the kind of edge cases this causes.

On top of that slopes also dramatically exacerbate tunneling effects. This is a glitch which can happen if the player is moving quickly. Here’s a rough illustration:

If the player moves fast enough, just looking for terrain wherever we happen to be doesn’t offer enough precision. The character can’t “see” the block on either frame so they simply move through it. If you’re just using full or empty tiles this problem can be solved by clamping the player’s movement to one tile per frame. That might sound restrictive but it’s probably a reasonable solution in many cases. For the sake of comparison think about a shinespark in Super Metroid. It moves uncontrollably fast yet it seems to be capped at about one tile per frame. Super Metroid has slopes though and a speed cap doesn’t work for slopes because they can be much thinner at the edges, increasing the likelihood of tunneling.

It's much easier to tunnel through a surface which is only a few pixels thick.

One way of solving this is to make sure that slopes are always “backed up” by a full tile. That way the player will collide normally with the supporting tile regardless.

Mr. Designer is unhappy with that approach though, it would be nice to be able to place slopes more freely. Like so:

This is where the worst edge case presents itself:

If I jump sideways into this slope how should I stop the player from moving into it while also allowing them to slip inside once their feet have cleared the bottom? Most importantly, can I do this without storing any extra information in the tile itself and if so how do I do this without messing up some other interaction? I have ideas but no simple answer yet.

Right now it’s hard to explain any more beyond this. Suffice to say I can solve many of the resulting problems in isolation but fixing one has ripple effects elsewhere. I could just blindly hammer down every edge case until they don’t present issues anymore but I have a feeling that a more elegant approach is somewhere on the periphery of my understanding. Maybe I’m going about this the wrong way but I think programming is partly driven by intuition even though the end product is entirely logical. My intuition tells me a better, more simple approach will reveal itself sooner or later. It would be nice to have something I feel comfortable reusing and even explaining in a future post. For now, I’m prototyping gameplay elements instead which is going well. I wish I could show more of that right now but I'd rather keep Proxima under wraps.

In the meantime there are some success stories. Mike Dailly has a video covering the basics of pixel perfect collision detection in a 2D game. This is similar to what you’ll find in Sonic Retro’s excellent breakdown of Sonic’s physics. The approach is to pre-calculate how tall each column of each tile is then store that data so you can easily look it up at runtime. This can mean thousands of height values but that’s only a couple of kilobytes of data at most, well worth the space.

I’ve implemented Mike’s method for building up height tables and also adapted it so that I have a table for each of the four directions. On top of that I built up a sensor system similar to the one described on Sonic Retro which allows me to point at any pixel on the map and get a number telling me how close it is to the floor or ceiling. This solves the tunneling problem mentioned earlier since I can draw a line from the player character to the ceiling and catch anything in the way, even if it’s only one pixel thick. Think of it like raytracing, only in this case it’s axis aligned to be as quick as possible.

The green lines show what values the sensors would return. My system allows these to be extended or clamped to any size and point in any of the four directions. This could have uses beyond simple collision detection. For example Deedlit in Wonder Labyrinth features a hover ability which is only possible when the player character is within a certain distance of the floor. Using this sensor tool would make that trivial to implement.

Outside of the aforementioned stumbling blocks, the results are overall impressive with the ability to draw tiles of many arbitrary shapes and have them behave reasonably.

Bitwisdom
(This section discusses some bitwise operations, if you don’t know how numbers are represented in binary, it would be worth reading a primer on that first.)

While it has been frustrating at times, building the sensor systems and tinkering with the height maps have dramatically increased my familiarity with tile based systems. I’ve even realised how GameMaker is performing certain tasks behind the scenes which would help me build my own engine some day if desired. One important example is translating absolute coordinates to grid coordinates or vice versa. Hopefully the diagram below makes this distinction clear.

White numbers represent pixels, red numbers represent tiles.

It’s useful to have a fast way of switching between one system or another. For example, if I know the player is at position (154,273) then I probably want to know what tile is occupying that space in the grid so I can decide whether they need to collide with something or not. This can be achieved by simple division. If we have a tile size of 16 then we just divide 154 by that.

154 / 16 = 9

Position 154 is in the 9th tile of the grid.

Division is relatively slow on a processor level and converting from one coordinate system to the other is something we’ll be doing a lot so it would be good to have a faster method. Thankfully bit shifting solves this problem at lightning speed as long as the tile size is a power of two (8, 16, 32, 64, 128, etc.) because a bitwise shift to the right is the equivalent of dividing by 2. All we need to do is bitshift four times to divide by 16. It sounds complicated but if you picture how the numbers are stored on a binary level this becomes much easier to understand.

10011010 (154 in binary)
----> Bitshift right by 4 (rightmost numbers lost, leftmost replaced by zeroes)
00001001 (9 in binary)

You can see that it works but if you’re confused about why it works think about a single bitshift instead.

10011010 (154)
-> Bitshift right by 1
01001101 (76)

Each “1” is now worth half as much as it was before. Hence if we shift to the right four times we’ve divided by 16. Note that you can’t divide by 3 (or anything but powers of two) this way because each shift is a further division by 2. If you shift twice you’ve essentially divided by 4 and so on. Maybe there’s a better term for what I’m about to say but the phrase I’ve come up with is that powers of two are “binary aligned”. This “binary aligned” property allows you to manipulate them very quickly since bitwise operations are simple and thus fast.

A similar method exists for finding a position within a tile. Since tiles range from 0-15 in size, a position within a tile will always fall within that range. In other words we want the modulus of 16. If you’re not familiar with modulo it’s related to division only instead of looking at the answer of the division we look at the remainder. When you divide something by 16, the remainder will always be less than 16.

For powers of two:
x modulo(y) == x AND(y-1)

Using the same example as above:

10011010 (154)
00001111 (15)
------------- Bitwise AND (only keep 1s if they’re present in both numbers)
00001010 (10)

Position 154 is 10 units into that tile.

Notice that 15 is represented by 4 ones in a row. This is why we use 15 instead of 16. The same is true for every power of two minus one (7, 15, 31, 63, 127, etc.)

00011111 (31)
00111111 (63)
01111111 (127)
11111111 (255)

The AND operation works because these numbers are always all zeroes followed by all ones, creating a mask. Essentially we keep everything below the boundary and discard everything above the boundary. Since we just want to know where the numbers between 0 and 15 sit, we can jettison anything higher than 15 to get the answer.

If this has been difficult to understand don’t worry, some practice would clarify things much more quickly. It’s only worth attempting to conquer if you want to make a game yourself and even then it’s not strictly necessary depending on what you want to make. I’ve been watching seminars about various programming topics and one of the most enlightening was about how efficient compilers can be. Divide a number by a power of two and the compiler will probably use a bitshift anyway, as long as that number is a constant. Curiously, GameMaker seems to allow for tile sizes that aren’t powers of two. Perhaps it has some further optimizations if the user selects a different size but it’s hard to imagine anything being quite so streamlined as a power of two, they really are magic numbers when it comes to computing.

Working on the Teio Form mod dramatically increased my comfort with bitwise operations and understanding the compiler. I’m far from being an expert but it was a valuable experience. If you have a game you’ve wanted to make some minor alterations to, I highly recommend learning how to use Cheat Engine and taking a crack at it, you might surprise yourself and learn something along the way.

Fully Automatic
When you’re working with tilemaps that store information on a per-pixel level, it makes sense to check collisions against whole numbers. This leads to a problem though. Suppose you’re moving at a constant rate of 1.3 units per frame, if you floor that number (drop everything after the decimal) then you’ll only ever move 1 pixel per frame which doesn’t reflect the player’s true speed. Worse yet, if your speed gradually increases to 3.0 units per frame then you’ll jump from moving at a constant rate of 1 to 2 to 3. Basically the movement would look like this:

1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, etc.

This feels like the character is “changing gears”, it’s truly terrible if you experience it hands-on. Instead we want a much more gradual acceleration like this:

1, 1, 1, 2, 1, 1, 2, 1, 2, 2, 2, 3, 2, 2, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, etc.

Thankfully this is easy. Just accumulate the fractional portions of the number and let them spill over into single frame speed burst every once in a while. It looks like this:

You could question whether movement should be more fine-grained but personally I’m attached to the idea of rendering and processing snapped to a pixel grid, especially since the game will likely use low resolution sprite art. There’s a clarity to this approach. When a pixel is a pixel you know exactly where you stand, literally.

Made of Words
As you can probably tell from the rest of this post, I have some interest in the art of programming. I’ve never considered myself an exceptional programmer (although I’m certainly not bad either) but there’s something about having sole ownership over a project which forces you to take this stuff more seriously. I suppose I got a taste of this when I was leaving my day job because I took it upon myself to refactor much of my previous work for the sake of my replacement. It’s important to think of your future self in much the same way, after all they will have to look back on your work to understand it.

In theory I’m a believer in the idea of “self-documenting code”, something which is so clear its purpose is obvious from the code itself, reducing the need for comments which can be misleading. In practice, I haven’t accomplished this goal as often as I’d like. The aforementioned sensor system is one such place. I’ve tested this thoroughly by attaching it to the mouse cursor and checking it under every condition I can imagine, it always returns the correct values but the code behind it isn’t as streamlined as it could be. I’ve already refactored it several times to arrive at my current approach which I was initially proud of but now think is so messy it needs another pass.

Just like regular language, there are many ways to say the same thing. Nowadays I’ve come to enjoy the editing process while writing and I’ve also spent far longer mulling over the specifics of this blog post than I should’ve. Refactoring code can be similarly enjoyable but it’s important to know when you can move on from a certain task and worry about making it better later. To go back to Mike Acton again, I believe he would call this “getting the clowns out of the car”. Set it up in a way which won’t give you headaches later, make it good enough for now and then move on. My sensors fit that description because they always return a simple integer value which is always correct, this makes them a component which can be easily swapped out later.

The programming community has its own kind of ideological warfare with people claiming to hold the answers to your woes if you’ll only use one approach or another. Many smart people with much more experience than me are quite invested in these wars so maybe they know something I don't but from my perspective they all seem to have similar goals. Everyone wants their programs to be made of small, simple components that work in unison to accomplish something greater. I'm beginning to think that the interaction between two components is more important than the components themselves because components can always be improved whereas the interaction between them has a way of getting entrenched and difficult to change. Anything which gets in the way of simplifying those interactions seems to be an over-complication. We all know that you can’t stop yourself from making a mess but you can make your mess easier to clean up later.