Home Artists Posts Import Register

Content

Hi Everyone,

let's do some game debugging today. It will start with some easier tasks and then lead us all the way down to undocumented behavior of the PSX.

Be aware that it gets very technical in same parts.

The game we will look at is the Destruction Derby 2 Demo on the PAL Registration Demo disc. What happens in the last release when booting it up?

The game only presents us a black screen with the game hanging forever and the Error Overlay showing "ED".

ED is the code for CPU Data/Bus request timeout. What does that mean?

The CPU might request data or instructions from the RAM, BIOS or different devices like the GPU.  Whenever this request doesn't return data in a fixed time limit, the error is generated. This can happen if either the device being requested has hung up or if a memory area that was requested doesn't exist.

You can see here that not every area of the 32Bit address space is mapped to something and if one of the other areas is accessed, the core will hang up.

In this special case here, the request that will hang up came from a instruction request, which means the program counter jumped into some unmapped area.

This is one of the hardest failures that can happen, but fortunatly it's very good for us, because it tells us that this must be a bug, as it's impossible the original console could work that way.

And sometimes we need even more luck. I tried the game in my emulator and it worked...but why?

It turned out I was still using the 8Mbyte option from a previous test there and forgot to remove it. The 8Mbyte option, that is also available in the core, will simulate a development console that has 4 times as much memory as the retail consoles.

Now we have everything we need to know it seems: the issue is 100% reproducable and we have a workaround, so we can analyze the difference between working and non-working state.

- First step: find out the address in memory that contains the jump to the unmapped area.

- Second step: find out what is writing the content of that address.

- Third step: find out if the content or the address of that write was wrong.

The first and second step are done very fast using the emulator with debugger and trigger on memory writes to the address in question.

It turns out the OTC DMA was writing there. The OTC DMA can be used only for preparing a GPU display list in RAM. It will write the address of the next data block to the current address in RAM.

This is again very helpful, because it tells us that the content can under no circumstanced be meant to be executed as instructions for the CPU, which in return means the address it's writing to is wrong.

It's not just wrong, it's VERY wrong.

The OTC DMA can only write RAM. And there is 2 Mbyte of it or 8 Mbyte in case of the Dev consoles. Therefore, it's absolutly clear that the DMA uses a address space of ... 16 Mbyte?

(Addresses are written as hexadecimal numbers here. The "0x" of prefix shows that the number is in hexadecimal with every digit being 4 bit. This is done because it's much easier to see each individual bit compared to decimal)

Indeed the DMA has 24 bits for the address, so it could work with PSX consoles with 16Mbyte of RAM.

After so many weird bugs in the past, it's probably no surprise now if I tell you that plenty of games write beyond the 2 Mbyte of RAM with the DMA. Bugs or oversights are all around with so many games.

But why does writing beyond 2Mbyte even work on a console with 2Mbyte of RAM? Because the memory will wrap around.

The last accessable address for the DMA on 2 Mbyte console ends at address 0x1FFFFC, but if you let the DMA write to 0x200000, which is just beyond that, it will write to 0x000000 instead on a 2Mbyte console. This is normal behavior and makes sense: if you just ignore the address bits 21 and 22 by not soldering them, this is what would happen.

The Destruction Derby 2 Demo however configures the DMA to write to 0xFFFFFC in the 16 Mbyte area. Assuming this would also wraparound it ends with address 0x1FFFFC for the 2Mbyte console and 0x7FFFFC for the 8Mbyte dev console.

At least this makes clear why with the 8Mbyte option, the core can run the game: the OTC DMA doesn't overwrite the data that is stored around the address 0x1FFFFC in RAM.

However, it does not explain why the 2Mbyte console would work when the game is doing that, so we need to dig deeper.


If we look in detail at this DMA transfer, it looks very strange all around. Not only is it writing to such an odd address, it's also executed with a length of zero.

That sounds amazing!

A DMA with length of zero will write zero words to the RAM and cannot overwrite anything, right?

Unfortunatly, no.

A length of zero is a special case in the documentation. It's actually used for writing 0x10000 = 65536 words instead. It's also implemented like this in the PSX core.

But maybe that is not true? Let's take a look at different emulators:

- DuckStation implements that special case and the game will crash unless you set the console to 8MByte of RAM

- Mednafen does not implement that special case and the game works fine

- PCSX does not implement that special case and the game works fine

Sounds like 2:1 win?

Or 2:2 tie if we also consider the documentation?

I asked in the PSX emudev channel and it was clear very fast that the special case is indeed the truth, these are just bugs in the emulators.

Good bugs in these cases, as it prevents the game from crashing, but still not our goal to remove our correct implementation.


So what else could be the issue?

Maybe the wraparound from 0xFFFFFC to 0x1FFFFC is not what is happening?

So i wrote a test to run it on a real PSX.

The test is writing fixed values to address 0x000000 and 0x1FFFFFC and then starts some DMAs at different addresses and with different lengths and reports if the DMA does overwrite the data.

Yes, yes, please forgive me, i didn't trust enough, so i also ran the test with length of zero and it turns out the documentation was right, the DMA is writing 65536 words in this case.(green box)

The tests in the red box are more interesting. 

The DMA starts at the end of each memory area and it can be seen that writes to 0x7FFFFFC and 0x3FFFFC indeed act just like writes to 0x1FFFFC, only with different content.

But the write to 0xFFFFFC does not modify the data at 0x1FFFFC at all. Instead, it turns out that if the uppermost address bit is set, the write does not end in RAM, but is instead ignored. 

That completly explains why the game was not working and after implementing the fix, it works:

The tests also revealed another edge case when the DMA is counting down the address when it's already at 0x000000.

Maybe some game out there will depend on that, too.

I summarized all my findings and they are now part of the documentation:

Given how long it took me to find so tiny details, i can't thank the authors of the documentation enough. 

It's massive what they made for the upcoming next PSX emulator or FPGA core developer generation.

Have fun!

Comments

Anonymous

Another fantastic article, loving these insights into the methods you’re using to analyse and debug the psx core!

Thorias

Thanks for your hard work on the Psx core, please carry on!