Home Artists Posts Import Register

Content

Hi everyone,


I've been working mostly on the CPU part of the N64 core in the last 2 weeks and made some good progress there. Yesterday was finally the moment where the wonderful n64-systemtest started to work on the core.

This alone is great enough, but it will also to allow to track the progress and regressions much better, see what is still open to be done and also helps fixing these things.

So let's take a look on what this testsuite does test on the CPU part and how the core performs there, then we can judge how much work is still open.


- Arithmetic

These tests check all basic calculations (ADD, SUB, MUL, DIV) as well as logical operations (AND, OR, XOR, NOR) and bit shifts. I fixed a DIV bug yesterday and now all tests pass:


- Jumps

The MIPS CPU does support the standard conditional and unconditional jumps and branches. 

Special about these in the MIPS architecture is the branch delay slot. To make the CPU execute 1 instruction in every clock cycle, even when it jumps to a different address for execution, 1 instruction after each branch is executed, no matter if the branch was taken or not. This allows the CPU to always keep full speed without the requirement of any branch prediction. But this also means you get many edge cases, like branches in a branch delay slot.

Compared to the PSX MIPS CPU, the one in the N64 also has new branch instructions, which are like the old ones, but the jump is now "likely". That means, that the instruction in the branch delay slot is not executed but instead discarded. This may sound easy again, but leads to even more edge cases.

In any case, the last of these edge cases could be fixed today and now all tests in the Jumps category pass:


- Exception Instructions

The CPU offers some instructions like SYSCALL to alter the program flow immidatly. It can be used for multithreading or to call some management functionality. Those instructions act like a jump, but the target is fixed and it will save some context to special registers in the Coprocessor 0 (COP0).

The core still fails some of them and I need to investigate. Could be some edge case where they are executed in a branch delay slot.


- Overflow Exceptions

Some arithmetic instructions can create an exception on overflow. For example if the result of adding two positive numbers will result in a negative result due to the result being to large for 32/64 bit. In theory there could be code in the exception handler to resolve that case and continue execution.

I'm not sure yet how N64 games handle it, but in PSX games this feature was never used. Some PSX emulators don't check for the overflow at all for performance reasons, because if it happens the game would crash anyway.

In any case, those checks currently don't work on the core, the CPU crashes with some infinite waiting and black screen, so no screenshot of the result here.


- Traps

Those are relativly similar in behavior to overflow exceptions. Trap instructions check for certain conditions and if they are met, will trigger an exception. They are mostly used for debugging purposes and not used in games typically. Nontheless, they should work and currently the core just presents a black screen there, so I need to take a look.


- COP0

The coprocessor 0 in the CPU has several tasks: it stores some context on an exception, it handles different privilege modes, has a free running counter that can be used to create timer interrupts and it also does all the TLB handling.

The COP0 tests of the systemtest however only test some basic read/write patterns on the COP0 registers, so all but one of them pass already:


Overall this looks very promissing already. Mainly some edge cases and exception handling needs to be fixed. I hope to get all these working in the next 2 weeks.

Does that mean the CPU is complete then?

Unfortunatly no. There are several parts not tested at all and some parts I left out:

- Instruction and Data Cache: I could implement the instruction cache last week, but the data cache is still missing and needs to be added. Both are not covered by the systemtest, but there is different test for them.

- Interrupts: They are still missing and I don't have any good test rom for interrupts unfortunatly. This was a serious problem with my emulator, but since I figured it out there already, I'm optimistic i can get it working without too much effort.

- TLB: there are plenty of TLB tests in the systemtest, but the TLB is not implemented and I will not do it yet. At least 50% of all games can work without TLB, so I will postpone the work on that until some games run fine


Then there is the big elephant in the room: the floating point unit. The FPU is implemented as coprocessor 1 in the CPU and is therefore part of the CPU. Still, I don't consider it part of the normal CPU in my development flow, because it can be added more or less independent. The handling of the FPU in the CPU can be compared to the GTE in PSX, which is also a coprocessor. There is a defined interface for reading and writing FPU registers and executing FPU commands.

But yes, the FPU must be done at some point and since 99% of all games use it, it's absolutly mandatory. Even the systemtest uses it to calculate the time a test took in seconds. Due to it being missing, you can see the "E2" overlay, which signals that some (currently) unsupported instruction was executed.


Overall the next tasks are very clear: get the 6 main CPU test categories to work fully and add the datacache and interrupts, then the CPU is ready for the first games and work on the other components(FPU, RSP, ...) can start.

While doing that, maybe I throw in some RDP or Audio work so we can hear something or see more than just text.


Have fun!

Comments

Matt Hargett

Branch delay slot simulation is *very* tricky. In 2003, I founded a company called BugScan that used symbolic execution (inspired by UltraHLE and Bleem) to find exploitable security vulnerabilities. We supported x86, but our biggest customers cared a LOT about MIPS and SPARC as well. We would analyze multi-hundred megabyte commercial products, then go through our report and verify exploitability. BDS on SPARC and MIPS took over a month of testing and refinement to get (known) false positives down to 0. Im hoping it’ll be easier for you with a proper CPU test suite!

FPGAzumSpass

Thankfully i don't have to test software, but only the CPU part and the combinations of different operations are somewhat limited. But yes, it's hard to judge what can be problematic without running into it. The systemtest does cover at least the cases that are the highest risk of being wrong.

Anonymous

Thank you for the updates Robert! Very exciting seeing the n64-systemtest from Lemmy working. Are you programming the core to read/write from DDR3 memory or SDRAM module? FPGA has its own internal 1MB memory?

FPGAzumSpass

RDRAM is placed in DDR3 and the game data (rom and saves) will be placed in the SDRAM. The FPGA has about 500Kbyte of internal RAM available for the core, this will be used for CPU cached, RSP Memories and RDP buffers.