Dynamic Recompilation
Jeremy Chadwick - February 15, 1998
The following conversation was logged on February 14th, 1998 and deals with
dynamic recompilation. The views of Jeremy are not the views of God. Some
people usually think it is =)
- Hiya Jer!
- Heya
- I've been really curious about something in emulation that I believe you once thought about...
Executor was the first emulator that I was aware of the used dynamic recompilation.
Now, it seems that the N64 emulation scene wants to use this method.
Do you understand what goes into dynamic recompilation?
- Yup
I've considered it prior to VeNES.
In fact, the first idea I had about emulation was doing 'dynamic recompilation'
(what a bullshit word. Heh)
- Hehehe...
- The correct term is 'code translation'
Do you know what the actually process does?
It's quite complex...
- So, is there any reason why this hasn't been exploited?
- Yeah.
One big reason: the amount of time it takes to write it.
- I understand that it converts op codes to be recognized as native op codes.
- Kind-of.
It's more complex than that.
If you wanna log, I can discuss what it is...
- I was thinking that this might kinda provide the PC with a reference table for converting automated...
- * Y0SHi taps the mic.
So first, let's take the term itself, and dissect it.
'dynamic recompilation'
Dynamic, meaning ever-changing. Recompilation, meaning re-writing or
re-creating. There's multiple methods of incorporating this idea into an
emulator... All in all, it's a bitch to write. :-/. So first off, I'd like
to state that consoles (and in the case of Executor, PCs) are very timing
sensitive... If you're one cycle (less than a blink of an eye, in English)
off, the result on-screen can turn to crap. Basically, what dynamic
recompilation is, is the process of translating opcodes which're native to
the emulated system, into opcodes which're native to the system running the
emulator.
Now, as I said, there's multiple methods of doing this. Some people like
doing this realtime -- meaning, while the emulator runs. I prefer the method
of doing it all during load-time (while the emulator loads the ROM, etc.)
(NOTE: qNES was originally supposed to use this idea, but due to the
complexity of it all, decided against it). So, as an example, let's take an
opcode from the 65816 (SNES) and turn it into a PC opcode. Here's some code
which, say, a 65816 could do:
LDX #5
LDA ($2000,X)
Now, the complexity of this code isn't very high, but it'll serve as a great
example to explain what 'dynamic recompilation' is. On a 65816 CPU, the
following happens when the above piece of code is executed:
- X index register is assigned to '5'
- CPU flags are changed based on the '5' value loaded. For instance, the
(z)ero flag, which would be 0, because 5 is not 0.
- The CPU prepares itself to get some data from the specific address: in
this example, this is the indirect address of $2000 + the X register.
The CPU adds $2000 to whatever is in X, which is 5. Hence, we get the
address $2005. At this point, the CPU has to do indirect addressing. In
English... If you want an apple, you walk over to the basket full of apples
and get one. This is called direct addressing. If you want an apple, and
you walk over to the basket full of apples and find a note which says 'see
the banana basket', you will go to the banana basket and get an apple. This
is called indirect addressing. The CPU then gets the actual 16-bit address
which is at $2005. So, let's pretend at $2005 and $2006 (because it's
16-bit, two memory locations are needed), we have the values:
$2005 = 34
$2006 = 12
The 65c816 uses a non-linear endian method, meaning, the upper part of the
16-bit address is loaded from a later address. So, what the CPU then gets
is the address $1234.
- So, finally, the CPU gets the actual data from the address $1234.
Now, as you can see, this is a very complex process which happens in a
matter of microseconds. To show you how to use 'dynamic recompilation',
we must disect each opcode the 65816 uses and make them into opcodes which
an x86 (PC) uses. Our above example will work great. The above code would
LITERALLY translate to:
MOV ECX, 5
MOV EDX, 2000h
MOV EAX, [EDX+ECX]
Emulators, in a way, already do this. But there's MUCH MUCH __MUCH__ more
going on. They must emulate all of the steps I listed above! All via
software...
Rather than using the x86 (or whatever CPU)'s built-in registers, flags,
etc. to do NATIVE emulation. Plus, you have to take memory into
consideration. The PC and the Mac and other systems don't all think
address $2005 holds the data they want. Because that $2005 address is
specific to the console (NES, SNES, etc.) So you have to EMULATE the
$2005 "register" -- doing this may, say, change the video mode, make
graphics show up, turn off the screen, etc. There's lots to an emulator
which makes it complex... How can 'dynamic recompilation' help us?
Instead of emulating all of the opcodes as 65816 opcodes. Why not
translate all the 65816 opcodes into PC opcodes. And run the code 100%
native to the PC? That's what the concept is.
How can 'dynamic recompilation' hurt us?
The amount of time it takes to do this is phenominal. Meaning, the amount
of man-hours... You _REALLY_ have to know how a CPU works to know how to
do this. This brings me to my final point, timing.
On, say, the NES... If you're a few cycles off, the little car (a sprite)
may end up looking like it's driving on the sidewalk. Not on the road...
So, systems like the NES, the SNES, and other consoles, are timing
intensive. One simple blink of an eye, and the car goes from being where
it's supposed to be, to, say, where your hi-score is. So, the big problem
with dynamic recompilation is the timing aspects to the consoles....
How do you get around this problem? It's quite easy, but pretty rough on
the programmer... You literally, have to emulate time. It's called 'cycle
counting' right now. As time moves on, we will find ways to optimize cycle
counting and truely make emulation something which is native.
Oh, one other note :-) -- The amount of RAM dynamic compilation takes up
can be pretty phenominal. That 1MB (megabyte) SNES game you like, which if
loaded, takes up 1MB of RAM, plus, say, 300K for the emulator. May now
take up, say, 5MB of RAM total. It won't take up any more disk space --
just taking up more RAM and in some cases it may even take up __LESS__ RAM!
Depends on the system :-)
- Therefore, your standard N64 ROM ... being 8MB at
the least ...
- Nintendo 64? Wow... It's a RISC CPU -- Reduced Instruction Set...
The entire point of a RISC CPU is to remove the # of opcodes the CPU has.
Hence making it faster.
- Do you have any insights on Project Reality -- the
most advanced Nintendo 64 emulator so far that is intending on exploiting
dynamic recompilation?
- So the entire process could take up PHENOMINAL amounts of RAM, if
natively run... All depends on the author you know -- everyone has their own
ideas! :-) Actually I don't, since I'm not a big N64 fan. But I can say
this. bpoint is a tactician of sorts, and I have no doubt in my mind he can
implement dynamic recompilation with fast, and very decent, results. I'm
looking forward to trying out Project Reality when dynamic recompilation is
implemented :-)
Just remember one important thing -- Emulation itself is truely remarkable.
What a little chip the size of your thumb can do in a thousanth of a
microsecond, can take up a full second when emulated on a different system.
As technology advances, and programmers come up with new ideas, expect
emulation of high-speed systems such as the N64, to become more and more
realistic. It just takes time -- because time is truely what emulation is
all about. Bathroom for me :-)
- Oooh.
I understand that last statement more than the first part of this...
It's a been-there-seen-it-done-it type of knowledge. I'm curious for your
opinion on why authors haven't seemed to have thought about dynamic
recompilation before. Usually, many authors like to brainstorm on a web page
(esp. vaporware authors)... I've seen operating systems (SneOS / VSMC) as
ideas... then we've seen the classic "I'm using full ASM to code an
unportable emulator" statement -- but it seems that no authors have thought
about dynamic recompilation with the exception of bpoint and I'm surprised
that the idea has never caught on until now.
- I think the idea has been around for a long time...
It's sifted through my head about a zillion times, even prior to VeNES.
In fact, the first idea which went through my head when it came to SNES
emulation, was "Why not write a SNES emulator for the Apple IIGS?".
I mean, it's the same CPU, no opcode emulation would be necessary -- just
memory emulation. I obviously figured out the answer. No one uses an Apple
IIGS anymore. And besides, the Apple IIGS has a slower CPU than the SNES.
And, besides that, the Apple IIGS can't display more than 16 colours per
horizontal scanline (While the SNES can do 256 per scanline, if necessary)
The concept of dynamic recompilation has been around for a long time it's
just that everyone's been afraid of it.
Heck, I'm afraid of it :-)
It's an excellent idea, and it _CAN_ be implemented correctly. It just takes
long, long hours of testing... and lots of Chinese food :-) In a way...
Emulation right now is doing a form of 'dynamic recompilation' But what
Project Reality and Executor are up to is basically a super-optimized version
of it. I'd have to call it 'native emulation' because the CPU emulation is
then kicked out the door -- only memory & video emulation is left. The
reason it's now surfacing is pretty obvious to me.
People are finally asking themselves:
"Why is playing a Super Nintendo game on my P233MHz system still slow,
when the Super Nintendo is only 3.68MHz?"
And the answer is, "I don't know! Let's do it!" So, people're doing it
:-) Any other questions?
|