Self-modifying code
Jan Christiaan van Winkel
jc at atcmp.nl
Fri Oct 12 21:33:59 AEST 1990
>From article <829 at neccan.oz>, by peter at neccan.oz (Peter Miller):
> 1. On a Z80 I wrote some code which used a NMI (non-maskable interrupt).
This reminds me of the code used in the 8080 basic interpreter by Microsoft.
They had several entries into the errorroutine. The errorroutine expected
an errornumber in register b. Now what they had done was:
ld hl,<some number> ; registerpair hl gets the value <some number>
ld hl,<some other number>
ld hl,<some other number>
and so on.
The 16 bit numbers themselves were actually instructions:
ld b,errorcode
By jumping into the middle of one of the ld hl,... instructions, they would
load the errorcode in b, and then execute some dummy ld hl,... instructions.
that would not globber the value in b, eventhough the ld b,xxx instructions
were just a byte away.
Although this is not self modifying code, it is 'shifting the bits a bit and
interpreting the result'. Very clever
> 4. At some point, I realized that using a compiler is rather like
> self-modifying code. The compiler, itself a binary data file, chews on a
> text file and makes a binary data file. When we run the program we just
> compiled, we are asking the OS to load a binary data file and leap into it.
Hmmm. I think you should read Ken Thompson's Turing award lecture. He dis-
cussed the possibility of getting code into a C compiler, without having it
in the source. The trick is illustrated with the addition of a new escaped
character like \n. In the lex. analyzer there is some sort of code like this:
case '\': switch(getnewchar()) {
case 'n': return '\n';
case 'a': return '\007'; /* the newly added character */
/* my name's Bond, James Bond :-) */
.
.
Now compile the compiler, and you'll have a new compiler that recognizes '\a'.
Now edit the sourcecode to look like this:
case 'a': return '\a'
Tghis is possible because the compiler will be compiled with the compiler that
knows about '\a'. The result is a C compiler that knows that '\a' is in
reality '\007', but nowhere in the source of the C compiler that knowledge
is stored. It is inherited from the previous generation of the C compiler.
JC
--
___ __ ____________________________________________________________________
|/ \ Jan Christiaan van Winkel Tel: +31 80 566880 jc at atcmp.nl
| AT Computing P.O. Box 1428 6501 BK Nijmegen The Netherlands
__/ \__/ ____________________________________________________________________
More information about the Comp.lang.c
mailing list