Calling functions by address

Wed Aug 31 20:35:05 AEST 1988

In article <679 at mssx.UUCP> src at mssx.UUCP (Pleschutznig Andreas) writes:
: Suppose following:
:
: We want to write a software emulation for another processor on UNIX
:
: So the main problem is to get into the emulation routines as fast as possible,
: and therefore it doe not seem to be good enough to do
:
:       switch (code) {
:               case ..
:               case ..
:               }
:
: So we thought of doing that job by declaring the addresses of the emulation
: routines and jumping to the routines by address like this
:
:       (*addressarray[code]);
:
: I know, I know that *does not* work, but maybe there is someone knowing to
: get around.

The correct call is

	(*addressarray[code])();

However, you will probably find the switch to be faster.

----

The following analysis presumes that we are running on something
of the order of a Vax, 386, or 68k, and have a reasonable C
compiler.

Here is what the switch is likely to generate:

	move    a0,code         (written in GASM: Generic Assembler)
	cmp     a0,256
	jgrtr   after           (branch if *unsigned* greater than 256)
	shiftl  a0,2
	jmp     table[a0]

The call should generate something like

	move    a0,code
	shiftl  a0,2
	call    table[a0]

and the function should add to this

	sub     sp,#locals      (or something else to make a stack frame)
	...
	return

The difference between these is:

	move    a0,code         move    a0,code
	cmp     a0,256          shiftl  a0,2
	jgrtr   after           call    table[a0]
	shiftl  a0,2            sub     sp,#locals
	jmp     table[a0]       return

On the machines I am familiar with, the cmp and sub instructions
are about equivalent in execution time, so leaving out that and
the common instructions, we have:

	jgrtr   after           call    table[a0]
	jmp     table[a0]       return

I would expect that the not-taken branch + the indirect branch to
take significantly less time than the call + return.  I also have
been *very* generous in my assumptions about how fast the call
will be; it often will have more overhead than the one
instruction I added.

You could also get rid of the cmp and jgrtr instructions, if you
are *certain* that the code passed will always be within range.
Some compilers let you create assembler output instead of object
output; you take that assembler output, run a sed script on it,
and then assemble that.  You can use this method to tweak all
sorts of things, besides.  But do this only if you are *really*
desperate for speed.

(I wanna #pragma :-)

---
Bill
novavax!proxftl!bill