structure function returns -- how?
Robert Firth
firth at sei.cmu.edu
Wed Dec 17 07:54:26 AEST 1986
In article <7403 at utzoo.UUCP> henry at utzoo.UUCP (Henry Spencer) writes:
>>Suppose a is declared as a structure and b is a function which
>>returns a structure. In the statement:
>> a = b () ;
>>when and how should the copying into a take place?
>
>It's an awkward problem, since struct values generally don't fit in the
>registers that are used to return ordinary values. The best solution is
>for the caller to allocate space for the returned value and communicate
>the address to the callee somehow, so the callee can copy the value there
>before returning. This does require that the caller know the returned
>type, and there is a lot of sloppiness about this in C, especially when
>the returned value is not being used. (Although said sloppiness may be
>less common for struct-valued functions.) There can also sometimes be
>difficulties in implementing it. There are various alternatives, some
>of which indeed are not signal-proof. Using a static area to return the
>value leads to trouble in the (quite uncommon) case of struct-returning
>functions being used in signal handlers, but has the virtue of being easy
>to retrofit into old compilers.
>--
> Henry Spencer @ U of Toronto Zoology
> {allegra,ihnp4,decvax,pyramid}!utzoo!henry
This problem occurs also in other languages, and it's a shame to see
people reinvent solutions over and over. Here, then, are all the
solutions I know of, with comments.
(a) Registers. If the structured object is small enough, use the
registers anyway. A two-word struct, for instance, can surely
be returned in <r0,r1> or your machine's equivalent.
(b) Caller preallocates. This passes as an extra parameter a pointer
to a result area computed by the caller. Note that this requires
the caller to know the size of the result, which is always the
case in C but not the case in, eg, Ada.
There is a subtle trap here if we have code like
static structthing s
...
s = f()
the temptation is to pass &s as the pointer, which causes strange
results if f() has another access path to s. That's called 'aliasing'.
Caller should allocate a local temporary unless really sure it is safe
to pass a pointer to a declared variable. If the ultimate caller does
the right thing then f() can pass that pointer down to an inner call
of a similar function g(), as has been mentioned.
(c) Function allocates static space and returns pointer. Caller copies.
This is not reentrant, as has been pointed out. Nor does it work if
the function is recursive. Since reentrancy bugs are very hard to
find, as a compiler writer I would absolutely NEVER use this technique.
A variation is for the function to return a pointer to a declared
object holding the value. For example, if the function does a lookup
of an external array of struct objects, it can simply return a pointer
into the table.
(d) Function allocates heap space, rest as above.
The technique used by several Algol-68 implementations. It requires a
true heap (with garbage collection) in Algol-68. In C you can have the
function allocate and the caller always free after doing the copy. This
always works but can be rather expensive.
(e) Function leaves result on stack.
This is usually easiest for the function. The problem is that on many
systems an interrupt or signal will destroy the result. (flame)(This is a
symptom of a major and persistent system design error: the use of a
hardware register pointing into user space as a place to dump junk. It
is compounded by language implementations that use the hardware stack
to allocate the LIFO address space of local variables). There are
ways around this
1. Protect the user stack. You can do this in bsd 4.3, for example, by
arranging for signals to use a separate stack
2. Ignore the hardware stack and allocate local variables somewhere
else. This is done by, for instance, BCPL on the PDP-11 and VAX-11.
3. Have a magic routine "return_result". This takes as its parameter
the result (and, probably, its size). It is called as the last act
of the function. It returns to the caller's caller, doing the right
things to the stack, and always keeping the stack pointer below any
valid data.
If the size of the result is always known at compile time, I'd pick method
(b). Overall it's simplest and not too expensive.
Otherwise, I'd use e2, e1, or e3 (in diminishing order of preference)
More information about the Comp.lang.c
mailing list