structure alignment question
Michael Tiemann
tiemann at mcc-pp.UUCP
Tue Sep 23 00:31:39 AEST 1986
In article <7479 at sun.uucp>, guy at sun.uucp (Guy Harris) writes:
> > The last 68000 compiler I used aligned strings on WORD boundaries.
> > This would cost one byte per string, half the time. But there was a
> > big speed payoff: I could do word operations in my strnlen, strncmp,
> > strncpy, and whatever other string processing functions I happened to
> > write.
>
> Oh, really?
>
> char string1[] = "foo";
> char string2[] = "xfoo";
>
> return(strcmp(string1, string2 + 1));
>
> If you can do this with straightforward word operations ...
> [ ... USW ... ]
> *!NOTHING*! guarantees that the arguments to the string routines always
> point to the *first* byte of a string. Furthermore, nothing guarantees that
> "strcat" and "strncat" will always do aligned copies, even if the arguments
> always point to the first byte of a string; the reason for this should be
> obvious.
> --
> Guy Harris
> {ihnp4, decvax, seismo, decwrl, ...}!sun!guy
> guy at sun.com (or guy at sun.arpa)
What I meant was... There are times when I *can* (can-to can-to can-to)
guarantee that I am refering to the first byte of a string. Sure,
when I am doing SUB-string operations I have to be prepared to
pay the price, but if I happen only to want to do STRING operations,
where I *always* refer *only* to the string as a whole, then it
sure would be nice to be rewarded for passing that extra knowledge
on to my compiler.
When I don't know what I'm dealing with, I use strcmp, strcat, etc,
(and I expect them do make pessimistic decisions).
I wrote my own C routines "fmove", "fzero" which are like
"bcopy" and "bzero", except I only call them when I know I'll
be using word aligned things. Also, "fcmp" depends on the fact that a
word-aligned string is terminated by a null word (too much to
ask of a compiler to do that: possible (2*sizeof(int)-1 byte overhead),
but if they're two of MY strings, I call it.
The point is, a word aligned string does not seem to be too much
to ask for, and the performance improvement IS noticable:
Lets be hypothetical for a moment:
int lookup(s)
char *s;
{
int where;
switch (*(int *)s) {
case 0x41424300: /* "abc" */
return ABC;
case 0x78797a00: /* "xyz" */
return XYZ;
default:
/* must look up in symbol table... */
do { ... };
return where;
}
}
Keep your reserved words < 4 characters and off you go!
No flames about "portability": these were handled by a little
emacs function (generates hex and ascii comment for short strings),
and each emacs (VAX, SUN, whatever) has its own version.
"A little knowledge is a dangerous thing"... programmers should
*never* learn about machine architectures: they will only want to
write fast code...
I hope that this time I have clarified my position: everybody can
now say I'm wrong (of course I am) or right.
Michael
tiemann at mcc.com
"I'm that C programmer your mother warned you about"
More information about the Comp.lang.c
mailing list