Bug constant folding unsigned chars

Sat Jul 28 06:15:17 AEST 1990

In article <9007250601.AA07615 at csvax.cs.caltech.edu> daveg at CSVAX.CS.CALTECH.EDU (David Gillespie) writes:
>I tried the following program on gcc 1.37.1 under HP-UX 7.0.  I was
>checking that IsUnsigned would be folded to a constant at compile-time.
>It was, but in two cases the result was wrong!

Sorry, but the macro is K&R correct(sometimes), ANSI wrong, as the value
conversion rules were open in K&R, but defined in ANSI.

What you are seeing is ANSI compiler behavior.

>
>	#define IsUnsigned(x)   (((x)*0-2)/2+1)
>
>	int a;
>	unsigned int b;
>	short c;
>	unsigned short d;
>	char e;
>	unsigned char f;
>
>	main() {
>	  printf("%d, %d, %d, %d, %d, %d\n",
>		 IsUnsigned(a), IsUnsigned(b),
>		 IsUnsigned(c), IsUnsigned(d),
>		 IsUnsigned(e), IsUnsigned(f));
>	}
>
>The output using cc is
>
>	0, -2147483648, 0, -2147483648, 0, -2147483648

One of the correct non-ANSI answers.

There are several.

>
>but the output using gcc is
>
>	0, -2147483648, 0, 0, 0, 0

The only correct ANSI answer (on 32 bit machines).  See below for explanation.

>
>(This IsUnsigned macro is due to Karl Heuer on comp.lang.c.)
>
>								-- Dave

What you're seeing here is the difference between ANSI and pre-ANSI
compilers.

In ANSI, before the arithmetic operators are applied to the values, they 
are promoted to *int*, unless their possible values prevent them from
being represented as *int*, in which case, they are converted to *unsigned
int* (for all types less than or equal to in size to an *int*).

In your program, all of the signed and unsigned types, except _unsigned int_,
have values representable as *int*, so they are converted to *int* prior to 
the operations being performed, and the constants follow suit.

That is why ONLY the *unsigned int* case works.

Longs and unsigned longs are subject to a different (but equivalent) 
conversion rule.

See section 3.2 of the ANSI standard (ANS 3.159-1989) for more detail.

Here is an extension of the above program which demonstrates my assertions.
The only correct answers with an ANSI compiler on a 32 bit machine are:

0, -2147483648, 0, 0, 0, 0, 0, -2147483648
0, -2147483648, 0, 0, 0, 0, 0, -2147483648

For any compiler on any architecture, if the two lines are different from
each other, COMPLAIN LOUDLY to your vendor until they fix it!

---------------cut here------------------
#define IsUnsigned(x)   (((x)*0-2)/2+1)

signed int a;
unsigned int b;
signed short c;
unsigned short d;
signed char e;
unsigned char f;
long g;
unsigned long h;

main() {
  printf("%d, %d, %d, %d, %d, %d, %d, %d\n",
	 IsUnsigned(a), IsUnsigned(b),
	 IsUnsigned(c), IsUnsigned(d),
	 IsUnsigned(e), IsUnsigned(f),
	 IsUnsigned(g), IsUnsigned(h));

  {
    int zero= 0;
    int one = 1;
    int two = 2;
#define F(type) (((type)zero-(type)two)/(type)two+(type)one)

    printf ("%d, %d, %d, %d, %d, %d, %d, %d\n",
	    F(int), F(unsigned int),
	    F(short), F(unsigned short),
	    F(char), F(unsigned char),
	    F(long), F(unsigned long));
  }
}

---------------cut here------------------

Frankly, I am surprised that Karl Heuer blew the macro; he is the walking
lint, after all!

Most of this was hashed about in ~1986, I think, when the debate about
"value-preserving" vs. "sign-preserving" raged across the net.
-- 
Kirk Hays - I'm the NRA, NRA-ILA, CCRKBA, SAF, and Neal Knox is my lobbyist.