Inherent imprecision of floating point variables
Andrew Koenig
ark at alice.UUCP
Sun Jul 8 23:29:07 AEST 1990
In article <7906 at ncar.ucar.edu>, steve at groucho.ucar.edu (Steve Emmerson) writes:
> In <14429 at lambda.UUCP> jlg at lambda.UUCP (Jim Giles) writes:
>
> >...
> >The job of the conversion routines is to convert the decimal into the internal
> >representations and vice-versa. They _should_ do this job as accurately as
> >possible - if the number is exacly representable in both bases you have a
> >right to expect exact conversion.
> A right?
> Granted by whom?
Granted by the IEEE floating-point standard, for one thing.
If I am using a system whose vendor claims that it supports
IEEE floating point, then I can expect that
input conversion on a floating point number with an
exact representation will be exact;
input conversion at compile time will give precisely
the same result as input conversion at run time;
if I write out any number at all with enough significant
digits, I can read back the result and get exactly
the same result;
and a whole bunch of other useful properties.
Interestingly, this does not require that conversion be as accurate
as possible, and one can reasonably argue that conversions should
generally *not* be as accurate as possible, because it's quite expensive
to get it exactly right compared with what it costs to get it
*almost* right. In particular, I do not know how to write an
exact input conversion routine without doing unbounded precision
arithmetic.
For that reason, the IEEE standard does *not* require conversion
to be as accurate as possible -- it allows an error of up to 0.47
times the least significant bit of the number being converted,
in addition to the inevitable error implied by rounding. The
number 0.47 is carefully chosen to guarantee all the properties
I mentioned above; the IEEE standard refers the reader to Jerome
Coonen's PhD thesis for further detail.
--
--Andrew Koenig
ark at europa.att.com
More information about the Comp.lang.c
mailing list