Uses of "float:16" ?
Doug Gwyn <gwyn>
gwyn at brl-tgr.ARPA
Mon Oct 14 11:45:31 AEST 1985
Earlier, I mentioned having read about a floating-point data
representation that obtained increased dynamic range traded
off against precision by using a variable number of bits for
the exponent. I have been informed that Bob Morris holds a
patent on this, which is called "tapered floating point" and
was described in some IEEE publication over 10 years ago.
Rummaging around in my notes, I discovered the article I had
in mind, entitled "FOCUS Microcomputer Number System" by
Albert D. Edgar & Samuel C. Lee, in the March 1979 issue of
CACM, pp. 166-177. It turns out that FOCUS represents
floating-point quantities as their base-2 logarithms using a
fixed number of bits for the fractional part of the logarithm.
For example, 8-bit FOCUS data interpretation is as follows:
sign,excess-8_fractional_exponent meaning
1,1001.000 -2^1 = -2
1,0000.000 -2^(-8) ~= -0.004
0,0111.000 +2^-1 = 0.5
0,1000.000 +2^0 = 1
0,1000.101 +2^(5/8) ~= 1.5
0,1111.111 +2^(63/8) ~= 235
This scheme has the advantage of not needing to use any bits
to specify the size of any field; otherwise it has similar
characteristics to the scheme that trades exponent against
mantissa: large dynamic range combined with higher relative
precision for numbers near 1. The article claims that FOCUS
software implementations run faster on the average than
fixed-point operations (presumably because multiply/divide
is cheap for FOCUS). Note that there is a jump around true
0, so some adaptation of algorithms may be needed to work
well with FOCUS. For further information, read the article.
This obviously doesn't belong in net.lang.c, but that's where
the discussion started.
More information about the Comp.lang.c
mailing list