C and national character sets
Martin Minow
minow at decvax.UUCP
Thu Aug 30 11:41:27 AEST 1984
Keld J|rn Simonsen brings up an important point concerning C
and its standardization. (By the way, the | is the oe ligature
character, needed in the Scandinavian languages as well as
German.) He notes that several characters used by C are
reserved by ISO standards for "national replacement characters"
The reserved characters are #@[\]^_`{|}~ -- most of which are
used in some way by C. There isn't any really good solution --
it is highly unlikely that the C standardization committee will
remove these characters from the language. While most of them
can be replaced by suitable #defines, several cannot, notably
backslash. The only short-term solution would be for the
parties affected to write NRC-specific pre-processors.
In the long term, however, the problem will go away as people
move to an 8-bit character set such as Dec-Multinational or
the pending ISO standard that is almost identical to it.
In this standard, the characters in the range 0-128 are identical
to the U.S. ASCII 7-bit standard. Characters in the range
128-159 are used for additional controls, and 160-255 for
additional graphics.
It is actually possible -- though rather messy -- to intermix
NRC's and Multinational, allowing Standard C to be written from
a terminal that normally displays a non-English NRC set.
Unfortunately, this will require a pre-processor that understands
the character-set switching escape sequences. This could
be done as a Unix filter, of course.
Hope this helps. Hej s} l{nge.
Martin Minow
decvax!minow
More information about the Comp.unix.wizards
mailing list