Report on WG15 Rapporteur Group
David Wheeler
wheeler at ida.org
Sat Mar 17 09:35:09 AEST 1990
From: wheeler at ida.org (David Wheeler)
domo at tsa.co.uk (Dominic Dunlop):
= From: Dominic Dunlop <domo at tsa.co.uk>
=
= Report on ISO/IEEE JTC1/SC22/WG15 Rapporteur Group on
= Internationalization Meeting of 5th - 7th
= March, 1990, Copenhagen, Denmark
=
= Dominic Dunlop -- domo at tsa.co.uk
=
= The Standard Answer Ltd.
=
I enjoyed your posting, thank you! You included a lot of "what this
phrase really means" that I appreciated.
=
= 3. ISO 646[4], the earliest ISO standard for information
= technology, is the international derivative of ASCII.
= Its Danish variant replaces ASCII's } with aa. Around
= the world, #$@[\]^`{|}~, all of which have a special
= meaning to the shell, are replaced by other characters
= in standards derived from ISO 646. See [5] for much
= more information.
=
Isn't there an 8-bit standard character set that defines the first 128
characters as a standard set (say as USASCII, provincial I'm afraid but it
would break no Unix tools), then includes all the international
characters as those with values > 127? If this were used in the POSIX
standard, wouldn't this solve many problems for those using a
Latin-based alphabet? Or is this standard unused in the real world?
Admittedly this eliminates the non-Latin alphabet world, and that
is a weakness.
= Apart from all this organizational stuff, we did review some
= existing documents. For example, DTR (draft technical
= report) 10176, a product of SC14, discusses the treatment of
= characters appearing in language constructs, variable names,
= literals and comments, and turns out to have implications
= for sh, awk, yacc and the other ``little languages'' defined
= in DP 9945-2, the forthcoming international standard for the
= shell and tools. And a document from SC22's study group on
= character sets suggests that source files should have some
= means of announcing the character set that they're using.
= Could this mean typed files or resource forks for POSIX6?
= Gee. How would we hide that?
=
Some C programs would have to be fixed to deal with signed characters
but at least the rules would be simple: 128+ are ordinary characters &
can be used in identifiers, etc.
Source file tagging for language sounds like an abomination!
--- David A. Wheeler
wheeler at ida.org
Volume-Number: Volume 18, Number 80
More information about the Comp.std.unix
mailing list