Trigraphs: a program
Richard A. O'Keefe
ok at quintus.UUCP
Sat May 28 07:13:28 AEST 1988
There has recently been some discussion of trigraphs in this newsgroup,
with distaste and apprehension being the predominant themes. I share
the distaste, but I decided to do something about the apprehension.
Here is a program which can be used to determine whether ANSI trigraph
processing will have an adverse effect on your code. It is a filter
which copies its standard input to its standard output, replacing
trigraphs by the corresponding ASCII characters (even in comments).
Now you can fix your programs _before_ the ANSI compiler arrives.
Be warned: it is _your_ responsibility to check this program before you
use it. I believe it to be correct, but I'm not getting any money and
I'm not taking any responsibility. Ying tong iddle i po!
-------------------------------- cut here --------------------------------
/* File : 3g.c
Author : Richard A. O'Keefe @ Quintus Computer Systems, Inc.
Updated: 27 May 1988
Purpose: Trigraph elimination for C.
The draft ANSI standard for C introduces so-called "trigraphs" so
that certain characters in ASCII which are not in the ISO 646 base
can be represented. The trigraphs are
??= #
??( [
??/ \
??) ]
??' ^
??< {
??! |
??> }
??- ~
Although there are other characters which could benefit from such
treatment, C doesn't use them. The ?? combination is left as is
if it is not part of one of these sequences.
Trigraphs are not a popular feature, and people are worried about
whether their programs will work in ANSI C. This program is meant
to serve as a tool for finding out.
3g <stdin >stdout
replaces all the trigraph sequences in its standard input stream
by the appropriate ASCII characters, and otherwise copies its
standard input to its standard output.
To find out whether a program of yours will be adversely affected by
trigraphs, filter it through this program and compare the result with
the original. In UNIX:
#!/bin/sh
#Usage: 3gc foobaz.c
3g <$1 | diff - $1
Note that the ease with which a filter like this can be written makes
the claim that such a facility is needed in the _language_ somewhat
dubious.
*/
#include <stdio.h>
#define TGCHAR '?'
/*ARGSUSED*/
main(argc, argv)
int argc;
char **argv;
{
register FILE *card = stdin;
register FILE *line = stdout;
register int c;
register int state;
/* There are three states:
0 : not in a trigraph sequence
1 : first character of a possible trigraph sequence read
2 : second character of a possible trigraph sequence read
*/
for (state = 0; (c = getc(card)) != EOF; ) {
if (c == TGCHAR) {
if (state == 2) putc(c, line);
else state++;
} else
switch (state) {
case 1:
state = 0;
putc(TGCHAR, line);
/* FALL THROUGH */
case 0:
putc(c, line);
break;
case 2:
switch (c) {
case '=': c = '#'; break;
case '(': c = '['; break;
case '/': c = '\\'; break;
case ')': c = ']'; break;
case '\'': c = '^'; break;
case '<': c = '{'; break;
case '!': c = '|'; break;
case '>': c = '}'; break;
case '-': c = '~'; break;
default: putc(TGCHAR, line);
putc(TGCHAR, line); break;
}
putc(c, line);
state = 0;
}
}
switch (state) {
case 2: putc(TGCHAR, line); /* FALL THROUGH */
case 1: putc(TGCHAR, line); /* FALL THROUGH */
case 0: break;
}
exit(0);
}
-------------------------------- cut here --------------------------------
More information about the Comp.lang.c
mailing list