Trigraphs: a program

Richard A. O'Keefe ok at quintus.UUCP
Sat May 28 07:13:28 AEST 1988


There has recently been some discussion of trigraphs in this newsgroup,
with distaste and apprehension being the predominant themes.  I share
the distaste, but I decided to do something about the apprehension.
Here is a program which can be used to determine whether ANSI trigraph
processing will have an adverse effect on your code.  It is a filter
which copies its standard input to its standard output, replacing
trigraphs by the corresponding ASCII characters (even in comments).
Now you can fix your programs _before_ the ANSI compiler arrives.

Be warned: it is _your_ responsibility to check this program before you
use it.  I believe it to be correct, but I'm not getting any money and
I'm not taking any responsibility.  Ying tong iddle i po!

-------------------------------- cut here --------------------------------
/*  File   : 3g.c
    Author : Richard A. O'Keefe @ Quintus Computer Systems, Inc.
    Updated: 27 May 1988
    Purpose: Trigraph elimination for C.

    The draft ANSI standard for C introduces so-called "trigraphs" so
    that certain characters in ASCII which are not in the ISO 646 base
    can be represented.  The trigraphs are
	??=	#
	??(	[
	??/	\
	??)	]
	??'	^
	??<	{
	??!	|
	??>	}
	??-	~
    Although there are other characters which could benefit from such
    treatment, C doesn't use them.  The ?? combination is left as is
    if it is not part of one of these sequences.

    Trigraphs are not a popular feature, and people are worried about
    whether their programs will work in ANSI C.  This program is meant
    to serve as a tool for finding out.

    3g <stdin >stdout
	replaces all the trigraph sequences in its standard input stream
	by the appropriate ASCII characters, and otherwise copies its
	standard input to its standard output.

    To find out whether a program of yours will be adversely affected by
    trigraphs, filter it through this program and compare the result with
    the original.  In UNIX:
	#!/bin/sh
	#Usage: 3gc foobaz.c
	3g <$1 | diff - $1

    Note that the ease with which a filter like this can be written makes
    the claim that such a facility is needed in the _language_ somewhat
    dubious.
*/

#include <stdio.h>
#define TGCHAR '?'
				/*ARGSUSED*/
main(argc, argv)
    int argc;
    char **argv;
    {
	register FILE *card = stdin;
	register FILE *line = stdout;
	register int c;
	register int state;

	/*  There are three states:
	    0 : not in a trigraph sequence
	    1 : first character of a possible trigraph sequence read
	    2 : second character of a possible trigraph sequence read
	*/
	for (state = 0; (c = getc(card)) != EOF; ) {
	    if (c == TGCHAR) {
		if (state == 2) putc(c, line);
		else state++;
	    } else
	    switch (state) {
		case 1:
		    state = 0;
		    putc(TGCHAR, line);
		    /* FALL THROUGH */
		case 0:
		    putc(c, line);
		    break;
		case 2:
		    switch (c) {
			case '=':   c = '#';   break;
			case '(':   c = '[';   break;
			case '/':   c = '\\';  break;
			case ')':   c = ']';   break;
			case '\'':  c = '^';   break;
			case '<':   c = '{';   break;
			case '!':   c = '|';   break;
			case '>':   c = '}';   break;
			case '-':   c = '~';   break;
			default:    putc(TGCHAR, line);
				    putc(TGCHAR, line); break;
		    }
		    putc(c, line);
		    state = 0;
	    }
	}
	switch (state) {
	    case 2: putc(TGCHAR, line); /* FALL THROUGH */
	    case 1: putc(TGCHAR, line); /* FALL THROUGH */
	    case 0: break;
	}
	exit(0);
    }

-------------------------------- cut here --------------------------------



More information about the Comp.lang.c mailing list