Comments on your program
Joseph S. D. Yao
jsdy at hadron.UUCP
Sat Nov 30 10:10:17 AEST 1985
In article <135 at brl-tgr.ARPA> cottrell at nbs-vms.arpa (COTTRELL, JAMES) writes:
> ... `Now why didn't you think before posting?'
>> ... This program was written
>> to help decode a bitnet routing table that I had been netcopy'd
>> to me and didn't get translated into ascii. So after running dd
>> over it, the line markers had disappeared into never never land.
>> But from looking real closely at the file I could see that each
>> line was supposed to start with ROUTE..... thus this program:
>
>It doesn't work. Suppose the sequence `ROUROUTE' occurs. The second
>`R' will not be recognized as the start of the sequence!
>
>I thought of ways to use existing tools to do the job. How about this:
>1) run thru `tr' to change all `R's to newlines. This gives you all
>possible places where a line might start. Now run an `ex' script that
>chex (wheat, corn, rice) each line begins with OUTE. If it doesn't,
>then put back the R. Then for each line that begins with an R, join
>it with the previous line. Finally, put back an R on each line.
Yes, Herron's algorithm won't work without some way of backing up.
No, Cottrell's algorithm won't work either. It assumes that ALL
NL's have been removed, which is a possible but not necessary
interpretation of the originally stated problem. In C, one way
to do things is:
while ((c = my_getchar()) != EOF {
if (c != 'R') {
putchar(c);
last_put = c;
continue;
}
gather 4 more
test for ROUTE
if so, print NL + 5 chars; last_put = 'E';
else ungetchar 4 (which is why my_getchar())
}
if (last_put != NL) /* almost certainly so */
putchar(NL);
This assumes that Herron is correct in his assumption that the
word "ROUTE" was one-to-one with line starts.
Note also that Herron implies a conversion from E***** to ASCII.
If the original tape/file was blocked with fixed-length records,
then there is a dd arg to size lines (cbs=, I believe). If var-
length, he may have to read all lines in the original for the
record sizes and substitute for them the E@#$%^ NL character
before dd'ing.
--
Joe Yao hadron!jsdy at seismo.{CSS.GOV,ARPA,UUCP}
More information about the Comp.lang.c
mailing list