Strings library for C (shar format)
ok at edai.UUCP
ok at edai.UUCP
Sat May 12 04:24:22 AEST 1984
................
Here is a collection of useful string-handling functions in/for C.
Many of them are based on UNIX routines with the same names, but
the code was independently derived. Each of the routines has been
tested, and seems to work. But free software is worth what you
pay for it, and I dare say this is no exception. I make no claim
that any of this is good for anything at all. Use it at your peril.
Manual pages? You must be joking. 2853 lines follow the dots.
................
# to unbundle, csh this file
echo Makefile
cat >Makefile <<'EOF'
# File : strings.d/Makefile
# Author : Richard A. O'Keefe.
# Updated: 4 May 1984.
# Purpose: UNIX make(1)file for the strings library.
# If you are not using a Vax, or if your strings might be 2^16
# characters long or longer, use
# CFLAGS=-O
# On the Vax we can use the string instructions some but not all the time.
CFLAGS=-O -DVaxAsm
# The SIII functions are the ones described in the System III
# string(3) manual page, and also in ctype(3), atoi(3).
SIII=strcat.o strncat.o strcmp.o strncmp.o strcpy.o strncpy.o strlen.o\
strchr.o strrchr.o strpbrk.o strspn.o strcspn.o strtok.o\
_c2type.o str2int.o getopt.o
# The BSD2 functions are the ones described in the 4.2bsd
# bstring(3) manual page, plus a couple of my additions.
# All except ffs have VAX-specific machine code versions.
BSD2=bcmp.o bcopy.o bfill.o bmove.o bzero.o ffs.o
# The "mine" functions are the ones which are entirely my own
# invention, though they are supposed to fit into the SIII conventions.
mine=strmov.o strnmov.o strrpt.o strnrpt.o strcase.o strncase.o strend.o\
strnlen.o strcpbrk.o int2str.o _str2map.o _str2pat.o _str2set.o\
strpack.o strcpack.o strtrans.o strntrans.o strpref.o strsuff.o\
strtrim.o strctrim.o strfield.o strkey.o
# The "find" functions are my code, but they are based on published
# work by Boyer, Moore, and Hospool. (See _str2pat.c.)
find=strfind.o strrepl.o
strings.a: ${SIII} ${BSD2} ${mine} ${find}
rm strings.a; ar rc strings.a *.o; ranlib strings.a
scan=strpbrk.o strcprbk.o strspn.o strcspn.o strpack.o strcpack.o \
strtrim.o strctrim.o strtok.o
${scan} _str2set.o: _str2set.h
tran=strtrans.o strntrans.o
${tran} _str2map.o: _str2map.h
${find}: _str2pat.h
str2int.o: ctypes.h
${SIII} ${BSD2} ${mine} ${find}: strings.h
clean:
-rm *.o
'EOF'
echo READ-ME
cat >READ-ME <<'EOF'
File : READ-ME
Author : Richard A. O'Keefe.
Updated: 30 April 1984
Purpose: Explain the new strings package.
The UNIX string libraries (described in the string(3) manual page)
differ from UNIX to UNIX (e.g. strtok is not in V7 or 4.1bsd). Worse,
the sources are not in the public domain, so that if there is a string
routine which is nearly what you want but not quite you can't take a
copy and modify it. And of course C programmers on non-UNIX systems
are at the mercy of their supplier.
This package was designed to let me do reasonable things with C's
strings whatever UNIX (V7, PaNiX, UX63, 4.1bsd) I happen to be using.
Everything in the System III manual is here and does just what the S3
manual says it does. There are also lots of new goodies. I'm sorry
about the names, but the routines do have to work on asphyxiated-at-
birth systems which truncate identifiers. The convention is that a
routine is called
str [n] [c] <operation>
If there is an "n", it means that the function takes an (int) "length"
argument, which bounds the number of characters to be moved or looked
at. If the function has a "set" argument, a "c" in the name indicates
that the complement of the set is used. Functions or variables whose
names start with _ are support routines which aren't really meant for
general use. I don't know what the "p" is doing in "strpbrk", but it
is there in the S3 manual so it's here too. "istrtok" does not follow
this rule, but with 7 letters what can you do?
I have included new versions of atoi(3) and atol(3) as well. They
use a new primitive str2int, which takes a pair of bounds and a radix,
and does much more thorough checking than the normal atoi and atol do.
The result returned by atoi & atol is valid if and only if errno == 0.
There is also an output conversion routine int2str, with itoa and ltoa
as interface macros. Only after writing int2str did I notice that the
str2int routine has no provision for unsigned numbers. On reflection,
I don't greatly care. I'm afraid that int2str may depend on your "C"
compiler in unexpected ways. Do check the code with -S.
Several of these routines have "asm" inclusions conditional on the
VaxAsm option. These insertions can make the routines which have them
quite a bit faster, but there is a snag. The VAX architects, for some
reason best known to themselves and their therapists, decided that all
"strings" were shorter than 2^16 bytes. Even when the length operands
are in 32-bit registers, only 16 bits count. So the "asm" versions do
not work for long strings. If you can guarantee that all your strings
will be short, define VaxAsm in the makefile, but in general, and when
using other machines, do not define it.
To use this library, you need the "strings.a" library file and the
"strings.h" and "ctypes.h" header files. The other header files are
for compiling the library itself, though if you are hacking extensions
you may find them useful. General users really shouldn't see them.
I've defined a few macros I find useful in "strings.h"; if you have no
need for "index", "rindex", "streql", and "beql", just edit them out.
On the 4.1bsd system I am using declaring all these functions 'extern'
does not mean that they will all be loaded; but only the ones you use.
When using lesser systems you may find it necessary to break strings.h
up, or you could get by with just adding "extern" declarations for the
functions you want as you need them. Many of these functions have the
same names as functions in the "standard C library", by design as this
is a replacement/reimplementation of part of that library. So you may
have to talk the loader into loading this library first. Again, I've
found no problems on 4.1bsd.
You may wonder at my failure to provide manual pages for this code.
For the things in V7, 4.?, or SIII, you should be able to use whichever
manual page came with that system, and anything I might write would be
so like it as to raise suspicions of violating AT&T copyrights. In the
sources you will find comments which provide far more documentation for
these routines than AT&T ever provided for their strings stuff, I just
don't happen to have put it in nroff -man form. Had I done so, the .3
files would have outbulked the .c files!
These files are in the public domain. This includes getopt.c, which
is the work of Henry Spencer, University of Toronto Zoology, who says of
it "None of this software is derived from Bell software. I had no access
to the source for Bell's versions at the time I wrote it. This software
is hereby explicitly placed in the public domain. It may be used for
any purpose on any machine by anyone." I would greatly prefer it if *my*
material received no military use.
'EOF'
echo _c2type.c
cat >_c2type.c <<'EOF'
/* File : _c2type.c
Author : Richard A. O'Keefe.
Updated: 4 May 1984
Purpose: Map character codes to types
The mapping used here is such that we can use it for converting
numbers expressed in a variety of radices to binary as well as for
classifying characters.
*/
char _c2type[129] =
{ 37, /* EOF == -1 */
37, 37, 37, 37, 37, 37, 37, 37, 37, 38, 39, 39, 39, 39, 37, 37,
37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37,
38, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36,
00, 01, 02, 03, 04, 05, 06, 07, 8, 9, 36, 36, 36, 36, 36, 36,
36, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 36, 36, 36, 36,
36, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 36, 36, 36, 37
};
'EOF'
echo _str2map.c
cat >_str2map.c <<'EOF'
/* File : _str2map.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: _map_vec[], _str2map().
_str2map(option, from, to) constructs a translation table. If from
or to is NullS, the same string is used as last time, so if you want
to translate a whole lot of strings using the same mapping you don't
have to reconstruct it each time. The options are
0: initialise the map to the identity function,
then map each from[i] to the corresponding to[i].
If to[] is shorter than from[], its last character is
repeated as often as needed.
1: as 0, but don't initialise the map.
2: initialise the map to send every character to to[0],
then map each from[i] to itself.
For example, to build a map which forces letters to lower case but
sends everything else to blank, call
_str2map(2, "abcdefghijklmnopqrstuvwxyz", " ");
_str2map(1, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz");
Only strtrans() and strntrans() in this package call _str2map; if
you want to build your own maps this way you can "fool" them into
using it, as when the two strings are NullS they don't change the
map. As an extra-special dubious *hack*, _map_vec has an extra NUL
character at the end, so after calling _str2map(0, "", ""), you can
use _map_vec+1 as a string of the 127 non-NUL characters (or if the
_AlphabetSize is 256, of the 255 non-NUL characters).
*/
#include "strings.h"
#include "_str2map.h"
static _char_ *oldFrom = "?";
static char *oldTo = "?";
char _map_vec[_AlphabetSize+1];
void _str2map(option, from, to)
int option;
register _char_ *from;
register char *to;
{
register int i, c;
if (from == NullS && to == NullS) return;
if (from == NullS) from = oldFrom; else oldFrom = from;
if (to == NullS) to = oldTo; else oldTo = to;
switch (option) {
case 0:
for (i = _AlphabetSize; --i >= 0; _map_vec[i] = i) ;
case 1:
while (i = *from++) {
_map_vec[i] = *to++;
if (!*to) {
c = *--to;
while (i = *from++) _map_vec[i] = c;
return;
}
}
return;
case 2:
c = *to;
for (i = _AlphabetSize; --i >= 0; _map_vec[i] = c) ;
while (c = *from++) _map_vec[c] = c;
return;
}
}
'EOF'
echo _str2map.h
cat >_str2map.h <<'EOF'
/* File : _str2map.h
Author : Richard A. O'Keefe.
Updated: 11 April 1984
Purpose: Definitions from _str2map.c
*/
extern char _map_vec[_AlphabetSize+1];
extern void _str2map(/*int,_char_^,char^*/);
'EOF'
echo _str2pat.c
cat >_str2pat.c <<'EOF'
/* File : _str2pat.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: _pat_lim, _pat_vec[], _str2pat()
Searching in this package is done by an algorithm due to R. Nigel
Hospool, described in Software Practice & Experience 1980, p505.
Elsewhere I have a version of it which does exact case or either
case match, word more or literal mode, forwards or backwards, and
will look for the Nth instance. For most applications that is too
much and a simple exact case forward search will do. Hospool's
algorithm is a simplification of the Boyer-Moore algorithm which
doesn't guarantee linear time, but in practice is very good indeed.
_str2pat(pat) builds a search table for the string pat. As usual in
this pacakge, if pat == NullS, the table is not changed and the last
search string is re-used. To support this, _str2pat returns the
actual search string.
*/
#include "strings.h"
#include "_str2pat.h"
int _pat_lim;
int _pat_vec[_AlphabetSize];
static _char_ *oldPat = "";
_char_ *_str2pat(pat)
register _char_ *pat;
{
register int L, i;
if (pat == NullS) pat = oldPat; else oldPat = pat;
for (L = 0; *pat++; L++) ;
for (i = _AlphabetSize; --i >= 0; _pat_vec[i] = L) ;
_pat_lim = --L;
pat = oldPat;
for (i = L; i > 0; i--) _pat_vec[*pat++] = i;
return oldPat;
}
'EOF'
echo _str2pat.h
cat >_str2pat.h <<'EOF'
/* File : _str2pat.h
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Purpose: Definitions from _str2pat.c
*/
extern int _pat_lim;
extern int _pat_vec[];
extern _char_ *_str2pat(/*_char_^*/);
'EOF'
echo _str2set.c
cat >_str2set.c <<'EOF'
/* File : _str2set.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: _set_ctr, _set_vec[], _str2set().
Purpose: Convert a character string to a set.
*/
/* The obvious way of representing a set of characters is as a
vector of 0s and 1s. The snag with that is that to convert a
string to such a vector, we have to clear all the elements to
0, and then set the elements corresponding to characters in
the string to 1, so the cost is O(|alphabet|+|string|). This
package uses another method, where there is a vector of small
numbers and a counter. A character is in the current set if
and only if the corresponding element of the vector is equal
to the current value of the counter. Every so often the
vector elements would overflow and we have to clear the
vector, but the cost is reduced to O(|string|+1).
Note that NUL ('\0') will never be in any set built by str2set.
While this method reduces the cost of building a set, it would
be useful to avoid it entirely. So when the "set" argument is
NullS the set is not changed. Use NullS to mean "the same set
as before." MaxPosChar is the largest integer value which can
be stored in a "char". Although we might get a slightly wider
range by using "unsigned char", "char" may be cheaper (as on a
PDP-11). By all means change the number from 127 if your C is
one of those that treats char as unsigned, but don't change it
just because _AlphabetSize is 256, the two are unrelated. And
don't dare change it on a VAX: it is built into the asm code!
*/
#include "strings.h"
#include "_str2set.h"
#define MaxPosChar 127
int _set_ctr = MaxPosChar;
char _set_vec[_AlphabetSize];
void _str2set(set)
register char *set;
{
if (set == NullS) return;
if (++_set_ctr == MaxPosChar+1) {
#if VaxAsm
asm("movc5 $0,4(ap),$0,$128,__set_vec");
#else ~VaxAsm
register char *w = &_set_vec[_AlphabetSize];
do *--w = NUL; while (w != &_set_vec[0]);
#endif VaxAsm
_set_ctr = 1;
}
while (*set) _set_vec[*set++] = _set_ctr;
}
'EOF'
echo _str2set.h
cat >_str2set.h <<'EOF'
/* File : _str2set.h
Updated: 10 April 1984
Purpose: External declarations for strprbk, strspn, strcspn &c
Copyright (C) 1984 Richard A. O'Keefe.
*/
extern int _set_ctr;
extern char _set_vec[];
extern void _str2set(/*char^*/);
'EOF'
echo ascii.h
cat >ascii.h <<'EOF'
/* File : strings.d/ascii.h
Author : Richard A. O'Keefe
Updated: 28 April 1984
Purpose: Define Ascii mnemonics.
This file defines the ASCII control characters. Note that these
names refer to their use in communication; it is an ERROR to use
these names to talk about keyboard commands. For example, DO NOT
use EOT when you mean "end of file", as many people prefer ^Z (if
the Ascii code were taken seriously, EOT would log you off and
hang up the line as well). Similarly, DO NOT use DEL when you
mean "interrupt", many people prefer ^C. When writing a screen
editor, you should speak of tocntrl('C') rather than ETX (see the
header file "ctypes.h").
*/
#define NUL '\000' /* null character */
#define SOH '\001' /* Start Of Heading, start of message */
#define STX '\002' /* Start Of Text, end of address */
#define ETX '\003' /* End of TeXt, end of message */
#define EOT '\004' /* End Of Transmission */
#define ENQ '\005' /* ENQuiry "who are you" */
#define ACK '\006' /* (positive) ACKnowledge */
#define BEL '\007' /* ring the BELl */
#define BS '\010' /* BackSpace */
#define HT '\011' /* Horizontal Tab */
#define TAB '\011' /* an unofficial name for HT */
#define LF '\012' /* Line Feed (does not imply cr) */
#define NL '\012' /* unix unofficial name for LF: new line */
#define VT '\013' /* Vertical Tab */
#define FF '\014' /* Form Feed (new page starts AFTER this) */
#define CR '\015' /* Carriage Return */
#define SO '\016' /* Shift Out; select alternate character set */
#define SI '\017' /* Shift In; select ASCII again */
#define DLE '\020' /* Data Link Escape */
#define DC1 '\021' /* Device Control 1 */
#define XON '\021' /* transmitter on, resume output */
#define DC2 '\022' /* Device Control 2 (auxiliary on) */
#define DC3 '\023' /* Device Control 3 */
#define XOFF '\023' /* transmitter off, suspend output */
#define DC4 '\024' /* Device Control 4 (auxiliary off) */
#define NAK '\025' /* Negative AcKnowledge (signal error) */
#define SYN '\026' /* SYNchronous idle */
#define ETB '\027' /* End of Transmission Block, logical end of medium */
#define CAN '\030' /* CANcel */
#define EM '\031' /* End of Medium */
#define SUB '\032' /* SUBstitute */
#define ESC '\033' /* ESCape */
#define FS '\034' /* File Separator */
#define GS '\035' /* Group Separator */
#define RS '\036' /* Record Separator */
#define US '\037' /* Unit Separator */
#define SP '\040' /* SPace */
#define DEL '\177' /* DELete, rubout */
'EOF'
echo bcmp.c
cat >bcmp.c <<'EOF'
/* File : bcmp.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: bcmp()
bcmp(s1, s2, len) returns 0 if the "len" bytes starting at "s1" are
identical to the "len" bytes starting at "s2", non-zero if they are
different. The 4.2bsd manual page doesn't say what non-zero value
is returned, though the BUGS note says that it takes its parameters
backwards from strcmp. This suggests that it is something like
for (; --len >= 0; s1++, s2++)
if (*s1 != *s2) return *s2-*s1;
return 0;
There, I've told you how to do it. As the manual page doesn't come
out and *say* that this is the result, I tried to figure out what a
useful result might be. (I'd forgotten than strncmp stops when it
hits a NUL, which the above does not do.) What I came up with was:
the result is the number of bytes in the differing tails. That is,
after you've skipped the equal parts, how many characters are left?
To put it another way, N-bcmp(s1,s2,N) is the number of equal bytes
(the size of the common prefix). After deciding on this definition
I discovered that the CMPC3 instruction does exactly what I wanted.
The code assumes that N is non-negative.
Note: the "b" routines are there to exploit certain VAX order codes,
but the CMPC3 instruction will only test 65535 characters. The asm
code is presented for your interest and amusement.
*/
#include "strings.h"
#if VaxAsm
int bcmp(s1, s2, len)
char *s1, *s2;
int len;
{
asm("cmpc3 12(ap),*4(ap),*8(ap)");
}
#else ~VaxAsm
int bcmp(s1, s2, len)
register char *s1, *s2;
register int len;
{
while (--len >= 0 && *s1++ == *s2++) ;
return len+1;
}
#endif VaxAsm
'EOF'
echo bcopy.c
cat >bcopy.c <<'EOF'
/* File : bcopy.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: bcopy()
bcopy(src, dst, len) moves exactly "len" bytes from the source "src"
to the destination "dst". It does not check for NUL characters as
strncpy() and strnmov() do. Thus if your C compiler doesn't support
structure assignment, you can simulate it with
bcopy(&from, &to, sizeof from);
BEWARE: the first two arguments are the other way around from almost
everything else. I'm sorry about that, but that's the way it is in
the 4.2bsd manual, though they list it as a bug. For a version with
the arguments the right way around, use bmove().
No value is returned.
Note: the "b" routines are there to exploit certain VAX order codes,
but the MOVC3 instruction will only move 65535 characters. The asm
code is presented for your interest and amusement.
*/
#include "strings.h"
#if VaxAsm
void bcopy(src, dst, len)
char *src, *dst;
int len;
{
asm("movc3 12(ap),*4(ap),*8(ap)");
}
#else ~VaxAsm
void bcopy(src, dst, len)
register char *src, *dst;
register int len;
{
while (--len >= 0) *dst++ = *src++;
}
#endif VaxAsm
'EOF'
echo bfill.c
cat >bfill.c <<'EOF'
/* File : bfill.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: bfill()
bfill(dst, len, fill) moves "len" fill characters to "dst".
Thus to set a buffer to 80 spaces, do bfill(buff, ' ', 80).
Note: the "b" routines are there to exploit certain VAX order codes,
but the MOVC5 instruction will only move 65535 characters. The asm
code is presented for your interest and amusement.
*/
#include "strings.h"
#if VaxAsm
void bfill(dst, len, fill)
register char *dst;
int len;
int fill; /* actually char */
{
asm("movc5 $0,*4(ap),12(ap),8(ap),*4(ap)");
}
#else ~VaxAsm
void bfill(dst, len, fill)
register char *dst;
register int len;
register int fill; /* char */
{
while (--len >= 0) *dst++ = fill;
}
#endif VaxAsm
'EOF'
echo bmove.c
cat >bmove.c <<'EOF'
/* File : bmove.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: bmove()
bmove(dst, src, len) moves exactly "len" bytes from the source "src"
to the destination "dst". It does not check for NUL characters as
strncpy() and strnmov() do. Thus if your C compiler doesn't support
structure assignment, you can simulate it with
bmove(&to, &from, sizeof from);
The standard 4.2bsd routine for this purpose is bcopy. But as bcopy
has its first two arguments the other way around you may find this a
bit easier to get right.
No value is returned.
Note: the "b" routines are there to exploit certain VAX order codes,
but the MOVC3 instruction will only move 65535 characters. The asm
code is presented for your interest and amusement.
*/
#include "strings.h"
#if VaxAsm
void bmove(dst, src, len)
char *dst, *src;
int len;
{
asm("movc3 12(ap),*8(ap),*4(ap)");
}
#else ~VaxAsm
void bmove(dst, src, len)
register char *dst, *src;
register int len;
{
while (--len >= 0) *dst++ = *src++;
}
#endif VaxAsm
'EOF'
echo bzero.c
cat >bzero.c <<'EOF'
/* File : bzero.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: bzero()
bzero(dst, len) moves "len" 0 bytes to "dst".
Thus to clear a disc buffer to 0s do bzero(buffer, BUFSIZ).
Note: the "b" routines are there to exploit certain VAX order codes,
but the MOVC5 instruction will only move 65535 characters. The asm
code is presented for your interest and amusement.
*/
#include "strings.h"
#if VaxAsm
void bzero(dst, len)
char *dst;
int len;
{
asm("movc5 $0,*4(ap),$0,8(ap),*4(ap)");
}
#else ~VaxAsm
void bzero(dst, len)
register char *dst;
register int len;
{
while (--len >= 0) *dst++ = 0;
}
#endif VaxAsm
'EOF'
echo ctypes.demo
cat >ctypes.demo <<'EOF'
EOF . . . . . . . . . # . .
ch DD? OD? XD? AN? AF? LC? UC? PT? PR? CT? SP? EL?
^@ . . . . . . . . . # . .
^A . . . . . . . . . # . .
^B . . . . . . . . . # . .
^C . . . . . . . . . # . .
^D . . . . . . . . . # . .
^E . . . . . . . . . # . .
^F . . . . . . . . . # . .
^G . . . . . . . . . # . .
^H . . . . . . . . . # . .
^I . . . . . . . . . # # .
^J . . . . . . . . . # # #
^K . . . . . . . . . # # #
^L . . . . . . . . . # # #
^M . . . . . . . . . # # #
^N . . . . . . . . . # . .
^O . . . . . . . . . # . .
ch DD? OD? XD? AN? AF? LC? UC? PT? PR? CT? SP? EL?
^P . . . . . . . . . # . .
^Q . . . . . . . . . # . .
^R . . . . . . . . . # . .
^S . . . . . . . . . # . .
^T . . . . . . . . . # . .
^U . . . . . . . . . # . .
^V . . . . . . . . . # . .
^W . . . . . . . . . # . .
^X . . . . . . . . . # . .
^Y . . . . . . . . . # . .
^Z . . . . . . . . . # . .
^[ . . . . . . . . . # . .
^\ . . . . . . . . . # . .
^] . . . . . . . . . # . .
^^ . . . . . . . . . # . .
^_ . . . . . . . . . # . .
ch DD? OD? XD? AN? AF? LC? UC? PT? PR? CT? SP? EL?
. . . . . . . . # . # .
! . . . . . . . # # . . .
" . . . . . . . # # . . .
# . . . . . . . # # . . .
$ . . . . . . . # # . . .
% . . . . . . . # # . . .
& . . . . . . . # # . . .
' . . . . . . . # # . . .
( . . . . . . . # # . . .
) . . . . . . . # # . . .
* . . . . . . . # # . . .
+ . . . . . . . # # . . .
, . . . . . . . # # . . .
- . . . . . . . # # . . .
. . . . . . . . # # . . .
/ . . . . . . . # # . . .
ch DD? OD? XD? AN? AF? LC? UC? PT? PR? CT? SP? EL?
0 # # # # . . . . # . . .
1 # # # # . . . . # . . .
2 # # # # . . . . # . . .
3 # # # # . . . . # . . .
4 # # # # . . . . # . . .
5 # # # # . . . . # . . .
6 # # # # . . . . # . . .
7 # # # # . . . . # . . .
8 # . # # . . . . # . . .
9 # . # # . . . . # . . .
: . . . . . . . # # . . .
; . . . . . . . # # . . .
< . . . . . . . # # . . .
= . . . . . . . # # . . .
> . . . . . . . # # . . .
? . . . . . . . # # . . .
ch DD? OD? XD? AN? AF? LC? UC? PT? PR? CT? SP? EL?
@ . . . . . . . # # . . .
A . . # # # . # . # . . .
B . . # # # . # . # . . .
C . . # # # . # . # . . .
D . . # # # . # . # . . .
E . . # # # . # . # . . .
F . . # # # . # . # . . .
G . . . # # . # . # . . .
H . . . # # . # . # . . .
I . . . # # . # . # . . .
J . . . # # . # . # . . .
K . . . # # . # . # . . .
L . . . # # . # . # . . .
M . . . # # . # . # . . .
N . . . # # . # . # . . .
O . . . # # . # . # . . .
ch DD? OD? XD? AN? AF? LC? UC? PT? PR? CT? SP? EL?
P . . . # # . # . # . . .
Q . . . # # . # . # . . .
R . . . # # . # . # . . .
S . . . # # . # . # . . .
T . . . # # . # . # . . .
U . . . # # . # . # . . .
V . . . # # . # . # . . .
W . . . # # . # . # . . .
X . . . # # . # . # . . .
Y . . . # # . # . # . . .
Z . . . # # . # . # . . .
[ . . . . . . . # # . . .
\ . . . . . . . # # . . .
] . . . . . . . # # . . .
^ . . . . . . . # # . . .
_ . . . . . . . # # . . .
ch DD? OD? XD? AN? AF? LC? UC? PT? PR? CT? SP? EL?
` . . . . . . . # # . . .
a . . # # # # . . # . . .
b . . # # # # . . # . . .
c . . # # # # . . # . . .
d . . # # # # . . # . . .
e . . # # # # . . # . . .
f . . # # # # . . # . . .
g . . . # # # . . # . . .
h . . . # # # . . # . . .
i . . . # # # . . # . . .
j . . . # # # . . # . . .
k . . . # # # . . # . . .
l . . . # # # . . # . . .
m . . . # # # . . # . . .
n . . . # # # . . # . . .
o . . . # # # . . # . . .
ch DD? OD? XD? AN? AF? LC? UC? PT? PR? CT? SP? EL?
p . . . # # # . . # . . .
q . . . # # # . . # . . .
r . . . # # # . . # . . .
s . . . # # # . . # . . .
t . . . # # # . . # . . .
u . . . # # # . . # . . .
v . . . # # # . . # . . .
w . . . # # # . . # . . .
x . . . # # # . . # . . .
y . . . # # # . . # . . .
z . . . # # # . . # . . .
{ . . . . . . . # # . . .
| . . . . . . . # # . . .
} . . . . . . . # # . . .
~ . . . . . . . # # . . .
DEL . . . . . . . # . # . .
'EOF'
echo ctypes.h
cat >ctypes.h <<'EOF'
/* File : ctypes.h
Author : Richard A. O'Keefe.
Updated: 26 April 1984
Purpose: Reimplement the UNIX ctype(3) library.
isaneol(c) means that c is a line terminating character.
isalnum, ispunct, isspace, and isaneol are defined on the
range -1..127, i.e. on ASCII U {EOF}, while all the other
macros are defined for any integer.
isodigit(c) checks for Octal digits.
isxdigit(c) checkx for heXadecimal digits.
*/
#define isdigit(c) ((unsigned)((c)-'0') < 10)
#define islower(c) ((unsigned)((c)-'a') < 26)
#define isupper(c) ((unsigned)((c)-'A') < 26)
#define isprint(c) ((unsigned)((c)-' ') < 95)
#define iscntrl(c) ((unsigned)((c)-' ') >= 95)
#define isascii(c) ((unsigned)(c) < 128)
#define isalpha(c) ((unsigned)(((c)|32)-'a') < 26)
extern char _c2type[];
#define isalnum(c) (_c2type[(c)+1] < 36)
#define ispunct(c) (_c2type[(c)+1] == 36)
#define isspace(c) (_c2type[(c)+1] > 37)
#define isaneol(c) (_c2type[(c)+1] > 38)
#define isxdigit(c) (_c2type[(c)+1] < 16)
#define isodigit(c) ((unsigned)((c)-'0') < 8)
/* The following "conversion" macros have been in some versions of UNIX
but are not in all. tocntrl is new. The original motivation for ^?
being a name for DEL was that (x)^64 mapped A..Z to ^A..^Z and also
? to DEL. The trouble is that this trick doesn't work for lower case
letters. The version given here is not mine. I wish it was. It has
the nice property that DEL is mapped to itself (so does EOF).
tolower(c) and toupper(c) are only defined when isalpha(c).
*/
#define tolower(c) ((c)|32)
#define toupper(c) ((c)&~32)
#define tocntrl(c) (((((c)+1)&~96)-1)&127)
#define toascii(c) ((c)&127)
'EOF'
echo ffs.c
cat >ffs.c <<'EOF'
/* File : ffs.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: ffs(), ffc()
ffs(i) returns the index of the least significant 1 bit in i,
where 1 means the least significant bit and 32 means the
most significant bit, or returns -1 if i is zero.
ffc(i) returns the index of the least significant 0 bit in i,
where 1 means the least significant bit and 32 means the
most significant bit, or returns -1 if i is zero.
These functions mimic the VAX FFS and FFC instructions, except that
the latter return much more sensible values. This file only exists
to make it easier to move 4.2bsd programs to System III (which is
rather like moving up from a Rolls Royce to a model T Ford), and so
I haven't bother with assembly code versions.
*/
#include "strings.h"
int ffs(i)
register int i;
{
register int N;
for (N = 8*sizeof(int); --N >= 0; i >>= 1)
if (i&1) return 8*sizeof(int)-N;
return -1;
}
int ffc(i)
register int i;
{
register int N;
for (N = 8*sizeof(int); --N >= 0; i >>= 1)
if (!(i&1)) return 8*sizeof(int)-N;
return -1;
}
'EOF'
echo getopt.3
cat >getopt.3 <<'EOF'
.TH GETOPT 3 local
.DA 25 March 1982
.SH NAME
getopt \- get option letter from argv
.SH SYNOPSIS
.ft B
int getopt(argc, argv, optstring)
.br
int argc;
.br
char **argv;
.br
char *optstring;
.sp
extern char *optarg;
.br
extern int optind;
.ft
.SH DESCRIPTION
.I Getopt
returns the next option letter in
.I argv
that matches a letter in
.IR optstring .
.I Optstring
is a string of recognized option letters;
if a letter is followed by a colon, the option is expected to have
an argument that may or may not be separated from it by white space.
.I Optarg
is set to point to the start of the option argument on return from
.IR getopt .
.PP
.I Getopt
places in
.I optind
the
.I argv
index of the next argument to be processed.
Because
.I optind
is external, it is normally initialized to zero automatically
before the first call to
.IR getopt .
.PP
When all options have been processed (i.e., up to the first
non-option argument),
.I getopt
returns
.BR EOF .
The special option
.B \-\-
may be used to delimit the end of the options;
.B EOF
will be returned, and
.B \-\-
will be skipped.
.SH SEE ALSO
getopt(1)
.SH DIAGNOSTICS
.I Getopt
prints an error message on
.I stderr
and returns a question mark
.RB ( ? )
when it encounters an option letter not included in
.IR optstring .
.SH EXAMPLE
The following code fragment shows how one might process the arguments
for a command that can take the mutually exclusive options
.B a
and
.BR b ,
and the options
.B f
and
.BR o ,
both of which require arguments:
.PP
.RS
.nf
main(argc, argv)
int argc;
char **argv;
{
int c;
extern int optind;
extern char *optarg;
\&.
\&.
\&.
while ((c = getopt(argc, argv, "abf:o:")) != EOF) {
switch (c) {
case 'a':
if (bflg) errflg++; else aflg++;
break;
case 'b':
if (aflg) errflg++; else bflg++;
break;
case 'f':
ifile = optarg;
break;
case 'o':
ofile = optarg;
break;
case '?':
default:
errflg++;
break;
}
}
if (errflg) {
fprintf(stderr, "Usage: ...");
exit(2);
}
for (; optind < argc; optind++) {
\&.
\&.
\&.
}
\&.
\&.
\&.
}
.RE
.PP
A template similar to this can be found in
.IR /usr/pub/template.c .
.SH HISTORY
Written by Henry Spencer, working from a Bell Labs manual page.
Behavior believed identical to the Bell version.
.SH BUGS
It is not obvious how
`\-'
standing alone should be treated; this version treats it as
a non-option argument, which is not always right.
.PP
Option arguments are allowed to begin with `\-';
this is reasonable but reduces the amount of error checking possible.
.PP
.I Getopt
is quite flexible but the obvious price must be paid: there is much
it could do that it doesn't, like
checking mutually exclusive options, checking type of
option arguments, etc.
'EOF'
echo getopt.c
cat >getopt.c <<'EOF'
/* File : getopt.c
Author : Henry Spencer, University of Toronto
Updated: 28 April 1984
Purpose: get option letter from argv.
*/
#include <stdio.h>
#include "strings.h"
char *optarg; /* Global argument pointer. */
int optind = 0; /* Global argv index. */
int getopt(argc, argv, optstring)
int argc;
char *argv[];
char *optstring;
{
register int c;
register char *place;
static char *scan = NullS; /* Private scan pointer. */
optarg = NullS;
if (scan == NullS || *scan == '\0') {
if (optind == 0) optind++;
if (optind >= argc) return EOF;
place = argv[optind];
if (place[0] != '-' || place[1] == '\0') return EOF;
optind++;
if (place[1] == '-' && place[2] == '\0') return EOF;
scan = place+1;
}
c = *scan++;
place = index(optstring, c);
if (place == NullS || c == ':') {
fprintf(stderr, "%s: unknown option %c\n", argv[0], c);
return '?';
}
if (*++place == ':') {
if (*scan != '\0') {
optarg = scan, scan = NullS;
} else {
optarg = argv[optind], optind++;
}
}
return c;
}
'EOF'
echo int2str.c
cat >int2str.c <<'EOF'
/* File : int2str.c
Author : Richard A. O'Keefe
Updated: 30 April 1984
Defines: int2str(), itoa(), ltoa()
int2str(dst, radix, val)
converts the (long) integer "val" to character form and moves it to
the destination string "dst" followed by a terminating NUL. The
result is normally a pointer to this NUL character, but if the radix
is dud the result will be NullS and nothing will be changed.
If radix is -2..-36, val is taken to be SIGNED.
If radix is 2.. 36, val is taken to be UNSIGNED.
That is, val is signed if and only if radix is. You will normally
use radix -10 only through itoa and ltoa, for radix 2, 8, or 16
unsigned is what you generally want.
_dig_vec is public just in case someone has a use for it.
The definitions of itoa and ltoa are actually macros in strings.h,
but this is where the code is.
*/
#include "strings.h"
char _dig_vec[] =
"0123456789abcdefghijklmnopqrstuvwxyz";
char *int2str(dst, radix, val)
register char *dst;
register int radix;
register long val;
{
char buffer[33];
register char *p;
if (radix < 0) {
if (radix < -36 || radix > -2) return NullS;
if (val < 0) {
*dst++ = '-';
val = -val;
}
radix = -radix;
} else {
if (radix > 36 || radix < 2) return NullS;
}
/* The slightly contorted code which follows is due to the
fact that few machines directly support unsigned long / and %.
Certainly the VAX C compiler generates a subroutine call. In
the interests of efficiency (hollow laugh) I let this happen
for the first digit only; after that "val" will be in range so
that signed integer division will do. Sorry 'bout that.
CHECK THE CODE PRODUCED BY YOUR C COMPILER. The first % and /
should be unsigned, the second % and / signed, but C compilers
tend to be extraordinarily sensitive to minor details of style.
This works on a VAX, that's all I claim for it.
*/
p = &buffer[32];
*p = '\0';
*--p = _dig_vec[(unsigned long)val%(unsigned long)radix];
val = (unsigned long)val/(unsigned long)radix;
while (val != 0) *--p = _dig_vec[val%radix], val /= radix;
while (*dst++ = *p++) ;
return dst-1;
}
'EOF'
echo str2int.c
cat >str2int.c <<'EOF'
/* File : str2int.c
Author : Richard A. O'Keefe
Updated: 27 April 1984
Defines: str2int(), atoi(), atol()
str2int(src, radix, lower, upper, &val)
converts the string pointed to by src to an integer and stores it in
val. It skips leading spaces and tabs (but not newlines, formfeeds,
backspaces), then it accepts an optional sign and a sequence of digits
in the specified radix. The result should satisfy lower <= *val <= upper.
The result is a pointer to the first character after the number;
trailing spaces will NOT be skipped.
If an error is detected, the result will be NullS, the value put
in val will be 0, and errno will be set to
EDOM if there are no digits
ERANGE if the result would overflow or otherwise fail to lie
within the specified bounds.
Check that the bounds are right for your machine.
This looks amazingly complicated for what you probably thought was an
easy task. Coping with integer overflow and the asymmetric range of
twos complement machines is anything but easy.
So that users of atoi and atol can check whether an error occured,
I have taken a wholly unprecedented step: errno is CLEARED if this
call has no problems.
*/
#include "strings.h"
#include "ctypes.h"
#include <errno.h>
extern int errno;
/* CHECK THESE CONSTANTS FOR YOUR MACHINE!!! */
#if pdp11
# define MaxInt 0x7fffL /* int = 16 bits */
# define MinInt 0x8000L
# define MaxLong 0x7fffffffL /* long = 32 bits */
# define MinLong 0x80000000L
#else ~pdp11
# define MaxInt 0x7fffffffL /* int = 32 bits */
# define MinInt 0x80000000L
# define MaxLong 0x7fffffffL /* long = 32 bits */
# define MinLong 0x80000000L
#endif pdp11
char *str2int(src, radix, lower, upper, val)
register char *src;
register int radix;
long lower, upper, *val;
{
int sign; /* is number negative (+1) or positive (-1) */
int n; /* number of digits yet to be converted */
long limit; /* "largest" possible valid input */
long scale; /* the amount to multiply next digit by */
long sofar; /* the running value */
register int d; /* (negative of) next digit */
char *answer;
/* Make sure *val is sensible in case of error */
*val = 0;
/* Check that the radix is in the range 2..36 */
if (radix < 2 || radix > 36) {
errno = EDOM;
return NullS;
}
/* The basic problem is: how do we handle the conversion of
a number without resorting to machine-specific code to
check for overflow? Obviously, we have to ensure that
no calculation can overflow. We are guaranteed that the
"lower" and "upper" arguments are valid machine integers.
On sign-and-magnitude, twos-complement, and ones-complement
machines all, if +|n| is representable, so is -|n|, but on
twos complement machines the converse is not true. So the
"maximum" representable number has a negative representative.
Limit is set to min(-|lower|,-|upper|); this is the "largest"
number we are concerned with. */
/* Calculate Limit using Scale as a scratch variable */
if ((limit = lower) > 0) limit = -limit;
if ((scale = upper) > 0) scale = -scale;
if (scale < limit) limit = scale;
/* Skip leading spaces and check for a sign.
Note: because on a 2s complement machine MinLong is a valid
integer but |MinLong| is not, we have to keep the current
converted value (and the scale!) as *negative* numbers,
so the sign is the opposite of what you might expect.
Should the test in the loop be isspace(*src)?
*/
while (*src == ' ' || *src == '\t') src++;
sign = -1;
if (*src == '+') src++; else
if (*src == '-') src++, sign = 1;
/* Check that there is at least one digit */
if (_c2type[1+ *src] >= radix) {
errno = EDOM;
return NullS;
}
/* Skip leading zeros so that we never compute a power of radix
in scale that we won't have a need for. Otherwise sticking
enough 0s in front of a number could cause the multiplication
to overflow when it neededn't.
*/
while (*src == '0') src++;
/* Move over the remaining digits. We have to convert from left
to left in order to avoid overflow. Answer is after last digit.
*/
for (n = 0; _c2type[1+ *src++] < radix; n++) ;
answer = --src;
/* The invariant we want to maintain is that src is just
to the right of n digits, we've converted k digits to
sofar, scale = -radix**k, and scale < sofar < 0. Now
if the final number is to be within the original
Limit, we must have (to the left)*scale+sofar >= Limit,
or (to the left)*scale >= Limit-sofar, i.e. the digits
to the left of src must form an integer <= (Limit-sofar)/(scale).
In particular, this is true of the next digit. In our
incremental calculation of Limit,
IT IS VITAL that (-|N|)/(-|D|) = |N|/|D|
*/
for (sofar = 0, scale = -1; --n >= 0; ) {
d = _c2type[1+ *--src];
if (-d < limit) {
errno = ERANGE;
return NullS;
}
limit = (limit+d)/radix, sofar += d*scale;
if (n != 0) scale *= radix; /* watch out for overflow!!! */
}
/* Now it might still happen that sofar = -32768 or its equivalent,
so we can't just multiply by the sign and check that the result
is in the range lower..upper. All of this caution is a right
pain in the neck. If only there were a standard routine which
says generate thus and such a signal on integer overflow...
But not enough machines can do it *SIGH*.
*/
if (sign < 0 && sofar < -MaxLong /* twos-complement problem */
|| (sofar*=sign) < lower || sofar > upper) {
errno = ERANGE;
return NullS;
}
*val = sofar;
errno = 0; /* indicate that all went well */
return answer;
}
int atoi(src)
char *src;
{
long val;
str2int(src, 10, MinInt, MaxInt, &val);
return (int)val;
}
long atol(src)
char *src;
{
long val;
str2int(src, 10, MinLong, MaxLong, &val);
return val;
}
'EOF'
echo strcase.c
cat >strcase.c <<'EOF'
/* File : strcase.c
Author : Richard A. O'Keefe.
Updated: 4 May 1984
Defines: strcase()
strcase(dst, src, op) copies characters from src to dst until a NUL
is encountered changing the alphabetic case of letters according to
the op. The operations available are
0 -> convert to lower case llllll
1 -> convert to upper case UUUUUU
2 -> capitalise each word Cccccc
3 -> change each letter to the opposite case
3 isn't particularly useful unless you know that all the letters in
src are already in the same case.
BEWARE: this is set up for ASCII only. You can use the same idea
for EBCDIC, but the magic numbers are different. I haven't used an
#ifdef because (a) I don't know what name to use (ebcdic? Ebcdic?)
and (b) I don't suppose many people will want it.
The result is a pointer to the NUL which now ends dst.
You can use strcase(buff, buff, op) safely.
*/
#include "strings.h"
#include "ctypes.h"
#define UPPER 0 /* EBCDIC: 64 */
#define LOWER 32 /* EBCDIC: 0 */
#define OTHER 32 /* EBCDIC: 64 */
char *strcase(dst, src, op)
register char *dst, *src;
int op;
{
register int d; /* Should be char */
register int mask; /* Should be char */
char initial, rest;
switch (op) {
case 0: initial = LOWER, rest = LOWER; break;
case 1: initial = UPPER, rest = UPPER; break;
case 2: initial = UPPER, rest = LOWER; break;
case 3: while (d = *src++)
*dst++ = isalpha(d) ? d^OTHER : d;
goto done;
}
for (mask = initial; d = *src++; *dst++ = d)
if (isalpha(d)) {
d = (d &~ OTHER) | mask, mask = rest;
} else {
mask = initial;
}
done: *dst = '\0';
return dst;
}
'EOF'
echo strcat.c
cat >strcat.c <<'EOF'
/* File : strcat.c
Author : Richard A. O'Keefe.
Updated: 10 April 1984
Defines: strcat()
strcat(s, t) concatenates t on the end of s. There had better be
enough room in the space s points to; strcat has no way to tell.
Note that strcat has to search for the end of s, so if you are doing
a lot of concatenating it may be better to use strmov, e.g.
strmov(strmov(strmov(strmov(s,a),b),c),d)
rather than
strcat(strcat(strcat(strcpy(s,a),b),c),d).
strcat returns the old value of s.
*/
#include "strings.h"
char *strcat(s, t)
register char *s, *t;
{
char *save;
for (save = s; *s++; ) ;
for (--s; *s++ = *t++; ) ;
return save;
}
'EOF'
echo strchr.c
cat >strchr.c <<'EOF'
/* File : strchr.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strchr(), index()
strchr(s, c) returns a pointer to the first place in s where c
occurs, or NullS if c does not occur in s. This function is called
index in V7 and 4.?bsd systems; while not ideal the name is clearer
than strchr, so index remains in strings.h as a macro. NB: strchr
looks for single characters, not for sets or strings. To find the
NUL character which closes s, use strchr(s, '\0') or strend(s). The
parameter 'c' is declared 'int' so it will go in a register; if your
C compiler is happy with register _char_ change it to that.
*/
#include "strings.h"
char *strchr(s, c)
register _char_ *s;
register int c;
{
for (;;) {
if (*s == c) return s;
if (!*s++) return NullS;
}
}
'EOF'
echo strcmp.c
cat >strcmp.c <<'EOF'
/* File : strcmp.c
Author : Richard A. O'Keefe.
Updated: 10 April 1984
Defines: strcmp()
strcmp(s, t) returns > 0, = 0, or < 0 when s > t, s = t, or s < t
according to the ordinary lexicographical order. To test for
equality, the macro streql(s,t) is clearer than !strcmp(s,t). Note
that if the string contains characters outside the range 0..127 the
result is machine-dependent; PDP-11s and VAXen use signed bytes,
some other machines use unsigned bytes.
*/
#include "strings.h"
int strcmp(s, t)
register char *s, *t;
{
while (*s == *t++) if (!*s++) return 0;
return s[0]-t[-1];
}
'EOF'
echo strcpack.c
cat >strcpack.c <<'EOF'
/* File : strcpack.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strcpack()
strcpack(dst, src, set, c)
copies characters from src to dst, stopping when it finds a NUL. If
c is NUL, characters not in the set are not copied to dst. If c is
not NUL, sequences of characters not in the set are copied as a
single c. strcpack is to strpack as strcspn is to strspn. If your C
compiler is happy with register _char_, change the declaration of c.
The result is the address of the NUL byte that now terminates "dst".
Note that dst may safely be the same as src.
*/
#include "strings.h"
#include "_str2set.h"
char *strcpack(dst, src, set, c)
register _char_ *dst, *src;
char *set;
register int c;
{
register int chr;
_str2set(set);
while (chr = *src++) {
if (_set_vec[chr] != _set_ctr) {
while ((chr = *src++) && _set_vec[chr] != _set_ctr) ;
if (c) *dst++ = c; /* 1. If you don't want trailing */
if (!chr) break; /* 2. things turned into "c", swap */
} /* lines 1 and 2. */
*dst++ = chr;
}
*dst = 0;
return dst;
}
'EOF'
echo strcpbrk.c
cat >strcpbrk.c <<'EOF'
/* File : strcpbrk.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strcpbrk()
strcpbrk(s1, s2) returns a pointer to the first character of s1 which
does not occur in s2. It is to strpbrk as strcspn is to strspn. It
relies on NUL never being in a set.
*/
#include "strings.h"
#include "_str2set.h"
char *strcpbrk(str, set)
register _char_ *str;
char *set;
{
_str2set(set);
while (_set_vec[*str++] == _set_ctr);
return *--str ? str : NullS;
}
'EOF'
echo strcpy.c
cat >strcpy.c <<'EOF'
/* File : strcpy.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strcpy()
strcpy(dst, src) copies all the characters of src (including the
closing NUL) to dst, and returns the old value of dst. Maybe this
is useful for doing i = strlen(strcpy(dst, src)); I've always found
strmov handier.
*/
#include "strings.h"
char *strcpy(dst, src)
register char *dst, *src;
{
char *save;
for (save = dst; *dst++ = *src++; ) ;
return save;
}
'EOF'
echo strcspn.c
cat >strcspn.c <<'EOF'
/* File : strcspn.c
Author : Richard A. O'Keefe.
Updated: 11 April 1984
Defines: strspn()
strcspn(s1, s2) returns the length of the longest prefix of s1
consisting entirely of characters which are NOT in s2 ("c" is
"complement"). NUL is considered to be part of s2. As _str2set
will never include NUL in a set, we have to check for it explicitly.
*/
#include "strings.h"
#include "_str2set.h"
int strcspn(str, set)
register _char_ *str;
char *set;
{
register int L;
_str2set(set);
for (L = 0; *str && _set_vec[*str++] != _set_ctr; L++) ;
return L;
}
'EOF'
echo strctrim.c
cat >strctrim.c <<'EOF'
/* File : strctrim.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strctrim()
strctrim(dst, src, set, ends)
copies src to dst, but will skip leading characters not in set if
ends <= 0 and will skip trailing characters not in set if ends >= 0.
Thus there are three cases:
ends < 0 : trim a prefix
ends = 0 : trim a prefix and a suffix both
ends > 0 : trim a suffix
This is to strtrim as strcspn is to strspn.
*/
#include "strings.h"
#include "_str2set.h"
char *strctrim(dst, src, set, ends)
register char *dst, *src;
char *set;
int ends;
{
_str2set(set);
if (ends <= 0) {
register int chr;
while ((chr = *src++) && _set_vec[chr] != _set_ctr) ;
--src;
}
if (ends >= 0) {
register int chr;
register char *save = dst;
while (chr = *src++) {
*dst++ = chr;
if (_set_vec[chr] == _set_ctr) save = dst;
}
dst = save, *dst = NUL;
} else {
while (*dst++ = *src++) ;
--dst;
}
return dst;
}
'EOF'
echo strend.c
cat >strend.c <<'EOF'
/* File : strend.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: strend()
strend(s) returns a character pointer to the NUL which ends s. That
is, strend(s)-s == strlen(s). This is useful for adding things at
the end of strings. It is redundant, because strchr(s,'\0') could
be used instead, but this is clearer and faster.
Beware: the asm version works only if strlen(s) < 65535.
*/
#include "strings.h"
#if VaxAsm
char *strend(s)
char *s;
{
asm("locc $0,$65535,*4(ap)");
asm("movl r1,r0");
}
#else ~VaxAsm
char *strend(s)
register char *s;
{
while (*s++);
return s-1;
}
#endif VaxAsm
'EOF'
echo strfield.c
cat >strfield.c <<'EOF'
/* File : strfield.c
Author : Richard A. O'Keefe.
Updated: 21 April 1984
Defines: strfield()
strfield(src, fields, chars, blanks, tabch)
is based on the key specifications of the sort(1) command.
tabch corresponds to 'x' in -t'x'. If it is NUL, a field
is leading layout (spaces, tabs &c) followed by at least
one non-layout character, and is terminated by the next
layout character or NUL. If it is not NUL, a field is
terminated by tabch or NUL.
fields is the number of fields to skip over. It corresponds
to m in -m.n or +m.n . There must be at least this many
fields, and only the last may be terminated by NUL.
chars is the number of characters to skip after the fields
have been skipped. At least this many non-NUL characters
must remain after the fields have been skipped. Note that
it is entirely possible for this skip to cross one or more
field boundaries. This corresponds to n in +m.n or -m.n .
Finally, if blanks is not 0, any layout characters will be
skipped. There need not be any. This corresponds to the
letter b in +2.0b or -0.4b .
The result is NullS if the source ran out of fields or ran
out of chars. Otherwise it is a pointer to the first
character of src which was not skipped. It is quite possible
for this character to be the terminating NUL.
Example:
to skip to the user-id field of /etc/passwd:
user_id = strfield(line, 2, 0, 0, ':');
to check whether "line" is at least 27 characters long:
if (strfield(line, 0, 27, 0, 0)) then-it-is;
to select the third blank-delimited field in a line:
head = strfield(line, 2, 0, 1, 0);
tail = strfield(head, 1, 0, 0, 0);
(* the field is the tail-head characters starting at head *)
It's not a bug, it's a feature: "layout" means any ASCII character
in the range '\1' .. ' ', including '\n', '\f' and so on.
*/
#include "strings.h"
char *strfield(src, fields, chars, blanks, tabch)
register char *src;
int fields, chars, blanks, tabch;
{
if (tabch <= 0) {
while (--fields >= 0) {
while (*src <= ' ') if (!*src++) return NullS;
while (*++src > ' ') ;
}
} else
if (fields > 0) {
do if (!*src) return NullS;
while (*src++ != tabch || --fields > 0);
}
while (--chars >= 0) if (!*src++) return NullS;
if (blanks) while (*src && *src++ <= ' ') ;
return src;
}
'EOF'
echo strfind.c
cat >strfind.c <<'EOF'
/* File : strfind.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: strfind()
strfind(src, pat) looks for an instance of pat in src. pat is not a
regex(3) pattern, it is a literal string which must be matched exactly.
As a special hack to prevent infinite loops, the empty string will be
found just once, at the far end of src. This is hard to justify. The
result is a pointer to the first character AFTER the located instance,
or NullS if pat does not occur in src. The reason for returning the
place after the instance is so that you can count the number of instances
by writing
_str2pat(ToBeFound);
for (p = src, n = 0; p = strfind(p, NullS); n++) ;
If you want a pointer to the first character of the instance, it is up
to you to subtract strlen(pat).
If there were a strnfind it wouldn't have to look at all the characters
of src, this version does otherwise it could miss the closing NUL.
*/
#include "strings.h"
#include "_str2pat.h"
char *strfind(src, pat)
char *src, *pat;
{
register char *s, *p;
register int c, lastch;
pat = _str2pat(pat);
if (_pat_lim < 0) {
for (s = src; *s++; ) ;
return s-1;
}
/* The pattern is non-empty */
for (c = _pat_lim, lastch = pat[c]; ; c = _pat_vec[c]) {
for (s = src; --c >= 0; )
if (!*s++) return NullS;
c = *s, src = s;
if (c == lastch) {
for (s -= _pat_lim, p = pat; *p; )
if (*s++ != *p++) goto not_yet;
return s;
not_yet:; }
}
}
'EOF'
echo strings.h
cat >strings.h <<'EOF'
/* File : strings.h
Updated: 30 April 1984
Purpose: Header file for the "string(3C)" package.
Copyright (C) 1984 Richard A. O'Keefe.
All the routines in this package are the original work of
R.A.O'Keefe. Any resemblance between them and any routines in
licensed software is due entirely to these routines having been
written using the "man 3 string" UNIX manual page, or in some cases
the "man 1 sort" manual page as a specification. See the READ-ME to
find the conditions under which these routines may be used & copied.
*/
#define NullS (char*)0
#define NUL '\0'
#ifndef _AlphabetSize
#define _AlphabetSize 128
#endif
#if _AlphabetSize == 128
typedef char _char_;
#endif
#if _AlphabetSize == 256
typedef unsigned char _char_;
#endif
/* NullS is the "nil" character pointer. NULL would work in most
cases, but in some C compilers pointers and integers may be of
different sizes, so it is handy to have a nil pointer that one can
pass to a function as well as compare pointers against.
NUL is the "end of string character". Strings are deemed to end at
the first NUL, or, for the routines which take an N argument, when N
is exhausted. None of the routines in this package works on the
length alone. (NUL is the ASCII name for this character.)
The routines which move characters around don't care whether they
are signed or unsigned. But the routines which compare a character
in a string with an argument, or use a character from a string as an
index into an array, do care. I have assumed that
_AlphabetSize = 128 => only 0..127 appear in strings
_AlphabetSize = 256 => only 0..255 appear in strings
The files _str2set.c and _str2map.c declare character vectors using
this size. If you don't have unsigned char, your machine may treat
char as unsigned anyway.
*/
extern char *strcat(/*char^,char^*/);
extern char *strncat(/*char^,char^,int*/);
extern int strcmp(/*char^,char^*/);
extern int strncmp(/*char^,char^,int*/);
#define streql !strcmp
#define strneql !strncmp
extern char *strcpy(/*char^,char^*/);
extern char *strncpy(/*char^,char^,int*/);
extern int strlen(/*char^*/);
extern int strnlen(/*char^,int*/);
extern char *strchr(/*char^,_char_*/);
extern char *strrchr(/*char^,_char_*/);
#define index strchr
#define rindex strrchr
extern char *strmov(/*char^,char^*/);
extern char *strnmov(/*char^,char^,int*/);
extern char *strend(/*char^*/);
extern char *strpbrk(/*char^,char^*/);
extern char *strcpbrk(/*char^,char^*/);
extern int strspn(/*char^,char^*/);
extern int strcspn(/*char^,char^*/);
extern char *strtok(/*char^,char^*/);
extern void istrtok(/*char^,char^*/);
extern char *strpack(/*_char_^,_char_^,char^,int*/);
extern char *strcpack(/*_char_^,_char_^,char^,int*/);
extern int strrpt(/*char^,char^,int*/);
extern int strnrpt(/*char^,int,char^,int*/);
extern void strtrans(/*_char_^,_char_^,_char_^,_char_^*/);
extern void strntrans(/*_char_^,_char_^,int,_char_^,_char_^*/);
extern char *strtrim(/*char^,char^,char^,int*/);
extern char *strctrim(/*char^,char^,char^,int*/);
extern char *strfield(/*char^,int,int,int,int*/);
extern char *strkey(/*char^,char^,char^,char^*/);
extern char *strfind(/*char^,char^*/);
extern char *strrepl(/*char^,char^,char^,char^*/);
extern void bcopy(/*char^,char^,int*/);
extern void bmove(/*char^,char^,int*/);
extern void bfill(/*char^,int,char*/);
extern void bzero(/*char^,int*/);
extern int bcmp(/*char^,char^,int*/);
#define beql !bcmp
extern int ffs(/*int*/);
extern int ffc(/*int*/);
extern char *str2int(/*char^,int,long,long,long^*/);
extern int atoi(/*char^*/);
extern long atol(/*char^*/);
extern char *int2str(/*char^,int,long*/);
#define itoa(d, n) int2str(d, -10, (long)(n))
#define ltoa(d, n) int2str(d, -10, (long)(n))
'EOF'
echo strkey.c
cat >strkey.c <<'EOF'
/* File : strkey.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strkey()
strkey(dst, head, tail, options)
copies tail-head characters from head to dst according to the
options. If tail is NullS, it copies up to the terminating
NUL of head. This function is meant for doing comparisons as
by sort(1). The options are thus a string of characters
taken from "bdfin". In case the options came from somewhere
else other letters are ignored.
-b: leading layout characters are not copied.
-d: only letters, digits, and blanks are copied.
-i: only graphic characters (32..126) are copied.
-n: a numeric string is copied.
These options are incompatible, and the last is taken.
-f: upper case letters are copied as lower case.
The question of what to do with a numeric string is an interesting
one, and I don't claim that this is a brilliant answer. However,
the solution used here does mean that the caller can compare two
strings as strings without needing to know that they are numeric. A
number is copied as <sign><9 digits>.<remaining digits>, where
<sign> is '-' for a negative number and '0' for a positive number.
The magic number 9 is defined to be DigitMagic.
The idea is that to compare two lines using the keys
-tx +m1.n1<flags> -m2.n2
you do
h1 = strfield(line1, m1, n1, 0, 'x');
t1 = strfield(h1, 1, 0, 0, 'x');
strkey(buff1, h1, t1, "flags");
h2 = strfield(line2, m2, n2, 0, 'x');
t2 = strfield(h2, 1, 0, 0, 'x');
strkey(buff2, h2, t2, "flags");
... strcmp(buff1, buff2) ...
The point of all this, of course, is to make it easier to write new
utilities which are compatible with sort(1) than ones which are not.
*/
#include "strings.h"
#define DigitMagic 9
char *strkey(dst, head, tail, flags)
register char *dst, *head, *tail;
char *flags;
{
register int c;
int b = 0; /* b option? */
int f = 0; /* f option? */
int k = 0; /* 3->n, 2->d, 1->i, 0->none of them */
while (*flags) switch (*flags++|32) {
case 'b': b++; break;
case 'f': f++; break;
case 'i': k = 1; break;
case 'd': k = 2; break;
case 'n': k = 3; break;
default : /*ignore*/break;
}
flags = dst; /* save return value */
if (tail == NullS) for (tail = head; *tail; tail++) ;
if (b) while (head != tail && *head <= ' ') head++;
switch (k) {
case 0:
if (f) {
while (head != tail) {
c = *head++;
if (c >= 'A' && c <= 'Z') c |= 32;
*dst++ = c;
}
} else {
while (head != tail) *dst++ = *head++;
}
break;
case 1:
if (f) {
while (head != tail) {
c = *head++;
if (c >= 32 && c <= 126) {
if (c >= 'A' && c <= 'Z') c |= 32;
*dst++ = c;
}
}
} else {
while (head != tail) {
c = *head++;
if (c >= 32 && c <= 126) *dst++ = c;
}
}
break;
case 2:
if (f) f = 32;
while (head != tail) {
c = *head++;
if (c >= '0' && c <= '9' || c >= 'a' && c <= 'z' || c == ' ') {
*dst++ = c;
} else
if (c >= 'A' && c <= 'Z') {
*dst++ = c|f;
}
}
break;
case 3:
if (*head == '-' && head != tail) {
*dst++ = *head++;
head++;
} else {
*dst++ = '0';
}
b = 0;
while (head != tail) {
c = *head;
if (c < '0' || c > '9') break;
b++, head++;
}
f = DigitMagic-b;
while (--f >= 0) *dst++ = '0';
head -= b;
while (--b >= 0) *dst++ = *head++;
if (*head == '.' && head != tail) {
*dst++ = *head++;
while (head != tail) {
c = *head++;
if (c < '0' || c > '9') break;
*dst++ = c;
}
/* now remove trailing 0s and possibly the '.' as well */
while (*--dst == '0') ;
if (*dst != '.') dst++;
}
break;
}
*dst = NUL;
return flags; /* saved initial value of dst */
}
'EOF'
echo strlen.c
cat >strlen.c <<'EOF'
/* File : strlen.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: strlen()
strlen(s) returns the number of characters in s, that is, the number
of non-NUL characters found before the closing NULEosCh. Note: some
non-standard C compilers for 32-bit machines take int to be 16 bits,
either put up with short strings or change int to long throughout
this package. Better yet, BOYCOTT such shoddy compilers.
Beware: the asm version works only if strlen(s) < 65536.
*/
#include "strings.h"
#if VaxAsm
int strlen(s)
char *s;
{
asm("locc $0,$65535,*4(ap)");
asm("subl3 r0,$65535,r0");
}
#else ~VaxAsm
int strlen(s)
register char *s;
{
register int L;
for (L = 0; *s++; L++) ;
return L;
}
#endif VaxAsm
'EOF'
echo strmov.c
cat >strmov.c <<'EOF'
/* File : strmov.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strmov()
strmov(dst, src) moves all the characters of src (including the
closing NUL) to dst, and returns a pointer to the new closing NUL in
dst. The similar UNIX routine strcpy returns the old value of dst,
which I have never found useful. strmov(strmov(dst,a),b) moves a//b
into dst, which seems useful.
*/
#include "strings.h"
char *strmov(dst, src)
register char *dst, *src;
{
while (*dst++ = *src++) ;
return dst-1;
}
'EOF'
echo strncase.c
cat >strncase.c <<'EOF'
/* File : strncase.c
Author : Richard A. O'Keefe.
Updated: 4 May 1984
Defines: strncase()
strncase(dst, src, n, op) copies characters from src to dst until
n runs out or a NUL is copied, whichever occurs first. It changes
the alphabetic case of letters according to op. The options are
0 -> convert to lower case llllll
1 -> convert to upper case UUUUUU
2 -> capitalise each word Cccccc
3 -> change each letter to the opposite case
This is the "n" version of strcase(). The result is a character
pointer to the closing NUL if one was transferred, otherwise to
the next character after the last one transferred. (The idea is
that strncase(dst, src, n, op) = strnlen(src, n).)
You can use strncase(buff, buff, n, op) safely.
*/
#include "strings.h"
#include "ctypes.h"
#define UPPER 0 /* EBCDIC: 64 */
#define LOWER 32 /* EBCDIC: 0 */
#define OTHER 32 /* EBCDIC: 64 */
char *strcase(dst, src, n, op)
register char *dst, *src;
int n;
int op;
{
register int d; /* Should be char */
register int mask; /* Should be char */
char initial, rest;
switch (op) {
case 0: initial = LOWER, rest = LOWER; break;
case 1: initial = UPPER, rest = UPPER; break;
case 2: initial = UPPER, rest = LOWER; break;
case 3: while (--n >= 0 && (d = *src++))
*dst++ = isalpha(d) ? d^OTHER : d;
goto done;
}
for (mask = initial; --n >= 0 && (d = *src++); *dst++ = d)
if (isalpha(d)) {
d = (d &~ OTHER) | mask, mask = rest;
} else {
mask = initial;
}
done: if (n >= 0) *dst = '\0';
return dst;
}
'EOF'
echo strncat.c
cat >strncat.c <<'EOF'
/* File : strncat.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strncat()
strncat(dst, src, n) copies up to n characters of src to the end of
dst. As with strcat, it has to search for the end of dst. Even if
it abandons src early because n runs out it will still close dst
with a NUL. See also strnmov.
*/
#include "strings.h"
char *strncat(dst, src, n)
register char *dst, *src;
register int n;
{
char *save;
for (save = dst; *dst++; ) ;
for (--dst; --n >= 0; )
if (!(*dst++ = *src++)) return save;
*dst = NUL;
return save;
}
'EOF'
echo strncmp.c
cat >strncmp.c <<'EOF'
/* File : strncmp.c
Author : Richard A. O'Keefe.
Updated: 10 April 1984
Defines: strncmp()
strncmp(s, t, n) compares the first n characters of s and t.
If they are the same in the first n characters it returns 0,
otherwise it returns the same value as strcmp(s, t) would.
*/
#include "strings.h"
int strncmp(s, t, n)
register char *s, *t;
register int n;
{
while (--n >= 0) {
if (*s != *t++) return s[0]-t[-1];
if (!*s++) return 0;
}
return 0;
}
'EOF'
echo strncpy.c
cat >strncpy.c <<'EOF'
/* File : strncpy.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strncpy()
strncpy(dst, src, n) copies up to n characters of src to dst. It
will pad dst on the right with NUL or truncate it as necessary to
ensure that n characters exactly are transferred. It returns the
old value of dst as strcpy does.
*/
#include "strings.h"
char *strncpy(dst, src, n)
register char *dst, *src;
register int n;
{
char *save;
for (save = dst; --n >= 0; ) {
if (!(*dst++ = *src++)) {
while (--n >= 0) *dst++ = NUL;
return save;
}
}
return save;
}
'EOF'
echo strnlen.c
cat >strnlen.c <<'EOF'
/* File : strnlen.c
Author : Richard A. O'Keefe.
Updated: 10 April 1984
Defines: strnlen()
strnlen(s, n) returns the number of characters up to the first NUL
in s, or n, whichever is smaller.
*/
#include "strings.h"
int strnlen(s, n)
register char *s;
register int n;
{
register int L;
for (L = 0; --n >= 0 && *s++; L++) ;
return L;
}
'EOF'
echo strnmov.c
cat >strnmov.c <<'EOF'
/* File : strnmov.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strnmov()
strnmov(dst, src, n) moves up to n characters of src to dst. It
always moves exactly n characters to dst; if src is shorter than n
characters dst will be extended on the right with NULs, while if src
is longer than n characters dst will be a truncated version of src
and will not have a closing NUL. The result is a pointer to the
first NUL in dst, or is dst+n if dst was truncated.
*/
#include "strings.h"
char *strnmov(dst, src, n)
register char *dst, *src;
register int n;
{
while (--n >= 0) {
if (!(*dst++ = *src++)) {
src = dst-1;
while (--n >= 0) *dst++ = NUL;
return src;
}
}
return dst;
}
'EOF'
echo strnrpt.c
cat >strnrpt.c <<'EOF'
/* File : strnrpt.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strnrpt()
strnrpt(dst, n, src, k) "RePeaTs" the string src into dst k times,
but will truncate the result at n characters if necessary. E.g.
strnrpt(dst, 7, "hack ", 2) will move "hack ha" to dst WITHOUT the
closing NUL. The result is the number of characters moved, not
counting the closing NUL. Equivalent to strrpt-ing into an infinite
buffer and then strnmov-ing the result.
*/
#include "strings.h"
int strnrpt(dst, n, src, k)
register char *dst;
register int n;
char *src;
int k;
{
char *save;
for (save = dst; --k >= 0; dst--) {
register char *p;
for (p = src; ; ) {
if (--n < 0) return dst-save;
if (!(*dst++ = *p++)) break;
}
}
return dst-save;
}
'EOF'
echo strntrans.c
cat >strntrans.c <<'EOF'
/* File : strntrans.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strntrans()
strntrans(dst, src, n, from, to)
copies exactly n characters from src to dst. It will not stop when
it encounters a NUL, so you can use it with a table which maps NUL
to something different. No value is returned.
*/
#include "strings.h"
#include "_str2map.h"
void strntrans(dst, src, n, from, to)
register _char_ *dst, *src;
register int n;
_char_ *from, *to;
{
_str2map(0, from, to);
while (--n >= 0) *dst++ = _map_vec[*src++] ;
}
'EOF'
echo strpack.c
cat >strpack.c <<'EOF'
/* File : strpack.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strpack()
strpack(dst, src, set, c)
copies characters from src to dst, stopping when it finds a NUL. If
c is NUL, characters in set are not copied to dst. If c is not NUL,
sequences of characters from set are copied as a single c.
strpack(d, s, " \t", ' ') can be used to compress white space,
strpack(d, s, " \t", NUL) to eliminate it. To translate characters
in set to c without compressing runs, see strtrans(). The result is
the address of the NUL byte now terminating dst. Note that dst may
safely be the same as src.
*/
#include "strings.h"
#include "_str2set.h"
char *strpack(dst, src, set, c)
register _char_ *dst, *src;
char *set;
register int c;
{
register int chr;
_str2set(set);
while (chr = *src++) {
if (_set_vec[chr] == _set_ctr) {
while ((chr = *src++) && _set_vec[chr] == _set_ctr) ;
if (c) *dst++ = c; /* 1. If you don't want trailing */
if (!chr) break; /* 2. things turned into "c", swap */
} /* lines 1 and 2. */
*dst++ = chr;
}
*dst = 0;
return dst;
}
'EOF'
echo strpbrk.c
cat >strpbrk.c <<'EOF'
/* File : strpbrk.c
Author : Richard A. O'Keefe.
Updated: 11 April 1984
Defines: strpbrk()
strpbrk(s1, s2) returns NullS if no character of s2 occurs in s1, or
a pointer to the first character of s1 which occurs in s2 if there
is one. It generalises strchr (v7=index). It wouldn't be useful to
consider NUL as part of s2, as that would occur in every s1.
*/
#include "strings.h"
#include "_str2set.h"
char *strpbrk(str, set)
register _char_ *str;
char *set;
{
_str2set(set);
while (_set_vec[*str] != _set_ctr)
if (!*str++) return NullS;
return str;
}
'EOF'
echo strpref.c
cat >strpref.c <<'EOF'
/* File : strpref.c
Author : Richard A. O'Keefe.
Updated: 11 April 1984
Defines: strpref()
strpref(src, prefix)
checks whether prefix is a prefix of src. If it is not, the
result is NullS. If it is, the result is a pointer to the
first character of src after the prefix (src+strlen(prefix)).
*/
#include "strings.h"
char *strpref(src, prefix)
register char *src, *prefix;
{
while (*prefix) if (*src++ != *prefix++) return NullS;
return src;
}
'EOF'
echo strrchr.c
cat >strrchr.c <<'EOF'
/* File : strrchr.c
Author : Richard A. O'Keefe.
Updated: 10 April 1984
Defines: strrchr(), rindex()
strrchr(s, c) returns a pointer to the last place in s where c
occurs, or NullS if c does not occur in s. This function is called
rindex in V7 and 4.?bsd systems; while not ideal the name is clearer
than strrchr, so rindex remains in strings.h as a macro. NB:
strrchr looks for single characters, not for sets or strings. The
parameter 'c' is declared 'int' so it will go in a register; if your
C compiler is happy with register char change it to that.
*/
#include "strings.h"
char *strrchr(s, c)
register _char_ *s;
register int c;
{
register char *t;
t = NullS;
do if (*s == c) t = s; while (*s++);
return t;
}
'EOF'
echo strrepl.c
cat >strrepl.c <<'EOF'
/* File : strrepl.c
Author : Richard A. O'Keefe.
Updated: 23 April 1984
Defines: strrepl()
strrepl(dst, src, pat, rep, times) copies src to dst, replacing the
first "times" non-overlapping instances of pat by rep. pat is not a
regex(3) pattern, it is a literal string which must be matched
exactly. As a special hack, since strfind claims to find "" just
once at the end of the src string, strrepl does a strcat when pat is
an empty string "". If times <= 0, it is just strmov.
The result is a pointer to the NUL which now terminates dst.
BEWARE: even when rep is shorter than pat it is NOT necessarily safe
for dst to be the same as src. ALWAYS make sure dst and src do not/
will not overlap. You have been warned.
There really ought to be a strnrepl with a bound for the size of the
destination string, but there isn't.
*/
#include "strings.h"
#include "_str2pat.h"
char *strrepl(dst, src, pat, rep, times)
char *dst, *src, *pat, *rep;
int times;
{
register char *s, *p;
register int c, lastch;
pat = _str2pat(pat);
if (times <= 0) {
for (p = dst, s = src; *p++ = *s++; ) ;
return p-1;
}
if (_pat_lim < 0) {
for (p = dst, s = src; *p++ = *s++; ) ;
for (--p, s = rep; *p++ = *s++; ) ;
return p-1;
}
/* The pattern is non-empty and times is positive */
c = _pat_lim, lastch = pat[c];
for (;;) {
for (s = src, p = dst; --c >= 0; )
if (!(*p++ = *s++)) return p-1;
c = *s, src = s, dst = p;
if (c == lastch) {
for (s -= _pat_lim, p = pat; *p; )
if (*s++ != *p++) goto not_yet;
for (p = dst-_pat_lim, s = rep; *p++ = *s++; ) ;
--p;
if (--times == 0) {
for (s = src; *p++ = *++s; ) ;
return p-1;
}
dst = p, src++, c = _pat_lim;
} else {
not_yet: c = _pat_vec[c];
}
}
}
'EOF'
echo strrpt.c
cat >strrpt.c <<'EOF'
/* File : strrpt.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strrpt()
strrpt(dst, src, k) "RePeaTs" the string src into dst k times. E.g.
strrpt(dst, "hack ", 2) will move "hack hack" to dst. If k <= 0 it
does nothing. The result is the number of characters moved, except
for the closing NUL. src may be "" but may not of course be NullS.
*/
#include "strings.h"
int strrpt(dst, src, k)
register char *dst;
char *src;
int k;
{
char *save;
for (save = dst; --k >= 0; --dst) {
register char *p;
for (p = src; *dst++ = *p++; ) ;
}
return dst-save;
}
'EOF'
echo strspn.c
cat >strspn.c <<'EOF'
/* File : strspn.c
Author : Richard A. O'Keefe.
Updated: 11 April 1984
Defines: strspn()
strspn(s1, s2) returns the length of the longest prefix of s1
consisting entirely of characters in s2. NUL is not considered to
be in s2, and _str2set will not include it in the set.
*/
#include "strings.h"
#include "_str2set.h"
int strspn(str, set)
register _char_ *str;
char *set;
{
register int L;
_str2set(set);
for (L = 0; _set_vec[*str++] == _set_ctr; L++) ;
return L;
}
'EOF'
echo strsuff.c
cat >strsuff.c <<'EOF'
/* File : strsuff.c
Author : Richard A. O'Keefe.
Updated: 11 April 1984
Defines: strsuff()
strsuff(src, suffix)
checks whether suffix is a suffix of src. If it is not, the
result is NullS. If it is, the result is a pointer to the
character of src where suffix starts (which is the same as
src+strlen(src)-strlen(prefix) ).
*/
#include "strings.h"
char *strsuff(src, suffix)
register char *src, *suffix;
{
register int L; /* length of suffix */
for (L = 0; *suffix++; L++)
if (!*src++) return NullS;
while (*src++) ;
for (--src, --suffix; --L >= 0; )
if (*--src != *--suffix) return NullS;
return src;
}
'EOF'
echo strtok.c
cat >strtok.c <<'EOF'
/* File : strtok.c
Author : Richard A. O'Keefe.
Updated: 11 April 1984
Defines: istrtok(), strtok()
strtok(src, set)
skips over initial characters of src[] which occur in set[].
The result is a pointer to the first character of src[]
which does not occur in set[]. It then skips until it finds
a character which does occur in set[], and changes it to NUL.
If src is NullS, it is as if you had specified the place
just after the last NUL was written. If src[] contains no
characters which are not in set[] (e.g. if src == "") the
result is NullS.
To read a sequence of words separated by spaces you might write
p = strtok(sequence, " ");
while (p) {process_word(p); p = strtok(NullS, " ");}
This is unpleasant, so there is also a function
istrtok(src, set)
which builds the set and notes the source string for future
reference. With this function, you can write
for (istrtok(wordlist, " \t"); p = strtok(NullS, NullS); )
process_word(p);
*/
#include "strings.h"
#include "_str2set.h"
static char *oldSrc = "";
void istrtok(src, set)
char *src, *set;
{
_str2set(set);
if (src != NullS) oldSrc = src;
}
char *strtok(src, set)
register char *src;
char *set;
{
char *save;
_str2set(set);
if (src == NullS) src = oldSrc;
while (_set_vec[*src] == _set_ctr) src++;
if (!*src) return NullS;
save = src;
while (_set_vec[*++src] != _set_ctr) ;
*src++ = NUL;
oldSrc = src;
return save;
}
'EOF'
echo strtrans.c
cat >strtrans.c <<'EOF'
/* File : strtrans.c
Author : Richard A. O'Keefe.
Updated: 11 April 1984
Defines: strtrans()
strtrans(dst, src, from, to)
copies characters from src[] to dst[], stopping when dst gets a
NUL character, translating characters in from[] to corresponding
characters in to[]. Courtesy of _str2map, if from or to is null
its previous value will be used, and if both are NullS the table
will not be rebuilt. Note that copying stops when a NUL is put
into dst[], which can normally happen only when a NUL has been
fetched from src[], but if you have built your own translation
table it may be earlier (if some character is mapped to NUL) or
later (if NUL is mapped to something else). No value is
returned.
*/
#include "strings.h"
#include "_str2map.h"
void strtrans(dst, src, from, to)
register _char_ *dst, *src;
_char_ *from, *to;
{
_str2map(0, from, to);
while (*dst++ = _map_vec[*src++]) ;
}
'EOF'
echo strtrim.c
cat >strtrim.c <<'EOF'
/* File : strtrim.c
Author : Richard A. O'Keefe.
Updated: 20 April 1984
Defines: strtrim()
strtrim(dst, src, set, ends)
copies src to dst, but will skip leading characters in set if "ends"
is <= 0, and will skip trailing characters in set if ends is >= 0.
Thus there are three cases:
ends < 0 : trim a prefix
ends = 0 : trim a prefix and a suffix both
ends > 0 : trim a suffix
To compress internal runs, see strpack. The normal use of this is
strtrim(buffer, buffer, " \t", 0); The result is the address of the
NUL which now terminates dst.
*/
#include "strings.h"
#include "_str2set.h"
char *strtrim(dst, src, set, ends)
register char *dst, *src;
char *set;
int ends;
{
_str2set(set);
if (ends <= 0) {
while (_set_vec[*src] == _set_ctr) src++;
}
if (ends >= 0) {
register int chr;
register char *save = dst;
while (chr = *src++) {
*dst++ = chr;
if (_set_vec[chr] != _set_ctr) save = dst;
}
dst = save, *dst = NUL;
} else {
while (*dst++ = *src++) ;
--dst;
}
return dst;
}
'EOF'
More information about the Comp.sources.unix
mailing list