Are 3B1 "pipes" really slower than molasses?
Thad P Floryan
thad at cup.portal.com
Tue Nov 27 17:26:14 AEST 1990
Yet another chapter in the saga of the ongoing "Don't shoe-shine MY data!"
While investigating why the tape backup operation on the 3B1 is so s-l-o-w,
even with double-buffering techniques, I finally pinpointed what appears to
be the cause: PIPES. Pipes are used to transfer data to "tapecpio" in all
the supplied shell scripts, and pipes are typically used to pass data from
a "find" (i.e. "find * -print | cpio -oc > whatever").
"Piping" was the ONLY thing in common with all my testing, so I decided to
instrument some pipe runs and see what gives. Seems the 3B1 pipes leak bits
out into the Great Bit Bucket or sumtin'. This is the first time I've ever
had something "bad" to say about the 3B1. And this "problem" affects more
than just backups, it affects ANYTHING using pipes, so this should be of
interest to you no matter what system you're using.
Specifically: the BEST performance observed is approx. 35 KBytes/Second between
two processes which are piped together. Adding more "drains" to the "pipe"
worsens performance. I tested 4 UNIXPC systems, ranging from 4MB RAM/85MB HD
to 1MB RAM/10MB HD, and the results are all in the same ballpark: 35-36 KBytes
per second.
Perhaps there's something I'm just not seeing, or perhaps some "ktune" params
are not obvious. I'm working on the assumption that "pipes" are a performance
bottleneck on the UNIXPC and so I went and grabbed some tape utils from site
wsmr-simtel20.army.mil to see if a non-piped tape backup/restore program can
improve performance. This will take some time to checkout, so in the meantime
here are two things I'm asking:
1) Enclosed are my test programs, a Makefile, and a shell-script to
exercise the tests. Try them on your system. If the results are
substantially different, please post them along with your present
"ktune" parameters (you get these by: "su; ktune -d"). By results
"substantially different" I mean you're getting 200 KBytes/Sec or
something else radically different from my results (below).
2) If you know of ways to improve pipe performance, please post them.
I don't recall any discussions of this "problem" mentioned in this
newsgroup before, so maybe I've opened a new "can-of-worms" here;
wouldn't be the first time and definitely won't be the last! :-)
Enclosed with this posting is a "shar" of my test suite. You may need to
change the "gcc" in the Makefile to be "cc", but I tried both with no change
in the observed performance. If nothing else, you may find the timing code
in "recv.c" interesting. To run the tests, do either:
$ ./test.sh (OR) $ nohup ./test.sh &
That second form places its output in a file named "nohup.out". In all cases,
the output will look something like:
$ ./test.sh
send <n> | recv
100000 characters received in 2.783 seconds for 35928 CPS
200000 characters received in 5.833 seconds for 34285 CPS
300000 characters received in 8.350 seconds for 35928 CPS
400000 characters received in 11.933 seconds for 33519 CPS
500000 characters received in 14.100 seconds for 35460 CPS
1000000 characters received in 28.200 seconds for 35460 CPS
send <n> | pass | recv
100000 characters received in 5.566 seconds for 17964 CPS
200000 characters received in 10.333 seconds for 19354 CPS
300000 characters received in 16.200 seconds for 18518 CPS
400000 characters received in 21.200 seconds for 18867 CPS
500000 characters received in 26.733 seconds for 18703 CPS
1000000 characters received in 53.050 seconds for 18850 CPS
If you see any flaws in my testing techniques, I'd appreciate knowing about
them, too. But I've checked this out quite thoroughly and I'm convinced that
what I'm seeing with the results (above) is the actual piping throughput.
The "ktune" parameters on my systems are (the comments are my annotations):
# ktune -d
nbuf 100 #number of system buffers for block devices
ninode 400 #number of memory-resident inodes at one time
nfile 300 #number of files open on system at one time
nproc 100 #number of processes existing at one time
ntext 75 #number of text structures allocated in kernel
nclist 150 #number of clist buffers available
npbuf 16 #number of buffer headers in the raw I/O pool
ncall 32 #number of callouts allowed in the kernel
nttyhog 1024 #number of chars in tty buffers before implicit flush
Some other systems I've already tested with the same suite include (with the
results for 1,000,000 chars in both tests rounded to nearest 1000):
HP-9000/840 (Spectrum RISC), HP-UX 3.01, 240000 CPS and 120000 CPS
HP-9000/350 (Motorola 68030), HP-UX 7.0, 156000 CPS and 85000 CPS
Thad
Thad Floryan [ thad at cup.portal.com (OR) ..!sun!portal!cup.portal.com!thad ]
---- Cut Here and unpack ----
#!/bin/sh
# This is a shell archive (shar 3.32)
# made 11/27/1990 05:18 UTC by thad at thadlabs
# Source directory /u/thad/Filecabinet/WORK/pipe-test
#
# existing files WILL be overwritten
#
# This shar contains:
# length mode name
# ------ ---------- ------------------------------------------
# 485 -rw-r--r-- Makefile
# 247 -rw-r--r-- pass.c
# 824 -rw-r--r-- recv.c
# 332 -rw-r--r-- send.c
# 411 -rwxr-xr-x test.sh
#
if touch 2>&1 | fgrep 'amc' > /dev/null
then TOUCH=touch
else TOUCH=true
fi
# ============= Makefile ==============
echo "x - extracting Makefile (Text)"
sed 's/^X//' << 'SHAR_EOF' > Makefile &&
X# 3B1 makefile for pipe speed testing
X#
XCC = gcc
XCFLAGS = -O
XLDFLAGS = -s
XLIBS = /lib/crt0s.o /lib/shlib.ifile
XNAME1 = send
XOBJS1 = send.o
XNAME2 = recv
XOBJS2 = recv.o
XNAME3 = pass
XOBJS3 = pass.o
X
Xall : $(NAME1) $(NAME2) $(NAME3)
X
X$(NAME1): $(OBJS1)
X $(LD) $(LDFLAGS) -o $(NAME1) $(OBJS1) $(LIBS)
X
X$(NAME2): $(OBJS2)
X $(LD) $(LDFLAGS) -o $(NAME2) $(OBJS2) $(LIBS)
X
X$(NAME3): $(OBJS3)
X $(LD) $(LDFLAGS) -o $(NAME3) $(OBJS3) $(LIBS)
X
Xclean :
X rm -f $(OBJS1) $(OBJS2) $(OBJS3) core *~
SHAR_EOF
$TOUCH -am 1126050290 Makefile &&
chmod 0644 Makefile ||
echo "restore of Makefile failed"
set `wc -c Makefile`;Wc_c=$1
if test "$Wc_c" != "485"; then
echo original size 485, current size $Wc_c
fi
# ============= pass.c ==============
echo "x - extracting pass.c (Text)"
sed 's/^X//' << 'SHAR_EOF' > pass.c &&
X/* pass.c
X *
X * just passes/handoffs chars from stdin to stdout until EOF for testing
X * the speed of pipes on the system.
X *
X * Thad Floryan, 26-Nov-1990
X */
X
X#include <stdio.h>
X
Xmain()
X{
X int c;
X
X while ( (c = getchar()) != EOF ) putchar(c);
X
X}
SHAR_EOF
$TOUCH -am 1126045490 pass.c &&
chmod 0644 pass.c ||
echo "restore of pass.c failed"
set `wc -c pass.c`;Wc_c=$1
if test "$Wc_c" != "247"; then
echo original size 247, current size $Wc_c
fi
# ============= recv.c ==============
echo "x - extracting recv.c (Text)"
sed 's/^X//' << 'SHAR_EOF' > recv.c &&
X/* recv.c
X *
X * just receives chars from stdin until EOF for testing the speed
X * of pipes on the system.
X *
X * Thad Floryan, 26-Nov-1990
X */
X
X#include <stdio.h>
X#include <sys/param.h> /* for def of HZ */
X#include <sys/types.h>
X#include <sys/times.h>
X
Xmain()
X{
X extern long times();
X
X long startime, endtime, elapsed;
X struct tms timebuf;
X long numchrs = 0;
X
X startime = times(&timebuf); /* get start time in HZ units */
X
X while ( getchar() != EOF ) ++numchrs;
X
X endtime = times(&timebuf); /* get completion time in HZ units */
X
X if ( (elapsed = endtime - startime) != 0L )
X {
X printf("%d characters received in %d.%03d seconds for %d CPS\n",
X numchrs,
X elapsed / HZ,
X ((elapsed % HZ) * 1000L) / HZ,
X ((numchrs * HZ) / elapsed ));
X }
X else
X {
X printf("Insufficient timer resolution for supplied input\n");
X }
X}
SHAR_EOF
$TOUCH -am 1126045390 recv.c &&
chmod 0644 recv.c ||
echo "restore of recv.c failed"
set `wc -c recv.c`;Wc_c=$1
if test "$Wc_c" != "824"; then
echo original size 824, current size $Wc_c
fi
# ============= send.c ==============
echo "x - extracting send.c (Text)"
sed 's/^X//' << 'SHAR_EOF' > send.c &&
X/* send.c
X *
X * just sends argv[1] number of characters out for testing the speed
X * of pipes on the system.
X *
X * Thad Floryan, 26-Nov-1990
X */
X
X#include <stdio.h>
X
Xmain(argc, argv)
X int argc;
X char *argv[];
X{
X long numchrs;
X
X numchrs = atol(argv[1]); /* dismiss error checks for now */
X
X while ( --numchrs >= 0L ) putchar('X');
X}
SHAR_EOF
$TOUCH -am 1126044190 send.c &&
chmod 0644 send.c ||
echo "restore of send.c failed"
set `wc -c send.c`;Wc_c=$1
if test "$Wc_c" != "332"; then
echo original size 332, current size $Wc_c
fi
# ============= test.sh ==============
echo "x - extracting test.sh (Text)"
sed 's/^X//' << 'SHAR_EOF' > test.sh &&
Xecho "\nsend <n> | recv\n"
X./send 100000 | ./recv
X./send 200000 | ./recv
X./send 300000 | ./recv
X./send 400000 | ./recv
X./send 500000 | ./recv
X./send 1000000 | ./recv
Xecho "\nsend <n> | pass | recv\n"
X./send 100000 | ./pass | ./recv
X./send 200000 | ./pass | ./recv
X./send 300000 | ./pass | ./recv
X./send 400000 | ./pass | ./recv
X./send 500000 | ./pass | ./recv
X./send 1000000 | ./pass | ./recv
SHAR_EOF
$TOUCH -am 1126175890 test.sh &&
chmod 0755 test.sh ||
echo "restore of test.sh failed"
set `wc -c test.sh`;Wc_c=$1
if test "$Wc_c" != "411"; then
echo original size 411, current size $Wc_c
fi
exit 0
More information about the Comp.sys.att
mailing list