Performance Tuning a DEC 5000 Ultrix 4.0 Risc Workstation
Corey Satten
corey at milton.u.washington.edu
Fri Sep 21 03:02:13 AEST 1990
: ----- cut here ----- cut here ----- cut here ----- cut here -----
: This is a "shell archive". Save everything after the cut mark
: in a file called thisstuff, then feed it to sh by typing sh thisstuff.
: SHAR archive format. Archive created Thu Sep 20 09:13:16 PDT 1990
echo x - READ_ME
echo '-rw-r--r-- 2 corey 6125 Sep 20 09:12 READ_ME (as sent)'
sed 's/^-//' >READ_ME <<'+FUNKY+STUFF+'
- Performance Tuning a DEC 5000 Ultrix 4.0 Risc Workstation
-
- Corey Satten, corey at cac.washington.edu
- and
- Laurence Lundblade, lgl at cac.washington.edu
-
- Networks and Distributed Computing
- University of Washington
- Seattle, Washington
- September 1990
-
-
-
-History:
-
- Until August 1990, our department was using a rather maximally
- configured pmax (DEC 3100 running Ultrix 3.1) as a time-sharing host.
- It had five disks, mostly Maxtor 660 megs. It served /usr/local/bin
- via NFS to about a dozen workstations; was the departmental electronic
- mail machine; host to some campus wide mailing lists; our anonymous FTP
- server; one of two campus default domain nameservers; and also
- time-sharing host for about 16 X-terminals plus about a dozen other users
- connected via telnet. We were supporting about 150 megs of swap space
- on some small portion of the 24 megabyte physical memory. A 'ps aux'
- listing usually had 250-300 lines in it.
-
- As you might guess, the machine wasn't always snappy, but it did
- admirably. It was clearly disk i/o limited -- mostly, we assumed,
- because it was usually thrashing. Still, the load average was usually
- between 1-2 and it was mostly the spikes which were annoying.
-
- Mid August we upgraded to a 3max (DEC 5000) running Ultrix 4.0. We
- doubled our RAM to 48 megs, increased our MIPS rating by 50-80% and felt
- that the system was slower than ever. According to the `ps' program we
- were still thrashing even though our active virtual memory was less than
- the physical memory available to support it. As we looked more closely,
- we discovered that the system wasn't even paging, it was swapping, and
- making stupid choices of what to swap, at that!
-
-Analysis:
-
- Eventually we decided that the constants involved in the 2-handed
- clock paging algorithm are no longer appropriate. In particular:
-
- lotsfree = 128 (512k)
- desfree = 64 (256k)
- minfree = 24 (96k)
- maxpgio = 60 (4k pages per second)
- slowscan = 94 (computed)
- fastscan = 47 (computed/2)
-
- In the old days, programs were small and the extra memory needed
- to start several could be obtained from a 512k-byte free list. Today,
- programs are bloated with X libraries, etc. Our average process is
- about 500k. At the scan rates we were seeing: 100-200 4k pages/second,
- scanning simply couldn't keep up with the demand. Our free list hovered
- right around the minimum threshold which triggered swapping.
-
- We examined some old source code and discovered what factors can
- trigger swapping. Several of these, such as load>2 are compiled into
- the kernel as constants or are computed into local variables -- these
- can only be changed by recompiling the kernel -- something we can't do
- until DEC releases the current source. Fortunately a significant number
- of the terms in the equation are stored in global variables which can
- be fiddled on a running system. By changing a few values, we believe
- we have virtually eliminated swapping on our system and raised the
- interactive performance level substantially.
-
- On our system we have made the following changes:
-
- lotsfree = 1280 (5 meg)
- desfree = 256 (1 meg)
- minfree = 64 (256k)
- maxpgio = 125 (4k pages per second)
- slowscan = 30
- fastscan = 10
-
- In this way, we try to have 5 megs of free list for programs to
- absorb transient loads, we can replenish the free list 5 times faster
- than the default, and we've increased the allowable page-in plus page-out
- rate to 125 (I can easily make our system burst to 150 and sustain 125,
- so I don't think 125 is indicative that the paging system is in distress.
- Also, when choosing your own numbers, remember that vmstat displays `pi'
- and `po' in 1k pages).
-
- Since DEC has phased-out adb, I wrote a program to allow us to make
- these changes. I've called it `kmem' and it works like this:
-
- prompt% kmem lotsfree desfree # to read values
- lotsfree(0x8014ba40) 1280
- desfree(0x8014ba48) 256
-
- prompt% kmem -w lotsfree=1281 desfree=257 # to write values
- lotsfree(0x8014ba40) 1280 -> 1281
- desfree(0x8014ba48) 256 -> 257
-
- Once you find values you're happy with, stick it in /etc/rc.local
- and be happy. The source to `kmem' is included in this directory.
-
-Final Disclaimers:
-
- By re-compiling the kernel, we expect we can do still better. We
- believe the clock paging algorithm still isn't working very well and
- even though we see better performance when paging than swapping, we
- suspect that because the "global page replacement" algorithm is making
- its decisions on very local (2megabyte spread between hands) page use
- data we aren't making very good use of physical memory. To support this
- claim, we notice that our cpu usually shows substantial idle time even
- when the load is greater than 1 and the "active real memory" field we
- print in our "vmstat" listing (from t_arm) usually shows lots of our
- physical memory is "inactive" when we think it shouldn't be.
-
- By increasing desfree to 640 (2.5meg) we can partially re-enable
- swapping of only "deadwood" (jobs sleeping for longer than 20 seconds).
- We find this helps increase our active real memory and decrease our idle
- cpu but at an unacceptable degradation in interactive response time.
-
- Before I finish, I should probably point out that in addition to the
- load you might expect on our system, we have 3 anomalies: first, we
- have about 60-80 processes such as xclock, which wake-up every now and
- then to check/update something and then sleep for a short while longer.
- Second, we have an unusually large number of very popular shell scripts
- which start dozens of little awks, seds, greps, etc. Third, we have
- 3 swap disks configured and we think we've done a good job of spreading
- all disk requests across all the drives.
-
---------
-Corey Satten, corey at cac.washington.edu
-Networks and Distributed Computing
-University of Washington
+FUNKY+STUFF+
chmod u=rw,g=r,o=r READ_ME
ls -l READ_ME
echo x - kmem.c
echo '-rw-r--r-- 2 corey 2910 Sep 10 20:20 kmem.c (as sent)'
sed 's/^-//' >kmem.c <<'+FUNKY+STUFF+'
-/*
- * a tool to use in place of adb (on systems without adb) which lets you
- * peek and poke at the values of kernel variables in /dev/kmem
- *
- * usage: kmem var1 var2 ... varN
- * or
- * usage: kmem -w var1=val1 var2=val2 ... varN=valN
- *
- * Corey Satten, corey at cac.washington.edu, 9/6/90 - Ultrix 4.0 version
- */
-#include<stdio.h>
-#include<nlist.h>
-#include<sys/file.h>
-
-struct nlist *nl; /* how we find locations of names */
-int *nv; /* the new values for each name */
-int w_flag = 0; /* write new values? */
-char *file = "/vmunix"; /* default file to read symbols from */
-int kmem;
-
-main(argc, argv)
- int argc;
- char *argv[];
-{
- int f; /* walks argv upto index of first non-flag */
- int i; /* walks through remaining arguments */
- int value = 0;
- int rc = 0;
-
- /*
- * flag parsing
- */
- for (f=1; f<argc && *(argv[f]) == '-'; ++f) {
- switch(argv[f][1]) {
- default:
- fprintf(stderr, "%s: unknown flag -%c\n", argv[0], argv[f][1]);
- exit(1);
- case 'w':
- w_flag = 1;
- break;
- case 'f':
- file = argv[++f];
- break;
- }
- }
-
- /*
- * handle the remaining arguments as either symname or symname=value
- * depending on whether -w (w_flag) was specified.
- */
-
- nl = (struct nlist *) malloc( sizeof(*nl) * (argc-f+1) );
- nv = (int *) malloc( sizeof(int) * (argc-f+1) );
- if (!nv || !nl) {perror("malloc"); exit(1);};
-
- for (i=0; i<argc-f; ++i) {
- char *name = (char *)malloc(strlen(argv[i+f]+1));
-
- if (!name) {perror("malloc"); exit(1);};
- rc = sscanf(argv[i+f], "%[^=]=%d", name, &value);
- if (rc - w_flag != 1) {
- fprintf(stderr, "%s: bad argument: %s\n", argv[0], argv[i+f]);
- exit(1);
- }
- nl[i].n_name = name;
- nv[i] = value;
- }
- nl[i].n_name = "";
-
- /*
- * now figure out where to read/write in /dev/kmem and do it
- */
-
- nlist(file, nl);
-
- kmem = open("/dev/kmem", w_flag ? O_RDWR : O_RDONLY);
- if (kmem < 0) {
- perror("/dev/kmem open");
- exit(1);
- }
-
- for (i=0; i<argc-f; ++i) {
- long seekto = (long)nl[i].n_value;
-
- if (nl[i].n_type == 0) {
- fprintf(stderr, "%s: symbol `%s' not found in namelist of %s\n",
- argv[0], nl[i].n_name, file);
- /*
- * We promise to do all writes in command line order, so if one
- * is going to fail, we'd best bail out rather than continue.
- */
- if (w_flag) exit(2);
- else continue;
- }
- if ( lseek(kmem, seekto, 0) != seekto ) {
- perror("/dev/kmem lseek"); exit(2);
- }
- if ( read(kmem, &value, sizeof(int)) != sizeof(int) ) {
- perror("/dev/kmem read"); exit(2);
- }
-
- printf("%s(0x%x)\t%d", nl[i].n_name, nl[i].n_value, value);
-
- if (w_flag) {
- if ( lseek(kmem, seekto, 0) != seekto ) {
- perror("/dev/kmem lseek"); exit(2);
- }
- value = nv[i];
- printf(" -> %d", value);
- if ( write(kmem, &value, sizeof(int)) != sizeof(int) ) {
- perror("/dev/kmem write"); exit(2);
- }
- }
- putchar('\n');
- }
-}
+FUNKY+STUFF+
chmod u=rw,g=r,o=r kmem.c
ls -l kmem.c
echo x - Makefile
echo '-rw-r--r-- 2 corey 28 Sep 10 17:56 Makefile (as sent)'
sed 's/^-//' >Makefile <<'+FUNKY+STUFF+'
-kmem: kmem.o
- cc -o $@ $@.o
+FUNKY+STUFF+
chmod u=rw,g=r,o=r Makefile
ls -l Makefile
exit 0
More information about the Comp.unix.ultrix
mailing list