Should 'sync' stop terminal/system activity ? (SUMMARY)
Andy-Krazy-Glew
aglew at urbana.mcd.mot.com
Wed Sep 6 11:29:26 AEST 1989
To: tuvie!dpmizar!lcz
Subject: Buffers and interactive response
Reply-To: aglew at urbana.mcd.mot.com
Bcc:
Date: Tue, 05 Sep 89 21:14:38 CDT
From: Andy 'Krazy' Glew <aglew at chant>
I'm trying to post this, but having problems.
In-reply-to: ee at atbull.UUCP's message of 5 Sep 89 01:44:01 GMT
Newsgroups: comp.unix.wizards
Followup-To: comp.unix.wizards
Subject: Re: Should 'sync' stop terminal/system activity ? (SUMMARY)
References: <265 at atbull.UUCP> <266 at atbull.UUCP>
Distribution: world
>In article <265 at atbull.UUCP> i write:
>>_The configuration:
>> 60830,UNIX V.3,16MB
>>
>>_The Situation:
>> 2MB for BUFFERS
>> 'sync' stops all terminal/system activity for
>> some seconds ( until buffers are written to
>> disk ? )
>>
>> 1MB for BUFFERS
>> no troubles with sync
>>
>> ( of course 2MB BUFFERS is preferable, but users
>> complain about terminals freezing for no obvious
>> reason )
>>
>>_The question:
>> Is it normal/ok for sync to freeze terminal activity ?
>Answers :
>1>From: tuvie!dpmizar!lcz (Lee Ziegenhals)
>1>
>1>I would appreciate any information you have on this problem. I ran into the
>1>same thing on a Motorola 68030 system running SystemV/68. Motorola's response
>1>was basically (1) set the file hardening switch (which turns the cache into
>1>a write-through cache -- at a tremendous performance hit), or (2) use fewer
>1>buffers. I don't really consider either of these an acceptable solution.
>1>
>1>I'm hoping to get more information from the engineers at Motorola, but I'm
>1>not holding my breath...
>1>
>1>-Lee Ziegenhals
This isn't the correct forum for a formal announcement of functionality,
and what I say must not be understood as an official Motorola policy, but
I feel a bit bad about Lee Ziegenhals' "not holding his breath" for help
from Motorola, so...
Yep, we found this performance problem, large buffer caches producing
big jerks in interactive response, as soon as we started living on large
memory machines. So far the biggest jerk I measured was 13 seconds!
The problem was an O(n^2) algorithm in the buffer cache scanning code
(standard UNIX), when a lot of buffers were dirty.
In Motorola SYSTEM V/68 R3V6 we have provided a different buffer
cache scanning algorithm, that is O(n), but, moreover, reduces the
"jerk" by scanning the buffer cache in segments. So, if you are
scanning the buffer cache (BDFLUSHR) once every second, then we can
now split up the work into, say, 1/60 as much on every clock tick.
Yes, it performs a lot better. First of all, empirically (it's my
job to measure these things). Secondly, "feel" -- we installed the fix
on our production machines, and then took it off so that I could
provide before and after measurements of jerkiness on a real system.
I was almost lynched when I took it off. It's back now (down, down,
angry programmers!)
There are still a few other O(n^2) algorithms in UNIX (remember,
simple, not sophisticated, algorithms? Uh-huh), but I think the buffer
cache was the biggy. Tell me if it's still a problem after you update
to R3V6 -- I know how to fix some of 'em, just need the time and
justification (I cannot go fixing things that we have no evidence are
problems - not without real good reason).
Motorola System V/68 R3V6 is not, I believe, formally released yet,
and you may want to double-check that the "syncfix" functionality is
in it.
For the moment, if you have a Motorola System V/68 R3V5 or earlier system,
and are having trouble with interactive response, you might:
-- reduce NBUF (to reduce the number of buffers you need to scan.
take a look at your buffer cache hit statistics,
to see if you really need all those buffers)
-- turn on FILEHARDN (with the problems mentioned above)
-- change BDFLUSHR
This is one that hasn't yet been mentioned, that you might
want to consider. However, it's a bit of a toughie:
The O(n^2) behaviour I describe above is really
more like O(n*d), where d is the number of dirty buffers
(if d=c*n, then O(n^2).
If your workload's buffer writing characteristics are
such that you dirty a lot of different buffers, you may want
to reduce BDFLUSHR (increase the rate at which scanning is done).
This way, you'll be writing the data out more frequently,
but hopefully fewer buffers will have been dirtied each time,
so the scan will take less time - you'll have smaller jerks
less frequently).
However, if you are constantly redirtying the same buffers,
then a higher flush rate will just mean that you're writing out
more data - probably not good. In this case, you might increase
BDLUSHR (frankly, I would reduce NBUF first - but I'm paranoid
about reliability).
If you really feel daring, you can patch the value of
"bdflushr" on the fly in your kernel: bdflushcnt = tune.t_bdflushr
Change tune.t_bdflushr. (In R3V6 you have to change a different
variable). This way you could dynamically try out a few
values, and see which you prefer.
NB. THIS IS NOT MOTOROLA RECOMMENDED STANDARD PROCEDURE!!!
We do not recommend changing tuneable parameters except via
sysgen, and any potential damage you do is on your own head.
Hope this helps.
If there was a standard newsgroup for Motorola SYSTEM V/68 (and 88) systems,
I'd cross-post to it.
...
Now, finally - mind if I be commercial for just a little bit? (I'm
normally a really good net.citizen, talking about anything *except* my
company's products, but I've got this character flaw: I'm proud of the
company I work for (and I only work for companies I'm proud of)):
I'm sorry that some of you "don't hold your breath" for help from
Motorola, but -- Motorola Microcomputer Division (the part of Motorola
that sells computer systems as opposed to parts) is full of people trying
to make our products better. Yeah, we've had problems, but we're getting
better quickly. We've been challenged to produce the same sort of quality
in computer systems, hardware and software, that other parts of Motorola
put into chips and communications equipment. What 99.9999% defect free
means to software isn't always clear, but it certainly means solving
our customers' problems. So keep those bug reports coming - it may take
a while to get 'em fixed, but we're gonna.
End of inspirational commercial.
Please, please, please - report those bugs to your sales office or
customer support. I didn't know about this buffer cache scanning
problem until I started working on a system with a lot of memory
myself.
This isn't a commercial - I know that other companies have difficulty
getting customers to report problems. Hell - I know that when I was
a sysadmin at school I was often too lazy to report bugs. But, believe
it or not, people at system shops actually do look at your problem
reports. Keep 'em coming!!
--
Andy "Krazy" Glew, Motorola MCD, aglew at urbana.mcd.mot.com
1101 E. University, Urbana, IL 61801, USA. {uunet!,}uiucuxc!udc!aglew
My opinions are my own; I indicate my company only so that the reader
may account for any possible bias I may have towards our products.
More information about the Comp.unix.wizards
mailing list