I remember hearing about Berkeley's new semaphore scheme, which depends upon shared memory, and using an atomic test-and-set instruction to avoid making syscalls. Does anybody know how (or if) this will be modified to work on machines without such instructions?