VAX-11/750 bugs
utzoo!decvax!ucbvax!ihnss!houxi!houxm!houxg!lime!vax135!jfr
utzoo!decvax!ucbvax!ihnss!houxi!houxm!houxg!lime!vax135!jfr
Wed Mar 10 11:13:37 AEST 1982
re: VAX-11/750 bugs
The VAX-11/750 has a long (and continuing) history of bugs
in memory management vs. the CALLS instruction.
I have an operating system for the VAX-11 that features
demand paging and the equivalent of TENEX PMAP.
The system has been running on the VAX-11/780 for
about 2 years. Due to microcode/hardware bugs,
I have had an extremely difficult time getting the
system to run on the VAX-11/750. Even with the latest
release (my SID registers say 02005E03), I need a
heuristic software patch to correct a microcode error.
An exchange of messages with Bill Munson in February 1982
leaves me with the distinct impression that DEC is not
interested in doing anything effective to fix the bugs
until July 1983, if ever.
Here is a synopsis:
August 1980: My operating system won't run on Greg
Chesson's beta-test comet, microcode version <50.
Dennis Ritchie determines that "CALLS $0,..." with
a write-protected stack reports the faulting address
as the contents of PC rather than the contents of SP.
A new set of ROMs fixes the problem on Greg's machine,
but the general fix is not promised until level 62,
and many level-less-than-62 machines are shipped from
the factory to customers.
July 1981: I get two 11/750s, level 62. The fault address
is now correct, but the fault type word doesn't always
say "write or modify intent"; usually it's zero ("read").
After a week of intensive debugging, I produce a stand-
alone program which gives pure garbage for the fault type
parameter, and (7/30/81) send the program to Peter Jessel
and Armando Stettner. Response is "Yeah, there's a problem,
we'll look into it."
August 28, 1981: I send a followup message requesting a
schedule for the fix. No response.
Fall 1981: Jessel leaves DEC; problem languishes. Same bad
behavior occurs on other level-62 750s. Meanwhile
I find a heuristic which detects and patches around
the error. The heuristic has not failed yet, but we
have a very light load on the 750s.
January 1982: My 11/750s are upgraded to level 94. The
standalone program still bombs in the same way, except
that the system ID register says 02005E03.
Here is a copy of the message I sent on July 30, 1981:
*****************************************************************
re: 11/750 CALLS on write-protected stack
I am having trouble with the memory management on a VAX-11/750.
The fault parameter word for an access control violation does
not always have bits set as described in the VAX Hardware Handbook
(1980-81 p.76 Fig. 4-17). In particular, a CALLS instruction
in user mode with zero parameters and with the stack valid but
write-protected, sometimes results in a parameter word of 0
instead of 4.
I include a console transcript with appropriate registers and
memory locations examined. I also include a standalone program
which can be deposited and run, producing a different and
even more horrible fault parameter word.
John F. Reiser
Bell Laboratories 4F-635
Holmdel, NJ 07733
(201) 949-3942
vax135!jfr
===================================================================
Console transcript which gets fault parameter word of 0
for CALLS on readonly stack
-------------------------------------------------------------------
>>>B/1
%%
*unix.uerr2
real mem = 1048576
free mem = 896000
# cat /etc/rc
date >>/dev/console
rm -f /etc/mtab
/etc/mount /dev/rp0h /usr
/usr/lib/ex3.6preserve -a
cd /tmp
rm -f *
cd /
rm -f /usr/spool/uucp/STST.* /usr/spool/uucp/LCK.*
rm -f /usr/spool/lpd/lock
/etc/update&
/etc/cron&
/etc/dzkload >>/dev/console
# ;; <ctrl-D> typed to enter multiuser mode
80000CCE 06 ;; the fault in question
>>>E P
00C00004 ;; kernel mode, kernel stack
>>>E/G E
G 0000000E 7FFFFFF0
>>>E/V 7FFFFFF0 ;; the fault parameter words
P 0002FDF0 00000000 ;; should be 00000004
>>>E
P 0002FDF4 7FFFF510 ;; faulting address
>>>E
P 0002FDF8 00003453 ;; pc
>>>E
P 0002FDFC 03C00004 ;; psl
>>>E/I 3E ;; SID register
I 0000003E 02003EFF ;; 11/750, level 62 microcode
>>>E/I 11 ;; SCBB
I 00000011 00000200
>>>E/P 220 ;; access control violation vector
P 00000220 80000CC8
>>>E/V 80000CC8 ;; the fault handler code itself
P 00000CC8 126E00D1 ;; CMPL $0,(SP)
>>>E ;; BNEQ 1$
P 00000CCC 1AE10001 ;; HALT
>>>E ;;1$:
P 00000CD0 00010CAE
>>>E
P 00000CD4 AED03FBB
>>>E/I 8 ;; current mapping registers
I 00000008 8001FE00 ;; P0BR
>>>E
I 00000009 00000025 ;; P0LR
>>>E
I 0000000A 7F820000 ;; P1BR
>>>E
I 0000000B 001FFFF7 ;; P1LR
>>>E/V 8001FFE0 ;; page table for end of P1
P 0002C5E0 20000000 ;; 7ffff000
>>>E
P 0002C5E4 20000000
>>>E
P 0002C5E8 FD00015F ;; 7ffff400
>>>E
P 0002C5EC FD00017D
>>>E
P 0002C5F0 E4000181 ;; 7ffff800
>>>E
P 0002C5F4 E4000180
>>>E
P 0002C5F8 E000017F
>>>E
P 0002C5FC E400017E
>>>E/V 7FFFF510 ;; the faulting address
P 0002BF10 20000000
>>>E
P 0002BF14 00000000
>>>E
P 0002BF18 20000000
>>>E
P 0002BF1C 7FFFF584
>>>E
P 0002BF20 7FFFF55C
>>>E/V 3453 ;; code which caused the fault
P 0002F853 48CF00FB ;; CALLS $0,^W...(pc)
>>>E
P 0002F857 CF00FBF2
>>>E/I 3 ;; USP
I 00000003 7FFFF52C ;; same page as fault address
>>>
=====================================================================
Standalone program for producing bad fault parameter word
---------------------------------------------------------------------
#
# page contents
# 0 this program
# 1 SCB
# 2 SCB UNIBUS extension
# 3 HALTs
#
.set PCBB,0x10
.set SCBB,0x11
.set SBR,0x0c
.set SLR,0x0d
.set MAPEN,0x38
.set TBIA,0x39
# p.3 is HALTs
movc5 $0,(r0),$0,$0x200,*$0x600
# SCB on p.1
movab *$0x200,r0
mtpr r0,$SCBB
# vectors 000 through 0fc halt at same offset on p.3
movl $0x100/4,r2
L100:
movab 0x80000400(r0),(r0)+
sobgtr r2,L100
# vectors 100 through 3fc rei
movl $(0x400-0x100)/4,r2
L200:
movl $0x80000000+_rei,(r0)+
sobgtr r2,L200
nop
jmp *$0x80000000+ready
ready:
movl $0x80000000+istack,sp
mtpr $pcb,$PCBB
mtpr $sbr,$SBR
mtpr $4,$SLR
mtpr $1,$TBIA
mtpr $1,$MAPEN
ldpctx
rei
foo:
.word 0
calls $0,foo
halt
.align 2
_rei:
rei
.align 2
sbr:
.long 0x90000000 # V KW page0
.long 0x90000001 # V KW page1
.long 0x90000002 # V KW page2
.long 0x90000003 # V KW page3
p0br:
.long 0xf8000000 # V UR page0
pcb:
.long 0x80000000+kstack,-1,-1,ustack
.long 0,0,0,0,0,0,0,0,0,0,0,0,0,0 # r0 through r13(fp)
.long foo+2,0x03c00000 # pc, psl
.long 0x80000000+p0br, 0x04000001 # P0
.long 0x7f800000+p0br+4,0x001fffff # P1 ontop of P0
.long 0,0,0,0
istack:
.long 0,0,0,0
kstack:
.long 0,0,0,0,0,0,0
ustack:
-----------------------------------------------------------------------
Execution of above program
>>>I
>>>D/P/L 0 60002C
>>>D + 9F02008F
>>>D + 600
>>>D + 2009F9E
>>>D + DA500000
>>>D + 8FD01150
>>>D + 40
>>>D + E09E52
>>>D + 80800004
>>>D + D0F652F5
>>>D + C08F
>>>D + 8FD05200
>>>D + 8000006C
>>>D + F652F580
>>>D + 3F9F1701
>>>D + D0800000
>>>D + F48F
>>>D + 8FDA5E80
>>>D + 84
>>>D + 708FDA10
>>>D + C000000
>>>D + DA0D04DA
>>>D + 1DA3901
>>>D + 20638
>>>D + EF00FB00
>>>D + FFFFFFF7
>>>D + 0
>>>D + 2
>>>D + 90000000
>>>D + 90000001
>>>D + 90000002
>>>D + 90000003
>>>D + F8000000
>>>D + 80000104
>>>D + FFFFFFFF
>>>D + FFFFFFFF
>>>D + 120
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 61
>>>D + 3C00000
>>>D + 80000080
>>>D + 4000001
>>>D + 7F800084
>>>D + 1FFFFF
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>D + 0
>>>
>>>E P
041F0000
>>>S 0
80000621 06
>>>E P
00C00000
>>>E/G E
G 0000000E 800000F4
>>>E/V 800000F4
P 000000F4 00800010 ;; pure garbage
>>>E
P 000000F8 00000108
>>>E
P 000000FC 00000061
>>>E
P 00000100 03C00000
>>>E/I 3E
I 0000003E 02003EFF
>>>E/I 11
I 00000011 00000200
>>>E/P 200
P 00000200 80000600
>>>E
P 00000204 80000604
>>>E
P 00000208 80000608
>>>E
P 0000020C 8000060C
>>>E
P 00000210 80000610
>>>E
P 00000214 80000614
>>>E
P 00000218 80000618
>>>E
P 0000021C 8000061C
>>>E
P 00000220 80000620
>>>E
P 00000224 80000624
>>>E/V 80000620
P 00000620 00000000
>>>E/I 8
I 00000008 80000080
>>>E
I 00000009 00000001
>>>E
I 0000000A 7F800084
>>>E
I 0000000B 001FFFFF
>>>E/V 80000080
P 00000080 F8000000
>>>E
P 00000084 80000104
>>>E/V 108
P 00000108 00000000
>>>
-----------------------------------------------------------------------
If the user-mode stack pointer in the assembled PCB above is changed
to 0x80000000 and the program is run, I get a correct fault parameter
word of 00000004.
More information about the Comp.unix.wizards
mailing list