Sunday, September 24, 2017

What's the time?

Previously in this blog: ...two ways from here: either fix the floppy emulation, or make OFW for 40p with no floppy...

... or skip the call. You know, I have an armed debugger here and am not afraid to use it. So just turn the fatal call:


/usr/lib/methods/cfgfda_isa -2 -l fda0

into something harmless, like:

/bin/echo -2 -l fda0

by using

set *(int *) 0x200c11a8 = 0x2f62696e
set *(int *) 0x200c11ac = 0x2f656368
set *(int *) 0x200c11b0 = 0x6f000000

Well actually it probably should have been "/usr/bin/echo", there is no "/bin/echo" in the system. But obviously the attempt above was good enough for AIX, as it doesn't really need the floppy disk adapter (nor mouse & keyboard which I had to hack in a similar way at the second attempt). This brings AIX here:


Completed method for: fda0, Elapsed time = 0
Return code = 127
*** no stdout ****
***** stderr *****
sh: /usr/lib/methods/cfgfda_isa:  not found

Method error (/usr/lib/methods/cfgfda_isa -2 -l fda0 ):
        0514-068 Cause not known.
...
exec(/../usr/sbin/lqueryvg,-phdisk0,-L)
exec(/../usr/bin/grep,00000000000000000000000000000000)
exec(/usr/bin/dosread,-S,/preload,/preload)
exec(/usr/lpp/bosinst/datadaemon)
exec(/../usr/bin/sleep,1)

Where it hangs forever. And now the problem is sort of obvious. Yesterday I wrote that the boot log hadn't had shown any hint. But it did:


Time: 0 LEDS: 0x539
...
Time: 0 LEDS: 0x78a
...
Completed method for: bus0, Elapsed time = 0
...
Time: 0 LEDS: 0x539
...
Time: 0 LEDS: 0x868
...
Completed method for: scsi0, Elapsed time = 0
...

See? The clock is not ticking (it's probably caused by a QEMU bug, that "loadvm" command sometimes doesn't restore one of the machine timers. And I used the command a lot during the yesterdays session).

So basically there are two scenarios:
 - the clock is ticking - in this case AIX doesn't start any methods after spawning the init process
 - the clock is stopped - in this case it starts the methods up to the point where the timeouts are important. Probably if the clock had worked properly the boot process wouldn't had stopped at the floppy detection method.

Which means that debug process is getting real complicated. Now I have to debug the kernel scheduler, which is tricky. And obviously is different from AIX 4.2 which doesn't hang at that point.

The KDB from 5.1 has some features to see the scheduled timers, but I'm not sure it can be used to debug the interrupt handling. At least Solaris kadb was not good for debugging the interrupts, as it made a lot of side effects, and mostly hanged the system right after setting the breakpoint.

So, the good news: the most of the QEMU's 40p model devices are working properly. The bad news: finding a black sheep in a dark room is pretty hard.

Saturday, September 23, 2017

Some experiments with AIX 5.1

Since I could not find the AIX 4.2 install for Motorola, I gave AIX 5.1 under qemu-system-ppc a shot. The feelings are mixed, on one hand I've got no reference machine to check things, on the other hand the KDB debugger in AIX 5.1 is much more powerful than in 4.2. The initialization process of 5.1 is close to 4.2,  so I can recognize some structures. Which is good: the version 4.2 is quite different from 4.1.4 which I tried first. So I was afraid they made an equal leap in 4.x->5.x transition. Well, partially they did. Although the function names are more or less the same, the debugger made a great leap forward.

This stack trace looked like a flashback.

[01DE1BCC]init_pcicfg+000000 (2FF3A910 [??])
[01DE1380]config_pal+000030 (??)
[01DE12F8]config_planar_pal+0001D8 (??, ??)
[004832AC]config_kmod+000184 (??, ??, ??)
[004836E4]sysconfig+000104 (??, ??, ??)
[00003A94].sys_call+000000 ()
[10002668]cfgpal_rspc+0003E8 ()
[100016C0]main+000110 (??, ??, ??)
[10000188]__start+000088 ()

Under 4.2 it was like this:

(gdb) bt
#0  0x018d2b7c in ?? () -- pci_rw
#1  0x00088114 in ?? ()
#2  0x00088114 in ?? ()
#3  0x018d1db0 in ?? () -- init_crashdump 0x018d1d70
#4  0x018cf410 in ?? () -- config_pal  0x018cf35c
#5  0x018cf30c in ?? () -- config_planar_pal 0x018cf100
#6  0x000f9b3c in ?? () -- config_kmod 0x000f9a5c, 2 params, size 0x118
#7  0x000f9eb8 in ?? ()
#8  0x000037a8 in ?? ()

See the formatting differences and gaps? That's because under 4.2 I had to make the trace manually. Did I mention that the 4.2 debugger is eighties style? So, now I'm really enjoying the luxury of having a modern tool.

Also there is a possibility to make the output verbose:

KDB(0)> mw enter_dbg
enter_dbg+000000:  00000000  = 42
n_core+000000:  00000032  = .
KDB(0)>

But then there are also some bad news. There are still bugs (or missing features) in qemu. Even worse, there is at least one Heisenbug. Some times it gets to the PCI initialization and sometimes not. And in the cases where it doesn't get to PCI init it's really unclear why: it just sits in the idle loop, interrupts are enabled, and it receives the interrupts from the timer. Just for some reason it thinks there is nothing to do. Debugging such cases is a real nightmare.
So, I thought maybe go as far it can in case where it does reach PCI init and see any clues in the log.
No obvious clues, but here goes a pretty long log:

Time: 0 LEDS: 0x539
Number of running methods: 0
 cfgmgr LED{539}
----------------
Attempting to configure device 'bus0'
 cfgmgr LED{78A}
Time: 0 LEDS: 0x78a
Invoking /usr/lib/methods/cfgbus_pci -1 -l bus0
exec(/bin/sh,-c,/usr/lib/methods/cfgbus_pci -1 -l bus0)
Number of running methods: 1
exec(/usr/lib/methods/cfgbus_pci,-1,-l,bus0)
Breakpoint
.bus_register+000000     mflr    r0                  <01DEADA0>
KDB(0)> g
exec(/bin/sh,-c,/usr/lib/methods/define_rspc -c bus -s pci -t isa -p bus0 -w 88 -L 04-A0 -d)
exec(/usr/lib/methods/define_rspc,-c,bus,-s,pci,-t,isa,-p,bus0,-w,88,-L,04-A0,-d)
exec(/bin/sh,-c,/usr/lib/methods/cfgbus_isa -1 -l bus1)
exec(/usr/lib/methods/cfgbus_isa,-1,-l,bus1)
Breakpoint
.bus_register+000000     mflr    r0                  <01DF8FF0>
KDB(0)> g
exec(/bin/sh,-c,/usr/lib/methods/define_rspc -d -c adapter -s isa_sio -t fda -p bus1 -w PNP0700ffffffff -L 01-B0)
exec(/usr/lib/methods/define_rspc,-d,-c,adapter,-s,isa_sio,-t,fda,-p,bus1,-w,PNP0700ffffffff,-L,01-B0)
exec(/bin/sh,-c,/usr/lib/methods/define_rspc -d -c adapter -s isa_sio -t isa_keyboard -p bus1 -w PNP0303ffffffff -L 01-D0)
exec(/usr/lib/methods/define_rspc,-d,-c,adapter,-s,isa_sio,-t,isa_keyboard,-p,bus1,-w,PNP0303ffffffff,-L,01-D0)
exec(/bin/sh,-c,/usr/lib/methods/define_rspc -d -c adapter -s isa_sio -t isa_mouse -p bus1 -w PNP0F03ffffffff -L 01-E0)
exec(/usr/lib/methods/define_rspc,-d,-c,adapter,-s,isa_sio,-t,isa_mouse,-p,bus1,-w,PNP0F03ffffffff,-L,01-E0)
exec(/bin/sh,-c,/usr/lib/methods/define_rspc -d -c adapter -s isa_sio -t s1a -p bus1 -w PNP05011 -L 01-F0)
exec(/usr/lib/methods/define_rspc,-d,-c,adapter,-s,isa_sio,-t,s1a,-p,bus1,-w,PNP05011,-L,01-F0)
exec(/bin/sh,-c,/usr/lib/methods/define_rspc -c adapter -s pci -t ncr810 -p bus0 -w 96 -L 04-B0 -d)
exec(/usr/lib/methods/define_rspc,-c,adapter,-s,pci,-t,ncr810,-p,bus0,-w,96,-L,04-B0,-d)
----------------
Completed method for: bus0, Elapsed time = 0
Return code = 0
***** stdout *****
:devices.isa_sio.IBM000E :devices.isa_sio.PNP0400 :devices.pci.22100020
fda0,sioka0,sioma0,sa0,scsi0

*** no stderr ****
----------------
Time: 0 LEDS: 0x539
Number of running methods: 0
 cfgmgr LED{539}
----------------
Attempting to configure device 'fda0'
Method: /usr/lib/methods/cfgfda_isa not in boot image, configure in phase 2
----------------
Attempting to configure device 'sioka0'
Method: /usr/lib/methods/cfgkm_isa not in boot image, configure in phase 2
----------------
Attempting to configure device 'scsi0'
 cfgmgr LED{868}
Time: 0 LEDS: 0x868
Invoking /usr/lib/methods/cfgncr_scsi -1 -l scsi0
exec(/bin/sh,-c,/usr/lib/methods/cfgncr_scsi -1 -l scsi0)
exec(/usr/lib/methods/cfgncr_scsi,-1,-l,scsi0)
exec(/bin/sh,-c,/etc/methods/define -c disk -s scsi -t osdisk -p scsi0 -w 0,0)
exec(/etc/methods/define,-c,disk,-s,scsi,-t,osdisk,-p,scsi0,-w,0,0)
exec(/bin/sh,-c,/etc/methods/define -c cdrom -s scsi -t oscd -p scsi0 -w 2,0)
exec(/etc/methods/define,-c,cdrom,-s,scsi,-t,oscd,-p,scsi0,-w,2,0)
Number of running methods: 1
----------------
Completed method for: scsi0, Elapsed time = 0
Return code = 0
***** stdout *****
hdisk0 cd0
*** no stderr ****
----------------
Time: 0 LEDS: 0x539
Number of running methods: 0
 cfgmgr LED{539}
----------------
Attempting to configure device 'hdisk0'
Method: /etc/methods/cfgscdisk not in boot image, configure in phase 2
----------------
Attempting to configure device 'cd0'
 cfgmgr LED{723}
Time: 0 LEDS: 0x723
Invoking /etc/methods/cfgsccd -1 -l cd0
exec(/bin/sh,-c,/etc/methods/cfgsccd -1 -l cd0)
exec(/etc/methods/cfgsccd,-1,-l,cd0)
Number of running methods: 1
----------------
Completed method for: cd0, Elapsed time = 0
Return code = 0
*** no stdout ****
*** no stderr ****
----------------
Time: 0 LEDS: 0x539
Number of running methods: 0
 cfgmgr LED{539}
----------------
Time: 0 LEDS: 0x538
Invoking top level program -- "/usr/lib/methods/deflvm"
 cfgmgr LED{538}
exec(/bin/sh,-c,/usr/lib/methods/deflvm )
 cfgmgr LED{539}
Time: 0 LEDS: 0x539
Return code = 127
*** no stdout ****
***** stderr *****
sh: /usr/lib/methods/deflvm:  not found

Method error (/usr/lib/methods/deflvm):
        0514-068 Cause not known.
sh: /usr/lib/methods/deflvm:  not found

----------------
Time: 0 LEDS: 0x538
Invoking top level program -- "/usr/lib/methods/fdarcfgrule"
 cfgmgr LED{538}
exec(/bin/sh,-c,/usr/lib/methods/fdarcfgrule )
 cfgmgr LED{539}
Time: 0 LEDS: 0x539
Return code = 127
*** no stdout ****
***** stderr *****
sh: /usr/lib/methods/fdarcfgrule:  not found

Method error (/usr/lib/methods/fdarcfgrule):
        0514-068 Cause not known.
sh: /usr/lib/methods/fdarcfgrule:  not found

----------------
Time: 0 LEDS: 0x538
Invoking top level program -- "/usr/lib/methods/defssar"
 cfgmgr LED{538}
exec(/bin/sh,-c,/usr/lib/methods/defssar )
 cfgmgr LED{539}
Time: 0 LEDS: 0x539
Return code = 127
*** no stdout ****
***** stderr *****
sh: /usr/lib/methods/defssar:  not found

Method error (/usr/lib/methods/defssar):
        0514-068 Cause not known.
sh: /usr/lib/methods/defssar:  not found

 cfgmgr LED{FFF}
Configuration time: 0 seconds
+ 1> /etc/filesystems
+ /usr/lib/methods/showled 0x517
exec(/usr/lib/methods/showled,0x517)
 showled LED{517}
+ bootinfo -b
exec(/usr/sbin/bootinfo,-b)
exec(/usr/lib/boot/bin/bootinfo_rspc,-b)
+ mount -v cdrfs -o ro /dev/cd0 /SPOT
exec(/usr/sbin/mount,-v,cdrfs,-o,ro,/dev/cd0,/SPOT)
exec(/usr/bin/sh,-c,/usr/sbin/wlmcntrl -u -d "" > /dev/null 2>&1)
+ [ 0 -ne 0 ]
+ /usr/lib/methods/showled 0x512
exec(/usr/lib/methods/showled,0x512)
 showled LED{512}
+ /SPOT/usr/bin/rm -r /etc/init /usr/bin /usr/lib/boot /usr/lib/drivers/ataide /usr/lib/drivers/ataidepin /usr/lib/drivers/cfs.ext /usr/lib/drivers/idecdrom /usr/lib/drivers/idecdrompin /usr/lib/drivers/isa /usr/lib/drivers/pci /usr/lib/drivers/planar_pal_rspc /usr/lib/drivers/scdisk /usr/lib/drivers/scdiskpin /usr/lib/methods/cfgataide /usr/lib/methods/cfgbus_isa /usr/lib/methods/cfgbus_pci /usr/lib/methods/cfgidecdrom /usr/lib/methods/cfgncr_scsi /usr/lib/methods/cfgsccd /usr/lib/methods/cfgsys_rspc /usr/lib/methods/chggen /usr/lib/methods/chggen_rspc /usr/lib/methods/define /usr/lib/methods/define_rspc /usr/lib/methods/defsys /usr/lib/methods/showled /usr/lib/methods/ucfgdevice /usr/sbin
exec(/SPOT/usr/bin/rm,-r,/etc/init,/usr/bin,/usr/lib/boot,/usr/lib/drivers/ataide,/usr/lib/drivers/ataidepin,/usr/lib/drivers/cfs.ext,/usr/lib/drivers/idecdrom,/usr/lib/drivers/idecdrompin,/usr/lib/drivers/isa,/usr/lib/drivers/pci,/usr/lib/drivers/planar_pal_rspc,/usr/lib/drivers/scdisk,/usr/lib/drivers/scdiskpin,/usr/lib/methods/cfgataide,/usr/lib/methods/cfgbus_isa,/usr/lib/methods/cfgbus_pci,/usr/lib/methods/cfgidecdrom,/usr/lib/methods/cfgncr_scsi,/usr/lib/methods/cfgsccd,/usr/lib/methods/cfgsys_rspc,/usr/lib/methods/chggen,/usr/lib/methods/chggen_rspc,/usr/lib/methods/define,/usr/lib/methods/define_rspc,/usr/lib/methods/defsys,/usr/lib/methods/showled,/usr/lib/methods/ucfgdevice,/usr/sbin)
...
Attempting to configure device 'fda0'
 cfgmgr LED{828}
Time: 0 LEDS: 0x828
Invoking /usr/lib/methods/cfgfda_isa -2 -l fda0
exec(/bin/sh,-c,/usr/lib/methods/cfgfda_isa -2 -l fda0)
Number of running methods: 1
exec(/usr/lib/methods/cfgfda_isa,-2,-l,fda0)

Now it hangs on the floppy disk adapter init. Looks like there is no timeout. Strange.
There are two ways from here: either fix the floppy emulation, or make OFW for 40p with no floppy...

Saturday, September 16, 2017

AIX under QEMU boots up to NFS

Launching a proprietary OS under QEMU is never boring because every next problem has to do with yet another component. A few weeks ago I was mostly doing Forth to make a bootable firmware, then I fought with the missing residual data, at which point it was mostly debugging ODM database using the PPC assembly, then extracted the mock residual data from the live system, then spent some time with NCR/LSI script, and now after fixing the PCI layout it gets to the point of starting the NIS and NFS services. It works much slower than the real machine, and also much slower than Linux/PPC, but still:

MOT PowerStack2 (e0), Serial #0, 62 MiB memory installed
Open Firmware , Built  September 01, 2017 16:11:38
Copyright (c) 1995-2000, FirmWorks.
Copyright (c) 2014,2017, Artyom Tarasenko.

Rebooting with command: boot /pci/scsi@2/disk@0,0
Boot device: /pci/scsi@2/disk@0,0  Arguments:

+ swcons -c

Saving Base Customize Data to boot disk
Starting the sync daemon
Starting the error daemon
System initialization completed.
Starting Multi-user Initialization
 Performing auto-varyon of Volume Groups
 Activating all paging spaces
swapon: Paging device /dev/hd6 activated.
 Performing all automatic mounts
mount: 1831-010 server axxxxs01 not responding:
RPC: 1832-018 Port mapper failure - RPC: 1832-006 Unable to send
mount: backgrounding
axxxxs01:/home
Multi-user initialization completed
Checking for srcmstr active...complete
Starting tcpip daemons:
0513-056 Timeout waiting for command response.
0513-056 Timeout waiting for command response.
vmtune:  current values:
  -p       -P        -r          -R         -f       -F       -N        -W
minperm  maxperm  minpgahead maxpgahead  minfree  maxfree  pd_npages maxrandwrt
   2968    11872       2          8        115      123     524288        0

  -M       -w       -k       -c         -b          -B          -u
maxpin   npswarn  npskill  numclust  numfsbufs   hd_pbuf_cnt  lvm_bufcnt
  12692     1536      384        0       93           64           9

number of valid memory pages = 15864    maxperm=74.8% of real memory
maximum pinable=80.0% of real memory    minperm=18.7% of real memory
number of file memory pages = 1443      numperm=9.1% of real memory

vmtune:  new values:
  -p       -P        -r          -R         -f       -F       -N        -W
minperm  maxperm  minpgahead maxpgahead  minfree  maxfree  pd_npages maxrandwrt
   2968    11872       2          8        115      123     524288       64

  -M       -w       -k       -c         -b          -B          -u
maxpin   npswarn  npskill  numclust  numfsbufs   hd_pbuf_cnt  lvm_bufcnt
  12692     1536      384        1       93           64           9

number of valid memory pages = 15864    maxperm=74.8% of real memory
maximum pinable=80.0% of real memory    minperm=18.7% of real memory
number of file memory pages = 1444      numperm=9.1% of real memory

Starting NIS services:
Starting NFS services:

0513-056 Timeout waiting for command response.
NIS: server not responding for domain "axxxxs01"; still trying.
NIS: server not responding for domain "axxxxs01"; still trying.
0513-056 Timeout waiting for command response.
NIS: server not responding for domain "axxxxs01"; still trying.
NIS: server not responding for domain "axxxxs01"; still trying.
NIS: server not responding for domain "axxxxs01"; still trying.

Woo-hoo! I could be proud that I made the second proprietary OS boot under QEMU. Or even the third one, because Solaris/sun4m is quite different from Solaris/sun4v, so probably each one counts for one.

But now I'm puzzled. Initially I planned to build a throw-away prototype for Powestack II Utah, fix the QEMU bugs preventing booting AIX with the existing machines, and dispose the prototype. The Motorola AIX I've got seems to expect the machine to be called "MOT PowerStack2 (e0)". Not sure if pushing such a model upstream wouldn't cause any copyright/trademark issues.

Now it boots to the same point as on the physical machine, but except for the PCI layout problems, I haven't found any bugs. And the PCI layout can not be the reason of AIX failing on the 40p machine, because it fails much earlier.

So obviously it's time to change the plan. And here is where things are going to slow down. I don't have the install media for the Motorola AIX 4.2, so I can not make my disk usable - it waits for NFS/NIS servers forever. I can not share the HDD image because it may contain the private data, so publishing the Powerstack II Utah target for QEMU makes a little sense for now. So, either I find the Motorola AIX 4.2 install CD (and probably a physical UW-SCSI drive if it won't work at the first attempt under QEMU), or the work has to be done again for the 40p target. Yes, now I have some know-how about the AIX boot process, but still at the moment I don't feel like starting again from scratch with the 40p target.

The good news is that AIX can definitely be booted under qemu-system-ppc.

Saturday, September 9, 2017

From SCSI to PCI

20 years ago SCSI devices ruled the world. Reading the NCR53c810 manual, I see that it was basically a computer in a computer. It can be programmed to transfer data from/to/between the disks without using the CPU at all.

Trying to understand what happens in the NCR/LSI script:

lsi_scsi: Select LUN 0
lsi_scsi: Extended message 0x1 (len 3)
lsi_scsi: SDTR (ignored)
lsi_scsi: SCRIPTS dsp=81c3b924 opcode 80080000 arg 81c3b9a4
lsi_scsi: Jump to 0x81c3b9a4
lsi_scsi: SCRIPTS dsp=81c3b9a4 opcode 870b0000 arg 81c3b9c4
lsi_scsi: Compare phase 2 == 7
lsi_scsi: Control condition failed
lsi_scsi: SCRIPTS dsp=81c3b9ac opcode 860a0000 arg 81c3b94c
lsi_scsi: Compare phase 2 == 6
lsi_scsi: Control condition failed
lsi_scsi: SCRIPTS dsp=81c3b9b4 opcode 98080000 arg 00000022
lsi_scsi: Interrupt 0x00000022
...
lsi_scsi: SCRIPTS dsp=81c3bb6c opcode 0e000002 arg 81c3bdb4
lsi_scsi: MSG out len=2
lsi_scsi: Select LUN 0
lsi_scsi: MSG: ABORT tag=0x0
lsi_scsi: SCRIPTS dsp=81c3bb74 opcode 80080000 arg 81c3bbbc
lsi_scsi: Jump to 0x81c3bbbc
lsi_scsi: SCRIPTS dsp=81c3bbbc opcode 60000008 arg 00000000
lsi_scsi: Clear ATN

Looks like it aborts if the selected SCSI target doesn't change phase to MSG_OUT or MSG_IN. So I implemented a hack for SDTR reply and it doesn't abort here. But indeed it's a red herring. AIX can also work with the devices which do not support the synchronous or wide transfers.

The actual problem happens later:

lsi_scsi: SCRIPTS dsp=81c43444 opcode c0000004 arg 010000dc
lsi_scsi: memcpy dest 0x81c435fc src 0x010000dc count 4

Or, with a bit more enhanced logging:

lsi_scsi: memcpy dest 0x81c475fc (Mem) src 0x010000dc (IO) count 4
lsi_mem_read, address_space_read status 2
lsi_scsi: the first 4 bytes: 00 00 00 00

It tries to read the port 0x10000dc and save it. QEMU doesn't have anything at the port 0x10000dc, so no wonder the NCR script fails. But what is supposed to be there? The Motorola Ultra 603/Ultra 603e/Ultra 604 Programmer’s Reference Guide suggests it must be the PCI I/O space.

So looks like I've learned enough of NCR/LSI script. Time to see how the PCI bus mastering is supposed to work on this machine.

Saturday, September 2, 2017

More fun with AIX cfgncr_scsi

<in the previous part>... doesn't find SCSI disks. Here it is tricky, it may be a problem with the interrupt routing, or DMA or SCSI host emulation...

... or a bug in the AIX driver itself.

As AIX 4.2 tries to perform scsi inquiry, that's what happens in the QEMU log:

(qemu) lsi_scsi: Write reg ??? ac = e4
lsi_scsi: Write reg ??? ad = 38
lsi_scsi: Write reg ??? ae = c4
lsi_scsi: Write reg ??? af = 81

The register at 0xac-0xaf is DSA Relative Selector (DRS). Is known to qemu, but seems to be not used in any operations.

The newer LSI53c1010-66 manual says:

"This register supplies AD[63:32] during Table Indirect
Fetches and Load/Store Data Structure Address (DSA)
relative operations"

So, maybe just add the support of this register to QEMU and allow the 64 bit DMA transfers, right?

Wrong. The write to this register is the last write and it doesn't start any SCSI command. Let's look where it happens:

p8xx_start_chip:
...
   0x018ff854:  stw     r8,-4(r7)
   0x018ff858:  li      r4,44         ; 0x2c
   0x018ff85c:  b       0x18fe348     ; p8xx_write_reg <= write happens here

The register r4 is 0x2c, but the procedure writes to 0xac. Weird.

Let's look at the other registers:

0x018fe368 in ?? ()
(gdb) info registers r3 r5 r4
r3             0x18f5000        26169344
r5             0x31000080       822083712
r4             0x2c     44

What's that 80 at the end of r5? 0x80 + 0x2c is 0xac. Coincidence? Don't think so.

So, what happens here is the driver tries to write 0x2c, but the bus is shifted, so it hits 0xac. After some chasing I found where this shift is coming from:

p8xx_config:
...
   0x018fdb9c:  bl      0x1909938
   0x018fdba0:  lwz     r2,20(r1)
   0x018fdba4:  cmpwi   cr1,r3,19
   0x018fdba8:  beq     cr1,0x18fdbc8
   0x018fdbac:  li      r8,1
   0x018fdbb0:  stb     r8,256(r28)
   0x018fdbb4:  li      r3,0
   0x018fdbb8:  bl      0x1909938
   0x018fdbbc:  lwz     r2,20(r1)
   0x018fdbc0:  lwz     r8,160(r28)
   0x018fdbc4:  b       0x18fdbd0
   0x018fdbc8:  stb     r29,256(r28)
   0x018fdbcc:  lwz     r8,160(r28)
   0x018fdbd0:  lis     r11,4096
   0x018fdbd4:  addic   r10,r8,128    ; this is the 0x80 I'm looking for
   0x018fdbd8:  li      r8,-1
   0x018fdbdc:  rlwinm  r31,r26,1,15,30
   0x018fdbe0:  addic   r23,r28,10580
   0x018fdbe4:  stw     r10,252(r28)
   0x018fdbe8:  li      r25,1
   0x018fdbec:  addi    r30,r28,0
   0x018fdbf0:  stwu    r25,10512(r30)
   0x018fdbf4:  stw     r8,10588(r28)
   0x018fdbf8:  lwz     r8,10500(r28)
   0x018fdbfc:  stw     r11,10520(r28)
   0x018fdc00:  stw     r10,10532(r28) ; and here it is stored
...

It is added and stored unconditionally. If I drop this addic, something different happens:

(qemu) lsi_scsi: Write reg DSP0 2c = e4
lsi_scsi: Write reg DSP1 2d = 58
lsi_scsi: Write reg DSP2 2e = c4
lsi_scsi: Write reg DSP3 2f = 81
lsi_scsi: SCRIPTS dsp=81c458e4 opcode 41000000 arg 81c45a44
lsi_scsi: Selected target 0
lsi_scsi: SCRIPTS dsp=81c458ec opcode 78370000 arg 00000000
lsi_scsi: Read-Modify-Write reg 0x37 MOV data8=0x00 sfbr=0x00
...

Why would it work on the physical hardware? I guess because the addresses are aliased. Pretty similar to the le bug in Solaris.

So, it's not that QEMU has some unimplemented registers. In this case it has too many implemented ones.

On the other hand, it still doesn't detect the scsi disk, so maybe it has not just too much features, but too few as well...

/Stay tuned