Previously in this blog: ...two ways from here: either fix the floppy emulation, or make OFW for 40p with no floppy...
... or skip the call. You know, I have an armed debugger here and am not afraid to use it. So just turn the fatal call:
into something harmless, like:
by using
Well actually it probably should have been "/usr/bin/echo", there is no "/bin/echo" in the system. But obviously the attempt above was good enough for AIX, as it doesn't really need the floppy disk adapter (nor mouse & keyboard which I had to hack in a similar way at the second attempt). This brings AIX here:
Where it hangs forever. And now the problem is sort of obvious. Yesterday I wrote that the boot log hadn't had shown any hint. But it did:
See? The clock is not ticking (it's probably caused by a QEMU bug, that "loadvm" command sometimes doesn't restore one of the machine timers. And I used the command a lot during the yesterdays session).
So basically there are two scenarios:
- the clock is ticking - in this case AIX doesn't start any methods after spawning the init process
- the clock is stopped - in this case it starts the methods up to the point where the timeouts are important. Probably if the clock had worked properly the boot process wouldn't had stopped at the floppy detection method.
Which means that debug process is getting real complicated. Now I have to debug the kernel scheduler, which is tricky. And obviously is different from AIX 4.2 which doesn't hang at that point.
The KDB from 5.1 has some features to see the scheduled timers, but I'm not sure it can be used to debug the interrupt handling. At least Solaris kadb was not good for debugging the interrupts, as it made a lot of side effects, and mostly hanged the system right after setting the breakpoint.
So, the good news: the most of the QEMU's 40p model devices are working properly. The bad news: finding a black sheep in a dark room is pretty hard.
... or skip the call. You know, I have an armed debugger here and am not afraid to use it. So just turn the fatal call:
/usr/lib/methods/cfgfda_isa -2 -l fda0
into something harmless, like:
/bin/echo -2 -l fda0
by using
set *(int *) 0x200c11a8 = 0x2f62696e set *(int *) 0x200c11ac = 0x2f656368 set *(int *) 0x200c11b0 = 0x6f000000
Well actually it probably should have been "/usr/bin/echo", there is no "/bin/echo" in the system. But obviously the attempt above was good enough for AIX, as it doesn't really need the floppy disk adapter (nor mouse & keyboard which I had to hack in a similar way at the second attempt). This brings AIX here:
Completed method for: fda0, Elapsed time = 0 Return code = 127 *** no stdout **** ***** stderr ***** sh: /usr/lib/methods/cfgfda_isa: not found Method error (/usr/lib/methods/cfgfda_isa -2 -l fda0 ): 0514-068 Cause not known. ... exec(/../usr/sbin/lqueryvg,-phdisk0,-L) exec(/../usr/bin/grep,00000000000000000000000000000000) exec(/usr/bin/dosread,-S,/preload,/preload) exec(/usr/lpp/bosinst/datadaemon) exec(/../usr/bin/sleep,1)
Where it hangs forever. And now the problem is sort of obvious. Yesterday I wrote that the boot log hadn't had shown any hint. But it did:
Time: 0 LEDS: 0x539 ... Time: 0 LEDS: 0x78a ... Completed method for: bus0, Elapsed time = 0 ... Time: 0 LEDS: 0x539 ... Time: 0 LEDS: 0x868 ... Completed method for: scsi0, Elapsed time = 0 ...
See? The clock is not ticking (it's probably caused by a QEMU bug, that "loadvm" command sometimes doesn't restore one of the machine timers. And I used the command a lot during the yesterdays session).
So basically there are two scenarios:
- the clock is ticking - in this case AIX doesn't start any methods after spawning the init process
- the clock is stopped - in this case it starts the methods up to the point where the timeouts are important. Probably if the clock had worked properly the boot process wouldn't had stopped at the floppy detection method.
Which means that debug process is getting real complicated. Now I have to debug the kernel scheduler, which is tricky. And obviously is different from AIX 4.2 which doesn't hang at that point.
The KDB from 5.1 has some features to see the scheduled timers, but I'm not sure it can be used to debug the interrupt handling. At least Solaris kadb was not good for debugging the interrupts, as it made a lot of side effects, and mostly hanged the system right after setting the breakpoint.
So, the good news: the most of the QEMU's 40p model devices are working properly. The bad news: finding a black sheep in a dark room is pretty hard.