Saturday, January 25, 2025

Self-hosting OpenSolaris under qemu-system-sparc -M niagara

Back to the Niagara target which I haven't touched since 2017. Back then I was playing with the Machine Description files, but afair OpenSolaris and Debian had different expectations. Unfortunately I haven't described my experiments, so I don't remember how exactly I created these machine definitions. Actually they are documented in the FWARC 2005/115. But it's something which takes time to understand, in some chapters it looks like unidirectional Directed Acyclic Graph, in other chapter there are references to "fwd" and "back". Anyway I've uploaded whatever I produced to my GitHub. At least it's possible to have 1GiB RAM.

Having 1 GiB allows QEMU booting various OpenSolaris/Illumos distributions, for instance the last dilos release for sun4v (dilos-net-1.3.7.136-sparc64.iso), but let’s start with OpenSolaris ramdisk.snv-b77-nd.no-boot-time-network.gz which was released as a part of OpenSparc T1. It is optimized to the Niagara machine QEMU emulates. Particularly it has the hsimd driver for a RAM-Disk. The other illumos distributions don’t seem to have it, although it’s released under GPLv2, so it should be possible to build it anywhere.

The v9os and Tribblix distributions use bootarchive and have much more features, which makes them too heavy for QEMU with its current performance. Booting Tribblix takes ~1 Hour on my laptop. The sun4v emulation can definitely be significantly optimized, will get back to this topic later. 

As the name suggests, the image has no network support. So, let’s add it. Not having a network card seems to be a challenge, but then again back in the nineties I haven’t had a network card either. There is a serial line, and this is enough to start hacking. At my University we phoned a UNIX console, executed slirp and then started pppd.

Chapter 1.  Experimenting with networking

The gunziped snv_77 image can be mounted for instance on qemu sun4m emulation, or any physical machine. Maybe it can even be read/write mounted under Linux, but RHEL9/OL9 doesn’t have the ufs.ko driver out of the box, so I haven’t tried it.

Initially I planned to use the authentic slirp.10c.sol24sparc binary, but alas, it wasn’t archived, and my google-fu was not strong enough to find it anywhere. So, I used a Solaris 9 machine to compile the binary myself. It wanted to have some crypt libraries which are different between my Solaris 9 installation and OpenSolaris snv_77. I don’t need any encryption performing communication as long as I stay on localhost, so I simply did

sed -e 's#crypt#nocrypt#g' configure > configure-nocrypt &&  chmod +x .configure-nocrypt && ./configure-nocrypt && make
Let's see if it's ok for snv77:
# /usr/local/bin/slirp -P
Slirp v1.0.16 (BETA) 

Copyright (c) 1995,1996 Danny Gasparovski and others. 
All rights reserved. 
This program is copyrighted, free software. 
Please read the file COPYRIGHT that came with the Slirp 
package for the terms and conditions of the copyright. 

IP address of Slirp host: 192.168.186.100 
[none found] 
Your address is 10.0.2.15 
(or anything else you want) 

Type five zeroes (0) to exit. 

[talking PPP, 115200 baud] 

SLiRP Ready ...
Nice! At this point I killed the user socat session, and used socat for creating a virtual serial.
socat pty,link=/dev/snv77,raw UNIX:/tmp/snv77.sock
Then started a pppd and I swear I could hear a phantom modem connect sound. The “anything else” message brought me to an idea to specify a different address (I already have 10.0.2.15 on another interface), but el9 pppd failed to negotiate it with slirp. So, I kept 10.0.2.15 for the moment. (Later I changed it to be 10.0.5.15) Let’s see it the machine is reachable.
$ telnet 192.168.186.100 
Trying 192.168.186.100...
And nothing happens. And actually, where is this 192.168.186.100 coming from? Oh, it’s defined in /etc/hosts. The physical FPGA machine for this image would have this address on its network card. But I don’t have a network card. The only one out there is the loopback with 127.0.0.1, which cannot be used for obvious routing reasons. No problem, another loopback to the rescue:
# ifconfig lo0:1 plumb 
# ifconfig lo0:1 192.168.186.100 up
Is it reachable now?
# ifconfig lo0:1 plumb 
$ telnet 192.168.186.100 
Trying 192.168.186.100... 
Connected to 192.168.186.100. 
Escape character is '^]'. 
login: root 
Last login: Tue Jan 19 06:46:15 on console 
Sun Microsystems Inc.   SunOS 5.11      snv_77  October 2007 
#
Good. Telnet is nice, but is extremely inconvenient for transferring files. And since I’m on nineties trip, let’s use rsh for the authenticity. Luckily I still have OpenSolaris b77 dvd with rsh and all the necessary libraries. So I added rshd.in from the SUNWrcmdr package and /usr/lib/libcmd.so.1 from SUNWcsl.
$ rsh 192.168.186.100 
::ffff:192.168.186.100: Connection refused
What? Why? Actually this happens because rsh without command works like rlogin. Which talks to a totally different daemon on a different port (513 instead of 514). I think this is a violation of the main UNIX principle: one program should do just one thing. So, rsh went against the rules and where is it now?
rsh -l root 192.168.186.100 ls -l
This one just hangs.

In my previous post I was wondering if anyone used rsh to execute commands on remote hosts back in nineties. I was sure I had used to do it, but found no success reports on the Net.

I even thought that this was another evidence of the Mandela Effect: the only reference I could find was stating that sending commands over rsh did not work.

I’ve looked at the code and found that internally slirp acts as a proxy, executing another rsh and piping the data back to the client. So I simply removed the support for rsh, 

$  git diff
diff --git a/src/ctl.h b/src/ctl.h
index 4a8576d..3518fb3 100644
--- a/src/ctl.h
+++ b/src/ctl.h
@@ -3,5 +3,5 @@
 #define CTL_ALIAS      2
 #define CTL_DNS                3
 
-#define CTL_SPECIAL    "10.0.2.0"
-#define CTL_LOCAL      "10.0.2.15"
+#define CTL_SPECIAL    "10.0.5.0"
+#define CTL_LOCAL      "10.0.5.15"
diff --git a/src/tcp_subr.c b/src/tcp_subr.c
index c14755a..0049778 100644
--- a/src/tcp_subr.c
+++ b/src/tcp_subr.c
@@ -563,7 +563,7 @@ struct tos_t tcptos[] = {
          {0, 23, IPTOS_LOWDELAY, 0},   /* telnet */
          {0, 80, IPTOS_THROUGHPUT, 0}, /* WWW */
          {0, 513, IPTOS_LOWDELAY, EMU_RLOGIN|EMU_NOCONNECT},   /* rlogin */
-         {0, 514, IPTOS_LOWDELAY, EMU_RSH|EMU_NOCONNECT},      /* shell */
+/* don't         {0, 514, IPTOS_LOWDELAY, EMU_RSH|EMU_NOCONNECT},       shell */
          {0, 544, IPTOS_LOWDELAY, EMU_KSH},            /* kshell */
          {0, 543, IPTOS_LOWDELAY, 0},  /* klogin */
          {0, 6667, IPTOS_THROUGHPUT, EMU_IRC}, /* IRC */

and added the rsh ports to port-forward. This way, Solaris rshd.in immediately complained that

 Jan 19 14:22:11 t1-fpga-00 rsh[270]: [ID 521673 daemon.notice] connection from 192.168.186.100 (192.168.186.100) - bad port
Makes sense. Slirp is usually started as normal used, so it communicates from unprivileged ports, whereas rshd expects the port to be in range: 513-1023:
 bad_port = (port >= IPPORT_RESERVED ||
		port < (uint_t)(IPPORT_RESERVED/2));

Fine, let’s hack rshd.in and remove this check:

 ./gdb-sparc64-solaris  --write -q in.rshd 
Reading symbols from /home/tyom/snv77/prep-rsh/usr/sbin/in.rshd...(no debugging symbols found)...done. 
(gdb) set *(int *) 0x000139c4=0x01000000 
(gdb) quit

And after that I'm back to where I started. rshd.in doesn’t complain, “rsh -l ls” waits for something. Then it occurred to me that I recently my laptop to OL9. I checked iptables settings immediately after first encountering the hanging rsh process, and the ports were open. But you know what? RHEL9/OL9 use firewalld by default. And there indeed communication to the ports 513-1023 is not permitted by default, which totally makes sense.

But since I don’t expect to be hacked from my own VM, I’ve permitted these connections. And now

$ rsh -lroot 192.168.186.100 "uname -a" 
SunOS t1-fpga-00 5.11 snv_77 sun4v sparc sun4v

Woo-hoo! We’ve got a networked sparc64 OpenSolaris image which can be used in self-hosting mode. You can put there whatever files you want with

  cat filename | rsh -lroot 192.168.186.100 "cat >filename" 
Then shut it down,
#  init 5
And finally in the qemu monitor,
 (qemu) pmemsave 0x1f40000000 83886080 vdisk.ram

Where 83886080 is the current size of the virtual disk in bytes (can be checked with ls -l). The next time you boot with the vdisk.ram, you’ll find the changes done to the FS before "init 5" and "pmemsave". Indeed I don’t encourage anyone to use rsh, this was done just for fun. If you want to try it yourself, the next chapter describes how to use the image created by experiments described above. 

Chapter 2. Using the snv-with-slirp image

The image for the experiments below, currenly resides here. Unpack it first. 

Launching it can be done with 6 terminals, 5 run with user privileges and 1 with root: 

  1. qemu-system-sparc64 -M niagara:
     ./qemu-system-sparc64 -M niagara -L ../1GiB-snv_77/ -m 1
    024 -nographic -serial unix:/tmp/snv77.sock,server -drive if=pflash,readonly=on,file=../sparc-disks/snv-with-slirp

    At the end of the session the (qemu) prompt it can be optionally used to save RAM disc contents into a file (qemu) pmemsave 0x1f40000000 83886080 vdisk.ram 
  2. socat
     socat STDIO,raw,echo=0 UNIX:/tmp/snv77.sock
    
    This one is used as a temporary helper to submit the boot command, login, configure the IP address alias and start the slirp process
     boot -vV
    

    As soon as the login prompt appears
    SunOS Release 5.11 Version snv_77 64-bit
    Copyright 1983-2007 Sun Microsystems, Inc.  All rights reserved.
    Use is subject to license terms.
    os-io Ethernet address = 0:80:3:de:ad:3
    Using default device instance data
    mem = 1048576K (0x40000000)
    avail mem = 945700864
    root nexus = Sun Fire T2000
    pseudo0 at root
    pseudo0 is /pseudo
    scsi_vhci0 at root
    scsi_vhci0 is /scsi_vhci
    virtual-device: hsimd0
    hsimd0 is /virtual-devices@100/disk@0
    root on /virtual-devices@100/disk@0:a fstype ufs
    pseudo-device: dld0
    dld0 is /pseudo/dld@0
    cpu0: UltraSPARC-T1 (cpuid 0 clock 5 MHz)
    pseudo-device: devinfo0
    devinfo0 is /pseudo/devinfo@0
    Hostname: t1-fpga-00
    
    t1-fpga-00 console login:
    
    login as root and then
    ifconfig lo0:1 plumb
    ifconfig lo0:1 192.168.186.100 up
    echo + >/etc/hosts.equiv
    /usr/local/bin/slirp-1.0.16-no-rsh-emu -P "redir 1023 1023" "redir 1022 1022" "redir 1021 1021"
    
  3. After slirp is started, kill the temporary socat
    killall socat
    
    (beware that it would also kill your other socat processes of the user. If you have any, you should use something more clever than killall. I don’t, so it works for me) 

  4. As root:
     socat pty,link=/dev/snv77,raw UNIX:/tmp/snv77.sock & pppd local -detach /dev/snv77
    
    This socat connects the unix socket to pty, which is used by pppd.
     
  5. Optionally start telnet session. It is handy to see what is available out there.
     
  6. This one is used to execute rsh commands on the Solaris guest. This one can be used for scripting and file transfer.

Indeed it’s possible to do with much less than 6 terminals. QEMU can be connected directly to pty and booting/starting slirp can be done for instance by a pppd expect script, or by changing the guest init sequence. Having the 6 terminals just helps to have more control and makes it easier to debug if something breaks.

/Happy hacking

Tuesday, January 21, 2025

rsh and slirp

Hey! I'm back with a maybe strange question for those who are in age 35+ (ahem, well primariliy to those a little more older then that). Have you used slirp on real machines to connect the Internet via pppd?

And if yes, did you for instance do something like the following?

 tar cf - myfiles* | rsh somehost tar xf -

Am I the only one who remembers doing this? Or did you immediately start with ssh, and never used rsh because of the well known reasons? Or were you the lucky ones who didn't have to use slirp because you had the network access from your home? :-)

Last week I started kinda neozeed style experiment, and hope to have something to share soon.

/Stay tuned

Sunday, June 14, 2020

Running AIX with 2 GiB of RAM and beyond

Trying to find out how much RAM can be given to a PPC PReP machine. In the IBM 40p the PCI host controller is sitting at 0x80000000, which means that in theory 2 GiB can be easily given:

QEMU PReP/40p, Serial #0, 2 GiB memory installed
Open Firmware, built  June 14, 2020 13:25:09
Copyright (c) 1995-2000, FirmWorks.
Copyright (c) 2014,2017,2019,2020 Artyom Tarasenko.

Rebooting with command: boot /pci/scsi@1/disk@0,0:1
Boot device: /pci/scsi@1/disk@0,0:1  Arguments: 

Saving Base Customize Data to boot disk
Starting the sync daemon
Starting the error daemon
System initialization completed.
Starting Multi-user Initialization
 Performing auto-varyon of Volume Groups 
 Activating all paging spaces 
0517-075 swapon: Paging device /dev/hd6 is already active.

And it even is recognized by AIX 5.1:

AIX Version 5
(C) Copyrights by IBM and by others 1982, 2000.
Console login: root
*******************************************************************************
*                                                                             *
*                                                                             *
*  Welcome to AIX Version 5.1!                                                *
*                                                                             *
*                                                                             *
*  Please see the README file in /usr/lpp/bos for information pertinent to    *
*  this release of the AIX Operating System.                                  *
*                                                                             *
*                                                                             *
*******************************************************************************

#  lsattr -El mem0 
size     2048 Total amount of physical memory in Mbytes  False
goodsize 2048 Amount of usable physical memory in Mbytes False
# 

Now I wonder, there were some 32 bit  RS/6000 machines with 3 GiB RAM, where there any PReP machines among them?

Friday, April 17, 2020

Playing with z/OS on Hercules

My new toy. Initially one of the data sets were damaged, but after all I managed to get the networking, including FTP and SSH work.

TSO login, that's easy


L CICS. Now what?
The 3270 terminals work out of the box. The networing had to be set up.


System log. Finally no red lines.
For some reason ssh is only allowed for the webadm user,  but it's possible to su - ibmuser.

and since OMVS is working, I can ssh
Maybe it's time to learn Cobol now :-)

Saturday, April 13, 2019

PReP/40p updates

Sent the qemu patches for the upstream review. Also fixed a couple of issues in OFW: clock rush and interrupt routing for the PCNet (it still used interrupt 13, which was correct in 2017, but has been changed meanwhile).

Will update the links in the how-to shortly.

/Stay tuned

Saturday, April 6, 2019

The next How-To


Nearly 10 years after writing Solaris/SPARC under QEMU How-To, now it’s time for the AIX/PReP under QEMU How-To.

Back then my strategy was using the Power-On Self Tests and other tests from the original firmware to verify and improve qemu-system-sparc.

This time I took a different approach as some tests are synthetic and check some typical hardware-specific problems like broken and shorted wires or faulty memory chips. The result of IBM firmware diagnostics is something like “replace your motherboard” – which is not exactly helpful to find out for instance whether there is a problem with the interrupt or DMA emulation. And yeah, there are some problems with the DMA emulation, that’s why qemu-system-ppc -M 40p can not use IDE CD-ROMs under AIX, and probably some other DMA devices like sound card (haven't tried it yet).

The approach this time was making the emulation good enough and describe it the way that it

  • matches to the hardware implemented in QEMU good enough
  • has a driver in AIX

The later was tricky, as AIX does support only a very limited amount of hardware. It checks exactly that your IDE controller is from Winbond (does anyone still remember them?) and checks the exact chip model. It doesn’t care if your chip is compatible, it wants the exact match.

As result we have a -M 40p model in QEMU which is not perfectly matching the physical IBM PPS 6015, and a firmware which describes it the way AIX 5.1 can see the onboard devices.

But anyways, it was fun 10 years ago and it's still fun.

AIX/PReP under QEMU How-To


AIX/PReP under QEMU How-To

Fetch the 40p-20190406-aix-boots branch and  compile qemu-system-ppc:

configure --target-list=ppc-softmmu

download the OFW image q40pofw-serial.rom configured  for the serial line.

create an empty hard disk image:

qemu-img create -f qcow2 aix-hdd.qcow2 8G

Concerning the VGA graphics: OFW can utilize the S3-Trio emulation done by HervĂ©, but AIX 5.1 can’t use it yet. For now, the serial line rules, but if you feel adventurous you can try using it omitting  the -vga none -nongraphic part.

qemu-system-ppc -M 40p -bios q40pofw-serial.rom -serial telnet::4441,server -hda aix-hdd.qcow2 -cdrom
/path/to/aix-5.1-cd1.iso  -vga none -nographic

Then in another terminal window:
telnet localhost 4441

The following text will appear:

QEMU PReP/40p, Serial #0, 128 MiB memory installed
Open Firmware, built  April 06, 2019 17:47:55
Copyright (c) 1995-2000, FirmWorks.
Copyright (c) 2014,2017,2019 Artyom Tarasenko.

Type any key to interrupt automatic startup
Boot device: /pci/ethernet  Arguments: 
The DHCP server did not specify a boot server

Boot load failed

ok

Once you see the “ok” prompt, type

ok boot cdrom:2
 
Then be patient, it takes some minutes till the first greeting appears and then some more before the installer starts.
Then answerer the installer questions. On my machine the copy process takes nearly one hour. At 93% it stalls after installing “mtools” for something like 10 Minutes, and then for another 10 minutes after “FAILURES” section, but don’t panic, eventually it will continue.

Once the install is done the emulated machine reboots to the “ok” prompt again. Type

ok boot disk

Supported AIX versions

I tested it with AIX 5.1 only. In theory it might work with 4.3.3 – 5.1 (a smoke test shows that at least the installer does start with AIX 4.3.3), let me know if you tested it. The 6015 support was officially discontinued in AIX 5.2, and probably the corresponding drivers were removed. I haven’t looked it up, as I don’t have a 5.2 media.

Networking in AIX 4.3.3 - 5.1 under QEMU

It looks like the PCNet driver (aka kent) is broken in AIX. I think the "busio" value used to look different in the previous versions. The networking can still be set up though. After performing the install, login as root and do the following (^D and ^C are Control-D and Control-C respectively):

# cat > lance-chg.asc
CuAt:
        name = "ent0"
        attribute = "busio"
        value = "0x01000000"
        type = "O"
        generic = "D"
        rep = "nr"
        nls_index = 3
^D
# odmchange -o CuAt -q "name=ent0 and attribute=busio" lance-chg.asc
# rmdev -l ent0
# mkdev -l ent0
# ifconfig en0 10.0.2.15
# ping 10.0.2.2
PING 10.0.2.2: (10.0.2.2): 56 data bytes
64 bytes from 10.0.2.2: icmp_seq=0 ttl=255 time=4 ms
64 bytes from 10.0.2.2: icmp_seq=1 ttl=255 time=3 ms
^C
#

If you try it with adifferent AIX version, before changing the busio, check first whether you have to modify it:

odmget -q "name=ent0 and attribute=busio" CuAt

In case  you get

value = "0x01000000"

you don't have to change it.

Your feedback is welcome!

Last updated on 2020.02.07