Saturday, January 25, 2025

Self-hosting OpenSolaris under qemu-system-sparc -M niagara

Back to the Niagara target which I haven't touched since 2017. Back then I was playing with the Machine Description files, but afair OpenSolaris and Debian had different expectations. Unfortunately I haven't described my experiments, so I don't remember how exactly I created these machine definitions. Actually they are documented in the FWARC 2005/115. But it's something which takes time to understand, in some chapters it looks like unidirectional Directed Acyclic Graph, in other chapter there are references to "fwd" and "back". Anyway I've uploaded whatever I produced to my GitHub. At least it's possible to have 1GiB RAM.

Having 1 GiB allows QEMU booting various OpenSolaris/Illumos distributions, for instance the last dilos release for sun4v (dilos-net-1.3.7.136-sparc64.iso), but let’s start with OpenSolaris ramdisk.snv-b77-nd.no-boot-time-network.gz which was released as a part of OpenSparc T1. It is optimized to the Niagara machine QEMU emulates. Particularly it has the hsimd driver for a RAM-Disk. The other illumos distributions don’t seem to have it, although it’s released under GPLv2, so it should be possible to build it anywhere.

The v9os and Tribblix distributions use bootarchive and have much more features, which makes them too heavy for QEMU with its current performance. Booting Tribblix takes ~1 Hour on my laptop. The sun4v emulation can definitely be significantly optimized, will get back to this topic later. 

As the name suggests, the image has no network support. So, let’s add it. Not having a network card seems to be a challenge, but then again back in the nineties I haven’t had a network card either. There is a serial line, and this is enough to start hacking. At my University we phoned a UNIX console, executed slirp and then started pppd.

Chapter 1.  Experimenting with networking

The gunziped snv_77 image can be mounted for instance on qemu sun4m emulation, or any physical machine. Maybe it can even be read/write mounted under Linux, but RHEL9/OL9 doesn’t have the ufs.ko driver out of the box, so I haven’t tried it.

Initially I planned to use the authentic slirp.10c.sol24sparc binary, but alas, it wasn’t archived, and my google-fu was not strong enough to find it anywhere. So, I used a Solaris 9 machine to compile the binary myself. It wanted to have some crypt libraries which are different between my Solaris 9 installation and OpenSolaris snv_77. I don’t need any encryption performing communication as long as I stay on localhost, so I simply did

sed -e 's#crypt#nocrypt#g' configure > configure-nocrypt &&  chmod +x .configure-nocrypt && ./configure-nocrypt && make
Let's see if it's ok for snv77:
# /usr/local/bin/slirp -P
Slirp v1.0.16 (BETA) 

Copyright (c) 1995,1996 Danny Gasparovski and others. 
All rights reserved. 
This program is copyrighted, free software. 
Please read the file COPYRIGHT that came with the Slirp 
package for the terms and conditions of the copyright. 

IP address of Slirp host: 192.168.186.100 
[none found] 
Your address is 10.0.2.15 
(or anything else you want) 

Type five zeroes (0) to exit. 

[talking PPP, 115200 baud] 

SLiRP Ready ...
Nice! At this point I killed the user socat session, and used socat for creating a virtual serial.
socat pty,link=/dev/snv77,raw UNIX:/tmp/snv77.sock
Then started a pppd and I swear I could hear a phantom modem connect sound. The “anything else” message brought me to an idea to specify a different address (I already have 10.0.2.15 on another interface), but el9 pppd failed to negotiate it with slirp. So, I kept 10.0.2.15 for the moment. (Later I changed it to be 10.0.5.15) Let’s see it the machine is reachable.
$ telnet 192.168.186.100 
Trying 192.168.186.100...
And nothing happens. And actually, where is this 192.168.186.100 coming from? Oh, it’s defined in /etc/hosts. The physical FPGA machine for this image would have this address on its network card. But I don’t have a network card. The only one out there is the loopback with 127.0.0.1, which cannot be used for obvious routing reasons. No problem, another loopback to the rescue:
# ifconfig lo0:1 plumb 
# ifconfig lo0:1 192.168.186.100 up
Is it reachable now?
# ifconfig lo0:1 plumb 
$ telnet 192.168.186.100 
Trying 192.168.186.100... 
Connected to 192.168.186.100. 
Escape character is '^]'. 
login: root 
Last login: Tue Jan 19 06:46:15 on console 
Sun Microsystems Inc.   SunOS 5.11      snv_77  October 2007 
#
Good. Telnet is nice, but is extremely inconvenient for transferring files. And since I’m on nineties trip, let’s use rsh for the authenticity. Luckily I still have OpenSolaris b77 dvd with rsh and all the necessary libraries. So I added rshd.in from the SUNWrcmdr package and /usr/lib/libcmd.so.1 from SUNWcsl.
$ rsh 192.168.186.100 
::ffff:192.168.186.100: Connection refused
What? Why? Actually this happens because rsh without command works like rlogin. Which talks to a totally different daemon on a different port (513 instead of 514). I think this is a violation of the main UNIX principle: one program should do just one thing. So, rsh went against the rules and where is it now?
rsh -l root 192.168.186.100 ls -l
This one just hangs.

In my previous post I was wondering if anyone used rsh to execute commands on remote hosts back in nineties. I was sure I had used to do it, but found no success reports on the Net.

I even thought that this was another evidence of the Mandela Effect: the only reference I could find was stating that sending commands over rsh did not work.

I’ve looked at the code and found that internally slirp acts as a proxy, executing another rsh and piping the data back to the client. So I simply removed the support for rsh, 

$  git diff
diff --git a/src/ctl.h b/src/ctl.h
index 4a8576d..3518fb3 100644
--- a/src/ctl.h
+++ b/src/ctl.h
@@ -3,5 +3,5 @@
 #define CTL_ALIAS      2
 #define CTL_DNS                3
 
-#define CTL_SPECIAL    "10.0.2.0"
-#define CTL_LOCAL      "10.0.2.15"
+#define CTL_SPECIAL    "10.0.5.0"
+#define CTL_LOCAL      "10.0.5.15"
diff --git a/src/tcp_subr.c b/src/tcp_subr.c
index c14755a..0049778 100644
--- a/src/tcp_subr.c
+++ b/src/tcp_subr.c
@@ -563,7 +563,7 @@ struct tos_t tcptos[] = {
          {0, 23, IPTOS_LOWDELAY, 0},   /* telnet */
          {0, 80, IPTOS_THROUGHPUT, 0}, /* WWW */
          {0, 513, IPTOS_LOWDELAY, EMU_RLOGIN|EMU_NOCONNECT},   /* rlogin */
-         {0, 514, IPTOS_LOWDELAY, EMU_RSH|EMU_NOCONNECT},      /* shell */
+/* don't         {0, 514, IPTOS_LOWDELAY, EMU_RSH|EMU_NOCONNECT},       shell */
          {0, 544, IPTOS_LOWDELAY, EMU_KSH},            /* kshell */
          {0, 543, IPTOS_LOWDELAY, 0},  /* klogin */
          {0, 6667, IPTOS_THROUGHPUT, EMU_IRC}, /* IRC */

and added the rsh ports to port-forward. This way, Solaris rshd.in immediately complained that

 Jan 19 14:22:11 t1-fpga-00 rsh[270]: [ID 521673 daemon.notice] connection from 192.168.186.100 (192.168.186.100) - bad port
Makes sense. Slirp is usually started as normal used, so it communicates from unprivileged ports, whereas rshd expects the port to be in range: 513-1023:
 bad_port = (port >= IPPORT_RESERVED ||
		port < (uint_t)(IPPORT_RESERVED/2));

Fine, let’s hack rshd.in and remove this check:

 ./gdb-sparc64-solaris  --write -q in.rshd 
Reading symbols from /home/tyom/snv77/prep-rsh/usr/sbin/in.rshd...(no debugging symbols found)...done. 
(gdb) set *(int *) 0x000139c4=0x01000000 
(gdb) quit

And after that I'm back to where I started. rshd.in doesn’t complain, “rsh -l ls” waits for something. Then it occurred to me that I recently my laptop to OL9. I checked iptables settings immediately after first encountering the hanging rsh process, and the ports were open. But you know what? RHEL9/OL9 use firewalld by default. And there indeed communication to the ports 513-1023 is not permitted by default, which totally makes sense.

But since I don’t expect to be hacked from my own VM, I’ve permitted these connections. And now

$ rsh -lroot 192.168.186.100 "uname -a" 
SunOS t1-fpga-00 5.11 snv_77 sun4v sparc sun4v

Woo-hoo! We’ve got a networked sparc64 OpenSolaris image which can be used in self-hosting mode. You can put there whatever files you want with

  cat filename | rsh -lroot 192.168.186.100 "cat >filename" 
Then shut it down,
#  init 5
And finally in the qemu monitor,
 (qemu) pmemsave 0x1f40000000 83886080 vdisk.ram

Where 83886080 is the current size of the virtual disk in bytes (can be checked with ls -l). The next time you boot with the vdisk.ram, you’ll find the changes done to the FS before "init 5" and "pmemsave". Indeed I don’t encourage anyone to use rsh, this was done just for fun. If you want to try it yourself, the next chapter describes how to use the image created by experiments described above. 

Chapter 2. Using the snv-with-slirp image

The image for the experiments below, currenly resides here. Unpack it first. 

Launching it can be done with 6 terminals, 5 run with user privileges and 1 with root: 

  1. qemu-system-sparc64 -M niagara:
     ./qemu-system-sparc64 -M niagara -L ../1GiB-snv_77/ -m 1
    024 -nographic -serial unix:/tmp/snv77.sock,server -drive if=pflash,readonly=on,file=../sparc-disks/snv-with-slirp

    At the end of the session the (qemu) prompt it can be optionally used to save RAM disc contents into a file (qemu) pmemsave 0x1f40000000 83886080 vdisk.ram 
  2. socat
     socat STDIO,raw,echo=0 UNIX:/tmp/snv77.sock
    
    This one is used as a temporary helper to submit the boot command, login, configure the IP address alias and start the slirp process
     boot -vV
    

    As soon as the login prompt appears
    SunOS Release 5.11 Version snv_77 64-bit
    Copyright 1983-2007 Sun Microsystems, Inc.  All rights reserved.
    Use is subject to license terms.
    os-io Ethernet address = 0:80:3:de:ad:3
    Using default device instance data
    mem = 1048576K (0x40000000)
    avail mem = 945700864
    root nexus = Sun Fire T2000
    pseudo0 at root
    pseudo0 is /pseudo
    scsi_vhci0 at root
    scsi_vhci0 is /scsi_vhci
    virtual-device: hsimd0
    hsimd0 is /virtual-devices@100/disk@0
    root on /virtual-devices@100/disk@0:a fstype ufs
    pseudo-device: dld0
    dld0 is /pseudo/dld@0
    cpu0: UltraSPARC-T1 (cpuid 0 clock 5 MHz)
    pseudo-device: devinfo0
    devinfo0 is /pseudo/devinfo@0
    Hostname: t1-fpga-00
    
    t1-fpga-00 console login:
    
    login as root and then
    ifconfig lo0:1 plumb
    ifconfig lo0:1 192.168.186.100 up
    echo + >/etc/hosts.equiv
    /usr/local/bin/slirp-1.0.16-no-rsh-emu -P "redir 1023 1023" "redir 1022 1022" "redir 1021 1021"
    
  3. After slirp is started, kill the temporary socat
    killall socat
    
    (beware that it would also kill your other socat processes of the user. If you have any, you should use something more clever than killall. I don’t, so it works for me) 

  4. As root:
     socat pty,link=/dev/snv77,raw UNIX:/tmp/snv77.sock & pppd local -detach /dev/snv77
    
    This socat connects the unix socket to pty, which is used by pppd.
     
  5. Optionally start telnet session. It is handy to see what is available out there.
     
  6. This one is used to execute rsh commands on the Solaris guest. This one can be used for scripting and file transfer.

Indeed it’s possible to do with much less than 6 terminals. QEMU can be connected directly to pty and booting/starting slirp can be done for instance by a pppd expect script, or by changing the guest init sequence. Having the 6 terminals just helps to have more control and makes it easier to debug if something breaks.

/Happy hacking

Tuesday, January 21, 2025

rsh and slirp

Hey! I'm back with a maybe strange question for those who are in age 35+ (ahem, well primariliy to those a little more older then that). Have you used slirp on real machines to connect the Internet via pppd?

And if yes, did you for instance do something like the following?

 tar cf - myfiles* | rsh somehost tar xf -

Am I the only one who remembers doing this? Or did you immediately start with ssh, and never used rsh because of the well known reasons? Or were you the lucky ones who didn't have to use slirp because you had the network access from your home? :-)

Last week I started kinda neozeed style experiment, and hope to have something to share soon.

/Stay tuned