Sunday, November 22, 2009

Hidden OBP feature found

debugging the initial Power-On-Self-Test of OBP 2.29 I found a secret level a cool undocumented feature, PromDiag. Whenever I turn it on, instead of getting a usual OBP "OK" prompt I get:

PromDiag
NOK>

I wonder what is "NOK"? Does it mean "Not OK"? Anyway, I played with it a little. It runed out that it can launch single POST tests, and there are some more features, which have to be discovered yet. All in all it accepts just a few symbols: numbers, dot, comma, c, h, l, q, r, s:

  • numbers, dot and comma: start a single test or a group of tests, i.e. "7.2.2" would start a single test, "7" would start all 7.* tests. "7.2.1, 7.2.2" starts the two specified tests.
  • h - when starting a group of tests, halt after the first error
  • r - loop (repeat) a specified test
  • l - loop (repeat) a specified test
  • c,s - unknown. I see no effect. Feel free to shed the light if you know what they do.
  • q - quit and start OBP
Running tests or entering the PromDiag mode is controlled by the byte 0x3 in NVRAM.  It only affects the OBP version 2.29 (ss5-170.bin). Patching qemu with m48t59_write(nvram, 3, X) gives the following variants for the X:
  • 0x0x - skip the tests
  • 0x2x - start tests, after the first failure process with the booting.
  • 0x4x - start tests,  run all the tests regardless the status, process with the booting.
  • 0x8x - enter the  PromDiag NOK> mode.

39 comments:

Anonymous said...

У меня есть SunSPARC CLASSIC X без диска, с фактически неизвестным состоянием железа. Если будет интересно взять образ ПЗУ то шлите инструкции. Только пишите их для тупых :-)

atar said...

Образ ПЗУ - интересно. Для этого нужен нуль-модемный кабель и комп, к которому CLASSIC можно этим кабелем подключить. Если то и другое имеется - смело подключайте.
Заодно можно будет понять состояние железа.

Кстати, если не трудно, я предпочёл бы общаться на "ты". Я безнадёжно испорчен временами, когда сеть только появилась, и общаться в сети на Вы мне всё ещё кажется непривычным.

Anonymous said...

Хе, а инструкцию? Или инструмент. Я скажем так очень начинающий пользователь линуха. Но на отдельном винте Lenny стоит. Так что могу и из линуха сделать образ.

atar said...

Не знаю, как это сделать из под линуха. Скорее всего, ядерный драйвер надо писать (аналогичный тому, который я использовал для выдёргивания биоса из рутера, где-то в начале этого блога).

Если есть нульмодемный кабель - имидж можно выдернуть без привлечения внешних средств: нужен только терминал, который может записывать в файл протокол сессии.

У тебя есть нульмодемыный кабель? Как ты с этим спарком общаешься? Если через него, то пришли, что спарк пишет после включения (до того как ОС начнёт грузится), мне там надо несколько адресов посмотреть. После этого - пришлю простую инструкцию.

Anonymous said...

Hello, Артем - great stuff!

I am just working on bringing back to life one SPARCstation 10 and SPARCstation 20 (with different CPUs and lots of graphics cards).

I was thinking about checking out plan9 on those machines, as plan9 is long dead on sparc architecture.

Do you have your patches in your own git repository, so that we can follow up immediately on your progress?

--Marcin

atar said...

Hi Marcin,

I don't have a public git. I submit all the clean things into the main git tree (all of them, except the last scsi one are already accepted in the git/head). Up to now no-one wanted to have the non-clean things (basically they are only needed for OBP).

Did you try to launch your things in the qemu git/HEAD?

Brent said...

This is exciting stuff!

I'm trying to reproduce your results with a clone from today's Qemu git repository. OBP does start up, but I can't seem to do anything with the SCSI. Nothing shows up for "probe-scsi".

Is there some magic incantation to get a disk image to appear, does it not except a QCOW2 image (I prepped it with a Sun disklabel and dd'ed a root filesystem from a Solaris box to the first partition), or are there some important patches that still haven't made it into the Qemu repository?

atar said...

Brent,

git/Head is not there yet. A couple of hacks is still needed. But actually it should fail before probe-scsi, sbus probing should have produced an error. What OBP version (and machine) are you using? Can you post the complete boot log?

Brent said...

I don't think OBP sees a scsi device at all, as you said, although I don't see any SBUS probing errors (or really any sign that it is even looking for SBUS devices; I'm not sure it even sees ethernet).

I've tried SS-5 and SS-20. SS-20 does give an "ESP ERROR: esp_mem_writeb: Unhandled ESP command (a2)", which you've mentioned on the QEMU mailing list. Both give a "Data Access Error" before they get to the prompt.

Here's an SS-20 example (note that I didn't really want the net to be passed through, only to show up as a device):
qemu/sparc-softmmu/qemu-system-sparc -name r2d2 -M SS-20 -bios ./ss20_v2.25_rom -drive file=/internal/virt/r2d2.qcow2,if=scsi -net nic,macaddr=52:54:00:02:01:01 -nographic
Warning: vlan 0 is not connected to host network
ESP ERROR: esp_mem_writeb: Unhandled ESP command (a2)

Power-ON Reset

SMCC SPARCstation 10/20 UP/MP POST version VRV3.45 (09/11/95)


CPU_#0 TI, STP1021PGA(1.x) 1Mb External cache

CPU_#1 ******* NOT installed *******
CPU_#2 ******* NOT installed *******
CPU_#3 ******* NOT installed *******

<<< CPU_00000000 on MBus Slot_00000000 >>> IS RUNNING (MID = 00000008)



$$$$$ WARNING : No Keyboard Detected! $$$$$
MMU ICACHE_TLB bit pattern Test
Case 0000000f: I_TLB mis-matched exp=55555000 obs=00000000 xor= 55555000 entry # 0x00000000
Available Memory 0x08000000
Allocating SRMMU Context Table
Context Table allocated, Available Memory 0x07fc0000
Setting SRMMU Context Register
Context Table allocated, Available Memory 0x07fc0000
Setting SRMMU Context Table Pointer Register
RAMsize allocated, Available Memory 0x07fb0000
Allocating SRMMU Level 1 Table
Level 1 Table allocated, Available Memory 0x07fafc00
Mapping RAM @ 0xffef0000
RAM mapped, Available Memory 0x07fafa00
Mapping ROM @ 0xffd00000
ROM mapped, Available Memory 0x07faf800
Mapping ROM @ 0x00000000
ROM mapped, Available Memory 0x07faf000
ttya initialized
Cpu #0 Data Access Error
ok

test scsi and test net say device not found. test-all hangs on floppy, naturally. probe-scsi and probe-scsi-all report nothing.

Do you happen to have those "unclean" patches handy to make OBP happier? I already applied your wrong but helpful performance patch and your tiny scsi patch that gets rid of the 36 error on probe-scsi (although I haven't gotten far enough for either of them to be helpful). Other patches I found on the list are indeed already applied in GIT.

Unknown said...

No, I run qemu 0.10.5 on FreeBSD. The port for FreeBSD includes lots of patches.

SS-20 invocation:

qemu-system-sparc \
-M SS-20 -m 128 -nographic \
-bios sparc/ss20_v2.25_rom \
-hda sol.qcow2 \
-cdrom sol-9-905hw-ga-sparc-dvd.iso \
-boot d

gives:

ESP ERROR: esp_mem_writeb: Unhandled ESP command (a2)

Power-ON Reset
SMCC SPARCstation 10/20 UP/MP POST version VRV3.45 (09/11/95)


CPU_#0 TI, STP1021PGA(1.x) 1Mb External cache

CPU_#1 ******* NOT installed *******
CPU_#2 ******* NOT installed *******
CPU_#3 ******* NOT installed *******

<<< CPU_00000000 on MBus Slot_00000000 >>> IS RUNNING (MID = 00000008)



$$$$$ WARNING : No Keyboard Detected! $$$$$
MMU ICACHE_TLB bit pattern Test
Case 0000000f: I_TLB mis-matched exp=55555000 obs=00000000 xor= 55555000 entry # 0x00000000
Available Memory 0x08000000
Allocating SRMMU Context Table
Context Table allocated, Available Memory 0x07fc0000
Setting SRMMU Context Register
Context Table allocated, Available Memory 0x07fc0000
Setting SRMMU Context Table Pointer Register
RAMsize allocated, Available Memory 0x07fb0000
Allocating SRMMU Level 1 Table
Level 1 Table allocated, Available Memory 0x07fafc00
Mapping RAM @ 0xffef0000
RAM mapped, Available Memory 0x07fafa00
Mapping ROM @ 0xffd00000
ROM mapped, Available Memory 0x07faf800
Mapping ROM @ 0x00000000
ROM mapped, Available Memory 0x07faf000
ttya initialized
Cpu #0 TI,TMS390Z55
Cpu #1 Nothing there
Cpu #2 Nothing there
Cpu #3 Nothing there
Probing Memory Bank #0 Nothing there
Probing Memory Bank #1 Nothing there
Probing Memory Bank #2 Nothing there
Probing Memory Bank #3 Data Access Error
ok

power-off works as well as some Forth commands (i.e. devalias). printenv fails with:

ok printenv
Parameter Name Value Default Value

tpe-link-test? Data Access Error

Very similar results with SS-10
(only no esp_mem_writeb error).

printenv fails the same way, simple forth commands work.

With SS-5 and the ss5.bin image power-off fails with Data Access Error. It works on SS-10 and SS-20.

Btw. I might have an access to oscilloscope and live SS-10 and SS-20. However, I have no clue how to use it (my colleagues may be of some help but I need instructions what we need).

--Marcin

atar said...

Just hoped that you have some OBP version which I haven't seen and which would perform better.

Actually after playing with the POST tests I have an impression, vanilla qemu can also work without my hacks at least with some versions of OBP. A good value for the "-icount" option could do the trick.

Unfortunately I won't have the time to test it till Sunday.

atar said...

the previous was for Brent.

Saper, 10.5 wouldn't work. There are two critical CPU bugs, at least one of them is definitely present in your listing. "Nothing there" vs. "Data Access Error" is caused by a bug I described here and here. My scsi and irq fixes are probably also not in.

You need to use the git version.

Brent said...

I've been trying different -icount values with no luck. They all give varying numbers of "Bad clock read" messages. I get "qemu: fatal: Raised interrupt while not in I/O function" on SS-5 attempts and qemu dies. Boot attempts crash OBP on SS-20 ("qemu: fatal: Trap 0x29 while interrupts disabled, Error state"), and probe-scsi still gives nothing.

Of course, I have no clue as to how to pick an icount value...

atar said...

Brent,

neither do I. :) Actually I asked about "qemu: fatal: Raised interrupt while not in I/O function" in the mailing list just a hour ago. I didn't see "Trap 0x29" though.

I can publish my hacks if you need them, but not before the next Monday.

What are you going to do once you have a working OBP?

Brent said...

Once I have a working OBP, I plan to build up a working Solaris 2.6 image derived from our existing Sun server environment and phase out our old UltraSparc hardware. We still have a couple of ancient applications in particular that only work on Sparc systems, with no source code and no Linux versions. We'd rather be 100% Intel Linux on the server end at this point. We'll never again be able to afford Sparc hardware that's anywhere near the performance of our modern Intel hardware, and it's clearly a dead-end.

I'd love to have OpenSolaris working, which would give me FUSE and access to our new filesystems (GlusterFS) without reexporting via NFS, but that would require Sparc64 emulation to be working sufficiently.

Even better would be if the Sparc user emulation supported SunOS/Solaris so that we could just run our apps and not a whole Solaris environment, but my impression is that would be the hardest of all and is unlikely to happen.

Brent said...

The icount represents 2^n seconds, so the valid values of n seem to be fairly limited (0<=n<=32, say). I've tried the 2 Sparc 5 Roms and the Sparc 20, and I don't think there's an icount value that gets the SS-5 working. SS-20 comes up but can't probe-scsi, test scsi says device not found, and boot dies. The icount value doesn't seem to do much more than influence how quickly this all occurs.

Values of 31 and 32 pretty much hang at some point during initialization. Lower values do not.

Brent said...

Oops, I meant 2^n nanoseconds.

Anonymous said...

Доброго ещё раз. У меня есть SunSPARC CLASSIC X без диска, с фактически неизвестным состоянием железа....
На компе WinXP|Debian Lenny подключён COM порт и через спаянный нуль-модемный кабель к порту A\B в этом самом классике. Собственно вопрос остался прежним. Как слить с машинки образ ПЗУ? А то гугл даёт слишком много ответов, а английский настолько хорошо я ещё не знаю. И мой jid:leonid.ko@jabber.ru

atar said...

ты так и не написал, что она пишет при загрузке. Чтобы слить образ
ПЗУ, надо знать,куда он примаплен.
Предположим, что он примаплен туда же, куда и на SS-5, SS-10, SS-20.
Если в ответ на OBPшное, "ok" сказать

ffd00000 10 dump

система должна показать 16 первых байт ПЗУ. В принципе, можно
использовать dump для считывания всего имджа, но рука устанет нажимать
"Enter" после каждой страницы. Поэтому, пришли мне, что говорит

see dump

,а я пришлю тебе модифицированный dump, который выдаст весь листинг целиком.

Anonymous said...

Да вот к сожалению ничего пока не даёт мне эта машинка. Накопал пару мануалов с sun.com про то как зайти в ПЗУ (СОМ скорость 9600 кбит/сек, проверка 8n1). Но putty молчит, просто молчит. Что уже делать ума не приложу.

atar said...

Кабель точно правильно спаян? Ты точно подключился к порту "A"? Клавиатура к спарку подключена? Индикаторы на ней какие-то горят/мигают?

Brent said...

Hey, Artyom, any word on how those patches (to get Open Boot recognizing the disk and booting) are coming along? I'm eager to tinker. ;-)

They don't have to be pretty; I'm only building the Sparc target, so breakage of other targets won't be a problem.

Anonymous said...

К сожалению правильно/неправильно сказать не смогу :-(
Надо порыть тот мануал по которому паял. Это вроде даже хендбук по фряхе был. :-)
1) Клавиатура (type 6) НЕ была подключена:
1а) putty не реагировал на нажатия клавиш;
1б) достучаться не удалось.
2) Клавиатура подключена.
2а) STOP-d не нажимал;
2б) Мигал CapsLock 1короткий 1 длинный 2 коротких;
2в) Затем мигнули все индикаторы клавы;
2г) Putty позволяет вводить символы, но нет эха;

Michael Kostylev said...

Мануал теперь не нужен. Нужно зачитать, как распаян кабель.

atar said...

Brent,

my patches are in git. Don't know whether they will be included into 0.12 yet, but I guess it doesn't matter for you, as you've managed to apply even the patches which weren't accepted. Now you should be able to boot the Solaris 2.6 kernel using the SS-5 OBP.

There are still two highly experimental patches which I haven't sent, I'll publish them later.

I don't want to publish any non-trivial patch here, because that would create a sandbox with my own users, which I'd have to support alone. And I don't have any profit from qemu, so the efforts wouldn't pay. (I even haven't got a single cent from the ads here, because there weren't a single click)

Instead, I'd like to bring to the vanilla qemu development as many people as possible. Ask on the mailing list as many questions as possible, so the maintainers will see that Solaris/sparc is a requested feature.

On the other hand, if your company would like buying the patches...

Brent said...

Thanks for the patches, but do you think there could still be something missing? The latest git seems to have your 3 new patches (even the floppy tweak), but OpenBoot still doesn't show any signs of seeing the SCSI. probe-scsi and probe-scsi-all still return blank.

I did:
git clone git://git.savannah.nongnu.org/qemu.git
cd qemu
./configure --target-list=sparc-softmmu
make
cd ..
qemu/sparc-softmmu/qemu-system-sparc -name blah -bios ./ss5-170.bin -drive file=image.qcow2,if=scsi -nographic

I don't see any initialization messages related to SCSI/esp/whatever.

Once qemu vanilla is booting Solaris, then I suspect your right that momentum will build up, and it will take on a life of its own.

Alas, we are a university department and real cheapskates, but I'm happy to test!

Anonymous said...

Тыксь. И снова здравствуйте.
Распайка кабеля:
pc db9-db25 A/B
1-1
2-2
3-2
4-6
5-7
6-20
7-5
8-4
9-not used

Michael Kostylev said...

Это плохой, негодный кабель. Если спарк может обойтись без управления потоком (flow control в терминальной программе/настройках порта тоже надо не забыть выключить), то достаточно связать
de9 db25
2 - 2
3 - 3
5 - 7

А вообще, распайка полного кабеля есть много где, например, здесь.

Anonymous said...

Michael Kostylev
> Это плохой, негодный кабель.

Спасибо :-) Будем разбираться. Заранее с наступающим!

Anonymous said...

to Michael Kostylev from leonid.ko
Ах спасибо тебе милый и добрый человек. Оно помогло и вижу я теперь в консольке кучу мне пока непонятных вещей. А именно: http://paste.org.ru/?9w8ww6

Michael Kostylev said...

Наверно, надо попробовать see dump, как предлагал Артем.

Anonymous said...

ffrom leonid.ko

ok see dump
: dump
base @ -rot hex (ffd1df10) dup 0= if
1+
then bounds do
i (ffd1deac) exit? (?leave) 10
+loop base !
;
ok

atar said...

: dumpall
base @ -rot hex ffd1df10 execute dup 0= if
1+
then bounds do
i ffd3f960 execute 10 +loop base !
;

(should see the "]" prompt while typing the function and "ok" after typing the last string)

now the actual dump:

ffd00000 100000 dumpall

atar said...

Sorry, this was a variant for another OBP version. Copy&pasted too much. This one should work:

: dumpall
base @ -rot hex ffd1df10 execute dup 0= if
1+
then bounds do
i ffd1deac execute 10 +loop base !
;

ffd00000 100000 dumpall

VooDoo_UzH_ said...

Доброго времени суток.
Я ну совсем начинающий испытатель qemu, что надо сделать чтобы система нашла клавиатуру и начала устанавливаться ОС. Получается вот так ...
qemu-system-sparc -M SS-5 -m 256 -L /home/iam/qemu-0.12.3/qemu -bios ss5.bin -hda /home/iam/.aqemu/sparc_HDA.img -nographic -cdrom /home/iam/Aqemu/sparc_2.6.iso -boot d

Power-ON Reset

$$$$$ WARNING: No Keyboard Detected! $$$$$
MMU Context Table Reg Test
MMU Context Register Test
MMU TLB Replace Ctrl Reg Tst
MMU Sync Fault Stat Reg Test
MMU Sync Fault Addr Reg Test
MMU TLB RAM NTA Pattern Test
ERROR : Address= 000000fc, exp= 07ffffdc, obs= 00000000, xor= 07ffffdc
initializing TLB
initializing cache

Allocating SRMMU Context Table
Setting SRMMU Context Register
Setting SRMMU Context Table Pointer Register
Allocating SRMMU Level 1 Table
Mapping RAM
Mapping ROM

ttya initialized
Probing Memory Bank #0 32 Megabytes
Probing Memory Bank #1 32 Megabytes
Probing Memory Bank #2 32 Megabytes
Probing Memory Bank #3 32 Megabytes
Probing Memory Bank #4 32 Megabytes
Probing Memory Bank #5 32 Megabytes
Probing Memory Bank #6 32 Megabytes
Probing Memory Bank #7 32 Megabytes
Incorrect configuration checksum;
Setting NVRAM parameters to default values.
Setting diag-switch? NVRAM parameter to true
Probing CPU FMI,MB86904

atar said...

12.3 не содержит всех необходимых исправлений. Из-за упёртости майнейнеров они войдут (может быть) только в 13.0 . Пока что, только git.

PS. А почему в этой ветке?

VooDoo_UzH_ said...

В этой ветке, пардон, потому что не оч хорошо знаю инглиш, а в какой на эту тему надо было написать?

atar said...

В принципе, всё равно в какой - я получаю уведомления по мылу. Но в основном люди жалуются на проблемы с установкой в ветке how-to. Кстати, имеет смысл перевести его на русский? Или и так понятно?

VooDoo_UzH_ said...

В для полноты понимания происходящего, конечно лучше и на русском, чтобы был такой полезный блог.