Qemu 2.6 Solaris 9 Sparc Segmentation Fault

Bug #1588328 reported by Zhen Ning Lim on 2016-06-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Undecided
Unassigned

Bug Description

Hi,
I tried the following command to boot Solaris 9 sparc:
qemu-system-sparc -nographic -boot d -hda ./Spark9.disk -m 256 -cdrom sol-9-905hw-ga-sparc-dvd.iso -serial telnet:0.0.0.0:3000,server

It seems there are a few Segmentation Faults, one from the starting of the boot. Another at the beginning of the commandline installation.

Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Configuration device id QEMU version 1 machine id 32
Probing SBus slot 0 offset 0
Probing SBus slot 1 offset 0
Probing SBus slot 2 offset 0
Probing SBus slot 3 offset 0
Probing SBus slot 4 offset 0
Probing SBus slot 5 offset 0
Invalid FCode start byte
CPUs: 1 x FMI,MB86904
UUID: 00000000-0000-0000-0000-000000000000
Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19
  Type 'help' for detailed information
Trying cdrom:d...
Not a bootable ELF image
Loading a.out image...
Loaded 7680 bytes
entry point is 0x4000
bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d

Jumping to entry point 00004000 for type 00000005...
switching to new context:
SunOS Release 5.9 Version Generic_118558-34 32-bit
Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0):
 Corrupt label; wrong magic number

Segmentation Fault
Configuring /dev and /devices
NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, Line #1759 0x00 0x88)
audio may not work correctly until it is stopped and restarted
Segmentation Fault
Using RPC Bootparams for network configuration information.
Skipping interface le0
Searching for configuration file(s)...
Search complete.

....

What type of terminal are you using?
 1) ANSI Standard CRT
 2) DEC VT52
 3) DEC VT100
 4) Heathkit 19
 5) Lear Siegler ADM31
 6) PC Console
 7) Sun Command Tool
 8) Sun Workstation
 9) Televideo 910
 10) Televideo 925
 11) Wyse Model 50
 12) X Terminal Emulator (xterms)
 13) CDE Terminal Emulator (dtterm)
 14) Other
Type the number of your choice and press Return: 3
syslog service starting.
savecore: no dump device configured
Running in command line mode
/sbin/disk0_install[109]: 143 Segmentation Fault
/sbin/run_install[130]: 155 Segmentation Fault

That basically looks like it should work. The only time I've seen random segfaults similar to this is with a corrupted disk, so the first question to ask is whether you've verified the ISO image you are using? Unfortunately this isn't an image I currently have access to, but I can report my Solaris 8 32-bit ISO boots and installs fine with no issues here. Does it make any difference if you remove the -hda part of the command line?

Zhen Ning Lim (limzn88) wrote :

Hi Mark,

I compared the cksum provided and the dvd cksum that i did, seems to match. I did try out Solaris 8 too. It seems to work fine with sun formatted hda disk. I did try removing the -hda for solaris 9, the problem still persist. I shall find if i can get another solaris 9 image source to try out.

If you can verify that the media is correct and you still see problems, I'd be interested to take a look if you are able to provide me a copy of the media for debugging.

Zhen Ning Lim (limzn88) wrote :

Hi Mark,

I have uploaded a copy of it to mega.nz

https://mega.nz/#!94ZVXBra

Thanks for the test case. It appears that this is a regression that occurred somewhere between 2.5 and 2.6 - bisecting now.

Download full text (3.5 KiB)

Can you guys check if the problem persists when qemu is launched with
the -singlestep option?
I think it's in general a good idea always check TCG-related problems
with -singlestep , because it helps to find out whether a bug is in
the optimizer or generator module of TCG.

Artyom

On Tue, Jun 14, 2016 at 11:44 PM, Mark Cave-Ayland
<email address hidden> wrote:
> Thanks for the test case. It appears that this is a regression that
> occurred somewhere between 2.5 and 2.6 - bisecting now.
>
> --
> You received this bug notification because you are a member of qemu-
> devel-ml, which is subscribed to QEMU.
> https://bugs.launchpad.net/bugs/1588328
>
> Title:
> Qemu 2.6 Solaris 9 Sparc Segmentation Fault
>
> Status in QEMU:
> New
>
> Bug description:
> Hi,
> I tried the following command to boot Solaris 9 sparc:
> qemu-system-sparc -nographic -boot d -hda ./Spark9.disk -m 256 -cdrom sol-9-905hw-ga-sparc-dvd.iso -serial telnet:0.0.0.0:3000,server
>
> It seems there are a few Segmentation Faults, one from the starting of
> the boot. Another at the beginning of the commandline installation.
>
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> Configuration device id QEMU version 1 machine id 32
> Probing SBus slot 0 offset 0
> Probing SBus slot 1 offset 0
> Probing SBus slot 2 offset 0
> Probing SBus slot 3 offset 0
> Probing SBus slot 4 offset 0
> Probing SBus slot 5 offset 0
> Invalid FCode start byte
> CPUs: 1 x FMI,MB86904
> UUID: 00000000-0000-0000-0000-000000000000
> Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19
> Type 'help' for detailed information
> Trying cdrom:d...
> Not a bootable ELF image
> Loading a.out image...
> Loaded 7680 bytes
> entry point is 0x4000
> bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d
>
> Jumping to entry point 00004000 for type 00000005...
> switching to new context:
> SunOS Release 5.9 Version Generic_118558-34 32-bit
> Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved.
> Use is subject to license terms.
> WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0):
> Corrupt label; wrong magic number
>
> Segmentation Fault
> Configuring /dev and /devices
> NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, Line #1759 0x00 0x88)
> audio may not work correctly until it is stopped and restarted
> Segmentation Fault
> Using RPC Bootparams for network configuration information.
> Skipping interface le0
> Searching for configuration file(s)...
> Search complete.
>
> ....
>
> What type of terminal are you using?
> 1) ANSI Standard CRT
> 2) DEC VT52
> 3) DEC VT100
> 4) Heathkit 19
> 5) Lear Siegler ADM31
> 6) PC Console
> 7) Sun Command Tool
> 8) Sun Workstation
> 9) Televideo 910
> 10) Televideo 925
> 11) Wyse Model 50
> 12) X Terminal Emulator (xterms)
> 13) CDE Terminal Emulator (dtterm)
> 14) Other
> Type the number of your choice and press Return: 3
> syslog service starting.
> savecore: no dump device configured
> Running i...

Read more...

On 17/06/16 12:42, Artyom Tarasenko wrote:

> Can you guys check if the problem persists when qemu is launched with
> the -singlestep option?
> I think it's in general a good idea always check TCG-related problems
> with -singlestep , because it helps to find out whether a bug is in
> the optimizer or generator module of TCG.
>
> Artyom

Hi Artyom,

I did manage to bisect this down to a single commit in the end: see
http://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg04039.html
for the commit in question.

ATB,

Mark.

Artyom has located the regression and posted a patch here: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07226.html.

Zhen Ning Lim (limzn88) wrote :

Hi all,

Thanks for the patch. I just tried, it seems to be not able to find the disk when it try to start the installation. :(

...

Please specify the media from which you will install the Solaris Operating
Environment.

Media:

1. CD/DVD
2. Network File System
3. HTTP (Flash archive only)
4. FTP (Flash archive only)
5. Local Tape (Flash archive only)

   Media [1]: 1
Reading disc for Solaris Operating Environment...

The system is being initialized, please wait... /
No Disks found.
Check to make sure disks are cabled and powered up.

I ran all the way through the installer in order to test the patch, so it should be working for you. Is your Spark9.disk labelled? See http://virtuallyfun.superglobalmegacorp.com/2010/10/03/formatting-disks-for-solaris/ for more information on how to do this.

Zhen Ning Lim (limzn88) wrote :
Download full text (4.3 KiB)

Hmm.. strange. I did make a new disk went into the setup, then format the disk. After that, i rebooted and start that installation. But, it seems still there is no disk detected.

 Media [1]: 1
Reading disc for Solaris Operating Environment...

The system is being initialized, please wait... |
No Disks found.
Check to make sure disks are cabled and powered up.

 Press OK to Exit.

   <Press ENTER to continue/

-# format -e
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c0t0d0 <drive type unknown>
          /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0
Specify disk (enter its number): 0

AVAILABLE DRIVE TYPES:
        0. Auto configure
        1. Quantum ProDrive 80S
        2. Quantum ProDrive 105S
        3. CDC Wren IV 94171-344
        4. SUN0104
        5. SUN0207
        6. SUN0327
        7. SUN0340
        8. SUN0424
        9. SUN0535
        10. SUN0669
        11. SUN1.0G
        12. SUN1.05
        13. SUN1.3G
        14. SUN2.1G
        15. SUN2.9G
        16. Zip 100
        17. Zip 250
        18. other
Specify disk type (enter its number): 18
Enter number of data cylinders: 24620
Enter number of alternate cylinders[2]:
Enter number of physical cylinders[24622]:
Enter number of heads: 27
Enter physical number of heads[default]: 107
Enter number of data sectors/track: 107
Enter number of physical sectors/track[default]:
Enter rpm of drive[3600]:
Enter format time[default]:
Enter cylinder skew[default]:
Enter track skew[default]:
Enter tracks per zone[default]:
Enter alternate tracks[default]:
Enter alternate sectors[default]:
Enter cache control[default]:
Enter prefetch threshold[default]:
Enter minimum prefetch[default]:
Enter maximum prefetch[default]:
Enter disk type name (remember quotes): Sparc9
selecting c0t0d0
[disk formatted]

FORMAT MENU:
        disk - select a disk
        type - select (define) a disk type
        partition - select (define) a partition table
        current - describe the current disk
        format - format and analyze the disk
        repair - repair a defective sector
        label - write label to the disk
        analyze - surface analysis
        defect - defect list management
        backup - search for backup labels
        verify - read and display labels
        save - save new disk/partition definitions
        inquiry - show vendor, product and revision
        scsi - independent SCSI mode selects
        cache - enable, disable or query SCSI disk cache
        volname - set 8-character volume name
        !<cmd> - execute <cmd>, then return
        quit
format> label
[0] SMI Label
[1] EFI Label
Specify Label type[0]: 1
Ready to label disk, continue?y

format> q

#reboot
Jun 28 23:37:16 rpcbind: rpcbind terminating on signal.
syncing file systems... done
rebooting...
rebooting ()
Configuration device id QEMU version 1 machine id 32
Probing SBus slot 0 offset 0
Probing SBus slot 1 offset 0
Probing SBus slot 2 offset 0
Probing SBus slot 3 offset 0
Probing SBus slot 4 offset 0
Probing SBus slot 5 offset 0
Invalid FCode start byte
CPUs: 1 x FMI,MB86904
UUID:...

Read more...

Okay. Can you confirm which version (or git revision) you've used to apply the patch so I can try and reproduce locally?

Zhen Ning Lim (limzn88) wrote :

May 11 2016. qemu-2.6.0 from http://wiki.qemu.org/Download

Download full text (8.8 KiB)

I've just tried v2.6.0 with the recent ldstub patch applied and it looks from the output above that you're using an incorrect format to put down the disk label. I see the following:

$ ./qemu-system-sparc -cdrom sol-9-905hw-ga-sparc-dvd.iso -hda /home/build/src/qemu/image/sparc32/sol9.qcow2 -boot d -nographic -prom-env 'auto-boot?=false'
Configuration device id QEMU version 1 machine id 32
Probing SBus slot 0 offset 0
Probing SBus slot 1 offset 0
Probing SBus slot 2 offset 0
Probing SBus slot 3 offset 0
Probing SBus slot 4 offset 0
Probing SBus slot 5 offset 0
Invalid FCode start byte
CPUs: 1 x FMI,MB86904
UUID: 00000000-0000-0000-0000-000000000000
Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19
  Type 'help' for detailed information

0 > boot cdrom:d -vs Not a bootable ELF image
Loading a.out image...
Loaded 7680 bytes
entry point is 0x4000
bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d

Jumping to entry point 00004000 for type 00000005...
switching to new context:
Size: 0x4624f+0xdaf5+0x1d6a3 Bytes
SunOS Release 5.9 Version Generic_118558-34 32-bit
Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
...
...
INIT: SINGLE USER MODE
# format
Searching for disks...WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0):
        Corrupt label; wrong magic number

WARNING: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0 (sd0):
        Corrupt label; wrong magic number

done

AVAILABLE DISK SELECTIONS:
       0. c0t0d0 <drive type unknown>
          /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0
Specify disk (enter its number): 0

AVAILABLE DRIVE TYPES:
        0. Auto configure
        1. Quantum ProDrive 80S
        2. Quantum ProDrive 105S
        3. CDC Wren IV 94171-344
        4. SUN0104
        5. SUN0207
        6. SUN0327
        7. SUN0340
        8. SUN0424
        9. SUN0535
        10. SUN0669
        11. SUN1.0G
        12. SUN1.05
        13. SUN1.3G
        14. SUN2.1G
        15. SUN2.9G
        16. Zip 100
        17. Zip 250
        18. other
Specify disk type (enter its number): 18
Enter number of data cylinders: 24620
Enter number of alternate cylinders[2]:
Enter number of physical cylinders[24622]:
Enter number of heads: 27
Enter physical number of heads[default]:
Enter number of data sectors/track: 107
Enter number of physical sectors/track[default]: 107
Enter rpm of drive[3600]:
Enter format time[default]:
Enter cylinder skew[default]:
Enter track skew[default]:
Enter tracks per zone[default]:
Enter alternate tracks[default]:
Enter alternate sectors[default]:
Enter cache control[default]:
Enter prefetch threshold[default]:
Enter minimum prefetch[default]:
Enter maximum prefetch[default]:
Enter disk type name (remember quotes): Sparc9
selecting c0t0d0
[disk formatted]

FORMAT MENU:
        disk - select a disk
        type - select (define) a disk type
        partition - select (define) a partition table
        current - describe the current disk
        format - format and analyze the disk
        repair - repair a d...

Read more...

Zhen Ning Lim (limzn88) wrote :

Hi Mark,

Thanks a lot. Got it working now. When formatting the label, there are 2 options, SMI and EFI. Once I format it with SMI, it seems to be able to find the disk.

Great news! FWIW with newer versions of QEMU, including 2.6.0, the framebuffer emulation is good enough to install and run Solaris (including X) without the -nographic/-serial options if you need it. I've also CCd the relevant patch to qemu-stable so it should appear in 2.6.1 also.

Many thanks for the report!

Changed in qemu:
status: New → Fix Committed
Zhen Ning Lim (limzn88) on 2016-06-30
Changed in qemu:
status: Fix Committed → Fix Released
status: Fix Released → New
Zhen Ning Lim (limzn88) on 2016-06-30
Changed in qemu:
status: New → Fix Committed
Zhen Ning Lim (limzn88) wrote :

Hi Mark,

Thanks for the update. I would definitely be nice to have other than the black screen. Still got a problem though. I managed to install sparc9 but after i removed the cdrom, it fails to boot.

qemu-system-sparc -nographic -monitor null -serial mon:telnet:0.0.0.0:3000,server -hda ./Sparc9.disk -m 256 -net nic,macaddr=52:54:0:12:34:56 -net tap,ifname=tap0,script=no,downscript=no

From an article written by Artyom, i did add

# cat >> /a/etc/system

set scsi_options=0x58

^d

to solaris 2.6.

However, i think i didnt do this with solaris 8, it works fine.

For the solaris 9, it will allow you to either set it to not reboot but once you reached the end of the installation, it will reboot when you press enter. And the Sparc9.disk cannot be booted :(

telnet 0.0.0.0 3000
Trying 0.0.0.0...
Connected to 0.0.0.0.
Escape character is '^]'.
Configuration device id QEMU version 1 machine id 32
Probing SBus slot 0 offset 0
Probing SBus slot 1 offset 0
Probing SBus slot 2 offset 0
Probing SBus slot 3 offset 0
Probing SBus slot 4 offset 0
Probing SBus slot 5 offset 0
Invalid FCode start byte
CPUs: 1 x FMI,MB86904
UUID: 00000000-0000-0000-0000-000000000000
Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19
  Type 'help' for detailed information
Trying disk...
Not a bootable ELF image
Not a bootable a.out image
No valid state has been set by load or init-program

0 >

If you use OpenBIOS then you don't explicitly have to set scsi-options since the value can be overridden via the device tree which is exactly what OpenBIOS does.

Interestingly enough it seems that the default bootloader for Solaris 9 is installed in the slice rather than the root of the disk as per my Solaris 8 installation. Fortunately you can manually boot Solaris 9 from the slice by entering "boot disk:d" at the Forth prompt.

Based upon this it probably makes sense to add "disk:d" to the bootpath used by OpenBIOS - I'll send a patch through to the OpenBIOS mailing list shortly.

Zhen Ning Lim (limzn88) wrote :

Hi Mark,

I have tried boot diisk:d. After this

Not a bootable ELF image
Not a bootable a.out image
No valid state has been set by load or init-program

0 > boot disk:d No valid state has been set by load or init-program
 ok
0 >

Somehow I am getting invalid boot

It works here as per my post above, so I think the problem is still with the disk label. With the above ISO image, I don't get asked for the type of label which makes me think you are using a newer version of Solaris for labelling than you are for installation.

Can you re-label the disk using the exact same image used for the installation and see if that makes a difference?

Hmmm I've just tried a second installation of Solaris 9 with a completely blank disk image and now it appears that "boot disk:a" is correct, i.e. boot from slice a. Not sure yet if this is the correct convention for HDs.

Zhen Ning Lim (limzn88) wrote :

Hi Mark,

We are finally in:)

By the way, how do you figure out which slice its in?

From solaris 8 dvd onwards, i seems to see 2 disk label options: SMI and EFI. Not sure why you didnt see those.

0 > boot disk:a Not a bootable ELF image
Loading a.out image...
Loaded 7680 bytes
entry point is 0x4000
bootpath: /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@0,0:a

Jumping to entry point 00004000 for type 00000005...
switching to new context:
SunOS Release 5.9 Version Generic_118558-34 32-bit
Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
configuring IPv4 interfaces: le0.
starting DHCP on primary interface le0
Hostname: unknown
The system is coming up. Please wait.
checking ufs filesystems
/dev/rdsk/c0t0d0s7: is logging.
starting rpc services: rpcbind done.
syslog service starting.
syslogd: line 24: WARNING: loghost could not be resolved
Jul 3 14:06:40 unknown sendmail[239]: My unqualified host name (unknown) unknown; sleeping for retry
Jul 3 14:06:40 unknown sendmail[240]: My unqualified host name (unknown) unknown; sleeping for retry
volume management starting.
Creating new rsa public/private host key pair
Creating new dsa public/private host key pair
Jul 3 14:06:54 unknown snmpXdmid: Error in Adding Row for Subscription Table Entry
Jul 3 14:06:55 unknown snmpXdmid: Failed to add filter to SP for Event delivery
The system is ready.

unknown console login: root
Password:
Jul 3 14:07:09 unknown login: ROOT LOGIN /dev/console
Sun Microsystems Inc. SunOS 5.9 Generic May 2002
# ls
bin etc lib opt tmp xfn
cdrom export lost+found platform usr
dev home mnt proc var
devices kernel net sbin vol
# Jul 3 14:07:41 unknown sendmail[240]: unable to qualify my own domain name (unknown) -- using short name
Jul 3 14:07:41 unknown sendmail[240]: [ID 702911 mail.alert] unable to qualify my own domain name (unknown) -- using short name
Jul 3 14:07:46 unknown sendmail[239]: unable to qualify my own domain name (unknown) -- using short name
Jul 3 14:07:46 unknown sendmail[239]: [ID 702911 mail.alert] unable to qualify my own domain name (unknown) -- using short name

Great news! AFAICT it's just convention that the first disk slice is used, and I've also proposed a patch for OpenBIOS to include this in the boot search path in future, hopefully in time for 2.7.

Thomas Huth (th-huth) on 2017-01-19
Changed in qemu:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers