Monday 18 February 2008

Solution: Solaris 10 fails to install on EFI labeled disks

I recently came across the following error when jumpstarting a system that has 4 disks, 2 of which were previously in a zpool and were EFI labeled:
Checking rules.ok file...
awk: division by zero
record number 17
awk: division by zero
record number 15
expr: syntax error
awk: division by zero
record number 17
The error seemed harmless enough in that it didn't affect the installation. Even still I tracked it down to the /usr/sbin/install.d/chkprobe script in the Solaris 10 mini-root. I opened a case with Sun and they informed me it was a known issue (BugID 6457349: chkprobe cannot handle disks with EFI labels). Sun provided me with a work-around patch to chkprobe which produced the following output:
Checking rules.ok file...
c0t8d0 doesn't have a VTOC label
c0t9d0 doesn't have a VTOC label
This was fine in a system that had at least one VTOC labeled disk as the jumpstart installation could still proceed. When the all of the disks are EFI labeled then the installation fails with the message:
ERROR: One or more disks are found, but one of the following problems exists:
- Hardware failure
- The disk(s) available on this system cannot be used to install Solaris Software. They do not have a valid label. If you want to use the disk(s) for the install, use format(1M) to label the disk and restart the installation.
Solaris installation program exited.
To solve this you need to run the format -e command and re-label the disks. Note the "-e" (expert mode) option to format is required otherwise you won't be given the choice of label types.
# format -e
Searching for disks...done


AVAILABLE DISK SELECTIONS:
0. c0t0d0 <FUJITSU-MAP3367N SUN36G-0401-33.92GB>
/pci@1c,600000/scsi@2/sd@0,0
1. c0t1d0 <FUJITSU-MAP3367N SUN36G-0401-33.92GB>
/pci@1c,600000/scsi@2/sd@1,0
Specify disk (enter its number): 0
selecting c0t0d0
[disk formatted]


FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
inquiry - show vendor, product and revision
scsi - independent SCSI mode selects
cache - enable, disable or query SCSI disk cache
volname - set 8-character volume name
! - execute , then return
quit
format> label
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
Warning: This disk has an EFI label. Changing to SMI label will erase all
current partitions.
Continue? y
Auto configuration via format.dat[no]? y
format> quit
The fix Sun provided should make it into a future Solaris 10 update and I suspect once ZFS boot is released this problem will be resolved for good.

7 comments:

Starflake said...

Hi,

I just wanna thank you for this post. It really saved my day, trying to install Solaris 10 on a SUN Ultra 45.

I even searched SUN's official forums, but I found no answers there. So thank you again!

BR

Lennie

Scott Rowley said...

Been having this same issue with a new jumpstart as well, thanks for the save!

Unknown said...

Same here, and it's not fixed as of Solaris 10 Update 8. And, it's not that easy to troubleshoot Jumpstart with multiple boot servers on different subnets, nor do I do it every day.

Unknown said...

Sweet! I was trying to Jumpstart on (3) V440's today and had this issue. Thanks for the information!

Unknown said...

Thank you so much, you really save our day

Best Regards,

MS DP JP YS

Dave said...

Nice, works.

But how to continue or restart install ?

Matthew Flanagan said...

@David McNeill, It has been a few years since I've done this but usually when an installation exits and drops you to a shell you can restart the install after fixing things by running suninstall.

Cheers

Matthew