Thursday, July 13. 2006
The new features in Solaris and its legendary stability have convinced me to give it a try.
It has quite a few differences from Linux, some of them obviously better and others that are quite obnoxious to a seasoned Linux administrator. What follows are some notes on a sample configuration setting up RAID and ZFS on a pair of 80GB disks. I used Solaris 10 06/06, this short howto for the RAID information (with quite a few modifications for Solaris 10), and lots of man page reading.
I expect to follow this up with more notes on Solaris in the following weeks.
First I wanted to try a ZFS (Zetabyte File System) root system. This technology does away with partitioning, slices, and size worries, vastly simplifying storage management. Unfortunately, I found through both reading and trial/error processes, this is not a good idea yet. I decided on this configuration:
Software RAID volumes:
And a second, rest of the disk, partition for ZFS managment.
Obviously the ideal settings for you are going to vary based on your configuration. Note that /usr is left in /, unlike many other configurations I've done, as Solaris tends to use softlinks from and to there from all over the disk, so you really don't want it separate from / unless you enjoy work. Instead, I will toss /usr/local, /opt, and other software storage locations into ZFS as needed. Due to dependencies and separate mounts in /var, it is also a very bad idea to put that in ZFS as yet, so that is a separate partition too. I will most likely move /var/log to ZFS at some point in the future if needed. I also plan to put all /home, /db, etc. storage paritions into ZFS.
To accomplish this, I fdisked as follows:
Then I set up the Solaris partition with these slices:
Note that the metadb slices are used to store the Sun Volume Manager software RAID metadb's and should be place in two (or more) somewhat spaced out places on the disk so that, ideally, one of them will be intact in the case of partial disk failure. Note that if setup using the Solaris installer, leave the labels blank so they remain as unused space.
I set up these partitions using the Solaris 10 installer in the console install mode as part of the install. Note that for partition type you have to select DOS then change it later for the ZFS partition (this is optional, ZFS doesn't care what the partition type is). Once the system is booted you are then ready to set up RAID.
prtvtoc /dev/rdsk/c1d0s2 | fmthard -s - /dev/rdsk/c2d0s2
Which copies the slice definitions from the first to the second disk on my sample box, obviously you change the disk device paths accordingly (if you didn't manually setup fdisk partitions on the second disk to match the first, do that then return to this step). These must match exactly.
metadb -af -c 2 /dev/dsk/c1d0s1 /dev/dsk/c1d0s6
metadb -af -c 2 /dev/dsk/c2d0s1 /dev/dsk/c2d0s6
Which command sets up the initial metadb's on the disks. Next, metainit the partitions we want to use. Note that this works fine on your booted, live, root filesystem:
metainit -f d10 1 1 /dev/dsk/c1d0s0
metainit -f d20 1 1 /dev/dsk/c2d0s0
metainit d0 -m d10
metainit -f d11 1 1 /dev/dsk/c1d0s3
metainit -f d21 1 1 /dev/dsk/c2d0s3
metainit d1 -m d11
... (rinse lather, and repeat for each set)
Then, there's a cute little script to set up your root filesystem in vfstab for you:
Follow the model provided in your vfstab by this command to add each additional configuration to your setup.
Next, reboot to move your mounted system partitions to your new raid ones. Afterwards, issue this command to attach your second drive raids to the first:
metattach d0 d20
metattach d1 d21
...(rinse lather, and repeat for each set)
You can use metastat -c to watch the progress of your raid syncing. Assuming you want to be able to boot from the second disk in case the first fails, install grub on it:
installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c2d0s0
Your RAID volumes should be good to go. Now you get an example of how nice ZFS makes things. On the live system, issue this to setup your ZFS RAID:
zpool create pool mirror /dev/rdsk/c1d0p2 /dev/rdsk/c2d0p2
Bam, ZFS RAID is now up and working, no more work involved. Now to make my opt parition:
zfs create pool/opt
zfs set mountpoint=/opt pool/opt
Done. It is mounted, live, RAID, and fully ready to use.
Now my log partition:
zfs create pool/log
zfs set mountpoint=/var/log pool/log
zfs set quota=4G pool/log
With zfs a rough guess of size is just fine, you can change the quota up or down at will. No more worries about whether you allocated enough space.
An additional note about this configuration. With a two disk setup, Solaris' software RAID will not boot if one of the drives fails without going into single user mode, using metadb -d to remove the bad disk's metadbs, then restarting again. This is due to Solaris using a consensus based model that requires 50%+1 of the metadbs to decide which ones are correct. You can turn this off, but I opted not to, as it both provides notification that something is wrong on reboot (one of your drives failed) and is easy and fast to do.
Display comments as (Linear | Threaded)
A few comments.
although it's increasingly common to join /usr and /, it doesn't cause any problems (or any work) to keep them separate. I've been building SunOS/Solaris systems for 15 years, and I always and still insist on separation.
I don't recall that partitions need to match exactly for metadisk mirroring, although you get optimal space usage if they do. Maybe this has changed more recently, though.
Your prtvtoc | fmthard approach is exactly the ideal way to match VTOCs in Solaris, but only works if the disks have the same capacity and cylinder count (which usually means being the same model). If that's not your situation you can still approximate by interactive partitioning in format(1M), but (as above) your partitions may not be exactly the same size, and some space may be lost.
Sun Volume Manager (formerly DiskSuite) is actually quite nice in many respects, although it does require a bit of zombie organ-cranking in the setup commands. But you can easily dash out shell loops for this if you take the right strategy with metadevice naming (for example, using d0-d7 to indicate mirrors for partitions 0-7, and d11 for stripe 1 on SCSI target 1, etc.) I've used DiskSuite to build a disk factory for churning out prebuilt system boot disks, for example. Just insert disk, metattach, sync, metadetach, unplug, repeat.
metadbs. You need a quorum (minimum 3) of replicas to operate, but you can install multiple replicas in a single partition. I typically put 3 replicas in slice 7 of each disk in my mirrored root volume set, so there are 6 replicas total, and my system remains sane if I pull one of the disks. Be sure not to mirror the metadb slice. :)
But all that aside, yeah: ZFS is really sweet. I'm a dyed-in-the-wool Solaris partisan, but I'm excited to see ZFS becoming available for other platforms, too.
I too used to keep /usr separate from / on Solaris until recently. I discovered if you transfer a Solaris boot disk from one machine to another and the hardware does not match precisely, you can end up in a situation where you can boot and access the root partition, but /usr is unavailable because the symlinks have not been created in /dev It then becomes impossible to do a reconfigure boot, and you have to devise clever ways to repopulate your /dev directory to get your system booted. It's not impossible, but it creates work where it's not necessary.
[same coward here]
Oh, and thanks for the writeup. I like reading howtos.
To make an AMD system (in my case a Sun Fire X2100M2) bootable I also did the following steps;
To /etc/system I added these lines:
set md_mirror:md_resync_bufsz = 2048
setprop altbootpath '/pci@0,0/pci-ide@5/ide@1/cmdk@0,0:a'
Make the partitions on both discs active (format->fdisk).
Solaris is still a strange world to me, and i am learning more and more. This bit was great to set up raid on my sunfire V480 .. although i wish i would have read the comments about more than 2 metadbs so i can still function with one disk (the 480 on has room for 2 internal)
also, if someone could clarify, what does all the d#'s do? d11, d20 d20, etc ...
If you're doing a new install of Solaris you better use ZFS raid or mirror. Solaris can now boot from a ZFS disk.
Lexiyntax @ Twitter