GEOM: Modular Disk Transformation FrameworkTomRhodesWritten by SynopsisGEOMGEOM Disk FrameworkGEOMIn &os;, the GEOM framework permits
access and control to classes, such as Master Boot Records and
BSD labels, through the use of providers, or
the disk devices in /dev. By supporting
various software RAID configurations,
GEOM transparently provides access to the
operating system and operating system utilities.This chapter covers the use of disks under the
GEOM framework in &os;. This includes the
major RAID control utilities which use the
framework for configuration. This chapter is not a definitive
guide to RAID configurations and only
GEOM-supported RAID
classifications are discussed.After reading this chapter, you will know:What type of RAID support is
available through GEOM.How to use the base utilities to configure, maintain,
and manipulate the various RAID
levels.How to mirror, stripe, encrypt, and remotely connect
disk devices through GEOM.How to troubleshoot disks attached to the
GEOM framework.Before reading this chapter, you should:Understand how &os; treats disk devices ().Know how to configure and install a new kernel ().RAID0 - StripingTomRhodesWritten by MurrayStokelyGEOMStripingStriping combines several disk drives into a single volume.
Striping can be performed through the use of hardware
RAID controllers. The
GEOM disk subsystem provides software support
for disk striping, also known as RAID0,
without the need for a RAID disk
controller.In RAID0, data is split into blocks that
are written across all the drives in the array. As seen in the
following illustration, instead of having to wait on the system
to write 256k to one disk, RAID0 can
simultaneously write 64k to each of the four disks in the array,
offering superior I/O performance. This
performance can be enhanced further by using multiple disk
controllers.Disk Striping IllustrationEach disk in a RAID0 stripe must be of
the same size, since I/O requests are
interleaved to read or write to multiple disks in
parallel.RAID0 does not
provide any redundancy. This means that if one disk in the
array fails, all of the data on the disks is lost. If the
data is important, implement a backup strategy that regularly
saves backups to a remote system or device.The process for creating a software,
GEOM-based RAID0 on a &os;
system using commodity disks is as follows. Once the stripe is
created, refer to &man.gstripe.8; for more information on how
to control an existing stripe.Creating a Stripe of Unformatted ATA
DisksLoad the geom_stripe.ko
module:&prompt.root; kldload geom_stripeEnsure that a suitable mount point exists. If this
volume will become a root partition, then temporarily use
another mount point such as
/mnt.Determine the device names for the disks which will
be striped, and create the new stripe device. For example,
to stripe two unused and unpartitioned
ATA disks with device names of
/dev/ad2 and
/dev/ad3:&prompt.root; gstripe label -v st0 /dev/ad2 /dev/ad3
Metadata value stored on /dev/ad2.
Metadata value stored on /dev/ad3.
Done.Write a standard label, also known as a partition table,
on the new volume and install the default bootstrap
code:&prompt.root; bsdlabel -wB /dev/stripe/st0This process should create two other devices in
/dev/stripe in addition to
st0. Those include
st0a and st0c. At
this point, a UFS file system can be
created on st0a using
newfs:&prompt.root; newfs -U /dev/stripe/st0aMany numbers will glide across the screen, and after a
few seconds, the process will be complete. The volume has
been created and is ready to be mounted.To manually mount the created disk stripe:&prompt.root; mount /dev/stripe/st0a /mntTo mount this striped file system automatically during
the boot process, place the volume information in
/etc/fstab. In this example, a
permanent mount point, named stripe, is
created:&prompt.root; mkdir /stripe
&prompt.root; echo "/dev/stripe/st0a /stripe ufs rw 2 2" \>> /etc/fstabThe geom_stripe.ko module must also
be automatically loaded during system initialization, by
adding a line to
/boot/loader.conf:&prompt.root; echo 'geom_stripe_load="YES"' >> /boot/loader.confRAID1 - MirroringGEOMDisk MirroringRAID1RAID1, or
mirroring, is the technique of writing
the same data to more than one disk drive. Mirrors are usually
used to guard against data loss due to drive failure. Each
drive in a mirror contains an identical copy of the data. When
an individual drive fails, the mirror continues to work,
providing data from the drives that are still functioning. The
computer keeps running, and the administrator has time to
replace the failed drive without user interruption.Two common situations are illustrated in these examples.
The first creates a mirror out of two new drives and uses it as
a replacement for an existing single drive. The second example
creates a mirror on a single new drive, copies the old drive's
data to it, then inserts the old drive into the mirror. While
this procedure is slightly more complicated, it only requires
one new drive.Traditionally, the two drives in a mirror are identical in
model and capacity, but &man.gmirror.8; does not require that.
Mirrors created with dissimilar drives will have a capacity
equal to that of the smallest drive in the mirror. Extra space
on larger drives will be unused. Drives inserted into the
mirror later must have at least as much capacity as the smallest
drive already in the mirror.The mirroring procedures shown here are non-destructive,
but as with any major disk operation, make a full backup
first.While &man.dump.8; is used in these procedures
to copy file systems, it does not work on file systems with
soft updates journaling. See &man.tunefs.8; for information
on detecting and disabling soft updates journaling.Metadata IssuesMany disk systems store metadata at the end of each disk.
Old metadata should be erased before reusing the disk for a
mirror. Most problems are caused by two particular types of
leftover metadata: GPT partition tables and
old metadata from a previous mirror.GPT metadata can be erased with
&man.gpart.8;. This example erases both primary and backup
GPT partition tables from disk
ada8:&prompt.root; gpart destroy -F ada8A disk can be removed from an active mirror and the
metadata erased in one step using &man.gmirror.8;. Here, the
example disk ada8 is removed from the
active mirror gm4:&prompt.root; gmirror remove gm4 ada8If the mirror is not running, but old mirror metadata is
still on the disk, use gmirror clear to
remove it:&prompt.root; gmirror clear ada8&man.gmirror.8; stores one block of metadata at the end of
the disk. Because GPT partition schemes
also store metadata at the end of the disk, mirroring entire
GPT disks with &man.gmirror.8; is not
recommended. MBR partitioning is used here
because it only stores a partition table at the start of the
disk and does not conflict with the mirror metadata.Creating a Mirror with Two New DisksIn this example, &os; has already been installed on a
single disk, ada0. Two new disks,
ada1 and ada2, have
been connected to the system. A new mirror will be created on
these two disks and used to replace the old single
disk.The geom_mirror.ko kernel module must
either be built into the kernel or loaded at boot- or
run-time. Manually load the kernel module now:&prompt.root; gmirror loadCreate the mirror with the two new drives:&prompt.root; gmirror label -v gm0 /dev/ada1 /dev/ada2gm0 is a user-chosen device name
assigned to the new mirror. After the mirror has been
started, this device name appears in
/dev/mirror/.MBR and
bsdlabel partition tables can now
be created on the mirror with &man.gpart.8;. This example
uses a traditional file system layout, with partitions for
/, swap, /var,
/tmp, and /usr. A
single / and a swap partition
will also work.Partitions on the mirror do not have to be the same size
as those on the existing disk, but they must be large enough
to hold all the data already present on
ada0.&prompt.root; gpart create -s MBR mirror/gm0
&prompt.root; gpart add -t freebsd -a 4k mirror/gm0
&prompt.root; gpart show mirror/gm0
=> 63 156301423 mirror/gm0 MBR (74G)
63 63 - free - (31k)
126 156301299 1 freebsd (74G)
156301425 61 - free - (30k)&prompt.root; gpart create -s BSD mirror/gm0s1
&prompt.root; gpart add -t freebsd-ufs -a 4k -s 2g mirror/gm0s1
&prompt.root; gpart add -t freebsd-swap -a 4k -s 4g mirror/gm0s1
&prompt.root; gpart add -t freebsd-ufs -a 4k -s 2g mirror/gm0s1
&prompt.root; gpart add -t freebsd-ufs -a 4k -s 1g mirror/gm0s1
&prompt.root; gpart add -t freebsd-ufs -a 4k mirror/gm0s1
&prompt.root; gpart show mirror/gm0s1
=> 0 156301299 mirror/gm0s1 BSD (74G)
0 2 - free - (1.0k)
2 4194304 1 freebsd-ufs (2.0G)
4194306 8388608 2 freebsd-swap (4.0G)
12582914 4194304 4 freebsd-ufs (2.0G)
16777218 2097152 5 freebsd-ufs (1.0G)
18874370 137426928 6 freebsd-ufs (65G)
156301298 1 - free - (512B)Make the mirror bootable by installing bootcode in the
MBR and bsdlabel and setting the active
slice:&prompt.root; gpart bootcode -b /boot/mbr mirror/gm0
&prompt.root; gpart set -a active -i 1 mirror/gm0
&prompt.root; gpart bootcode -b /boot/boot mirror/gm0s1Format the file systems on the new mirror, enabling
soft-updates.&prompt.root; newfs -U /dev/mirror/gm0s1a
&prompt.root; newfs -U /dev/mirror/gm0s1d
&prompt.root; newfs -U /dev/mirror/gm0s1e
&prompt.root; newfs -U /dev/mirror/gm0s1fFile systems from the original ada0
disk can now be copied onto the mirror with &man.dump.8; and
&man.restore.8;.&prompt.root; mount /dev/mirror/gm0s1a /mnt
&prompt.root; dump -C16 -b64 -0aL -f - / | (cd /mnt && restore -rf -)
&prompt.root; mount /dev/mirror/gm0s1d /mnt/var
&prompt.root; mount /dev/mirror/gm0s1e /mnt/tmp
&prompt.root; mount /dev/mirror/gm0s1f /mnt/usr
&prompt.root; dump -C16 -b64 -0aL -f - /var | (cd /mnt/var && restore -rf -)
&prompt.root; dump -C16 -b64 -0aL -f - /tmp | (cd /mnt/tmp && restore -rf -)
&prompt.root; dump -C16 -b64 -0aL -f - /usr | (cd /mnt/usr && restore -rf -)Edit /mnt/etc/fstab to point to
the new mirror file systems:# Device Mountpoint FStype Options Dump Pass#
/dev/mirror/gm0s1a / ufs rw 1 1
/dev/mirror/gm0s1b none swap sw 0 0
/dev/mirror/gm0s1d /var ufs rw 2 2
/dev/mirror/gm0s1e /tmp ufs rw 2 2
/dev/mirror/gm0s1f /usr ufs rw 2 2If the geom_mirror.ko kernel module
has not been built into the kernel,
/mnt/boot/loader.conf is edited to load
the module at boot:geom_mirror_load="YES"Reboot the system to test the new mirror and verify that
all data has been copied. The BIOS will
see the mirror as two individual drives rather than a mirror.
Because the drives are identical, it does not matter which is
selected to boot.See if there are
problems booting. Powering down and disconnecting the
original ada0 disk will allow it to be
kept as an offline backup.In use, the mirror will behave just like the original
single drive.Creating a Mirror with an Existing DriveIn this example, &os; has already been installed on a
single disk, ada0. A new disk,
ada1, has been connected to the system.
A one-disk mirror will be created on the new disk, the
existing system copied onto it, and then the old disk will be
inserted into the mirror. This slightly complex procedure is
required because gmirror needs to put a
512-byte block of metadata at the end of each disk, and the
existing ada0 has usually had all of its
space already allocated.Load the geom_mirror.ko kernel
module:&prompt.root; gmirror loadCheck the media size of the original disk with
diskinfo:&prompt.root; diskinfo -v ada0 | head -n3
/dev/ada0
512 # sectorsize
1000204821504 # mediasize in bytes (931G)Create a mirror on the new disk. To make certain that the
mirror capacity is not any larger than the original
ada0 drive, &man.gnop.8; is used to
create a fake drive of the exact same size. This drive does
not store any data, but is used only to limit the size of the
mirror. When &man.gmirror.8; creates the mirror, it will
restrict the capacity to the size of
gzero.nop, even if the new
ada1 drive has more space. Note that the
1000204821504 in the second line is
equal to ada0's media size as shown by
diskinfo above.&prompt.root; geom zero load
&prompt.root; gnop create -s 1000204821504 gzero
&prompt.root; gmirror label -v gm0 gzero.nop ada1
&prompt.root; gmirror forget gm0Since gzero.nop does not store any
data, the mirror does not see it as connected. The mirror is
told to forget unconnected components, removing
references to gzero.nop. The result is a
mirror device containing only a single disk,
ada1.After creating gm0, view the
partition table on ada0. This output is
from a 1 TB drive. If there is some unallocated space at
the end of the drive, the contents may be copied directly from
ada0 to the new mirror.However, if the output shows that all of the space on the
disk is allocated, as in the following listing, there is no
space available for the 512-byte mirror metadata at the end of
the disk.&prompt.root; gpart show ada0
=> 63 1953525105 ada0 MBR (931G)
63 1953525105 1 freebsd [active] (931G)In this case, the partition table must be edited to reduce
the capacity by one sector on mirror/gm0.
The procedure will be explained later.In either case, partition tables on the primary disk
should be first copied using gpart backup
and gpart restore.&prompt.root; gpart backup ada0 > table.ada0
&prompt.root; gpart backup ada0s1 > table.ada0s1These commands create two files,
table.ada0 and
table.ada0s1. This example is from a
1 TB drive:&prompt.root; cat table.ada0
MBR 4
1 freebsd 63 1953525105 [active]&prompt.root; cat table.ada0s1
BSD 8
1 freebsd-ufs 0 4194304
2 freebsd-swap 4194304 33554432
4 freebsd-ufs 37748736 50331648
5 freebsd-ufs 88080384 41943040
6 freebsd-ufs 130023424 838860800
7 freebsd-ufs 968884224 984640881If no free space is shown at the end of the disk, the size
of both the slice and the last partition must be reduced by
one sector. Edit the two files, reducing the size of both the
slice and last partition by one. These are the last numbers
in each listing.&prompt.root; cat table.ada0
MBR 4
1 freebsd 63 1953525104 [active]&prompt.root; cat table.ada0s1
BSD 8
1 freebsd-ufs 0 4194304
2 freebsd-swap 4194304 33554432
4 freebsd-ufs 37748736 50331648
5 freebsd-ufs 88080384 41943040
6 freebsd-ufs 130023424 838860800
7 freebsd-ufs 968884224 984640880If at least one sector was unallocated at the end of the
disk, these two files can be used without modification.Now restore the partition table into
mirror/gm0:&prompt.root; gpart restore mirror/gm0 < table.ada0
&prompt.root; gpart restore mirror/gm0s1 < table.ada0s1Check the partition table with
gpart show. This example has
gm0s1a for /,
gm0s1d for /var,
gm0s1e for /usr,
gm0s1f for /data1,
and gm0s1g for
/data2.&prompt.root; gpart show mirror/gm0
=> 63 1953525104 mirror/gm0 MBR (931G)
63 1953525042 1 freebsd [active] (931G)
1953525105 62 - free - (31k)
&prompt.root; gpart show mirror/gm0s1
=> 0 1953525042 mirror/gm0s1 BSD (931G)
0 2097152 1 freebsd-ufs (1.0G)
2097152 16777216 2 freebsd-swap (8.0G)
18874368 41943040 4 freebsd-ufs (20G)
60817408 20971520 5 freebsd-ufs (10G)
81788928 629145600 6 freebsd-ufs (300G)
710934528 1242590514 7 freebsd-ufs (592G)
1953525042 63 - free - (31k)Both the slice and the last partition must have at least
one free block at the end of the disk.Create file systems on these new partitions. The number
of partitions will vary to match the original disk,
ada0.&prompt.root; newfs -U /dev/mirror/gm0s1a
&prompt.root; newfs -U /dev/mirror/gm0s1d
&prompt.root; newfs -U /dev/mirror/gm0s1e
&prompt.root; newfs -U /dev/mirror/gm0s1f
&prompt.root; newfs -U /dev/mirror/gm0s1gMake the mirror bootable by installing bootcode in the
MBR and bsdlabel and setting the active
slice:&prompt.root; gpart bootcode -b /boot/mbr mirror/gm0
&prompt.root; gpart set -a active -i 1 mirror/gm0
&prompt.root; gpart bootcode -b /boot/boot mirror/gm0s1Adjust /etc/fstab to use the new
partitions on the mirror. Back up this file first by copying
it to /etc/fstab.orig.&prompt.root; cp /etc/fstab /etc/fstab.origEdit /etc/fstab, replacing
/dev/ada0 with
mirror/gm0.# Device Mountpoint FStype Options Dump Pass#
/dev/mirror/gm0s1a / ufs rw 1 1
/dev/mirror/gm0s1b none swap sw 0 0
/dev/mirror/gm0s1d /var ufs rw 2 2
/dev/mirror/gm0s1e /usr ufs rw 2 2
/dev/mirror/gm0s1f /data1 ufs rw 2 2
/dev/mirror/gm0s1g /data2 ufs rw 2 2If the geom_mirror.ko kernel module
has not been built into the kernel, edit
/boot/loader.conf to load it at
boot:geom_mirror_load="YES"File systems from the original disk can now be copied onto
the mirror with &man.dump.8; and &man.restore.8;. Each file
system dumped with dump -L will create a
snapshot first, which can take some time.&prompt.root; mount /dev/mirror/gm0s1a /mnt
&prompt.root; dump -C16 -b64 -0aL -f - / | (cd /mnt && restore -rf -)
&prompt.root; mount /dev/mirror/gm0s1d /mnt/var
&prompt.root; mount /dev/mirror/gm0s1e /mnt/usr
&prompt.root; mount /dev/mirror/gm0s1f /mnt/data1
&prompt.root; mount /dev/mirror/gm0s1g /mnt/data2
&prompt.root; dump -C16 -b64 -0aL -f - /usr | (cd /mnt/usr && restore -rf -)
&prompt.root; dump -C16 -b64 -0aL -f - /var | (cd /mnt/var && restore -rf -)
&prompt.root; dump -C16 -b64 -0aL -f - /data1 | (cd /mnt/data1 && restore -rf -)
&prompt.root; dump -C16 -b64 -0aL -f - /data2 | (cd /mnt/data2 && restore -rf -)Restart the system, booting from
ada1. If everything is working, the
system will boot from mirror/gm0, which
now contains the same data as ada0 had
previously. See if
there are problems booting.At this point, the mirror still consists of only the
single ada1 disk.After booting from mirror/gm0
successfully, the final step is inserting
ada0 into the mirror.When ada0 is inserted into the
mirror, its former contents will be overwritten by data from
the mirror. Make certain that
mirror/gm0 has the same contents as
ada0 before adding
ada0 to the mirror. If the contents
previously copied by &man.dump.8; and &man.restore.8; are
not identical to what was on ada0,
revert /etc/fstab to mount the file
systems on ada0, reboot, and start the
whole procedure again.&prompt.root; gmirror insert gm0 ada0
GEOM_MIRROR: Device gm0: rebuilding provider ada0Synchronization between the two disks will start
immediately. Use gmirror status to view
the progress.&prompt.root; gmirror status
Name Status Components
mirror/gm0 DEGRADED ada1 (ACTIVE)
ada0 (SYNCHRONIZING, 64%)After a while, synchronization will finish.GEOM_MIRROR: Device gm0: rebuilding provider ada0 finished.
&prompt.root; gmirror status
Name Status Components
mirror/gm0 COMPLETE ada1 (ACTIVE)
ada0 (ACTIVE)mirror/gm0 now consists
of the two disks ada0 and
ada1, and the contents are automatically
synchronized with each other. In use,
mirror/gm0 will behave just like the
original single drive.TroubleshootingIf the system no longer boots, BIOS
settings may have to be changed to boot from one of the new
mirrored drives. Either mirror drive can be used for booting,
as they contain identical data.If the boot stops with this message, something is wrong
with the mirror device:Mounting from ufs:/dev/mirror/gm0s1a failed with error 19.
Loader variables:
vfs.root.mountfrom=ufs:/dev/mirror/gm0s1a
vfs.root.mountfrom.options=rw
Manual root filesystem specification:
<fstype>:<device> [options]
Mount <device> using filesystem <fstype>
and with the specified (optional) option list.
eg. ufs:/dev/da0s1a
zfs:tank
cd9660:/dev/acd0 ro
(which is equivalent to: mount -t cd9660 -o ro /dev/acd0 /)
? List valid disk boot devices
. Yield 1 second (for background tasks)
<empty line> Abort manual input
mountroot>Forgetting to load the geom_mirror.ko
module in /boot/loader.conf can cause
this problem. To fix it, boot from a &os;
installation media and choose Shell at the
first prompt. Then load the mirror module and mount the
mirror device:&prompt.root; gmirror load
&prompt.root; mount /dev/mirror/gm0s1a /mntEdit /mnt/boot/loader.conf, adding a
line to load the mirror module:geom_mirror_load="YES"Save the file and reboot.Other problems that cause error 19
require more effort to fix. Although the system should boot
from ada0, another prompt to select a
shell will appear if /etc/fstab is
incorrect. Enter ufs:/dev/ada0s1a at the
boot loader prompt and press Enter. Undo the
edits in /etc/fstab then mount the file
systems from the original disk (ada0)
instead of the mirror. Reboot the system and try the
procedure again.Enter full pathname of shell or RETURN for /bin/sh:
&prompt.root; cp /etc/fstab.orig /etc/fstab
&prompt.root; rebootRecovering from Disk FailureThe benefit of disk mirroring is that an individual disk
can fail without causing the mirror to lose any data. In the
above example, if ada0 fails, the mirror
will continue to work, providing data from the remaining
working drive, ada1.To replace the failed drive, shut down the system and
physically replace the failed drive with a new drive of equal
or greater capacity. Manufacturers use somewhat arbitrary
values when rating drives in gigabytes, and the only way to
really be sure is to compare the total count of sectors shown
by diskinfo -v. A drive with larger
capacity than the mirror will work, although the extra space
on the new drive will not be used.After the computer is powered back up, the mirror will be
running in a degraded mode with only one drive.
The mirror is told to forget drives that are not currently
connected:&prompt.root; gmirror forget gm0Any old metadata should be cleared from the replacement
disk using the instructions in
. Then the replacement
disk, ada4 for this example, is inserted
into the mirror:&prompt.root; gmirror insert gm0 /dev/ada4Resynchronization begins when the new drive is inserted
into the mirror. This process of copying mirror data to a new
drive can take a while. Performance of the mirror will be
greatly reduced during the copy, so inserting new drives is
best done when there is low demand on the computer.Progress can be monitored with gmirror
status, which shows drives that are being
synchronized and the percentage of completion. During
resynchronization, the status will be
DEGRADED, changing to
COMPLETE when the process is
finished.RAID3 - Byte-level Striping with
Dedicated ParityMarkGladmanWritten by DanielGerzoTomRhodesBased on documentation by MurrayStokelyGEOMRAID3RAID3 is a method used to combine several
disk drives into a single volume with a dedicated parity disk.
In a RAID3 system, data is split up into a
number of bytes that are written across all the drives in the
array except for one disk which acts as a dedicated parity disk.
This means that disk reads from a RAID3
implementation access all disks in the array. Performance can
be enhanced by using multiple disk controllers. The
RAID3 array provides a fault tolerance of 1
drive, while providing a capacity of 1 - 1/n times the total
capacity of all drives in the array, where n is the number of
hard drives in the array. Such a configuration is mostly
suitable for storing data of larger sizes such as multimedia
files.At least 3 physical hard drives are required to build a
RAID3 array. Each disk must be of the same
size, since I/O requests are interleaved to
read or write to multiple disks in parallel. Also, due to the
nature of RAID3, the number of drives must be
equal to 3, 5, 9, 17, and so on, or 2^n + 1.This section demonstrates how to create a software
RAID3 on a &os; system.While it is theoretically possible to boot from a
RAID3 array on &os;, that configuration is
uncommon and is not advised.Creating a Dedicated RAID3
ArrayIn &os;, support for RAID3 is
implemented by the &man.graid3.8; GEOM
class. Creating a dedicated RAID3 array on
&os; requires the following steps.First, load the geom_raid3.ko
kernel module by issuing one of the following
commands:&prompt.root; graid3 loador:&prompt.root; kldload geom_raid3Ensure that a suitable mount point exists. This
command creates a new directory to use as the mount
point:&prompt.root; mkdir /multimediaDetermine the device names for the disks which will be
added to the array, and create the new
RAID3 device. The final device listed
will act as the dedicated parity disk. This example uses
three unpartitioned ATA drives:
ada1 and
ada2 for
data, and
ada3 for
parity.&prompt.root; graid3 label -v gr0 /dev/ada1 /dev/ada2 /dev/ada3
Metadata value stored on /dev/ada1.
Metadata value stored on /dev/ada2.
Metadata value stored on /dev/ada3.
Done.Partition the newly created gr0
device and put a UFS file system on
it:&prompt.root; gpart create -s GPT /dev/raid3/gr0
&prompt.root; gpart add -t freebsd-ufs /dev/raid3/gr0
&prompt.root; newfs -j /dev/raid3/gr0p1Many numbers will glide across the screen, and after a
bit of time, the process will be complete. The volume has
been created and is ready to be mounted:&prompt.root; mount /dev/raid3/gr0p1 /multimedia/The RAID3 array is now ready to
use.Additional configuration is needed to retain this setup
across system reboots.The geom_raid3.ko module must be
loaded before the array can be mounted. To automatically
load the kernel module during system initialization, add
the following line to
/boot/loader.conf:geom_raid3_load="YES"The following volume information must be added to
/etc/fstab in order to
automatically mount the array's file system during the
system boot process:/dev/raid3/gr0p1 /multimedia ufs rw 2 2Software RAID DevicesWarrenBlockOriginally contributed by GEOMSoftware RAID DevicesHardware-assisted RAIDSome motherboards and expansion cards add some simple
hardware, usually just a ROM, that allows the
computer to boot from a RAID array. After
booting, access to the RAID array is handled
by software running on the computer's main processor. This
hardware-assisted software
RAID gives RAID
arrays that are not dependent on any particular operating
system, and which are functional even before an operating system
is loaded.Several levels of RAID are supported,
depending on the hardware in use. See &man.graid.8; for a
complete list.&man.graid.8; requires the geom_raid.ko
kernel module, which is included in the
GENERIC kernel starting with &os; 9.1.
If needed, it can be loaded manually with
graid load.Creating an ArraySoftware RAID devices often have a menu
that can be entered by pressing special keys when the computer
is booting. The menu can be used to create and delete
RAID arrays. &man.graid.8; can also create
arrays directly from the command line.graid label is used to create a new
array. The motherboard used for this example has an Intel
software RAID chipset, so the Intel
metadata format is specified. The new array is given a label
of gm0, it is a mirror
(RAID1), and uses drives
ada0 and
ada1.Some space on the drives will be overwritten when they
are made into a new array. Back up existing data
first!&prompt.root; graid label Intel gm0 RAID1 ada0 ada1
GEOM_RAID: Intel-a29ea104: Array Intel-a29ea104 created.
GEOM_RAID: Intel-a29ea104: Disk ada0 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:0-ada0 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Array started.
GEOM_RAID: Intel-a29ea104: Volume gm0 state changed from STARTING to OPTIMAL.
Intel-a29ea104 created
GEOM_RAID: Intel-a29ea104: Provider raid/r0 for volume gm0 created.A status check shows the new mirror is ready for
use:&prompt.root; graid status
Name Status Components
raid/r0 OPTIMAL ada0 (ACTIVE (ACTIVE))
ada1 (ACTIVE (ACTIVE))The array device appears in
/dev/raid/. The first array is called
r0. Additional arrays, if present, will
be r1, r2, and so
on.The BIOS menu on some of these devices
can create arrays with special characters in their names. To
avoid problems with those special characters, arrays are given
simple numbered names like r0. To show
the actual labels, like gm0 in the
example above, use &man.sysctl.8;:&prompt.root; sysctl kern.geom.raid.name_format=1Multiple VolumesSome software RAID devices support
more than one volume on an array.
Volumes work like partitions, allowing space on the physical
drives to be split and used in different ways. For example,
Intel software RAID devices support two
volumes. This example creates a 40 G mirror for safely
storing the operating system, followed by a 20 G
RAID0 (stripe) volume for fast temporary
storage:&prompt.root; graid label -S 40G Intel gm0 RAID1 ada0 ada1
&prompt.root; graid add -S 20G gm0 RAID0Volumes appear as additional
rX entries
in /dev/raid/. An array with two volumes
will show r0 and
r1.See &man.graid.8; for the number of volumes supported by
different software RAID devices.Converting a Single Drive to a MirrorUnder certain specific conditions, it is possible to
convert an existing single drive to a &man.graid.8; array
without reformatting. To avoid data loss during the
conversion, the existing drive must meet these minimum
requirements:The drive must be partitioned with the
MBR partitioning scheme.
GPT or other partitioning schemes with
metadata at the end of the drive will be overwritten and
corrupted by the &man.graid.8; metadata.There must be enough unpartitioned and unused space at
the end of the drive to hold the &man.graid.8; metadata.
This metadata varies in size, but the largest occupies
64 M, so at least that much free space is
recommended.If the drive meets these requirements, start by making a
full backup. Then create a single-drive mirror with that
drive:&prompt.root; graid label Intel gm0 RAID1 ada0 NONE&man.graid.8; metadata was written to the end of the drive
in the unused space. A second drive can now be inserted into
the mirror:&prompt.root; graid insert raid/r0 ada1Data from the original drive will immediately begin to be
copied to the second drive. The mirror will operate in
degraded status until the copy is complete.Inserting New Drives into the ArrayDrives can be inserted into an array as replacements for
drives that have failed or are missing. If there are no
failed or missing drives, the new drive becomes a spare. For
example, inserting a new drive into a working two-drive mirror
results in a two-drive mirror with one spare drive, not a
three-drive mirror.In the example mirror array, data immediately begins to be
copied to the newly-inserted drive. Any existing information
on the new drive will be overwritten.&prompt.root; graid insert raid/r0 ada1
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NONE to NEW.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NEW to REBUILD.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 rebuild start at 0.Removing Drives from the ArrayIndividual drives can be permanently removed from a
from an array and their metadata erased:&prompt.root; graid remove raid/r0 ada1
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from ACTIVE to OFFLINE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-[unknown] state changed from ACTIVE to NONE.
GEOM_RAID: Intel-a29ea104: Volume gm0 state changed from OPTIMAL to DEGRADED.Stopping the ArrayAn array can be stopped without removing metadata from the
drives. The array will be restarted when the system is
booted.&prompt.root; graid stop raid/r0Checking Array StatusArray status can be checked at any time. After a drive
was added to the mirror in the example above, data is being
copied from the original drive to the new drive:&prompt.root; graid status
Name Status Components
raid/r0 DEGRADED ada0 (ACTIVE (ACTIVE))
ada1 (ACTIVE (REBUILD 28%))Some types of arrays, like RAID0 or
CONCAT, may not be shown in the status
report if disks have failed. To see these partially-failed
arrays, add :&prompt.root; graid status -ga
Name Status Components
Intel-e2d07d9a BROKEN ada6 (ACTIVE (ACTIVE))Deleting ArraysArrays are destroyed by deleting all of the volumes from
them. When the last volume present is deleted, the array is
stopped and metadata is removed from the drives:&prompt.root; graid delete raid/r0Deleting Unexpected ArraysDrives may unexpectedly contain &man.graid.8; metadata,
either from previous use or manufacturer testing.
&man.graid.8; will detect these drives and create an array,
interfering with access to the individual drive. To remove
the unwanted metadata:Boot the system. At the boot menu, select
2 for the loader prompt. Enter:OK set kern.geom.raid.enable=0
OK bootThe system will boot with &man.graid.8;
disabled.Back up all data on the affected drive.As a workaround, &man.graid.8; array detection
can be disabled by addingkern.geom.raid.enable=0to /boot/loader.conf.To permanently remove the &man.graid.8; metadata
from the affected drive, boot a &os; installation
CD-ROM or memory stick, and select
Shell. Use status
to find the name of the array, typically
raid/r0:&prompt.root; graid status
Name Status Components
raid/r0 OPTIMAL ada0 (ACTIVE (ACTIVE))
ada1 (ACTIVE (ACTIVE))Delete the volume by name:&prompt.root; graid delete raid/r0If there is more than one volume shown, repeat the
process for each volume. After the last array has been
deleted, the volume will be destroyed.Reboot and verify data, restoring from backup if
necessary. After the metadata has been removed, the
kern.geom.raid.enable=0 entry in
/boot/loader.conf can also be
removed.GEOM Gate NetworkGEOM provides a simple mechanism for
providing remote access to devices such as disks,
CDs, and file systems through the use of the
GEOM Gate network daemon,
ggated. The system with the device
runs the server daemon which handles requests made by clients
using ggatec. The devices should not
contain any sensitive data as the connection between the client
and the server is not encrypted.Similar to NFS, which is discussed in
, ggated
is configured using an exports file. This file specifies which
systems are permitted to access the exported resources and what
level of access they are offered. For example, to give the
client 192.168.1.5
read and write access to the fourth slice on the first
SCSI disk, create
/etc/gg.exports with this line:192.168.1.5 RW /dev/da0s4dBefore exporting the device, ensure it is not currently
mounted. Then, start ggated:&prompt.root; ggatedSeveral options are available for specifying an alternate
listening port or changing the default location of the exports
file. Refer to &man.ggated.8; for details.To access the exported device on the client machine, first
use ggatec to specify the
IP address of the server and the device name
of the exported device. If successful, this command will
display a ggate device name to mount. Mount
that specified device name on a free mount point. This example
connects to the /dev/da0s4d partition on
192.168.1.1, then mounts
/dev/ggate0 on
/mnt:&prompt.root; ggatec create -o rw 192.168.1.1 /dev/da0s4d
ggate0
&prompt.root; mount /dev/ggate0 /mntThe device on the server may now be accessed through
/mnt on the client. For more details about
ggatec and a few usage examples, refer to
&man.ggatec.8;.The mount will fail if the device is currently mounted on
either the server or any other client on the network. If
simultaneous access is needed to network resources, use
NFS instead.When the device is no longer needed, unmount it with
umount so that the resource is available to
other clients.Labeling Disk DevicesGEOMDisk LabelsDuring system initialization, the &os; kernel creates
device nodes as devices are found. This method of probing for
devices raises some issues. For instance, what if a new disk
device is added via USB? It is likely that
a flash device may be handed the device name of
da0 and the original
da0 shifted to
da1. This will cause issues mounting
file systems if they are listed in
/etc/fstab which may also prevent the
system from booting.One solution is to chain SCSI devices
in order so a new device added to the SCSI
card will be issued unused device numbers. But what about
USB devices which may replace the primary
SCSI disk? This happens because
USB devices are usually probed before the
SCSI card. One solution is to only insert
these devices after the system has been booted. Another method
is to use only a single ATA drive and never
list the SCSI devices in
/etc/fstab.A better solution is to use glabel to
label the disk devices and use the labels in
/etc/fstab. Because
glabel stores the label in the last sector of
a given provider, the label will remain persistent across
reboots. By using this label as a device, the file system may
always be mounted regardless of what device node it is accessed
through.glabel can create both transient and
permanent labels. Only permanent labels are consistent across
reboots. Refer to &man.glabel.8; for more information on the
differences between labels.Label Types and ExamplesPermanent labels can be a generic or a file system label.
Permanent file system labels can be created with
&man.tunefs.8; or &man.newfs.8;. These types of labels are
created in a sub-directory of /dev, and
will be named according to the file system type. For example,
UFS2 file system labels will be created in
/dev/ufs. Generic permanent labels can
be created with glabel label. These are
not file system specific and will be created in
/dev/label.Temporary labels are destroyed at the next reboot. These
labels are created in /dev/label and are
suited to experimentation. A temporary label can be created
using glabel create.To create a permanent label for a
UFS2 file system without destroying any
data, issue the following command:&prompt.root; tunefs -L home/dev/da3A label should now exist in /dev/ufs
which may be added to /etc/fstab:/dev/ufs/home /home ufs rw 2 2The file system must not be mounted while attempting
to run tunefs.Now the file system may be mounted:&prompt.root; mount /homeFrom this point on, so long as the
geom_label.ko kernel module is loaded at
boot with /boot/loader.conf or the
GEOM_LABEL kernel option is present,
the device node may change without any ill effect on the
system.File systems may also be created with a default label
by using the flag with
newfs. Refer to &man.newfs.8; for
more information.The following command can be used to destroy the
label:&prompt.root; glabel destroy homeThe following example shows how to label the partitions of
a boot disk.Labeling Partitions on the Boot DiskBy permanently labeling the partitions on the boot disk,
the system should be able to continue to boot normally, even
if the disk is moved to another controller or transferred to
a different system. For this example, it is assumed that a
single ATA disk is used, which is
currently recognized by the system as
ad0. It is also assumed that the
standard &os; partition scheme is used, with
/,
/var,
/usr and
/tmp, as
well as a swap partition.Reboot the system, and at the &man.loader.8; prompt,
press 4 to boot into single user mode.
Then enter the following commands:&prompt.root; glabel label rootfs /dev/ad0s1a
GEOM_LABEL: Label for provider /dev/ad0s1a is label/rootfs
&prompt.root; glabel label var /dev/ad0s1d
GEOM_LABEL: Label for provider /dev/ad0s1d is label/var
&prompt.root; glabel label usr /dev/ad0s1f
GEOM_LABEL: Label for provider /dev/ad0s1f is label/usr
&prompt.root; glabel label tmp /dev/ad0s1e
GEOM_LABEL: Label for provider /dev/ad0s1e is label/tmp
&prompt.root; glabel label swap /dev/ad0s1b
GEOM_LABEL: Label for provider /dev/ad0s1b is label/swap
&prompt.root; exitThe system will continue with multi-user boot. After
the boot completes, edit /etc/fstab and
replace the conventional device names, with their respective
labels. The final /etc/fstab will
look like this:# Device Mountpoint FStype Options Dump Pass#
/dev/label/swap none swap sw 0 0
/dev/label/rootfs / ufs rw 1 1
/dev/label/tmp /tmp ufs rw 2 2
/dev/label/usr /usr ufs rw 2 2
/dev/label/var /var ufs rw 2 2The system can now be rebooted. If everything went
well, it will come up normally and mount
will show:&prompt.root; mount
/dev/label/rootfs on / (ufs, local)
devfs on /dev (devfs, local)
/dev/label/tmp on /tmp (ufs, local, soft-updates)
/dev/label/usr on /usr (ufs, local, soft-updates)
/dev/label/var on /var (ufs, local, soft-updates)The &man.glabel.8; class
supports a label type for UFS file
systems, based on the unique file system id,
ufsid. These labels may be found in
/dev/ufsid and are
created automatically during system startup. It is possible
to use ufsid labels to mount partitions
using /etc/fstab. Use glabel
status to receive a list of file systems and their
corresponding ufsid labels:&prompt.user; glabel status
Name Status Components
ufsid/486b6fc38d330916 N/A ad4s1d
ufsid/486b6fc16926168e N/A ad4s1fIn the above example, ad4s1d
represents /var,
while ad4s1f represents
/usr.
Using the ufsid values shown, these
partitions may now be mounted with the following entries in
/etc/fstab:/dev/ufsid/486b6fc38d330916 /var ufs rw 2 2
/dev/ufsid/486b6fc16926168e /usr ufs rw 2 2Any partitions with ufsid labels can be
mounted in this way, eliminating the need to manually create
permanent labels, while still enjoying the benefits of device
name independent mounting.UFS Journaling Through GEOMGEOMJournalingSupport for journals on
UFS file systems is available on &os;. The
implementation is provided through the GEOM
subsystem and is configured using gjournal.
Unlike other file system journaling implementations, the
gjournal method is block based and not
implemented as part of the file system. It is a
GEOM extension.Journaling stores a log of file system transactions, such as
changes that make up a complete disk write operation, before
meta-data and file writes are committed to the disk. This
transaction log can later be replayed to redo file system
transactions, preventing file system inconsistencies.This method provides another mechanism to protect against
data loss and inconsistencies of the file system. Unlike Soft
Updates, which tracks and enforces meta-data updates, and
snapshots, which create an image of the file system, a log is
stored in disk space specifically for this task. For better
performance, the journal may be stored on another disk. In this
configuration, the journal provider or storage device should be
listed after the device to enable journaling on.The GENERIC kernel provides support for
gjournal. To automatically load the
geom_journal.ko kernel module at boot time,
add the following line to
/boot/loader.conf:geom_journal_load="YES"If a custom kernel is used, ensure the following line is in
the kernel configuration file:options GEOM_JOURNALOnce the module is loaded, a journal can be created on a new
file system using the following steps. In this example,
da4 is a new SCSI
disk:&prompt.root; gjournal load
&prompt.root; gjournal label /dev/da4This will load the module and create a
/dev/da4.journal device node on
/dev/da4.A UFS file system may now be created on
the journaled device, then mounted on an existing mount
point:&prompt.root; newfs -O 2 -J /dev/da4.journal
&prompt.root; mount /dev/da4.journal /mntIn the case of several slices, a journal will be created
for each individual slice. For instance, if
ad4s1 and ad4s2 are
both slices, then gjournal will create
ad4s1.journal and
ad4s2.journal.Journaling may also be enabled on current file systems by
using tunefs. However,
always make a backup before attempting to
alter an existing file system. In most cases,
gjournal will fail if it is unable to create
the journal, but this does not protect against data loss
incurred as a result of misusing tunefs.
Refer to &man.gjournal.8; and &man.tunefs.8; for more
information about these commands.It is possible to journal the boot disk of a &os; system.
Refer to the article Implementing UFS
Journaling on a Desktop PC for detailed
instructions.