Replace failed disk in VIOS
This article describes two common scenarios for replacing failing local disk in VIOS.
Scenario 1: Failed disk in VIO
server which is used by VIO client(s)
The failing disk contains LVs used for rootvg of VIO clients. The rootvg is mirrored to another disk presented by a second VIO server. This scenario is illustrated below:
The failing disk contains LVs used for rootvg of VIO clients. The rootvg is mirrored to another disk presented by a second VIO server. This scenario is illustrated below:
Procedure to replace failing disk on VIO server:
On VIOS: as padmin user:
Record information which will be needed for later operations and
recreation of devices:
$ lsdev -virtual
$ lsdev -virtual
To get volume group name in which failed disk participate:
$ lspv
$ lspv
To get list of logical volumes on disk:
$ lspv -lv
$ lspv -lv
To get info about logical volumes, e.g. size
(number of LPs):
$ lsvg -lv
$ lsvg
$ lsvg -lv
$ lsvg
To get info about LVs, VTD names,
vhost numbers and virtual clients:
$ lsmap –all
$ lsmap –all
On client(s): Identify
affected disk(s)(LVs on bad disk on
VIOS):
# lscfg -vl (for all virtual SCSI disks)
hdisk1 U9117.MMA.999999-V2-C12-T1-L8200000000000000 Virtual SCSI Disk Drive
# lscfg -vl
hdisk1 U9117.MMA.999999-V2-C12-T1-L8200000000000000 Virtual SCSI Disk Drive
Take note of the following:
V# – LPAR ID (this should be the LPAR ID of the affected VIOS)
C# – slot number
L# – LUN ID
V# – LPAR ID (this should be the LPAR ID of the affected VIOS)
C# – slot number
L# – LUN ID
The affected disk may be listed as removed or missing depending
on the failure.
# lsvg -p rootvg
# lsvg -p rootvg
Remove the bad disk from the mirror:
# unmirrorvg rootvg
# reducevg rootvg
# rmdev -dl hdisk#
# unmirrorvg rootvg
# reducevg rootvg
# rmdev -dl hdisk#
On VIOS:
Remove all VTDs and LVs that reside on the failed disk:
$ rmvdev -vtd -rmlv
or
$ rmdev -dev
$ rmdev -dev
$ rmvdev -vtd
or
$ rmdev -dev
$ rmdev -dev
Check
if all logical volumes are
removed from bad disk:
$ lspv -lv
$ lspv -lv
Remove
the disk from the respective volume group:
$ reducevg
Note: If the volume group consists of only
one disk then the whole VG will need to be removed from ODM. In that case use
the following commands:
$ reducevg
Note: If the volume group
$
deactivatevg
$ exportvg
Replace the failed disk:
$ diagmenu
–> select “Task Selection”
–> select “ Hot Plug Task”
–> select “SCSI and SCSI RAID Hot Plug Manager”
–> Replace/Remove a Device Attached to an SCSI Hot Swap Enclosure
$ exportvg
Replace the failed disk:
$ diagmenu
–> select “Task Selection”
–> select “
–> select “SCSI and SCSI RAID Hot Plug Manager”
–> Replace/Remove a Device Attached to an SCSI Hot Swap Enclosure
Configure
the new disk:
$ cfgdev
$ cfgdev
Add the new disk to the volume group or recreate the VG in case
it was removed:
$ extendvg
or
$ mkvg -vg
$ extendvg
or
$ mkvg -vg
Recreate the LVs with the same names and size which we got in
the beginning.
$ mklv -lv
$ mklv -lv
Recreate the VTDs:
$ mkvdev -vdev -vadapter -dev
$ mkvdev -vdev
On client(s):
Discover new
disk(s) and rebuid mirror:
# cfgmgr
# extendvg rootvg
# mirrorvg rootvg
Build boot image on both
mirrored disks (just in case):
# bosboot -ad /dev/
# bosboot -ad /dev/
# cfgmgr
# extendvg rootvg
# mirrorvg rootvg
Build
# bosboot -ad /dev/
# bosboot -ad /dev/
Set
bootlist:
# bootlist -m normal <list names of both hdisks>
# bootlist -m normal -o
# bootlist -m normal <list names of both hdisks>
# bootlist -m normal -o
Scenario 2: Bad disk in rootvg of VIO server
Usually rootvg utilize some kind of disk protection. Most often rootvg consists of disks which are LVM mirrored. To replace a mirrored hdisk in rootvg of VIO server you can use VIO commands or root AIX commands (to become root, use oem_setup_env command). In this example we will use VIO commands since this is the recommended way of managing VIOS.
Usually rootvg utilize some kind of disk protection. Most often rootvg consists of disks which are LVM mirrored. To replace a mirrored hdisk in rootvg of VIO server you can use VIO commands or root AIX commands (to become root, use oem_setup_env command). In this example we will use VIO commands since this is the recommended way of managing VIOS.
Break the mirror:
$ unmirrorios , where is the bad disk
Check if any LV remained on the bad disk:
$ lspv -lv
If there are any (e.g. lg_dumplv – dump device) migrate them to the other disk or remove them (dump device can be recreated later):
$ unmirrorios
Check if any LV remained on the bad disk:
$ lspv -lv
If there are any (e.g. lg_dumplv – dump device) migrate them to the other disk or remove them (dump device can be recreated later):
$ migratepv -lv
or
$ rmlv -f
or
$ rmlv -f
Take out failed disk from rootvg:
$ reducevg rootvg
Use ”Hot Plug” procedure to replace the failed disk:
$ reducevg rootvg
Use ”Hot Plug” procedure to replace the failed disk:
$
diagmenu
–> select “Task Selection”
–> select “Hot Plug Task”
–> select “SCSI and SCSI RAID Hot Plug Manager”
–> Replace/Remove a Device Attached to an SCSI Hot Swap Enclosure
–> select “Task Selection”
–> select “Hot Plug Task”
–> select “SCSI and SCSI RAID Hot Plug Manager”
–> Replace/Remove a Device Attached to an SCSI Hot Swap Enclosure
Configure the new disk:
$ cfgdev
$ cfgdev
Verify that the new disk came back with the same number as the
previous one:
$ lspv
$ extendvg rootvg
$ mirrorios -defer (Note that if you do not use -defer
option, your VIO server will be rebooted after mirroring completes)
Check bootlist to ensure that both disks are included as boot devices:
$ lspv
$ extendvg rootvg
$ mirrorios -defer
Check bootlist to ensure that both disks are included as boot devices:
$ bootlist -mode normal -ls
hdisk0 blv=hd5
hdisk1 blv=hd5
hdisk0 blv=hd5
hdisk1 blv=hd5
Use the command below to include both disks if they do not show
up in the bootlist:
$ bootlist -mode normal hdisk0 hdisk1