ZFS Disk Replacement
The process of replacing mirrored zfs disks is fairly simple. The changes are done by zpool attach and detach.
zpool detach <pool> <disk-id>
zpool attach <pool> <disk-id-to-mirror> <disk-id-mirrored-to>
The heavy lifting is done by zfs itself.
process
PREP
- use the pdf article link to print this before going down
- If possible pre wipe and check the disks on a separate linux machine (note: /dev/sdf is an placeholder for the disk mounted on that system)
``` root@homebox:~# wipefs -af --backup /dev/sdf /dev/sdf: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54 /dev/sdf: 8 bytes were erased at offset 0x222ee64e00 (gpt): 45 46 49 20 50 41 52 54 /dev/sdf: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa /dev/sdc: calling ioctl to re-read partition table: Success root@homebox:~# fdisk /dev/sdf .... Command (m for help): g
Created a new GPT disklabel (GUID: EBC5A0C9-E871-544F-A8EA-E31FCA655F9C).
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
root@homebox:~# badblocks /dev/sdf
....
```
- insure that you can ssh into the box
On site
The following assumes you have escalated to root privileges (sudo bash), in this case we are replacing /dev/sdc and /dev/sdd in the pool named 'level'
- check for the correct disk.
The following should cause the disk to light up\
(
C when you have identified the disk. Careful with the if/of here) .
root@bs2020:~# dd if=/dev/sdc of=/dev/null
- find the disk in the pool.
``` root@bs2020:~# zpool status pool: devel state: ONLINE scan: resilvered 9.95G in 0h4m with 0 errors on Sat Nov 10 22:00:41 2018 config:
NAME STATE READ WRITE CKSUM
devel ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
scsi-35000c50054fee503 ONLINE 0 0 0
scsi-35000c5005501b45b ONLINE 0 0 0
errors: No known data errors
...
root@bs2020:~# ls -ls /dev/disk/by-id/|grep scsi|grep -v "\-part"
0 lrwxrwxrwx 1 root root 9 Nov 10 21:22 scsi-350000395a8336d34 -> ../../sde
0 lrwxrwxrwx 1 root root 9 Nov 10 21:22 scsi-35000c50054fee503 -> ../../sdd
0 lrwxrwxrwx 1 root root 9 Nov 10 21:56 scsi-35000c5005501b45b -> ../../sdc
0 lrwxrwxrwx 1 root root 9 Nov 10 21:22 scsi-35000cca00b33a264 -> ../../sdf
0 lrwxrwxrwx 1 root root 9 Nov 10 21:22 scsi-3600508e00000000069cf3977618f1408 -> ../../sdg
root@bs2020:~#
``` We notice above that the disk we are looking for is scsi-35000c5005501b45b
- detach the disk from the pool.
``` root@bs2020:~# zpool detach devel scsi-35000c5005501b45b root@bs2020:~# zpool status pool: devel state: ONLINE scan: resilvered 9.95G in 0h4m with 0 errors on Sat Nov 10 22:00:41 2018 config:
NAME STATE READ WRITE CKSUM
devel ONLINE 0 0 0
scsi-35000c50054fee503 ONLINE 0 0 0
errors: No known data errors
pool: infra
...
root@bs2020:~#
```
- even if expanding the disk size insure that auto expand is off.
root@bs2020:~# zpool set autoexpand=off devel
-
Swap out the old disk with the new one.
-
find the new disk's id.
root@bs2020:~# partprobe
root@bs2020:~# ls -ls /dev/disk/by-id/|grep sdc
0 lrwxrwxrwx 1 root root 9 Nov 10 21:56 scsi-xxxxxxxxxxxxxxx -> ../../sdc
...
0 lrwxrwxrwx 1 root root 9 Nov 10 21:56 xxx-xxxxxxxxxxxxxxx -> ../../sdc
-
If the drive id does not change reboot the server
-
attach the new disk to the zfs pool (scsi-xxxxxxxxxxxxxxxx is the new id from the above step)
root@bs2020:~# zpool attach devel scsi-35000c50054fee503 scsi-xxxxxxxxxxxxxx
- wait for pool to resliver
```
root@bs2020:~# zpool status
pool: devel
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sat Nov 10 21:56:04 2018
8.54G scanned out of 9.95G at 35.5M/s, 0h0m to go
8.54G resilvered, 85.85% done
config:
NAME STATE READ WRITE CKSUM
devel ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
scsi-35000c50054fee503 ONLINE 0 0 0
scsi-35000c5005501b45b ONLINE 0 0 0 (resilvering)
errors: No known data errors
pool: infra
...
root@bs2020:~# zpool status
.... repeat until finished reslivering ....
root@bs2020:~# zpool status
pool: devel
state: ONLINE
scan: scrub repaired 0B in 0h4m with 0 errors on Sat Nov 10 21:58:04 2018
config:
NAME STATE READ WRITE CKSUM
devel ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
scsi-35000cca00b33a264 ONLINE 0 0 0
scsi-350000395a8336d34 ONLINE 0 0 0
errors: No known data errors
...
```
- if expanding disk check for new size and if not expand it
# zfs list and check for larger disk pool
- repeat process for disk in bay below (we already know its old id from above).
root@bs2020:~# dd if=/dev/sdd of=/dev/null
root@bs2020:~# zpool detach devel scsi-35000c50054fee503
... swap disks ...
root@bs2020:~# partprobe
root@bs2020:~# ls -ls /dev/disk/by-id/|grep sdd
0 lrwxrwxrwx 1 root root 9 Nov 10 21:56 scsi-yyyyyyyyyyyyyyyy-> ../../sdd
... reboot if necessary ...
root@bs2020:~# wipefs -a /dev/sdd
...
root@bs2020:~# fdisk /dev/sdd
...
root@bs2020:~# zpool attach devel scsi-xxxxxxxxxxxxxx scsi-yyyyyyyyyyyyyyy
... wait for resliver...
* use the process below to grow disks to new size
# zpool set autoexpand=on devel
# zpool online -e devel scsi-xxxxxxxxxxxxxxxxxxxx
# zpool online -e devel scsi-yyyyyyyyyyyyyyyyyyy
# zpool set autoexpand=off devel
references
- https://tomasz.korwel.net/2014/01/03/growing-zfs-pool/
- https://jsosic.wordpress.com/2013/01/01/expanding-zfs-zpool-raid/
- https://serverfault.com/questions/5336/how-do-i-make-linux-recognize-a-new-sata-dev-sda-drive-i-hot-swapped-in-without