之前配置的Proxmox服务器系统盘一律使用ZFS Mirror模式,前段时间SMART提示有个硬盘有坏道了。拿了个新盘替换上去,做个笔记记录一下操作流程。
首先查看一下ZFS池状态:
root@g1-pve2:~# zpool status -v pool: rpool state: ONLINE scan: scrub repaired 0B in 0h41m with 0 errors on Sun Dec 8 01:05:19 2019 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 sda2 ONLINE 0 0 0 sdb2 ONLINE 0 0 0 errors: No known data errors
上面看到ZFS池没有什么报错,硬盘还是存在的。我们已知SMART报错的是sdb这个盘。拔下来后再查看状态:
root@g1-pve2:~# zpool status -v pool: rpool state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-4J scan: none requested config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 sda2 ONLINE 0 0 0 sdb2 FAULTED 0 0 0
上面看到sdb2是FAULTED状态,我们的盘是热拔的。好像冷拔盘后显示的是一串ID。记住要替换的盘符sdb2或者那一串ID,准备后面使用。然后插入新盘,并找到新盘的盘符:
ls -lh /dev/disk/by-id
插入新盘后,通过硬盘标签上的SN号和上面命令输出的结果,找到对应的盘符,我这里还是sdb,然后通过健康的sda盘复制分区表并分配一个新的GUID。(应该是因为我这里还是热插拔,所以盘符一样)
警告!下面危险操作
#sgdisk -R /dev/新盘 /dev/健康盘 #sgdisk -G /dev/新盘 sgdisk -R /dev/sdb /dev/sda sgdisk -G /dev/sdb
然后复制分区数据:
#dd if=/dev/健康盘 of=/dev/新盘 bs=512 dd if=/dev/sda1 of=/dev/sdb1 bs=512 dd if=/dev/sda9 of=/dev/sdb9 bs=512
然后还原到zpool和安装grub引导
#zpool replace rpool 旧盘符或ID /dev/新盘符 zpool replace rpool sdb2 /dev/sdb2 grub-install /dev/sdb
最后等待恢复数据完成后重启:
root@g1-pve2:~# zpool status -v pool: rpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Sun Dec 8 01:20:19 2019 91.9M scanned out of 1.87T at 7.66M/s, 71h0m to go 22.6M resilvered, 0.00% done config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 sda2 ONLINE 0 0 0 replacing-1 UNAVAIL 0 0 0 sdb2 FAULTED 0 0 0 was /dev/sdb2 sdb2 ONLINE 0 0 0 (resilvering) errors: No known data errors
等待上面所有state都是ONLINE后就可以重启啦。
参考资料:
https://edmondscommerce.github.io/replacing-failed-drive-in-zfs-zpool-on-proxmox/
https://www.oxcrag.net/2018/09/02/replacing-zfs-system-drives-in-proxmox/