Proxmox更换ZFS系统盘

By | 2020 年 1 月 2 日

之前配置的Proxmox服务器系统盘一律使用ZFS Mirror模式,前段时间SMART提示有个硬盘有坏道了。拿了个新盘替换上去,做个笔记记录一下操作流程。
首先查看一下ZFS池状态:

root@g1-pve2:~# zpool status -v
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 0h41m with 0 errors on Sun Dec  8 01:05:19 2019
config:

	NAME        STATE     READ WRITE CKSUM
	rpool       ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    sda2    ONLINE       0     0     0
	    sdb2    ONLINE       0     0     0

errors: No known data errors

上面看到ZFS池没有什么报错,硬盘还是存在的。我们已知SMART报错的是sdb这个盘。拔下来后再查看状态:

root@g1-pve2:~# zpool status -v
  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
    invalid.  Sufficient replicas exist for the pool to continue
    functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	rpool       DEGRADED     0     0     0
	  mirror-0  DEGRADED     0     0     0
	    sda2    ONLINE       0     0     0
	    sdb2    FAULTED      0     0     0

上面看到sdb2是FAULTED状态,我们的盘是热拔的。好像冷拔盘后显示的是一串ID。记住要替换的盘符sdb2或者那一串ID,准备后面使用。然后插入新盘,并找到新盘的盘符:

ls -lh /dev/disk/by-id

插入新盘后,通过硬盘标签上的SN号和上面命令输出的结果,找到对应的盘符,我这里还是sdb,然后通过健康的sda盘复制分区表并分配一个新的GUID。(应该是因为我这里还是热插拔,所以盘符一样)
警告!下面危险操作

#sgdisk -R /dev/新盘 /dev/健康盘
#sgdisk -G /dev/新盘
sgdisk -R /dev/sdb /dev/sda
sgdisk -G /dev/sdb

然后复制分区数据:

#dd if=/dev/健康盘 of=/dev/新盘 bs=512
dd if=/dev/sda1 of=/dev/sdb1 bs=512
dd if=/dev/sda9 of=/dev/sdb9 bs=512

然后还原到zpool和安装grub引导

#zpool replace rpool 旧盘符或ID /dev/新盘符
zpool replace rpool sdb2 /dev/sdb2
grub-install /dev/sdb

最后等待恢复数据完成后重启:

root@g1-pve2:~# zpool status -v
  pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Dec  8 01:20:19 2019
    91.9M scanned out of 1.87T at 7.66M/s, 71h0m to go
    22.6M resilvered, 0.00% done
config:

	NAME               STATE     READ WRITE CKSUM
	rpool              DEGRADED     0     0     0
	  mirror-0         DEGRADED     0     0     0
	    sda2           ONLINE       0     0     0
        replacing-1    UNAVAIL      0     0     0
          sdb2         FAULTED      0     0     0  was /dev/sdb2
          sdb2         ONLINE       0     0     0  (resilvering)

errors: No known data errors

等待上面所有state都是ONLINE后就可以重启啦。

参考资料:
https://edmondscommerce.github.io/replacing-failed-drive-in-zfs-zpool-on-proxmox/
https://www.oxcrag.net/2018/09/02/replacing-zfs-system-drives-in-proxmox/

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注