Snapraid Change Data Disk
Overview
This guide covers how to replace a working drive in a functioning SnapRAID array with another working drive. The new drive can be the same size or larger than the old drive as long as its not larger than the largest parity volume.
Before starting this process it is advised to test the new drive, Badblocks is a good drive testing utility for Linux.
Synopsis
- Make sure nothing is writing to the array.
- Run the SnapRAID commands
diff
andsync
until thediff
command results in 0 changes. - Identify the drive in the array to be replaced.
- Copy all the files, including hidden files from the old drive to the new drive.
- Remove the old drive from the SnapRAID config file.
- Add the new drive to the SnapRAID config file.
- Run the SnapRaid command
diff
, it should identify a drive has changed but no data should have changed. - Return to business as usual with the array.
SnapRAID
I run SnapRAID as a plugin on an OpenMediaVault virtual machine running in Proxmox. The drives used for storage are connected to two LSI MegaRAID cards, both of which are passed through to the OpenMediaVault VM. In addition I use MergerFS through the OMV Union Filesystems plugin to pool all the disks together into a single volume. This setup requires a few more steps before we can change a SnapRAID data drive.
Note that SnapRAID also has the ability to pool drives, but its not as robust as MergerFS.
Prepare
-
Attach the new drive to an external USB dock that is accessible from the Proxmox Server. Most of the instructions here are done on the CLI, UI steps will be noted as such.
-
Identify the new drive using lsblk. The example below shows
sdb
as a 3.7TB disk with no associated partitions. A brand new disk would look this way, but if you’re re-using a drive it could have partitions.lsblk -o NAME,PATH,FSTYPE,MOUNTPOINT,LABEL,UUID,SIZE
NAME PATH FSTYPE MOUNTPOINT LABEL UUID SIZE sda /dev/sda 2.8T └─sda1 /dev/sda1 ext4 201907d001 cdee8e5f-54b3-4f1f-afeb-4d414b7e94aa 2.8T sdb /dev/sdb 3.7T sdc /dev/sdc 2.8T └─sdc1 /dev/sdc1 ext4 201907d002 f4adcf58-380b-4c6c-b991-feb47e02b15a 2.8T sdd /dev/sdd 3.7T └─sdd1 /dev/sdd1 ext4 202003d003 d5a72cf7-8b04-4475-9860-9c48735b3f23 3.7T
There might also be one or more drives with many partitions, take note of these drives and do not touch them in any way! These drives are usually used by proxmox, and are easier to manage through the UI.
-
Use Badblocks to verify the disk is free of errors. The command below is nondestructive, replace the
n
flag withw
to do a destructive write test. This can take hours on small drives and days on large drives. The 4TB refurbished enterprise drives I use take about 2 days to finish.sudo badblocks -nsv /dev/sdb >badsectors.log
-
Setup partitions on the drive using fdisk
sudo fdisk /dev/sdb
The output might look like the example below.
The steps to create a new partition begins at line 8.Welcome to fdisk (util-linux 2.33.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table. Created a new DOS disklabel with disk identifier 0xe96324a1. Command (m for help): n Partition type p primary (0 primary, 0 extended, 4 free) e extended (container for logical partitions) Select (default p): p Partition number (1-4, default 1): 1 First sector (2048-16777215, default 2048): Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-16777215, default 16777215): Created a new partition 1 of type 'Linux' and of size 8 GiB. Command (m for help): w The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks.
- Line 8
- The command
n
creates a new partition. - Line 12
- The command
p
makes it a primary partition. - Line 13 - 15
- Can be left blank as the default values are fine.
- Line 19
- The command
w
writes the changes to disk.
-
Format the drive using mkfs. Ext4 is a good enough for most cases option. The
L
option gives the partition a label,newdatadrive
for easy identification later.sudo mkfs.ext4 -L newdatadrive /dev/sdb1
mke2fs 1.44.5 (15-Dec-2018) Discarding device blocks: done Creating filesystem with 2096896 4k blocks and 524288 inodes Filesystem UUID: f4adcf58-380b-4c6c-b991-feb47e02b15a Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done
-
Use lsblk to verify the new partition.
lsblk -o NAME,PATH,FSTYPE,MOUNTPOINT,LABEL,UUID,SIZE
NAME PATH FSTYPE MOUNTPOINT LABEL UUID SIZE sda /dev/sda 2.8T └─sda1 /dev/sda1 ext4 201907d001 cdee8e5f-54b3-4f1f-afeb-4d414b7e94aa 2.8T sdb /dev/sdb 3.7T └─sdb1 /dev/sdb1 ext4 newdatadrive f4adcf58-380b-4c6c-b991-feb47e02b15a 3.7T sdc /dev/sdc 2.8T └─sdc1 /dev/sdc1 ext4 201907d002 0cdf8cd4-5c49-460a-a65a-6e340b53a2a4 2.8T sdd /dev/sdd 3.7T └─sdd1 /dev/sdd1 ext4 202003d003 d5a72cf7-8b04-4475-9860-9c48735b3f23 3.7T
-
We need to access this new volume later. To mount it, first create a directory to mount it to.
sudo mkdir /mnt/sdb/
-
Mount the new volume using mount.
sudo mount /dev/sdb1 /mnt/sdb
After this point NOTHING should be writing to the array.
All important data should be backed up.
Triple check every command!
!! DATA LOSS MAY OCCUR !!
Open Media Vault
The steps below are done in the Open Media Vault CLI through ssh.
-
Run a SnapRAID
diff
command on the array.sudo snapraid diff
Example of diff results.
Loading state from /srv/dev-disk-by-label-201907d001/snapraid.content... Comparing... ... ... ... 125315 equal 10 added 7 removed 0 updated 0 moved 0 copied 0 restored There are differences!
-
Next run the SnapRAID
sync
command, then repeat step 7 and 8 until step 7 returns no differences.sudo snapraid sync
Example of sync results.
Self test... Loading state from /srv/dev-disk-by-label-201907d001/snapraid.content... Scanning disk 201907d001... Scanning disk 201907d002... Using 2235 MiB of memory for the file-system. Initializing... Resizing... Saving state to /srv/dev-disk-by-label-201907d001/snapraid.content... Saving state to /srv/dev-disk-by-label-201907d002/snapraid.content... Verifying /srv/dev-disk-by-label-201907d001/snapraid.content... Verifying /srv/dev-disk-by-label-201907d002/snapraid.content... Verified /srv/dev-disk-by-label-201907d001/snapraid.content in 14 seconds Verified /srv/dev-disk-by-label-201907d002/snapraid.content in 15 seconds Syncing... Using 96 MiB of memory for 32 cached blocks. 100% completed, 113784 MB accessed in 0:07 0:00 ETA 201907d001 0% | 201907d002 0% | parity 2% | * raid 1% | * hash 5% | *** sched 7% | **** misc 0% | |_______________________________________________________ wait time (total, less is better) Everything OK Saving state to /srv/dev-disk-by-label-201907d001/snapraid.content... Saving state to /srv/dev-disk-by-label-201907d002/snapraid.content... Verifying /srv/dev-disk-by-label-201907d001/snapraid.content... Verifying /srv/dev-disk-by-label-201907d002/snapraid.content... Verified /srv/dev-disk-by-label-201907d001/snapraid.content in 15 seconds Verified /srv/dev-disk-by-label-201907d002/snapraid.content in 16 seconds
-
Use lsblk with a few additional options to find the UUID, label, and mount points of the disks in the snapRAID array. You want to identify the disk you want to remove within OMV, and relate that to the physical disk in Proxmox. Generally setting a good label in step 6 will make the identification easier.
lsblk -o NAME,PATH,FSTYPE,MOUNTPOINT,LABEL,UUID,SIZE
NAME PATH FSTYPE MOUNTPOINT LABEL UUID SIZE sda /dev/sda 16G ├─sda1 /dev/sda1 vfat /boot/efi 52A1-DABE 512M ├─sda2 /dev/sda2 ext4 / b919071a-3a7a-4349-8bb3-6d04a2654964 14.5G └─sda3 /dev/sda3 swap [SWAP] 5d40e0e9-c5a5-4277-bf6a-d11a5daeb693 1020M vda /dev/vda 2.8T └─vda1 /dev/vda1 ext4 /srv/dev-disk-by-label-201907d001 201907d001 c089c2df-df90-487e-a380-acf581c2a734 2.8T vdb /dev/vdb 2.8T └─vdb1 /dev/vdb1 ext4 /srv/dev-disk-by-label-201907d002 201907d002 30ae1493-c449-4ec1-9825-752f88f2962e 2.8T vdc /dev/vdc 3.7T └─vdc1 /dev/vdc1 ext4 /srv/dev-disk-by-label-202003d003 202003d003 95fbd539-383b-4fe2-88da-81f65c17c82f 3.7T
-
Take note of the details of the disk to be removed, vdb. Note the 3rd disk of size 3.7T, vdc is the parity drive. The existing data drives are smaller at 2.8T, vda and vdb. The new 3.7T drive doesn’t appear here because its not mounted to the OMV VM. Some of this info can be found using the OMV UI too.
Name: vdb Partition: vbd1 Partition Label: 201907d002 Path: /dev/vdb Mount Point: /srv/dev-disk-by-label-201907d002 UUID: 30ae1493-c449-4ec1-9825-752f88f2962e Size: 2.8T
Proxmox
This section takes place on the proxmox host CLI.
-
Use the details from the previous step and the lsblk command below to identify vdb on the proxmox node. If you set a good label in step 6 this should be much easier. In the previous step vdb’s partition label is
201907d002
.lsblk -o NAME,PATH,FSTYPE,MOUNTPOINT,LABEL,UUID,SIZE NAME PATH FSTYPE MOUNTPOINT LABEL UUID SIZE sda /dev/sda 2.8T └─sda1 /dev/sda1 ext4 201907d001 cdee8e5f-54b3-4f1f-afeb-4d414b7e94aa 2.8T sdb /dev/sdb 3.7T └─sdb1 /dev/sdb1 ext4 newdatadrive f4adcf58-380b-4c6c-b991-feb47e02b15a 3.7T sdc /dev/sdc 2.8T └─sdc1 /dev/sdc1 ext4 201907d002 0cdf8cd4-5c49-460a-a65a-6e340b53a2a4 2.8T sdd /dev/sdd 3.7T └─sdd1 /dev/sdd1 ext4 202003d003 d5a72cf7-8b04-4475-9860-9c48735b3f23 3.7T
-
Mount the disk to be removed, this is similar to step 7 & 8. Create a directory then mount the drive to it.
sudo mkdir /mnt/sdc sudo mount /dev/sdc1 /mnt/sdc
-
Use rsync to copy all the files off the old data disk, sdc to the new data disk, sdb. The
n
flag does a dry run and doesn’t actually copy anything. Remove it when you’re sure everything is good. The trailing / inmnt/sdc/
is intentional, without it the command would create a sdc directory inside/mnt/sdb
. Thea
flag copies all files including hidden files, and doesn’t modify metadata. This will take a while, so sit back with a hot drink.sudo rsync -avn /mnt/sdc/ /mnt/sdb sending incremental file list ./ file.1 file.2 file.3 file.4 file.5 folder/ folder/file.1.1 folder/file.1.2 sent 266 bytes received 44 bytes 620.00 bytes/sec total size is 0 speedup is 0.00 (DRY RUN)
SnapRaid
Back to the Open Media Vault CLI to run the SnapRAID command below.
-
Run a diff on the snapRaid array to verify nothing was written. This is similar to step 9.
sudo snapraid diff
-
If there are differences double check the mounted drives to ensure you’re copying from and to the correct disks. This step should not be necessary, but if it is slow down and recheck the previous steps. You might have written to the wrong disk, and if so you will need to check the snapRAID docs to restore the written files.
Open Media Vault
The steps below are done on the Open Media Vault web GUI.
- Login to the OMV UI and go to the SnapRAID section.
- Find the old drive in the Drives tab and remove it from the array. Apply the changes to OMV.
- Move to the Union Filesystems section.
- Remove the old drive from the union filesystem it belongs to. Deselect it from the list. Save and apply changes.
- Move to the File Systems section in OMV.
- Make sure the drive is not referenced, then unmount it. Apply the changes.
Replace
-
If hot swap is not supported, safely turn off the Proxmox node.
-
Carefully remove the physical drive, and replace it with the new data drive.
-
Restart the Proxmox node, then get the ID of the new disk,
sdc
using lsblk.lsblk -o name,label,model,serial
NAME LABEL MODEL SERIAL sda WDC_WD4000FYYZ-05UL1B0 WD-WCC124631734 └─sda1 201907d001 sdb Hitachi_HUS724040ALE640 PK23654PAG7BP8T └─sdb1 newdatadrive sdc WDC_WD40EFRX-68N32N0 WD-WCC7K45XJ4LJ └─sdc1 201907d002 sdd Hitachi_HUS724040ALE640 PK13241PAG6RZBS └─sdd1 202003d003
-
Find the disk by its ID in
/dev/disk/by-id/
using the Model and Serial.ls /dev/disk/by-id/
ata-HITACHI_HUS724040ALE640_PK23654PAG7BP8T ata-HITACHI_HUS724040ALE640_PK13241PAG6RZBS ata-WDC_WD4000FYYZ-05UL1B0_WD-WCC124631734 ata-WDC_WD40EFRX-68N32N0 _WD-WCC7K45XJ4LJ
-
Make sure the OMV VM is turned off. Passthrough the disk to the OMV VM.
100
is the VM ID, and-virtio7
is the 7th virtio drive attached to the VM. Attaching another disk would require changing to another virtio interface, such as-virtio8
.sudo qm set 100 -virtio7 /dev/disk/by-id/ata-Hitachi_HUS724040ALE640_PK23654PAG7BP8T
-
Restart and log back into the OMV VM, then go to the filesystems section.
-
Move back to the Union Filesystems section and edit the existing file system. Select to add the new drive in the list. Save and apply changes.
-
Move to the SnapRAID section and go to the Drive tab. Select the new drive in the list of data drives. Apply the changes.
-
Run a snapraid diff, SnapRAID should detect the disk change but it shouldn’t find any differences. If it returns anything but equal, restored, or copied check the directory structure on the new drive.
sudo snapraid diff
-
Run a snapraid check to verify the files on the new disk.
sudo snapraid check -a -d newdatadrive
-
Run a snapraid sync to finish up.
sudo snapraid sync
That’s it! At this point write access can be given back to the SnapRAID array.