Help Center/ Cloud Container Engine/ FAQs/ Storage/ How Do I Restore a Disk After It Is Mistakenly Detached from a Storage Pool?
Updated on 2025-04-25 GMT+08:00

How Do I Restore a Disk After It Is Mistakenly Detached from a Storage Pool?

A storage pool is a custom resource (nodelocalvolumes) created by Everest. It is not recommended that you manually perform any operations on this type of resource under normal circumstances. Everest scans for idle disks every minute and verifies that the disks added to the storage pool are functioning properly.

Everest uses LVM to manage storage pools. Both the local PVs and local ephemeral volumes (EVs) are a volume group (VG) in LVM.

  • VG used by the local PVs: vg-everest-localvolume-persistent
  • VG used by the local EVs: vg-everest-localvolume-ephemeral

This section describes how to restore a local PV. To restore a local EV, use the corresponding VG.

This section's guide is solely intended for restoring an unavailable storage pool from that a disk has been accidentally detached from. Once the storage pool is restored, you can import PVs or EVs, but the original data cannot be recovered.

Symptom

A disk of a storage pool is detached by mistake, resulting in unavailable node storage pool.

Fault Locating

Run the kubectl command to check the nodelocalvolumes resource.

kubectl get nodelocalvolumes.localvolume.everest.io -n kube-system 192.168.1.137 -o yaml

The message displayed in status is "/dev/vde is lost."

...
status:
  lastUpdateTime: "2024-07-16T07:13:55Z"
  message: the device 6eef2886-f5ad-4b3f-a:/dev/vde is lost
  phase: Unavailable
  totalDisks:
  - capacity: 100Gi
    name: /dev/vdb
    uuid: 3511993c-61e6-4faa-a
  usedDisks:
  - totalSize: 102392Mi
    type: persistent
    usedSize: 10Gi
    volume-group: vg-everest-localvolume-persistent
    volumes:
    - capacity: 50Gi
      name: /dev/vdd
      pv-uuid: Jo9Uur-evVi-RLWM-yaln-J6rz-3QCo-aHpvwB
      uuid: 40b8b92b-5852-4a97-9
    - capacity: 50Gi
      name: /dev/vde
      pv-uuid: ZxA9kY-5C28-96Z9-ZjOE-dCrc-yTgp-DOhUHo
      uuid: 6eef2886-f5ad-4b3f-a

Solution

  1. Restore the nodelocalvolumes resource.

    kubectl edit nodelocalvolumes.localvolume.everest.io -n kube-system 192.168.1.137

    Modify the preceding resource, delete the lost disk from spec.volumes.type: persistent, and delete the status field from the resource.

  2. Remove the corresponding PV from the VG.

    The information for VG is stored on its respective disk.
    • If a VG has multiple disks but some PVs are missing, a message will appear indicating missing PVs.
    • If a disk is removed from the VG, the VG will not display by running vgdisplay. If you cannot see vg-everest-localvolume-persistent or vg-everest-localvolume-ephemeral using the vgdisplay command, you can skip this step.

    Run the following command to remove all lost PVs from the VG. The VG name for local PVs is vg-everest-localvolume-persistent. If a local EV is restored, the VG name changes to vg-everest-localvolume-ephemeral.

    vgreduce --removemissing vg-everest-localvolume-persistent

    Information similar to the following is displayed:

      WARNING: Couldn't find device with uuid ZxA9kY-5C28-96Z9-ZjOE-dCrc-yTgp-DOhUHo.
      WARNING: VG vg-everest-localvolume-persistent is missing PV ZxA9kY-5C28-96Z9-ZjOE-dCrc-yTgp-DOhUHo (last written to /dev/vde).
      WARNING: Couldn't find device with uuid ZxA9kY-5C28-96Z9-ZjOE-dCrc-yTgp-DOhUHo.
      Wrote out consistent volume group vg-everest-localvolume-persistent.

    Run vgdisplay again and check that the command output is normal.

    vgdisplay vg-everest-localvolume-persistent

    Information similar to the following is displayed:

    --- Volume group ---
      VG Name               vg-everest-localvolume-persistent
      System ID             
      Format                lvm2
      Metadata Areas        1
      Metadata Sequence No  4
      VG Access             read/write
      VG Status             resizable
      MAX LV                0
      Cur LV                1
      Open LV               1
      Max PV                0
      Cur PV                1
      Act PV                1
      VG Size               <50.00 GiB
      PE Size               4.00 MiB
      Total PE              12799
      Alloc PE / Size       2560 / 10.00 GiB
      Free  PE / Size       10239 / <40.00 GiB
      VG UUID               31LHdA-yZPV-M7JX-ttwK-aynz-IyxY-usp22p

  3. Restart everest-csi-driver on the corresponding node.

    1. Check the pod name of everest-csi-driver.
      kubectl get pod -A -owide
      Information similar to the following is displayed:
      NAMESPACE     NAME                                      READY   STATUS             RESTARTS     AGE     IP               NODE            NOMINATED NODE   READINESS GATES
      kube-system   everest-csi-driver-7clbg                  1/1     Running            0            5d4h    192.168.1.38     192.168.1.38    <none>           <none>
      kube-system   everest-csi-driver-jvj9f                  1/1     Running            0            5d4h    192.168.1.137    192.168.1.137   <none>           <none>
    2. Delete the pod on the node whose IP address is 192.168.1.137.
      kubectl delete pod -n kube-system everest-csi-driver-jvj9f
    3. Verify that the nodelocalvolumes status becomes normal and continue to import PVs.

  4. Restart the node.

    Sometimes, even after resolving the issue, CCE Node Problem Detector may still show the node as unavailable due to abnormal metrics. If this happens, restarting the node can solve the problem.