Backup and Restore CR issues
You might encounter the following common issues with Backup and Restore custom resources (CRs):
-
Backup CR cannot retrieve volume
-
Backup CR status remains in progress
-
Backup CR status remains in the
PartiallyFailedphase/state/etc
Troubleshooting issue where backup CR cannot retrieve volume
If the persistent volume (PV) and the snapshot locations are in different regions, the Backup custom resource (CR) displays the following error message:
InvalidVolume.NotFound: The volume ‘vol-xxxx’ does not exist.
-
Edit the value of the
spec.snapshotLocations.velero.config.regionkey in theDataProtectionApplicationmanifest so that the snapshot location is in the same region as the PV. -
Create a new
BackupCR.
Troubleshooting issue where backup CR status remains in progress
If a backup is interrupted, it cannot be resumed, and the status of a Backup customer resource (CR) remains in the InProgress phase and does not complete.
-
Retrieve the details of the
BackupCR by running the following command:$ oc -n {namespace} exec deployment/velero -c velero -- ./velero \ backup describe <backup> -
Delete the
BackupCR by running the following command:$ oc delete backups.velero.io <backup> -n openshift-adpYou do not need to clean up the backup location because an in progress
BackupCR has not uploaded files to object storage. -
Create a new
BackupCR. -
View the Velero backup details by running the following command:
$ velero backup describe <backup_name> --details
Troubleshooting issue where backup CR status remains partially failed
The status of a Backup CR without Restic in use remains in the PartiallyFailed phase and is not completed. A snapshot of the affiliated PVC is not created.
If the backup created based on the CSI snapshot class is missing a label, the CSI snapshot plugin fails to create a snapshot. As a result, the Velero pod logs an error similar to the following message:
time="2023-02-17T16:33:13Z" level=error msg="Error backing up item" backup=openshift-adp/user1-backup-check5 error="error executing custom action (groupResource=persistentvolumeclaims, namespace=busy1, name=pvc1-user1): rpc error: code = Unknown desc = failed to get volumesnapshotclass for storageclass ocs-storagecluster-ceph-rbd: failed to get volumesnapshotclass for provisioner openshift-storage.rbd.csi.ceph.com, ensure that the desired volumesnapshot class has the velero.io/csi-volumesnapshot-class label" logSource="/remote-source/velero/app/pkg/backup/backup.go:417" name=busybox-79799557b5-vprq
-
Delete the
BackupCR by running the following command::$ oc delete backups.velero.io <backup> -n openshift-adp -
If required, clean up the stored data on the
BackupStorageLocationresource to free up space. -
Apply the
velero.io/csi-volumesnapshot-class=truelabel to theVolumeSnapshotClassobject by running the following command:$ oc label volumesnapshotclass/<snapclass_name> velero.io/csi-volumesnapshot-class=true -
Create a new
BackupCR.