Merge branch 'main' of https://github.com/qiuming-best/velero into restic-repo-tmp
commit
752b28166f
1
Makefile
1
Makefile
|
@ -125,6 +125,7 @@ all-containers: container-builder-env
|
|||
@$(MAKE) --no-print-directory container BIN=velero-restic-restore-helper
|
||||
|
||||
local: build-dirs
|
||||
# Add DEBUG=1 to enable debug locally
|
||||
GOOS=$(GOOS) \
|
||||
GOARCH=$(GOARCH) \
|
||||
VERSION=$(VERSION) \
|
||||
|
|
43
ROADMAP.md
43
ROADMAP.md
|
@ -1,42 +1 @@
|
|||
## Velero Roadmap
|
||||
|
||||
### About this document
|
||||
This document provides a link to the [Velero Project boards](https://github.com/vmware-tanzu/velero/projects) that serves as the up to date description of items that are in the release pipeline. The release boards have separate swim lanes based on prioritization. Most items are gathered from the community or include a feedback loop with the community. This should serve as a reference point for Velero users and contributors to understand where the project is heading, and help determine if a contribution could be conflicting with a longer term plan.
|
||||
|
||||
### How to help?
|
||||
Discussion on the roadmap can take place in threads under [Issues](https://github.com/vmware-tanzu/velero/issues) or in [community meetings](https://velero.io/community/). Please open and comment on an issue if you want to provide suggestions, use cases, and feedback to an item in the roadmap. Please review the roadmap to avoid potential duplicated effort.
|
||||
|
||||
### How to add an item to the roadmap?
|
||||
One of the most important aspects in any open source community is the concept of proposals. Large changes to the codebase and / or new features should be preceded by a [proposal](https://github.com/vmware-tanzu/velero/blob/main/GOVERNANCE.md#proposal-process) in our repo.
|
||||
For smaller enhancements, you can open an issue to track that initiative or feature request.
|
||||
We work with and rely on community feedback to focus our efforts to improve Velero and maintain a healthy roadmap.
|
||||
|
||||
### Current Roadmap
|
||||
The following table includes the current roadmap for Velero. If you have any questions or would like to contribute to Velero, please attend a [community meeting](https://velero.io/community/) to discuss with our team. If you don't know where to start, we are always looking for contributors that will help us reduce technical, automation, and documentation debt.
|
||||
Please take the timelines & dates as proposals and goals. Priorities and requirements change based on community feedback, roadblocks encountered, community contributions, etc. If you depend on a specific item, we encourage you to attend community meetings to get updated status information, or help us deliver that feature by contributing to Velero.
|
||||
|
||||
`Last Updated: October 2021`
|
||||
|
||||
#### 1.8.0 Roadmap (to be delivered January/February 2021)
|
||||
|
||||
|Issue|Description|Timeline|Notes|
|
||||
|---|---|---|---|
|
||||
|[4108](https://github.com/vmware-tanzu/velero/issues/4108), [4109](https://github.com/vmware-tanzu/velero/issues/4109)|Solution for CSI - Azure and AWS|2022 H1|Currently, Velero plugins for AWS and Azure cannot back up persistent volumes that were provisioned using the CSI driver. This will fix that.|
|
||||
|[3229](https://github.com/vmware-tanzu/velero/issues/3229),[4112](https://github.com/vmware-tanzu/velero/issues/4112)|Moving data mover functionality from the Velero Plugin for vSphere into Velero proper|2022 H1|This work is a precursor to decoupling the Astrolabe snapshotting infrastructure.|
|
||||
|[3533](https://github.com/vmware-tanzu/velero/issues/3533)|Upload Progress Monitoring|2022 H1|Finishing up the work done in the 1.7 timeframe. The data mover work depends on this.|
|
||||
|[1975](https://github.com/vmware-tanzu/velero/issues/1975)|Test dual stack mode|2022 H1|We already tested IPv6, but we want to confirm that dual stack mode works as well.|
|
||||
|[2082](https://github.com/vmware-tanzu/velero/issues/2082)|Delete Backup CRs on removing target location. |2022 H1||
|
||||
|[3516](https://github.com/vmware-tanzu/velero/issues/3516)|Restore issue with MutatingWebhookConfiguration v1beta1 API version|2022 H1||
|
||||
|[2308](https://github.com/vmware-tanzu/velero/issues/2308)|Restoring nodePort service that has nodePort preservation always fails if service already exists in the namespace|2022 H1||
|
||||
|[4115](https://github.com/vmware-tanzu/velero/issues/4115)|Support for multiple set of credentials for VolumeSnapshotLocations|2022 H1||
|
||||
|[1980](https://github.com/vmware-tanzu/velero/issues/1980)|Velero triggers backup immediately for scheduled backups|2022 H1||
|
||||
|[4067](https://github.com/vmware-tanzu/velero/issues/4067)|Pre and post backup and restore hooks|2022 H1||
|
||||
|[3742](https://github.com/vmware-tanzu/velero/issues/3742)|Carvel packaging for Velero for vSphere|2022 H1|AWS and Azure have been completed already.|
|
||||
|[3285](https://github.com/vmware-tanzu/velero/issues/3285)|Design doc for Velero plugin versioning|2022 H1||
|
||||
|[4231](https://github.com/vmware-tanzu/velero/issues/4231)|Technical health (prioritizing giving developers confidence and saving developers time)|2022 H1|More automated tests (especially the pre-release manual tests) and more automation of the running of tests.|
|
||||
|[4110](https://github.com/vmware-tanzu/velero/issues/4110)|Solution for CSI - GCP|2022 H1|Currently, the Velero plugin for GCP cannot back up persistent volumes that were provisioned using the CSI driver. This will fix that.|
|
||||
|[3742](https://github.com/vmware-tanzu/velero/issues/3742)|Carvel packaging for Velero for restic|2022 H1|AWS and Azure have been completed already.|
|
||||
|[3454](https://github.com/vmware-tanzu/velero/issues/3454),[4134](https://github.com/vmware-tanzu/velero/issues/4134),[4135](https://github.com/vmware-tanzu/velero/issues/4135)|Kubebuilder tech debt|2022 H1||
|
||||
|[4111](https://github.com/vmware-tanzu/velero/issues/4111)|Ignore items returned by ItemSnapshotter.AlsoHandles during backup|2022 H1|This will enable backup of complex objects, because we can then tell Velero to ignore things that were already backed up when Velero was previously called recursively.|
|
||||
|
||||
Other work may make it into the 1.8 release, but this is the work that will be prioritized first.
|
||||
# Please go to the [Velero Wiki](https://github.com/vmware-tanzu/velero/wiki/) to see our latest roadmap, archived roadmaps and roadmap guidance.
|
2
Tiltfile
2
Tiltfile
|
@ -103,7 +103,7 @@ local_resource(
|
|||
|
||||
local_resource(
|
||||
"restic_binary",
|
||||
cmd = 'cd ' + '.' + ';mkdir -p _tiltbuild/restic; BIN=velero GOOS=linux GOARCH=amd64 RESTIC_VERSION=0.12.0 OUTPUT_DIR=_tiltbuild/restic ./hack/download-restic.sh',
|
||||
cmd = 'cd ' + '.' + ';mkdir -p _tiltbuild/restic; BIN=velero GOOS=linux GOARCH=amd64 RESTIC_VERSION=0.13.1 OUTPUT_DIR=_tiltbuild/restic ./hack/download-restic.sh',
|
||||
)
|
||||
|
||||
# Note: we need a distro with a bash shell to exec into the Velero container
|
||||
|
|
|
@ -0,0 +1 @@
|
|||
Convert Pod Volume Restore resource/controller to the Kubebuilder framework
|
|
@ -0,0 +1 @@
|
|||
Make in-progress backup/restore as failed when doing the reconcile to avoid hanging in in-progress status
|
|
@ -0,0 +1 @@
|
|||
Modify CSI VolumeSnapshot metric related code.
|
|
@ -0,0 +1 @@
|
|||
Refactor backup deletion controller based on kubebuilder
|
|
@ -0,0 +1 @@
|
|||
Remove VolumeSnapshots created during backup when CSI feature is enabled.
|
|
@ -0,0 +1 @@
|
|||
Add ClusterClasses to the restore priority list
|
|
@ -0,0 +1 @@
|
|||
Delete orphan CSI snapshots in backup sync controller
|
|
@ -0,0 +1 @@
|
|||
Make waiting VolumeSnapshot to ready process parallel.
|
|
@ -372,6 +372,10 @@ spec:
|
|||
format: date-time
|
||||
nullable: true
|
||||
type: string
|
||||
failureReason:
|
||||
description: FailureReason is an error that caused the entire backup
|
||||
to fail.
|
||||
type: string
|
||||
formatVersion:
|
||||
description: FormatVersion is the backup format version, including
|
||||
major, minor, and patch version.
|
||||
|
|
|
@ -16,7 +16,16 @@ spec:
|
|||
singular: deletebackuprequest
|
||||
scope: Namespaced
|
||||
versions:
|
||||
- name: v1
|
||||
- additionalPrinterColumns:
|
||||
- description: The name of the backup to be deleted
|
||||
jsonPath: .spec.backupName
|
||||
name: BackupName
|
||||
type: string
|
||||
- description: The status of the deletion request
|
||||
jsonPath: .status.phase
|
||||
name: Status
|
||||
type: string
|
||||
name: v1
|
||||
schema:
|
||||
openAPIV3Schema:
|
||||
description: DeleteBackupRequest is a request to delete one or more backups.
|
||||
|
@ -63,6 +72,8 @@ spec:
|
|||
type: object
|
||||
served: true
|
||||
storage: true
|
||||
subresources:
|
||||
status: {}
|
||||
status:
|
||||
acceptedNames:
|
||||
kind: ""
|
||||
|
|
|
@ -16,7 +16,37 @@ spec:
|
|||
singular: podvolumerestore
|
||||
scope: Namespaced
|
||||
versions:
|
||||
- name: v1
|
||||
- additionalPrinterColumns:
|
||||
- description: Namespace of the pod containing the volume to be restored
|
||||
jsonPath: .spec.pod.namespace
|
||||
name: Namespace
|
||||
type: string
|
||||
- description: Name of the pod containing the volume to be restored
|
||||
jsonPath: .spec.pod.name
|
||||
name: Pod
|
||||
type: string
|
||||
- description: Name of the volume to be restored
|
||||
jsonPath: .spec.volume
|
||||
name: Volume
|
||||
type: string
|
||||
- description: Pod Volume Restore status such as New/InProgress
|
||||
jsonPath: .status.phase
|
||||
name: Status
|
||||
type: string
|
||||
- description: Pod Volume Restore status such as New/InProgress
|
||||
format: int64
|
||||
jsonPath: .status.progress.totalBytes
|
||||
name: TotalBytes
|
||||
type: integer
|
||||
- description: Pod Volume Restore status such as New/InProgress
|
||||
format: int64
|
||||
jsonPath: .status.progress.bytesDone
|
||||
name: BytesDone
|
||||
type: integer
|
||||
- jsonPath: .metadata.creationTimestamp
|
||||
name: Age
|
||||
type: date
|
||||
name: v1
|
||||
schema:
|
||||
openAPIV3Schema:
|
||||
properties:
|
||||
|
@ -136,6 +166,8 @@ spec:
|
|||
type: object
|
||||
served: true
|
||||
storage: true
|
||||
subresources:
|
||||
status: {}
|
||||
status:
|
||||
acceptedNames:
|
||||
kind: ""
|
||||
|
|
File diff suppressed because one or more lines are too long
|
@ -6,12 +6,31 @@ metadata:
|
|||
creationTimestamp: null
|
||||
name: velero-perms
|
||||
rules:
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- persistentvolumerclaims
|
||||
verbs:
|
||||
- get
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- persistentvolumes
|
||||
verbs:
|
||||
- get
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- pods
|
||||
verbs:
|
||||
- get
|
||||
- apiGroups:
|
||||
- velero.io
|
||||
resources:
|
||||
- backups
|
||||
verbs:
|
||||
- create
|
||||
- delete
|
||||
- apiGroups:
|
||||
- velero.io
|
||||
resources:
|
||||
|
@ -32,6 +51,26 @@ rules:
|
|||
- get
|
||||
- patch
|
||||
- update
|
||||
- apiGroups:
|
||||
- velero.io
|
||||
resources:
|
||||
- deletebackuprequests
|
||||
verbs:
|
||||
- create
|
||||
- delete
|
||||
- get
|
||||
- list
|
||||
- patch
|
||||
- update
|
||||
- watch
|
||||
- apiGroups:
|
||||
- velero.io
|
||||
resources:
|
||||
- deletebackuprequests/status
|
||||
verbs:
|
||||
- get
|
||||
- patch
|
||||
- update
|
||||
- apiGroups:
|
||||
- velero.io
|
||||
resources:
|
||||
|
@ -72,6 +111,26 @@ rules:
|
|||
- get
|
||||
- patch
|
||||
- update
|
||||
- apiGroups:
|
||||
- velero.io
|
||||
resources:
|
||||
- podvolumerestores
|
||||
verbs:
|
||||
- create
|
||||
- delete
|
||||
- get
|
||||
- list
|
||||
- patch
|
||||
- update
|
||||
- watch
|
||||
- apiGroups:
|
||||
- velero.io
|
||||
resources:
|
||||
- podvolumerestores/status
|
||||
verbs:
|
||||
- get
|
||||
- patch
|
||||
- update
|
||||
- apiGroups:
|
||||
- velero.io
|
||||
resources:
|
||||
|
|
|
@ -0,0 +1,262 @@
|
|||
# Add support for `ExistingResourcePolicy` to restore API
|
||||
## Abstract
|
||||
Velero currently does not support any restore policy on kubernetes resources that are already present in-cluster. Velero skips over the restore of the resource if it already exists in the namespace/cluster irrespective of whether the resource present in the restore is the same or different from the one present on the cluster. It is desired that Velero gives the option to the user to decide whether or not the resource in backup should overwrite the one present in the cluster.
|
||||
|
||||
## Background
|
||||
As of Today, Velero will skip over the restoration of resources that already exist in the cluster. The current workflow followed by Velero is (Using a `service` that is backed up for example):
|
||||
- Velero tries to attempt restore of the `service`
|
||||
- Fetches the `service` from the cluster
|
||||
- If the `service` exists then:
|
||||
- Checks whether the `service` instance in the cluster is equal to the `service` instance present in backup
|
||||
- If not equal then skips the restore of the `service` and adds a restore warning (except for [ServiceAccount objects](https://github.com/vmware-tanzu/velero/blob/574baeb3c920f97b47985ec3957debdc70bcd5f8/pkg/restore/restore.go#L1246))
|
||||
- If equal then skips the restore of the `service` and mentions that the restore of resource `service` is skipped in logs
|
||||
|
||||
It is desired to add the functionality to specify whether or not to overwrite the instance of resource `service` in cluster with the one present in backup during the restore process.
|
||||
|
||||
Related issue: https://github.com/vmware-tanzu/velero/issues/4066
|
||||
|
||||
## Goals
|
||||
- Add support for `ExistingResourcePolicy` to restore API for Kubernetes resources.
|
||||
|
||||
## Non Goals
|
||||
- Change existing restore workflow for `ServiceAccount` objects
|
||||
- Add support for `ExistingResourcePolicy` as `recreate` for Kubernetes resources. (Future scope feature)
|
||||
|
||||
## Unrelated Proposals (Completely different functionalities than the one proposed in the design)
|
||||
- Add support for `ExistingResourcePolicy` to restore API for Non-Kubernetes resources.
|
||||
- Add support for `ExistingResourcePolicy` to restore API for `PersistentVolume` data.
|
||||
|
||||
### Use-cases/Scenarios
|
||||
|
||||
### A. Production Cluster - Backup Cluster:
|
||||
Let's say you have a Backup Cluster which is identical to the Production Cluster. After some operations/usage/time the Production Cluster had changed itself, there might be new deployments, some secrets might have been updated. Now, this means that the Backup cluster will no longer be identical to the Production Cluster. In order to keep the Backup Cluster up to date/identical to the Production Cluster with respect to Kubernetes resources except PV data we would like to use Velero for scheduling new backups which would in turn help us update the Backup Cluster via Velero restore.
|
||||
|
||||
Reference: https://github.com/vmware-tanzu/velero/issues/4066#issuecomment-954320686
|
||||
|
||||
### B. Help identify resource delta:
|
||||
Here delta resources mean the resources restored by a previous backup, but they are no longer in the latest backup. Let's follow a sequence of steps to understand this scenario:
|
||||
- Consider there are 2 clusters, Cluster A, which has 3 resources - P1, P2 and P3.
|
||||
- Create a Backup1 from Cluster A which has P1, P2 and P3.
|
||||
- Perform restore on a new Cluster B using Backup1.
|
||||
- Now, Lets say in Cluster A resource P1 gets deleted and resource P2 gets updated.
|
||||
- Create a new Backup2 with the new state of Cluster A, keep in mind Backup1 has P1, P2 and P3 while Backup2 has P2' and P3.
|
||||
- So the Delta here is (|Cluster B - Backup2|), Delete P1 and Update P2.
|
||||
- During Restore time we would want the Restore to help us identify this resource delta.
|
||||
|
||||
Reference: https://github.com/vmware-tanzu/velero/pull/4613#issuecomment-1027260446
|
||||
|
||||
## High-Level Design
|
||||
### Approach 1: Add a new spec field `existingResourcePolicy` to the Restore API
|
||||
In this approach we do *not* change existing velero behavior. If the resource to restore in cluster is equal to the one backed up then do nothing following current Velero behavior. For resources that already exist in the cluster that are not equal to the resource in the backup (other than Service Accounts). We add a new optional spec field `existingResourcePolicy` which can have the following values:
|
||||
1. `none`: This is the existing behavior, if Velero encounters a resource that already exists in the cluster, we simply
|
||||
skip restoration.
|
||||
2. `update`: This option would provide the following behavior.
|
||||
- Unchanged resources: Velero would update the backup/restore labels on the unchanged resources, if labels patch fails Velero adds a restore error.
|
||||
- Changed resources: Velero will first try to patch the changed resource, Now if the patch:
|
||||
- succeeds: Then the in-cluster resource gets updated with the labels as well as the resource diff
|
||||
- fails: Velero adds a restore warning and tries to just update the backup/restore labels on the resource, if the labels patch also fails then we add restore error.
|
||||
3. `recreate`: If resource already exists, then Velero will delete it and recreate the resource.
|
||||
|
||||
*Note:* The `recreate` option is a non-goal for this enhancement proposal, but it is considered as a future scope.
|
||||
Another thing to highlight is that Velero will not be deleting any resources in any of the policy options proposed in
|
||||
this design but Velero will patch the resources in `update` policy option.
|
||||
|
||||
Example:
|
||||
A. The following Restore will execute the `existingResourcePolicy` restore type `none` for the `services` and `deployments` present in the `velero-protection` namespace.
|
||||
|
||||
```
|
||||
Kind: Restore
|
||||
|
||||
…
|
||||
|
||||
includeNamespaces: velero-protection
|
||||
includeResources:
|
||||
- services
|
||||
- deployments
|
||||
existingResourcePolicy: none
|
||||
|
||||
```
|
||||
|
||||
B. The following Restore will execute the `existingResourcePolicy` restore type `update` for the `secrets` and `daemonsets` present in the `gdpr-application` namespace.
|
||||
```
|
||||
Kind: Restore
|
||||
|
||||
…
|
||||
includeNamespaces: gdpr-application
|
||||
includeResources:
|
||||
- secrets
|
||||
- daemonsets
|
||||
existingResourcePolicy: update
|
||||
```
|
||||
|
||||
### Approach 2: Add a new spec field `existingResourcePolicyConfig` to the Restore API
|
||||
In this approach we give user the ability to specify which resources are to be included for a particular kind of force update behaviour, essentially a more granular approach where in the user is able to specify a resource:behaviour mapping. It would look like:
|
||||
`existingResourcePolicyConfig`:
|
||||
- `patch:`
|
||||
- `includedResources:` [ ]string
|
||||
- `recreate:`
|
||||
- `includedResources:` [ ]string
|
||||
|
||||
*Note:*
|
||||
- There is no `none` behaviour in this approach as that would conform to the current/default Velero restore behaviour.
|
||||
- The `recreate` option is a non-goal for this enhancement proposal, but it is considered as a future scope.
|
||||
|
||||
|
||||
Example:
|
||||
A. The following Restore will execute the restore type `patch` and apply the `existingResourcePolicyConfig` for `secrets` and `daemonsets` present in the `inventory-app` namespace.
|
||||
```
|
||||
Kind: Restore
|
||||
…
|
||||
includeNamespaces: inventory-app
|
||||
existingResourcePolicyConfig:
|
||||
patch:
|
||||
includedResources
|
||||
- secrets
|
||||
- daemonsets
|
||||
|
||||
```
|
||||
|
||||
|
||||
### Approach 3: Combination of Approach 1 and Approach 2
|
||||
|
||||
Now, this approach is somewhat a combination of the aforementioned approaches. Here we propose addition of two spec fields to the Restore API - `existingResourceDefaultPolicy` and `existingResourcePolicyOverrides`. As the names suggest ,the idea being that `existingResourceDefaultPolicy` would describe the default velero behaviour for this restore and `existingResourcePolicyOverrides` would override the default policy explicitly for some resources.
|
||||
|
||||
Example:
|
||||
A. The following Restore will execute the restore type `patch` as the `existingResourceDefaultPolicy` but will override the default policy for `secrets` using the `existingResourcePolicyOverrides` spec as `none`.
|
||||
```
|
||||
Kind: Restore
|
||||
…
|
||||
includeNamespaces: inventory-app
|
||||
existingResourceDefaultPolicy: patch
|
||||
existingResourcePolicyOverrides:
|
||||
none:
|
||||
includedResources
|
||||
- secrets
|
||||
|
||||
```
|
||||
|
||||
## Detailed Design
|
||||
### Approach 1: Add a new spec field `existingResourcePolicy` to the Restore API
|
||||
The `existingResourcePolicy` spec field will be an `PolicyType` type field.
|
||||
|
||||
Restore API:
|
||||
```
|
||||
type RestoreSpec struct {
|
||||
.
|
||||
.
|
||||
.
|
||||
// ExistingResourcePolicy specifies the restore behaviour for the kubernetes resource to be restored
|
||||
// +optional
|
||||
ExistingResourcePolicy PolicyType
|
||||
|
||||
}
|
||||
```
|
||||
PolicyType:
|
||||
```
|
||||
type PolicyType string
|
||||
const PolicyTypeNone PolicyType = "none"
|
||||
const PolicyTypePatch PolicyType = "update"
|
||||
```
|
||||
|
||||
### Approach 2: Add a new spec field `existingResourcePolicyConfig` to the Restore API
|
||||
The `existingResourcePolicyConfig` will be a spec of type `PolicyConfiguration` which gets added to the Restore API.
|
||||
|
||||
Restore API:
|
||||
```
|
||||
type RestoreSpec struct {
|
||||
.
|
||||
.
|
||||
.
|
||||
// ExistingResourcePolicyConfig specifies the restore behaviour for a particular/list of kubernetes resource(s) to be restored
|
||||
// +optional
|
||||
ExistingResourcePolicyConfig []PolicyConfiguration
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
PolicyConfiguration:
|
||||
```
|
||||
type PolicyConfiguration struct {
|
||||
|
||||
PolicyTypeMapping map[PolicyType]ResourceList
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
PolicyType:
|
||||
```
|
||||
type PolicyType string
|
||||
const PolicyTypePatch PolicyType = "patch"
|
||||
const PolicyTypeRecreate PolicyType = "recreate"
|
||||
```
|
||||
|
||||
ResourceList:
|
||||
```
|
||||
type ResourceList struct {
|
||||
IncludedResources []string
|
||||
}
|
||||
```
|
||||
|
||||
### Approach 3: Combination of Approach 1 and Approach 2
|
||||
|
||||
Restore API:
|
||||
```
|
||||
type RestoreSpec struct {
|
||||
.
|
||||
.
|
||||
.
|
||||
// ExistingResourceDefaultPolicy specifies the default restore behaviour for the kubernetes resource to be restored
|
||||
// +optional
|
||||
existingResourceDefaultPolicy PolicyType
|
||||
|
||||
// ExistingResourcePolicyOverrides specifies the restore behaviour for a particular/list of kubernetes resource(s) to be restored
|
||||
// +optional
|
||||
existingResourcePolicyOverrides []PolicyConfiguration
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
PolicyType:
|
||||
```
|
||||
type PolicyType string
|
||||
const PolicyTypeNone PolicyType = "none"
|
||||
const PolicyTypePatch PolicyType = "patch"
|
||||
const PolicyTypeRecreate PolicyType = "recreate"
|
||||
```
|
||||
PolicyConfiguration:
|
||||
```
|
||||
type PolicyConfiguration struct {
|
||||
|
||||
PolicyTypeMapping map[PolicyType]ResourceList
|
||||
|
||||
}
|
||||
```
|
||||
ResourceList:
|
||||
```
|
||||
type ResourceList struct {
|
||||
IncludedResources []string
|
||||
}
|
||||
```
|
||||
|
||||
The restore workflow changes will be done [here](https://github.com/vmware-tanzu/velero/blob/b40bbda2d62af2f35d1406b9af4d387d4b396839/pkg/restore/restore.go#L1245)
|
||||
|
||||
### CLI changes for Approach 1
|
||||
We would introduce a new CLI flag called `existing-resource-policy` of string type. This flag would be used to accept the
|
||||
policy from the user. The velero restore command would look somewhat like this:
|
||||
```
|
||||
velero create restore <restore_name> --existing-resource-policy=update
|
||||
```
|
||||
|
||||
Help message `Restore Policy to be used during the restore workflow, can be - none, update`
|
||||
|
||||
The CLI changes will go at `pkg/cmd/cli/restore/create.go`
|
||||
|
||||
We would also add a validation which checks for invalid policy values provided to this flag.
|
||||
|
||||
Restore describer will also be updated to reflect the policy `pkg/cmd/util/output/restore_describer.go`
|
||||
|
||||
### Implementation Decision
|
||||
We have decided to go ahead with the implementation of Approach 1 as:
|
||||
- It is easier to implement
|
||||
- It is also easier to scale and leaves room for improvement and the door open to expanding to approach 3
|
||||
- It also provides an option to preserve the existing velero restore workflow
|
|
@ -0,0 +1,138 @@
|
|||
# Ensure support for backing up resources based on multiple labels
|
||||
## Abstract
|
||||
As of today Velero supports filtering of resources based on single label selector per backup. It is desired that Velero
|
||||
support backing up of resources based on multiple labels (OR logic).
|
||||
|
||||
**Note:** This solution is required because kubernetes label selectors only allow AND logic of labels.
|
||||
|
||||
## Background
|
||||
Currently, Velero's Backup/Restore API has a spec field `LabelSelector` which helps in filtering of resources based on
|
||||
a **single** label value per backup/restore request. For instance, if the user specifies the `Backup.Spec.LabelSelector` as
|
||||
`data-protection-app: true`, Velero will grab all the resources that possess this label and perform the backup
|
||||
operation on them. The `LabelSelector` field does not accept more than one labels, and thus if the user want to take
|
||||
backup for resources consisting of a label from a set of labels (label1 OR label2 OR label3) then the user needs to
|
||||
create multiple backups per label rule. It would be really useful if Velero Backup API could respect a set of
|
||||
labels (OR Rule) for a single backup request.
|
||||
|
||||
Related Issue: https://github.com/vmware-tanzu/velero/issues/1508
|
||||
|
||||
## Goals
|
||||
- Enable support for backing up resources based on multiple labels (OR Logic) in a single backup config.
|
||||
- Enable support for restoring resources based on multiple labels (OR Logic) in a single restore config.
|
||||
|
||||
## Use Case/Scenario
|
||||
Let's say as a Velero user you want to take a backup of secrets, but all these secrets do not have one single consistent
|
||||
label on them. We want to take backup of secrets having any one label in `app=gdpr`, `app=wpa` and `app=ccpa`. Here
|
||||
we would have to create 3 instances of backup for each label rule. This can become cumbersome at scale.
|
||||
|
||||
## High-Level Design
|
||||
### Addition of `OrLabelSelectors` spec to Velero Backup/Restore API
|
||||
For Velero to back up resources if they consist of any one label from a set of labels, we would like to add a new spec
|
||||
field `OrLabelSelectors` which would enable user to specify them. The Velero backup would somewhat look like:
|
||||
|
||||
```
|
||||
apiVersion: velero.io/v1
|
||||
kind: Backup
|
||||
metadata:
|
||||
name: backup-101
|
||||
namespace: openshift-adp
|
||||
spec:
|
||||
includedNamespaces:
|
||||
- test
|
||||
storageLocation: velero-sample-1
|
||||
ttl: 720h0m0s
|
||||
orLabelSelectors:
|
||||
- matchLabels:
|
||||
app=gdpr
|
||||
- matchLabels:
|
||||
app=wpa
|
||||
- matchLabels:
|
||||
app=ccpa
|
||||
```
|
||||
|
||||
**Note:** This approach will **not** be changing any current behavior related to Backup API spec `LabelSelector`. Rather we
|
||||
propose that the label in `LabelSelector` spec and labels in `OrLabelSelectors` should be treated as different Velero functionalities.
|
||||
Both these fields will be treated as separate Velero Backup API specs. If `LabelSelector` (singular) is present then just match that label.
|
||||
And if `OrLabelSelectors` is present then match to any label in the set specified by the user. For backup case, if both the `LabelSelector` and `OrLabelSelectors`
|
||||
are specified (we do not anticipate this as a real world use-case) then the `OrLabelSelectors` will take precedence, `LabelSelector` will
|
||||
only be used to filter only when `OrLabelSelectors` is not specified by the user. This helps to keep both spec behaviour independent and not confuse the users.
|
||||
This way we preserve the existing Velero behaviour and implement the new functionality in a much cleaner way.
|
||||
For instance, let's take a look the following cases:
|
||||
|
||||
1. Only `LabelSelector` specified: Velero will create a backup with resources matching label `app=protect-db`
|
||||
```
|
||||
apiVersion: velero.io/v1
|
||||
kind: Backup
|
||||
metadata:
|
||||
name: backup-101
|
||||
namespace: openshift-adp
|
||||
spec:
|
||||
includedNamespaces:
|
||||
- test
|
||||
storageLocation: velero-sample-1
|
||||
ttl: 720h0m0s
|
||||
labelSelector:
|
||||
- matchLabels:
|
||||
app=gdpr
|
||||
```
|
||||
2. Only `OrLabelSelectors` specified: Velero will create a backup with resources matching any label from set `{app=gdpr, app=wpa, app=ccpa}`
|
||||
```
|
||||
apiVersion: velero.io/v1
|
||||
kind: Backup
|
||||
metadata:
|
||||
name: backup-101
|
||||
namespace: openshift-adp
|
||||
spec:
|
||||
includedNamespaces:
|
||||
- test
|
||||
storageLocation: velero-sample-1
|
||||
ttl: 720h0m0s
|
||||
orLabelSelectors:
|
||||
- matchLabels:
|
||||
app=gdpr
|
||||
- matchLabels:
|
||||
app=wpa
|
||||
- matchLabels:
|
||||
app=ccpa
|
||||
```
|
||||
|
||||
Similar implementation will be done for the Restore API as well.
|
||||
|
||||
## Detailed Design
|
||||
With the Introduction of `OrLabelSelectors` the BackupSpec and RestoreSpec will look like:
|
||||
|
||||
BackupSpec:
|
||||
```
|
||||
type BackupSpec struct {
|
||||
[...]
|
||||
// OrLabelSelectors is a set of []metav1.LabelSelector to filter with
|
||||
// when adding individual objects to the backup. Resources matching any one
|
||||
// label from the set of labels will be added to the backup. If empty
|
||||
// or nil, all objects are included. Optional.
|
||||
// +optional
|
||||
OrLabelSelectors []\*metav1.LabelSelector
|
||||
[...]
|
||||
}
|
||||
```
|
||||
|
||||
RestoreSpec:
|
||||
```
|
||||
type RestoreSpec struct {
|
||||
[...]
|
||||
// OrLabelSelectors is a set of []metav1.LabelSelector to filter with
|
||||
// when restoring objects from the backup. Resources matching any one
|
||||
// label from the set of labels will be restored from the backup. If empty
|
||||
// or nil, all objects are included from the backup. Optional.
|
||||
// +optional
|
||||
OrLabelSelectors []\*metav1.LabelSelector
|
||||
[...]
|
||||
}
|
||||
```
|
||||
|
||||
The logic to collect resources to be backed up for a particular backup will be updated in the `backup/item_collector.go`
|
||||
around [here](https://github.com/vmware-tanzu/velero/blob/574baeb3c920f97b47985ec3957debdc70bcd5f8/pkg/backup/item_collector.go#L294).
|
||||
|
||||
And for filtering the resources to be restored, the changes will go [here](https://github.com/vmware-tanzu/velero/blob/d1063bda7e513150fd9ae09c3c3c8b1115cb1965/pkg/restore/restore.go#L1769)
|
||||
|
||||
**Note:**
|
||||
- This feature will not be exposed via Velero CLI.
|
2
go.mod
2
go.mod
|
@ -10,6 +10,7 @@ require (
|
|||
github.com/Azure/go-autorest/autorest v0.11.21
|
||||
github.com/Azure/go-autorest/autorest/azure/auth v0.5.8
|
||||
github.com/Azure/go-autorest/autorest/to v0.3.0
|
||||
github.com/apex/log v1.9.0
|
||||
github.com/aws/aws-sdk-go v1.28.2
|
||||
github.com/bombsimon/logrusr v1.1.0
|
||||
github.com/evanphx/json-patch v4.11.0+incompatible
|
||||
|
@ -37,6 +38,7 @@ require (
|
|||
golang.org/x/mod v0.4.2
|
||||
golang.org/x/net v0.0.0-20210520170846-37e1c6afe023
|
||||
golang.org/x/oauth2 v0.0.0-20210819190943-2bc19b11175f
|
||||
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c
|
||||
google.golang.org/api v0.56.0
|
||||
google.golang.org/grpc v1.40.0
|
||||
k8s.io/api v0.22.2
|
||||
|
|
28
go.sum
28
go.sum
|
@ -112,14 +112,21 @@ github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf/go.mod h1:ybxpYRF
|
|||
github.com/alecthomas/units v0.0.0-20190717042225-c3de453c63f4/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0=
|
||||
github.com/alecthomas/units v0.0.0-20190924025748-f65c72e2690d/go.mod h1:rBZYJk541a8SKzHPHnH3zbiI+7dagKZ0cgpgrD7Fyho=
|
||||
github.com/antihax/optional v1.0.0/go.mod h1:uupD/76wgC+ih3iEmQUL+0Ugr19nfwCT1kdvxnR2qWY=
|
||||
github.com/apex/log v1.9.0 h1:FHtw/xuaM8AgmvDDTI9fiwoAL25Sq2cxojnZICUU8l0=
|
||||
github.com/apex/log v1.9.0/go.mod h1:m82fZlWIuiWzWP04XCTXmnX0xRkYYbCdYn8jbJeLBEA=
|
||||
github.com/apex/logs v1.0.0/go.mod h1:XzxuLZ5myVHDy9SAmYpamKKRNApGj54PfYLcFrXqDwo=
|
||||
github.com/aphistic/golf v0.0.0-20180712155816-02c07f170c5a/go.mod h1:3NqKYiepwy8kCu4PNA+aP7WUV72eXWJeP9/r3/K9aLE=
|
||||
github.com/aphistic/sweet v0.2.0/go.mod h1:fWDlIh/isSE9n6EPsRmC0det+whmX6dJid3stzu0Xys=
|
||||
github.com/armon/circbuf v0.0.0-20150827004946-bbbad097214e/go.mod h1:3U/XgcO3hCbHZ8TKRvWD2dDTCfh9M9ya+I9JpbB7O8o=
|
||||
github.com/armon/consul-api v0.0.0-20180202201655-eb2c6b5be1b6/go.mod h1:grANhF5doyWs3UAsr3K4I6qtAmlQcZDesFNEHPZAzj8=
|
||||
github.com/armon/go-metrics v0.0.0-20180917152333-f0300d1749da/go.mod h1:Q73ZrmVTwzkszR9V5SSuryQ31EELlFMUz1kKyl939pY=
|
||||
github.com/armon/go-radix v0.0.0-20180808171621-7fddfc383310/go.mod h1:ufUuZ+zHj4x4TnLV4JWEpy2hxWSpsRywHrMgIH9cCH8=
|
||||
github.com/armon/go-radix v1.0.0/go.mod h1:ufUuZ+zHj4x4TnLV4JWEpy2hxWSpsRywHrMgIH9cCH8=
|
||||
github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a/go.mod h1:lB+ZfQJz7igIIfQNfa7Ml4HSf2uFQQRzpGGRXenZAgY=
|
||||
github.com/aws/aws-sdk-go v1.20.6/go.mod h1:KmX6BPdI08NWTb3/sm4ZGu5ShLoqVDhKgpiN924inxo=
|
||||
github.com/aws/aws-sdk-go v1.28.2 h1:j5IXG9CdyLfcVfICqo1PXVv+rua+QQHbkXuvuU/JF+8=
|
||||
github.com/aws/aws-sdk-go v1.28.2/go.mod h1:KmX6BPdI08NWTb3/sm4ZGu5ShLoqVDhKgpiN924inxo=
|
||||
github.com/aybabtme/rgbterm v0.0.0-20170906152045-cc83f3b3ce59/go.mod h1:q/89r3U2H7sSsE2t6Kca0lfwTK8JdoNGS/yzM/4iH5I=
|
||||
github.com/benbjohnson/clock v1.0.3/go.mod h1:bGMdMPoPVvcYyt1gHDf4J2KE153Yf9BuiUKYMaxlTDM=
|
||||
github.com/benbjohnson/clock v1.1.0 h1:Q92kusRqC1XV2MjkWETPvjJVqKetz1OzxZB7mHJLju8=
|
||||
github.com/benbjohnson/clock v1.1.0/go.mod h1:J11/hYXuz8f4ySSvYwY0FKfm+ezbsZBKZxNJlLklBHA=
|
||||
|
@ -424,6 +431,7 @@ github.com/joho/godotenv v1.3.0/go.mod h1:7hK45KPybAkOC6peb+G5yklZfMxEjkZhHbwpqx
|
|||
github.com/jonboulle/clockwork v0.1.0/go.mod h1:Ii8DK3G1RaLaWxj9trq07+26W01tbo22gdxWY5EU2bo=
|
||||
github.com/jonboulle/clockwork v0.2.2/go.mod h1:Pkfl5aHPm1nk2H9h0bjmnJD/BcgbGXUBGnn1kMkgxc8=
|
||||
github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
|
||||
github.com/jpillora/backoff v0.0.0-20180909062703-3050d21c67d7/go.mod h1:2iMrUgbbvHEiQClaW2NsSzMyGHqN+rDFqY705q49KG0=
|
||||
github.com/jpillora/backoff v1.0.0/go.mod h1:J/6gKK9jxlEcS3zixgDgUAsiuZ7yrSoa/FX5e0EB2j4=
|
||||
github.com/json-iterator/go v1.1.6/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCVDaaPEHmU=
|
||||
github.com/json-iterator/go v1.1.7/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4=
|
||||
|
@ -465,6 +473,8 @@ github.com/mailru/easyjson v0.0.0-20190626092158-b2ccc519800e/go.mod h1:C1wdFJiN
|
|||
github.com/mailru/easyjson v0.7.0/go.mod h1:KAzv3t3aY1NaHWoQz1+4F1ccyAH66Jk7yos7ldAVICs=
|
||||
github.com/mailru/easyjson v0.7.6/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
|
||||
github.com/mattn/go-colorable v0.0.9/go.mod h1:9vuHe8Xs5qXnSaW/c/ABM9alt+Vo+STaOChaDxuIBZU=
|
||||
github.com/mattn/go-colorable v0.1.1/go.mod h1:FuOcm+DKB9mbwrcAfNl7/TZVBZ6rcnceauSikq3lYCQ=
|
||||
github.com/mattn/go-colorable v0.1.2/go.mod h1:U0ppj6V5qS13XJ6of8GYAs25YV2eR4EVcfRqFIhoBtE=
|
||||
github.com/mattn/go-colorable v0.1.4/go.mod h1:U0ppj6V5qS13XJ6of8GYAs25YV2eR4EVcfRqFIhoBtE=
|
||||
github.com/mattn/go-colorable v0.1.6/go.mod h1:u6P/XSegPjTcexA+o6vUJrdnUu04hMope9wVRipJSqc=
|
||||
github.com/mattn/go-colorable v0.1.9 h1:sqDoxXbdeALODt0DAeJCVp38ps9ZogZEAXjus69YV3U=
|
||||
|
@ -473,6 +483,7 @@ github.com/mattn/go-ieproxy v0.0.1 h1:qiyop7gCflfhwCzGyeT0gro3sF9AIg9HU98JORTkqf
|
|||
github.com/mattn/go-ieproxy v0.0.1/go.mod h1:pYabZ6IHcRpFh7vIaLfK7rdcWgFEb3SFJ6/gNWuh88E=
|
||||
github.com/mattn/go-isatty v0.0.3/go.mod h1:M+lRXTBqGeGNdLjl/ufCoiOlB5xdOkqRJdNxMWT7Zi4=
|
||||
github.com/mattn/go-isatty v0.0.4/go.mod h1:M+lRXTBqGeGNdLjl/ufCoiOlB5xdOkqRJdNxMWT7Zi4=
|
||||
github.com/mattn/go-isatty v0.0.5/go.mod h1:Iq45c/XA43vh69/j3iqttzPXn0bhXyGjM0Hdxcsrc5s=
|
||||
github.com/mattn/go-isatty v0.0.8/go.mod h1:Iq45c/XA43vh69/j3iqttzPXn0bhXyGjM0Hdxcsrc5s=
|
||||
github.com/mattn/go-isatty v0.0.10/go.mod h1:qgIWMr58cqv1PHHyhnkY9lrL7etaEgOFcMEpPG5Rm84=
|
||||
github.com/mattn/go-isatty v0.0.11/go.mod h1:PhnuNfih5lzO57/f3n+odYbM4JtupLOxQOAqxQCu2WE=
|
||||
|
@ -485,6 +496,7 @@ github.com/mattn/go-runewidth v0.0.13/go.mod h1:Jdepj2loyihRzMpdS35Xk/zdY8IAYHsh
|
|||
github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0=
|
||||
github.com/matttproud/golang_protobuf_extensions v1.0.2-0.20181231171920-c182affec369 h1:I0XW9+e1XWDxdcEniV4rQAIOPUGDq67JSCiRCgGCZLI=
|
||||
github.com/matttproud/golang_protobuf_extensions v1.0.2-0.20181231171920-c182affec369/go.mod h1:BSXmuO+STAnVfrANrmjBb36TMTDstsz7MSK+HVaYKv4=
|
||||
github.com/mgutz/ansi v0.0.0-20170206155736-9520e82c474b/go.mod h1:01TrycV0kFyexm33Z7vhZRXopbI8J3TDReVlkTgMUxE=
|
||||
github.com/miekg/dns v1.0.14/go.mod h1:W1PPwlIAgtquWBMBEV9nkV9Cazfe8ScdGz/Lj7v3Nrg=
|
||||
github.com/miekg/dns v1.1.26/go.mod h1:bPDLeHnStXmXAq1m/Ch/hvfNHr14JKNPMBo3VZKjuso=
|
||||
github.com/mitchellh/cli v1.0.0/go.mod h1:hNIlj7HEI86fIcpObd7a0FcrxTWetlwJDGcceTlRvqc=
|
||||
|
@ -536,6 +548,7 @@ github.com/onsi/ginkgo v1.16.4/go.mod h1:dX+/inL/fNMqNlz0e9LfyB9TswhZpCVdJM/Z6Vv
|
|||
github.com/onsi/ginkgo v1.16.5 h1:8xi0RTUf59SOSfEtZMvwTvXYMzG4gV23XVHOZiXNtnE=
|
||||
github.com/onsi/ginkgo v1.16.5/go.mod h1:+E8gABHa3K6zRBolWtd+ROzc/U5bkGt0FwiG042wbpU=
|
||||
github.com/onsi/gomega v0.0.0-20170829124025-dcabb60a477c/go.mod h1:C1qb7wdrVGGVU+Z6iS04AVkA3Q65CEZX59MT0QO5uiA=
|
||||
github.com/onsi/gomega v1.5.0/go.mod h1:ex+gbHU/CVuBBDIJjb2X0qEXbFg53c61hWP/1CpauHY=
|
||||
github.com/onsi/gomega v1.7.0/go.mod h1:ex+gbHU/CVuBBDIJjb2X0qEXbFg53c61hWP/1CpauHY=
|
||||
github.com/onsi/gomega v1.7.1/go.mod h1:XdKZgCCFLUoM/7CFJVPcG8C1xQ1AJ0vpAezJrB7JYyY=
|
||||
github.com/onsi/gomega v1.10.1/go.mod h1:iN09h71vgCQne3DLsj+A5owkum+a2tYe+TOCB1ybHNo=
|
||||
|
@ -588,6 +601,7 @@ github.com/rivo/uniseg v0.2.0/go.mod h1:J6wj4VEh+S6ZtnVlnTBMWIodfgj8LQOQFoIToxlJ
|
|||
github.com/robfig/cron v1.1.0 h1:jk4/Hud3TTdcrJgUOBgsqrZBarcxl6ADIjSC2iniwLY=
|
||||
github.com/robfig/cron v1.1.0/go.mod h1:JGuDeoQd7Z6yL4zQhZ3OPEVHB7fL6Ka6skscFHfmt2k=
|
||||
github.com/rogpeppe/fastuuid v0.0.0-20150106093220-6724a57986af/go.mod h1:XWv6SoW27p1b0cqNHllgS5HIMJraePCO15w5zCzIWYg=
|
||||
github.com/rogpeppe/fastuuid v1.1.0/go.mod h1:jVj6XXZzXRy/MSR5jhDC/2q6DgLz+nrA6LYCDYWNEvQ=
|
||||
github.com/rogpeppe/fastuuid v1.2.0/go.mod h1:jVj6XXZzXRy/MSR5jhDC/2q6DgLz+nrA6LYCDYWNEvQ=
|
||||
github.com/rogpeppe/go-internal v1.3.0/go.mod h1:M8bDsm7K2OlrFYOpmOWEs/qY81heoFRclV5y23lUDJ4=
|
||||
github.com/russross/blackfriday v1.5.2/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR/rfWxYHBV53g=
|
||||
|
@ -595,6 +609,7 @@ github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQD
|
|||
github.com/ryanuber/columnize v0.0.0-20160712163229-9b3edd62028f/go.mod h1:sm1tb6uqfes/u+d4ooFouqFdy9/2g9QGwK3SQygK0Ts=
|
||||
github.com/sagikazarmark/crypt v0.1.0/go.mod h1:B/mN0msZuINBtQ1zZLEQcegFJJf9vnYIR88KRMEuODE=
|
||||
github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529/go.mod h1:DxrIzT+xaE7yg65j358z/aeFdxmN0P9QXhEzd20vsDc=
|
||||
github.com/sergi/go-diff v1.0.0/go.mod h1:0CfEIISq7TuYL3j771MWULgwwjU+GofnZX9QAmXWZgo=
|
||||
github.com/sergi/go-diff v1.1.0/go.mod h1:STckp+ISIX8hZLjrqAeVduY0gWCT9IjLuqbuNXdaHfM=
|
||||
github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc=
|
||||
github.com/sirupsen/logrus v1.2.0/go.mod h1:LxeOpSwHxABJmUn/MG1IvRgCAasNZTLOkJPxbbu5VWo=
|
||||
|
@ -604,7 +619,10 @@ github.com/sirupsen/logrus v1.7.0/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic
|
|||
github.com/sirupsen/logrus v1.8.1 h1:dJKuHgqk1NNQlqoA6BTlM1Wf9DOH3NBjQyu0h9+AZZE=
|
||||
github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
|
||||
github.com/smartystreets/assertions v0.0.0-20180927180507-b2de0cb4f26d/go.mod h1:OnSkiWE9lh6wB0YB77sQom3nweQdgAjqCqsofrRNTgc=
|
||||
github.com/smartystreets/assertions v1.0.0/go.mod h1:kHHU4qYBaI3q23Pp3VPrmWhuIUrLW/7eUrw0BU5VaoM=
|
||||
github.com/smartystreets/go-aws-auth v0.0.0-20180515143844-0c1422d1fdb9/go.mod h1:SnhjPscd9TpLiy1LpzGSKh3bXCfxxXuqd9xmQJy3slM=
|
||||
github.com/smartystreets/goconvey v1.6.4/go.mod h1:syvi0/a8iFYH4r/RixwvyeAJjdLS9QV7WQ/tjFTllLA=
|
||||
github.com/smartystreets/gunit v1.0.0/go.mod h1:qwPWnhz6pn0NnRBP++URONOVyNkPyr4SauJk4cUOwJs=
|
||||
github.com/soheilhy/cmux v0.1.4/go.mod h1:IM3LyeVVIOuxMH7sFAkER9+bJ4dT7Ms6E4xg4kGIyLM=
|
||||
github.com/soheilhy/cmux v0.1.5/go.mod h1:T7TcVDs9LWfQgPlPsdngu6I6QIoyIFZDDC6sNE1GqG0=
|
||||
github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
|
||||
|
@ -644,6 +662,13 @@ github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/
|
|||
github.com/stretchr/testify v1.7.0 h1:nwc3DEeHmmLAfoZucVR881uASk0Mfjw8xYJ99tb5CcY=
|
||||
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
|
||||
github.com/subosito/gotenv v1.2.0/go.mod h1:N0PQaV/YGNqwC0u51sEeR/aUtSLEXKX9iv69rRypqCw=
|
||||
github.com/tj/assert v0.0.0-20171129193455-018094318fb0/go.mod h1:mZ9/Rh9oLWpLLDRpvE+3b7gP/C2YyLFYxNmcLnPTMe0=
|
||||
github.com/tj/assert v0.0.3 h1:Df/BlaZ20mq6kuai7f5z2TvPFiwC3xaWJSDQNiIS3Rk=
|
||||
github.com/tj/assert v0.0.3/go.mod h1:Ne6X72Q+TB1AteidzQncjw9PabbMp4PBMZ1k+vd1Pvk=
|
||||
github.com/tj/go-buffer v1.1.0/go.mod h1:iyiJpfFcR2B9sXu7KvjbT9fpM4mOelRSDTbntVj52Uc=
|
||||
github.com/tj/go-elastic v0.0.0-20171221160941-36157cbbebc2/go.mod h1:WjeM0Oo1eNAjXGDx2yma7uG2XoyRZTq1uv3M/o7imD0=
|
||||
github.com/tj/go-kinesis v0.0.0-20171128231115-08b17f58cb1b/go.mod h1:/yhzCV0xPfx6jb1bBgRFjl5lytqVqZXEaeqWP8lTEao=
|
||||
github.com/tj/go-spin v1.1.0/go.mod h1:Mg1mzmePZm4dva8Qz60H2lHwmJ2loum4VIrLgVnKwh4=
|
||||
github.com/tmc/grpc-websocket-proxy v0.0.0-20170815181823-89b8d40f7ca8/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U=
|
||||
github.com/tmc/grpc-websocket-proxy v0.0.0-20190109142713-0ad062ec5ee5/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U=
|
||||
github.com/tmc/grpc-websocket-proxy v0.0.0-20201229170055-e5319fda7802/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U=
|
||||
|
@ -712,6 +737,7 @@ go.uber.org/zap v1.19.0/go.mod h1:xg/QME4nWcxGxrpdeYfq7UvYrLh66cuVKdrbD1XF/NI=
|
|||
golang.org/x/crypto v0.0.0-20180904163835-0709b304e793/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4=
|
||||
golang.org/x/crypto v0.0.0-20181029021203-45a5f77698d3/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4=
|
||||
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
|
||||
golang.org/x/crypto v0.0.0-20190426145343-a29dc8fdc734/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
|
||||
golang.org/x/crypto v0.0.0-20190510104115-cbcb75029529/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
|
||||
golang.org/x/crypto v0.0.0-20190605123033-f99c8df09eb5/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
|
||||
golang.org/x/crypto v0.0.0-20190611184440-5c40567a22f8/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
|
||||
|
@ -840,6 +866,7 @@ golang.org/x/sync v0.0.0-20200317015054-43a5402ce75a/go.mod h1:RxMgew5VJxzue5/jJ
|
|||
golang.org/x/sync v0.0.0-20200625203802-6e8e738ad208/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
|
||||
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
|
||||
golang.org/x/sync v0.0.0-20201207232520-09787c993a3a/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
|
||||
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c h1:5KslGYwFpkhGh+Q16bwMP3cOontH8FOep7tGV86Y7SQ=
|
||||
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
|
||||
golang.org/x/sys v0.0.0-20180823144017-11551d06cbcc/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
|
||||
golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
|
||||
|
@ -1183,6 +1210,7 @@ gopkg.in/yaml.v2 v2.3.0/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
|
|||
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
|
||||
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
|
||||
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
gopkg.in/yaml.v3 v3.0.0-20200605160147-a5ece683394c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
gopkg.in/yaml.v3 v3.0.0-20200615113413-eeeca48fe776/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b h1:h8qDotaEPuJATrMmW04NCwg7v22aHH28wwpauUhK9Oo=
|
||||
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
|
|
|
@ -292,6 +292,10 @@ type BackupStatus struct {
|
|||
// +optional
|
||||
VolumeSnapshotsCompleted int `json:"volumeSnapshotsCompleted,omitempty"`
|
||||
|
||||
// FailureReason is an error that caused the entire backup to fail.
|
||||
// +optional
|
||||
FailureReason string `json:"failureReason,omitempty"`
|
||||
|
||||
// Warnings is a count of all warning messages that were generated during
|
||||
// execution of the backup. The actual warnings are in the backup's log
|
||||
// file in object storage.
|
||||
|
|
|
@ -50,8 +50,15 @@ type DeleteBackupRequestStatus struct {
|
|||
Errors []string `json:"errors,omitempty"`
|
||||
}
|
||||
|
||||
// TODO(2.0) After converting all resources to use the runtime-controller client, the genclient and k8s:deepcopy markers will no longer be needed and should be removed.
|
||||
// +genclient
|
||||
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
|
||||
// +kubebuilder:object:root=true
|
||||
// +kubebuilder:object:generate=true
|
||||
// +kubebuilder:storageversion
|
||||
// +kubebuilder:subresource:status
|
||||
// +kubebuilder:printcolumn:name="BackupName",type="string",JSONPath=".spec.backupName",description="The name of the backup to be deleted"
|
||||
// +kubebuilder:printcolumn:name="Status",type="string",JSONPath=".status.phase",description="The status of the deletion request"
|
||||
|
||||
// DeleteBackupRequest is a request to delete one or more backups.
|
||||
type DeleteBackupRequest struct {
|
||||
|
@ -68,6 +75,7 @@ type DeleteBackupRequest struct {
|
|||
}
|
||||
|
||||
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
|
||||
// +kubebuilder:object:root=true
|
||||
|
||||
// DeleteBackupRequestList is a list of DeleteBackupRequests.
|
||||
type DeleteBackupRequestList struct {
|
|
@ -81,8 +81,20 @@ type PodVolumeRestoreStatus struct {
|
|||
Progress PodVolumeOperationProgress `json:"progress,omitempty"`
|
||||
}
|
||||
|
||||
// TODO(2.0) After converting all resources to use the runtime-controller client, the genclient and k8s:deepcopy markers will no longer be needed and should be removed.
|
||||
// +genclient
|
||||
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
|
||||
// +kubebuilder:object:generate=true
|
||||
// +kubebuilder:object:root=true
|
||||
// +kubebuilder:storageversion
|
||||
// +kubebuilder:subresource:status
|
||||
// +kubebuilder:printcolumn:name="Namespace",type="string",JSONPath=".spec.pod.namespace",description="Namespace of the pod containing the volume to be restored"
|
||||
// +kubebuilder:printcolumn:name="Pod",type="string",JSONPath=".spec.pod.name",description="Name of the pod containing the volume to be restored"
|
||||
// +kubebuilder:printcolumn:name="Volume",type="string",JSONPath=".spec.volume",description="Name of the volume to be restored"
|
||||
// +kubebuilder:printcolumn:name="Status",type="string",JSONPath=".status.phase",description="Pod Volume Restore status such as New/InProgress"
|
||||
// +kubebuilder:printcolumn:name="TotalBytes",type="integer",format="int64",JSONPath=".status.progress.totalBytes",description="Pod Volume Restore status such as New/InProgress"
|
||||
// +kubebuilder:printcolumn:name="BytesDone",type="integer",format="int64",JSONPath=".status.progress.bytesDone",description="Pod Volume Restore status such as New/InProgress"
|
||||
// +kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp"
|
||||
|
||||
type PodVolumeRestore struct {
|
||||
metav1.TypeMeta `json:",inline"`
|
||||
|
@ -98,6 +110,8 @@ type PodVolumeRestore struct {
|
|||
}
|
||||
|
||||
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
|
||||
// +kubebuilder:object:generate=true
|
||||
// +kubebuilder:object:root=true
|
||||
|
||||
// PodVolumeRestoreList is a list of PodVolumeRestores.
|
||||
type PodVolumeRestoreList struct {
|
|
@ -30,27 +30,23 @@ import (
|
|||
v1 "k8s.io/api/core/v1"
|
||||
storagev1api "k8s.io/api/storage/v1"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/fields"
|
||||
"k8s.io/apimachinery/pkg/runtime"
|
||||
"k8s.io/apimachinery/pkg/util/clock"
|
||||
"k8s.io/apimachinery/pkg/util/sets"
|
||||
kubeinformers "k8s.io/client-go/informers"
|
||||
corev1informers "k8s.io/client-go/informers/core/v1"
|
||||
"k8s.io/client-go/kubernetes"
|
||||
"k8s.io/client-go/tools/cache"
|
||||
ctrl "sigs.k8s.io/controller-runtime"
|
||||
"sigs.k8s.io/controller-runtime/pkg/cache"
|
||||
"sigs.k8s.io/controller-runtime/pkg/log/zap"
|
||||
"sigs.k8s.io/controller-runtime/pkg/manager"
|
||||
|
||||
"github.com/vmware-tanzu/velero/internal/credentials"
|
||||
"github.com/vmware-tanzu/velero/internal/util/managercontroller"
|
||||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
|
||||
"github.com/vmware-tanzu/velero/pkg/buildinfo"
|
||||
"github.com/vmware-tanzu/velero/pkg/client"
|
||||
"github.com/vmware-tanzu/velero/pkg/cmd"
|
||||
"github.com/vmware-tanzu/velero/pkg/cmd/util/signals"
|
||||
"github.com/vmware-tanzu/velero/pkg/controller"
|
||||
clientset "github.com/vmware-tanzu/velero/pkg/generated/clientset/versioned"
|
||||
informers "github.com/vmware-tanzu/velero/pkg/generated/informers/externalversions"
|
||||
"github.com/vmware-tanzu/velero/pkg/metrics"
|
||||
"github.com/vmware-tanzu/velero/pkg/restic"
|
||||
"github.com/vmware-tanzu/velero/pkg/util/filesystem"
|
||||
|
@ -101,45 +97,17 @@ func NewServerCommand(f client.Factory) *cobra.Command {
|
|||
}
|
||||
|
||||
type resticServer struct {
|
||||
kubeClient kubernetes.Interface
|
||||
veleroClient clientset.Interface
|
||||
veleroInformerFactory informers.SharedInformerFactory
|
||||
kubeInformerFactory kubeinformers.SharedInformerFactory
|
||||
podInformer cache.SharedIndexInformer
|
||||
logger logrus.FieldLogger
|
||||
ctx context.Context
|
||||
cancelFunc context.CancelFunc
|
||||
fileSystem filesystem.Interface
|
||||
mgr manager.Manager
|
||||
metrics *metrics.ServerMetrics
|
||||
metricsAddress string
|
||||
namespace string
|
||||
logger logrus.FieldLogger
|
||||
ctx context.Context
|
||||
cancelFunc context.CancelFunc
|
||||
fileSystem filesystem.Interface
|
||||
mgr manager.Manager
|
||||
metrics *metrics.ServerMetrics
|
||||
metricsAddress string
|
||||
namespace string
|
||||
}
|
||||
|
||||
func newResticServer(logger logrus.FieldLogger, factory client.Factory, metricAddress string) (*resticServer, error) {
|
||||
|
||||
kubeClient, err := factory.KubeClient()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
veleroClient, err := factory.Client()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// use a stand-alone pod informer because we want to use a field selector to
|
||||
// filter to only pods scheduled on this node.
|
||||
podInformer := corev1informers.NewFilteredPodInformer(
|
||||
kubeClient,
|
||||
metav1.NamespaceAll,
|
||||
0,
|
||||
cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc},
|
||||
func(opts *metav1.ListOptions) {
|
||||
opts.FieldSelector = fmt.Sprintf("spec.nodeName=%s", os.Getenv("NODE_NAME"))
|
||||
},
|
||||
)
|
||||
|
||||
ctx, cancelFunc := context.WithCancel(context.Background())
|
||||
|
||||
clientConfig, err := factory.ClientConfig()
|
||||
|
@ -152,29 +120,40 @@ func newResticServer(logger logrus.FieldLogger, factory client.Factory, metricAd
|
|||
velerov1api.AddToScheme(scheme)
|
||||
v1.AddToScheme(scheme)
|
||||
storagev1api.AddToScheme(scheme)
|
||||
|
||||
// use a field selector to filter to only pods scheduled on this node.
|
||||
cacheOption := cache.Options{
|
||||
SelectorsByObject: cache.SelectorsByObject{
|
||||
&v1.Pod{}: {
|
||||
Field: fields.Set{"spec.nodeName": os.Getenv("NODE_NAME")}.AsSelector(),
|
||||
},
|
||||
},
|
||||
}
|
||||
mgr, err := ctrl.NewManager(clientConfig, ctrl.Options{
|
||||
Scheme: scheme,
|
||||
Scheme: scheme,
|
||||
NewCache: cache.BuilderWithOptions(cacheOption),
|
||||
})
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
s := &resticServer{
|
||||
kubeClient: kubeClient,
|
||||
veleroClient: veleroClient,
|
||||
veleroInformerFactory: informers.NewFilteredSharedInformerFactory(veleroClient, 0, factory.Namespace(), nil),
|
||||
kubeInformerFactory: kubeinformers.NewSharedInformerFactory(kubeClient, 0),
|
||||
podInformer: podInformer,
|
||||
logger: logger,
|
||||
ctx: ctx,
|
||||
cancelFunc: cancelFunc,
|
||||
fileSystem: filesystem.NewFileSystem(),
|
||||
mgr: mgr,
|
||||
metricsAddress: metricAddress,
|
||||
namespace: factory.Namespace(),
|
||||
logger: logger,
|
||||
ctx: ctx,
|
||||
cancelFunc: cancelFunc,
|
||||
fileSystem: filesystem.NewFileSystem(),
|
||||
mgr: mgr,
|
||||
metricsAddress: metricAddress,
|
||||
namespace: factory.Namespace(),
|
||||
}
|
||||
|
||||
if err := s.validatePodVolumesHostPath(); err != nil {
|
||||
// the cache isn't initialized yet when "validatePodVolumesHostPath" is called, the client returned by the manager cannot
|
||||
// be used, so we need the kube client here
|
||||
client, err := factory.KubeClient()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if err := s.validatePodVolumesHostPath(client); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
|
@ -218,38 +197,14 @@ func (s *resticServer) run() {
|
|||
FileSystem: filesystem.NewFileSystem(),
|
||||
ResticExec: restic.BackupExec{},
|
||||
Log: s.logger,
|
||||
|
||||
PvLister: s.kubeInformerFactory.Core().V1().PersistentVolumes().Lister(),
|
||||
PvcLister: s.kubeInformerFactory.Core().V1().PersistentVolumeClaims().Lister(),
|
||||
}
|
||||
if err := pvbReconciler.SetupWithManager(s.mgr); err != nil {
|
||||
s.logger.Fatal(err, "unable to create controller", "controller", controller.PodVolumeBackup)
|
||||
}
|
||||
|
||||
restoreController := controller.NewPodVolumeRestoreController(
|
||||
s.logger,
|
||||
s.veleroInformerFactory.Velero().V1().PodVolumeRestores(),
|
||||
s.veleroClient.VeleroV1(),
|
||||
s.podInformer,
|
||||
s.kubeInformerFactory.Core().V1().PersistentVolumeClaims(),
|
||||
s.kubeInformerFactory.Core().V1().PersistentVolumes(),
|
||||
s.mgr.GetClient(),
|
||||
os.Getenv("NODE_NAME"),
|
||||
credentialFileStore,
|
||||
)
|
||||
|
||||
go s.veleroInformerFactory.Start(s.ctx.Done())
|
||||
go s.kubeInformerFactory.Start(s.ctx.Done())
|
||||
go s.podInformer.Run(s.ctx.Done())
|
||||
|
||||
// TODO(2.0): presuming all controllers and resources are converted to runtime-controller
|
||||
// by v2.0, the block from this line and including the `s.mgr.Start() will be
|
||||
// deprecated, since the manager auto-starts all the caches. Until then, we need to start the
|
||||
// cache for them manually.
|
||||
|
||||
// Adding the controllers to the manager will register them as a (runtime-controller) runnable,
|
||||
// so the manager will ensure the cache is started and ready before all controller are started
|
||||
s.mgr.Add(managercontroller.Runnable(restoreController, 1))
|
||||
if err = controller.NewPodVolumeRestoreReconciler(s.logger, s.mgr.GetClient(), credentialFileStore).SetupWithManager(s.mgr); err != nil {
|
||||
s.logger.WithError(err).Fatal("Unable to create the pod volume restore controller")
|
||||
}
|
||||
|
||||
s.logger.Info("Controllers starting...")
|
||||
|
||||
|
@ -260,7 +215,7 @@ func (s *resticServer) run() {
|
|||
|
||||
// validatePodVolumesHostPath validates that the pod volumes path contains a
|
||||
// directory for each Pod running on this node
|
||||
func (s *resticServer) validatePodVolumesHostPath() error {
|
||||
func (s *resticServer) validatePodVolumesHostPath(client kubernetes.Interface) error {
|
||||
files, err := s.fileSystem.ReadDir("/host_pods/")
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "could not read pod volumes host path")
|
||||
|
@ -274,7 +229,7 @@ func (s *resticServer) validatePodVolumesHostPath() error {
|
|||
}
|
||||
}
|
||||
|
||||
pods, err := s.kubeClient.CoreV1().Pods("").List(s.ctx, metav1.ListOptions{FieldSelector: fmt.Sprintf("spec.nodeName=%s,status.phase=Running", os.Getenv("NODE_NAME"))})
|
||||
pods, err := client.CoreV1().Pods("").List(s.ctx, metav1.ListOptions{FieldSelector: fmt.Sprintf("spec.nodeName=%s,status.phase=Running", os.Getenv("NODE_NAME"))})
|
||||
if err != nil {
|
||||
return errors.WithStack(err)
|
||||
}
|
||||
|
|
|
@ -95,12 +95,11 @@ func Test_validatePodVolumesHostPath(t *testing.T) {
|
|||
}
|
||||
|
||||
s := &resticServer{
|
||||
kubeClient: kubeClient,
|
||||
logger: testutil.NewLogger(),
|
||||
fileSystem: fs,
|
||||
}
|
||||
|
||||
err := s.validatePodVolumesHostPath()
|
||||
err := s.validatePodVolumesHostPath(kubeClient)
|
||||
if tt.wantErr {
|
||||
assert.Error(t, err)
|
||||
} else {
|
||||
|
|
|
@ -302,6 +302,7 @@ func newServer(f client.Factory, config serverConfig, logger *logrus.Logger) (*s
|
|||
scheme := runtime.NewScheme()
|
||||
velerov1api.AddToScheme(scheme)
|
||||
corev1api.AddToScheme(scheme)
|
||||
snapshotv1api.AddToScheme(scheme)
|
||||
|
||||
ctrl.SetLogger(logrusr.NewLogger(logger))
|
||||
|
||||
|
@ -476,6 +477,7 @@ func (s *server) veleroResourcesExist() error {
|
|||
// have restic restores run before controllers adopt the pods.
|
||||
// - Replica sets go before deployments/other controllers so they can be explicitly
|
||||
// restored and be adopted by controllers.
|
||||
// - CAPI ClusterClasses go before Clusters.
|
||||
// - CAPI Clusters come before ClusterResourceSets because failing to do so means the CAPI controller-manager will panic.
|
||||
// Both Clusters and ClusterResourceSets need to come before ClusterResourceSetBinding in order to properly restore workload clusters.
|
||||
// See https://github.com/kubernetes-sigs/cluster-api/issues/4105
|
||||
|
@ -498,6 +500,7 @@ var defaultRestorePriorities = []string{
|
|||
// to ensure that we prioritize restoring from "apps" too, since this is how they're stored
|
||||
// in the backup.
|
||||
"replicasets.apps",
|
||||
"clusterclasses.cluster.x-k8s.io",
|
||||
"clusters.cluster.x-k8s.io",
|
||||
"clusterresourcesets.addons.cluster.x-k8s.io",
|
||||
}
|
||||
|
@ -597,6 +600,7 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
|
|||
s.mgr.GetClient(),
|
||||
s.veleroClient.VeleroV1(),
|
||||
s.sharedInformerFactory.Velero().V1().Backups().Lister(),
|
||||
csiVSLister,
|
||||
s.config.backupSyncPeriod,
|
||||
s.namespace,
|
||||
s.csiSnapshotClient,
|
||||
|
@ -646,6 +650,7 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
|
|||
s.metrics,
|
||||
s.config.formatFlag.Parse(),
|
||||
csiVSLister,
|
||||
s.csiSnapshotClient,
|
||||
csiVSCLister,
|
||||
csiVSClassLister,
|
||||
backupStoreGetter,
|
||||
|
@ -672,34 +677,6 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
|
|||
}
|
||||
}
|
||||
|
||||
deletionControllerRunInfo := func() controllerRunInfo {
|
||||
deletionController := controller.NewBackupDeletionController(
|
||||
s.logger,
|
||||
s.sharedInformerFactory.Velero().V1().DeleteBackupRequests(),
|
||||
s.veleroClient.VeleroV1(), // deleteBackupRequestClient
|
||||
s.veleroClient.VeleroV1(), // backupClient
|
||||
s.sharedInformerFactory.Velero().V1().Restores().Lister(),
|
||||
s.veleroClient.VeleroV1(), // restoreClient
|
||||
backupTracker,
|
||||
s.resticManager,
|
||||
s.sharedInformerFactory.Velero().V1().PodVolumeBackups().Lister(),
|
||||
s.mgr.GetClient(),
|
||||
s.sharedInformerFactory.Velero().V1().VolumeSnapshotLocations().Lister(),
|
||||
csiVSLister,
|
||||
csiVSCLister,
|
||||
s.csiSnapshotClient,
|
||||
newPluginManager,
|
||||
backupStoreGetter,
|
||||
s.metrics,
|
||||
s.discoveryHelper,
|
||||
)
|
||||
|
||||
return controllerRunInfo{
|
||||
controller: deletionController,
|
||||
numWorkers: defaultControllerWorkers,
|
||||
}
|
||||
}
|
||||
|
||||
restoreControllerRunInfo := func() controllerRunInfo {
|
||||
restorer, err := restore.NewKubernetesRestorer(
|
||||
s.veleroClient.VeleroV1(),
|
||||
|
@ -743,7 +720,6 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
|
|||
controller.BackupSync: backupSyncControllerRunInfo,
|
||||
controller.Backup: backupControllerRunInfo,
|
||||
controller.GarbageCollection: gcControllerRunInfo,
|
||||
controller.BackupDeletion: deletionControllerRunInfo,
|
||||
controller.Restore: restoreControllerRunInfo,
|
||||
}
|
||||
// Note: all runtime type controllers that can be disabled are grouped separately, below:
|
||||
|
@ -820,6 +796,19 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
|
|||
s.logger.Fatal(err, "unable to create controller", "controller", controller.ResticRepo)
|
||||
}
|
||||
|
||||
if err := controller.NewBackupDeletionReconciler(
|
||||
s.logger,
|
||||
s.mgr.GetClient(),
|
||||
backupTracker,
|
||||
s.resticManager,
|
||||
s.metrics,
|
||||
s.discoveryHelper,
|
||||
newPluginManager,
|
||||
backupStoreGetter,
|
||||
).SetupWithManager(s.mgr); err != nil {
|
||||
s.logger.Fatal(err, "unable to create controller", "controller", controller.BackupDeletion)
|
||||
}
|
||||
|
||||
if _, ok := enabledRuntimeControllers[controller.ServerStatusRequest]; ok {
|
||||
r := controller.ServerStatusRequestReconciler{
|
||||
Scheme: s.mgr.GetScheme(),
|
||||
|
@ -864,7 +853,6 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
|
|||
if err := s.mgr.Start(s.ctx); err != nil {
|
||||
s.logger.Fatal("Problem starting manager", err)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
|
|
|
@ -25,11 +25,15 @@ import (
|
|||
"io"
|
||||
"io/ioutil"
|
||||
"os"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/apex/log"
|
||||
jsonpatch "github.com/evanphx/json-patch"
|
||||
"github.com/pkg/errors"
|
||||
"github.com/sirupsen/logrus"
|
||||
"golang.org/x/sync/errgroup"
|
||||
v1 "k8s.io/api/core/v1"
|
||||
apierrors "k8s.io/apimachinery/pkg/api/errors"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/labels"
|
||||
|
@ -37,11 +41,13 @@ import (
|
|||
"k8s.io/apimachinery/pkg/util/clock"
|
||||
kerrors "k8s.io/apimachinery/pkg/util/errors"
|
||||
"k8s.io/apimachinery/pkg/util/sets"
|
||||
"k8s.io/apimachinery/pkg/util/wait"
|
||||
"k8s.io/client-go/tools/cache"
|
||||
|
||||
"github.com/vmware-tanzu/velero/pkg/util/csi"
|
||||
|
||||
snapshotv1api "github.com/kubernetes-csi/external-snapshotter/client/v4/apis/volumesnapshot/v1"
|
||||
snapshotterClientSet "github.com/kubernetes-csi/external-snapshotter/client/v4/clientset/versioned"
|
||||
snapshotv1listers "github.com/kubernetes-csi/external-snapshotter/client/v4/listers/volumesnapshot/v1"
|
||||
|
||||
"github.com/vmware-tanzu/velero/internal/storage"
|
||||
|
@ -64,6 +70,7 @@ import (
|
|||
"github.com/vmware-tanzu/velero/pkg/util/logging"
|
||||
"github.com/vmware-tanzu/velero/pkg/volume"
|
||||
|
||||
"sigs.k8s.io/cluster-api/util/patch"
|
||||
kbclient "sigs.k8s.io/controller-runtime/pkg/client"
|
||||
)
|
||||
|
||||
|
@ -87,6 +94,7 @@ type backupController struct {
|
|||
backupStoreGetter persistence.ObjectBackupStoreGetter
|
||||
formatFlag logging.Format
|
||||
volumeSnapshotLister snapshotv1listers.VolumeSnapshotLister
|
||||
volumeSnapshotClient *snapshotterClientSet.Clientset
|
||||
volumeSnapshotContentLister snapshotv1listers.VolumeSnapshotContentLister
|
||||
volumeSnapshotClassLister snapshotv1listers.VolumeSnapshotClassLister
|
||||
}
|
||||
|
@ -109,6 +117,7 @@ func NewBackupController(
|
|||
metrics *metrics.ServerMetrics,
|
||||
formatFlag logging.Format,
|
||||
volumeSnapshotLister snapshotv1listers.VolumeSnapshotLister,
|
||||
volumeSnapshotClient *snapshotterClientSet.Clientset,
|
||||
volumeSnapshotContentLister snapshotv1listers.VolumeSnapshotContentLister,
|
||||
volumesnapshotClassLister snapshotv1listers.VolumeSnapshotClassLister,
|
||||
backupStoreGetter persistence.ObjectBackupStoreGetter,
|
||||
|
@ -132,6 +141,7 @@ func NewBackupController(
|
|||
metrics: metrics,
|
||||
formatFlag: formatFlag,
|
||||
volumeSnapshotLister: volumeSnapshotLister,
|
||||
volumeSnapshotClient: volumeSnapshotClient,
|
||||
volumeSnapshotContentLister: volumeSnapshotContentLister,
|
||||
volumeSnapshotClassLister: volumesnapshotClassLister,
|
||||
backupStoreGetter: backupStoreGetter,
|
||||
|
@ -147,13 +157,12 @@ func NewBackupController(
|
|||
backup := obj.(*velerov1api.Backup)
|
||||
|
||||
switch backup.Status.Phase {
|
||||
case "", velerov1api.BackupPhaseNew:
|
||||
// only process new backups
|
||||
case "", velerov1api.BackupPhaseNew, velerov1api.BackupPhaseInProgress:
|
||||
default:
|
||||
c.logger.WithFields(logrus.Fields{
|
||||
"backup": kubeutil.NamespaceAndName(backup),
|
||||
"phase": backup.Status.Phase,
|
||||
}).Debug("Backup is not new, skipping")
|
||||
}).Debug("Backup is not new or in-progress, skipping")
|
||||
return
|
||||
}
|
||||
|
||||
|
@ -241,7 +250,22 @@ func (c *backupController) processBackup(key string) error {
|
|||
// this key (even though it was a no-op).
|
||||
switch original.Status.Phase {
|
||||
case "", velerov1api.BackupPhaseNew:
|
||||
// only process new backups
|
||||
case velerov1api.BackupPhaseInProgress:
|
||||
// A backup may stay in-progress forever because of
|
||||
// 1) the controller restarts during the processing of a backup
|
||||
// 2) the backup with in-progress status isn't updated to completed or failed status successfully
|
||||
// So we try to mark such Backups as failed to avoid it
|
||||
updated := original.DeepCopy()
|
||||
updated.Status.Phase = velerov1api.BackupPhaseFailed
|
||||
updated.Status.FailureReason = fmt.Sprintf("got a Backup with unexpected status %q, this may be due to a restart of the controller during the backing up, mark it as %q",
|
||||
velerov1api.BackupPhaseInProgress, updated.Status.Phase)
|
||||
updated.Status.CompletionTimestamp = &metav1.Time{Time: c.clock.Now()}
|
||||
_, err = patchBackup(original, updated, c.client)
|
||||
if err != nil {
|
||||
return errors.Wrapf(err, "error updating Backup status to %s", updated.Status.Phase)
|
||||
}
|
||||
log.Warn(updated.Status.FailureReason)
|
||||
return nil
|
||||
default:
|
||||
return nil
|
||||
}
|
||||
|
@ -261,6 +285,7 @@ func (c *backupController) processBackup(key string) error {
|
|||
if err != nil {
|
||||
return errors.Wrapf(err, "error updating Backup status to %s", request.Status.Phase)
|
||||
}
|
||||
|
||||
// store ref to just-updated item for creating patch
|
||||
original = updatedBackup
|
||||
request.Backup = updatedBackup.DeepCopy()
|
||||
|
@ -287,6 +312,7 @@ func (c *backupController) processBackup(key string) error {
|
|||
// result in the backup being Failed.
|
||||
log.WithError(err).Error("backup failed")
|
||||
request.Status.Phase = velerov1api.BackupPhaseFailed
|
||||
request.Status.FailureReason = err.Error()
|
||||
}
|
||||
|
||||
switch request.Status.Phase {
|
||||
|
@ -621,6 +647,13 @@ func (c *backupController) runBackup(backup *pkgbackup.Request) error {
|
|||
if err != nil {
|
||||
backupLog.Error(err)
|
||||
}
|
||||
|
||||
err = c.checkVolumeSnapshotReadyToUse(context.Background(), volumeSnapshots)
|
||||
if err != nil {
|
||||
backupLog.Errorf("fail to wait VolumeSnapshot change to Ready: %s", err.Error())
|
||||
}
|
||||
|
||||
backup.CSISnapshots = volumeSnapshots
|
||||
}
|
||||
|
||||
if c.volumeSnapshotContentLister != nil {
|
||||
|
@ -646,6 +679,10 @@ func (c *backupController) runBackup(backup *pkgbackup.Request) error {
|
|||
backupLog.Error(err)
|
||||
}
|
||||
}
|
||||
|
||||
// Delete the VolumeSnapshots created in the backup, when CSI feature is enabled.
|
||||
c.deleteVolumeSnapshot(volumeSnapshots, volumeSnapshotContents, *backup, backupLog)
|
||||
|
||||
}
|
||||
|
||||
// Mark completion timestamp before serializing and uploading.
|
||||
|
@ -661,7 +698,7 @@ func (c *backupController) runBackup(backup *pkgbackup.Request) error {
|
|||
|
||||
backup.Status.CSIVolumeSnapshotsAttempted = len(backup.CSISnapshots)
|
||||
for _, vs := range backup.CSISnapshots {
|
||||
if *vs.Status.ReadyToUse {
|
||||
if vs.Status != nil && boolptr.IsSetToTrue(vs.Status.ReadyToUse) {
|
||||
backup.Status.CSIVolumeSnapshotsCompleted++
|
||||
}
|
||||
}
|
||||
|
@ -847,3 +884,170 @@ func encodeToJSONGzip(data interface{}, desc string) (*bytes.Buffer, []error) {
|
|||
|
||||
return buf, nil
|
||||
}
|
||||
|
||||
// Waiting for VolumeSnapshot ReadyTosue to true is time consuming. Try to make the process parallel by
|
||||
// using goroutine here instead of waiting in CSI plugin, because it's not easy to make BackupItemAction
|
||||
// parallel by now. After BackupItemAction parallel is implemented, this logic should be moved to CSI plugin
|
||||
// as https://github.com/vmware-tanzu/velero-plugin-for-csi/pull/100
|
||||
func (c *backupController) checkVolumeSnapshotReadyToUse(ctx context.Context, volumesnapshots []*snapshotv1api.VolumeSnapshot) error {
|
||||
eg, _ := errgroup.WithContext(ctx)
|
||||
timeout := 10 * time.Minute
|
||||
interval := 5 * time.Second
|
||||
|
||||
for _, vs := range volumesnapshots {
|
||||
volumeSnapshot := vs
|
||||
eg.Go(func() error {
|
||||
err := wait.PollImmediate(interval, timeout, func() (bool, error) {
|
||||
tmpVS, err := c.volumeSnapshotClient.SnapshotV1().VolumeSnapshots(volumeSnapshot.Namespace).Get(ctx, volumeSnapshot.Name, metav1.GetOptions{})
|
||||
if err != nil {
|
||||
return false, errors.Wrapf(err, fmt.Sprintf("failed to get volumesnapshot %s/%s", volumeSnapshot.Namespace, volumeSnapshot.Name))
|
||||
}
|
||||
if tmpVS.Status == nil || tmpVS.Status.BoundVolumeSnapshotContentName == nil || !boolptr.IsSetToTrue(tmpVS.Status.ReadyToUse) {
|
||||
log.Infof("Waiting for CSI driver to reconcile volumesnapshot %s/%s. Retrying in %ds", volumeSnapshot.Namespace, volumeSnapshot.Name, interval/time.Second)
|
||||
return false, nil
|
||||
}
|
||||
|
||||
return true, nil
|
||||
})
|
||||
if err == wait.ErrWaitTimeout {
|
||||
log.Errorf("Timed out awaiting reconciliation of volumesnapshot %s/%s", volumeSnapshot.Namespace, volumeSnapshot.Name)
|
||||
}
|
||||
return err
|
||||
})
|
||||
}
|
||||
return eg.Wait()
|
||||
}
|
||||
|
||||
// deleteVolumeSnapshot delete VolumeSnapshot created during backup.
|
||||
// This is used to avoid deleting namespace in cluster triggers the VolumeSnapshot deletion,
|
||||
// which will cause snapshot deletion on cloud provider, then backup cannot restore the PV.
|
||||
// If DeletionPolicy is Retain, just delete it. If DeletionPolicy is Delete, need to
|
||||
// change DeletionPolicy to Retain before deleting VS, then change DeletionPolicy back to Delete.
|
||||
func (c *backupController) deleteVolumeSnapshot(volumeSnapshots []*snapshotv1api.VolumeSnapshot,
|
||||
volumeSnapshotContents []*snapshotv1api.VolumeSnapshotContent,
|
||||
backup pkgbackup.Request, logger logrus.FieldLogger) {
|
||||
var wg sync.WaitGroup
|
||||
vscMap := make(map[string]*snapshotv1api.VolumeSnapshotContent)
|
||||
for _, vsc := range volumeSnapshotContents {
|
||||
vscMap[vsc.Name] = vsc
|
||||
}
|
||||
|
||||
for _, vs := range volumeSnapshots {
|
||||
wg.Add(1)
|
||||
go func(vs *snapshotv1api.VolumeSnapshot) {
|
||||
defer wg.Done()
|
||||
var vsc *snapshotv1api.VolumeSnapshotContent
|
||||
modifyVSCFlag := false
|
||||
if vs.Status.BoundVolumeSnapshotContentName != nil &&
|
||||
len(*vs.Status.BoundVolumeSnapshotContentName) > 0 {
|
||||
vsc = vscMap[*vs.Status.BoundVolumeSnapshotContentName]
|
||||
if vsc.Spec.DeletionPolicy == snapshotv1api.VolumeSnapshotContentDelete {
|
||||
modifyVSCFlag = true
|
||||
}
|
||||
}
|
||||
|
||||
// Change VolumeSnapshotContent's DeletionPolicy to Retain before deleting VolumeSnapshot,
|
||||
// because VolumeSnapshotContent will be deleted by deleting VolumeSnapshot, when
|
||||
// DeletionPolicy is set to Delete, but Velero needs VSC for cleaning snapshot on cloud
|
||||
// in backup deletion.
|
||||
if modifyVSCFlag {
|
||||
logger.Debugf("Patching VolumeSnapshotContent %s", vsc.Name)
|
||||
_, err := c.patchVolumeSnapshotContent(vsc, func(req *snapshotv1api.VolumeSnapshotContent) {
|
||||
req.Spec.DeletionPolicy = snapshotv1api.VolumeSnapshotContentRetain
|
||||
})
|
||||
if err != nil {
|
||||
logger.Errorf("fail to modify VolumeSnapshotContent %s DeletionPolicy to Retain: %s", vsc.Name, err.Error())
|
||||
return
|
||||
}
|
||||
|
||||
defer func() {
|
||||
logger.Debugf("Start to recreate VolumeSnapshotContent %s", vsc.Name)
|
||||
err := c.recreateVolumeSnapshotContent(vsc)
|
||||
if err != nil {
|
||||
logger.Errorf("fail to recreate VolumeSnapshotContent %s: %s", vsc.Name, err.Error())
|
||||
}
|
||||
}()
|
||||
}
|
||||
|
||||
// Delete VolumeSnapshot from cluster
|
||||
logger.Debugf("Deleting VolumeSnapshotContent %s", vsc.Name)
|
||||
err := c.volumeSnapshotClient.SnapshotV1().VolumeSnapshots(vs.Namespace).Delete(context.TODO(), vs.Name, metav1.DeleteOptions{})
|
||||
if err != nil {
|
||||
logger.Errorf("fail to delete VolumeSnapshot %s/%s: %s", vs.Namespace, vs.Name, err.Error())
|
||||
}
|
||||
}(vs)
|
||||
}
|
||||
|
||||
wg.Wait()
|
||||
}
|
||||
|
||||
func (c *backupController) patchVolumeSnapshotContent(req *snapshotv1api.VolumeSnapshotContent, mutate func(*snapshotv1api.VolumeSnapshotContent)) (*snapshotv1api.VolumeSnapshotContent, error) {
|
||||
patchHelper, err := patch.NewHelper(req, c.kbClient)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "fail to get patch helper.")
|
||||
}
|
||||
|
||||
// Mutate
|
||||
mutate(req)
|
||||
|
||||
if err := patchHelper.Patch(context.TODO(), req); err != nil {
|
||||
return nil, errors.Wrapf(err, "fail to patch VolumeSnapshotContent %s", req.Name)
|
||||
}
|
||||
|
||||
return req, nil
|
||||
}
|
||||
|
||||
// recreateVolumeSnapshotContent will delete then re-create VolumeSnapshotContent,
|
||||
// because some parameter in VolumeSnapshotContent Spec is immutable, e.g. VolumeSnapshotRef
|
||||
// and Source. Source is updated to let csi-controller thinks the VSC is statically provsisioned with VS.
|
||||
// Set VolumeSnapshotRef's UID to nil will let the csi-controller finds out the related VS is gone, then
|
||||
// VSC can be deleted.
|
||||
func (c *backupController) recreateVolumeSnapshotContent(vsc *snapshotv1api.VolumeSnapshotContent) error {
|
||||
timeout := 1 * time.Minute
|
||||
interval := 1 * time.Second
|
||||
|
||||
err := c.volumeSnapshotClient.SnapshotV1().VolumeSnapshotContents().Delete(context.TODO(), vsc.Name, metav1.DeleteOptions{})
|
||||
if err != nil {
|
||||
return errors.Wrapf(err, "fail to delete VolumeSnapshotContent: %s", vsc.Name)
|
||||
}
|
||||
|
||||
// Check VolumeSnapshotContents is already deleted, before re-creating it.
|
||||
err = wait.PollImmediate(interval, timeout, func() (bool, error) {
|
||||
_, err := c.volumeSnapshotClient.SnapshotV1().VolumeSnapshotContents().Get(context.TODO(), vsc.Name, metav1.GetOptions{})
|
||||
if err != nil {
|
||||
if apierrors.IsNotFound(err) {
|
||||
return true, nil
|
||||
}
|
||||
return false, errors.Wrapf(err, fmt.Sprintf("failed to get VolumeSnapshotContent %s", vsc.Name))
|
||||
}
|
||||
return false, nil
|
||||
})
|
||||
if err != nil {
|
||||
return errors.Wrapf(err, "fail to retrieve VolumeSnapshotContent %s info", vsc.Name)
|
||||
}
|
||||
|
||||
// Make the VolumeSnapshotContent static
|
||||
vsc.Spec.Source = snapshotv1api.VolumeSnapshotContentSource{
|
||||
SnapshotHandle: vsc.Status.SnapshotHandle,
|
||||
}
|
||||
// Set VolumeSnapshotRef to none exist one, because VolumeSnapshotContent
|
||||
// validation webhook will check whether name and namespace are nil.
|
||||
// external-snapshotter needs Source pointing to snapshot and VolumeSnapshot
|
||||
// reference's UID to nil to determine the VolumeSnapshotContent is deletable.
|
||||
vsc.Spec.VolumeSnapshotRef = v1.ObjectReference{
|
||||
APIVersion: snapshotv1api.SchemeGroupVersion.String(),
|
||||
Kind: "VolumeSnapshot",
|
||||
Namespace: "ns-" + string(vsc.UID),
|
||||
Name: "name-" + string(vsc.UID),
|
||||
}
|
||||
// Revert DeletionPolicy to Delete
|
||||
vsc.Spec.DeletionPolicy = snapshotv1api.VolumeSnapshotContentDelete
|
||||
// ResourceVersion shouldn't exist for new creation.
|
||||
vsc.ResourceVersion = ""
|
||||
_, err = c.volumeSnapshotClient.SnapshotV1().VolumeSnapshotContents().Create(context.TODO(), vsc, metav1.CreateOptions{})
|
||||
if err != nil {
|
||||
return errors.Wrapf(err, "fail to create VolumeSnapshotContent %s", vsc.Name)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
|
|
@ -94,11 +94,6 @@ func TestProcessBackupNonProcessedItems(t *testing.T) {
|
|||
key: "velero/backup-1",
|
||||
backup: defaultBackup().Phase(velerov1api.BackupPhaseFailedValidation).Result(),
|
||||
},
|
||||
{
|
||||
name: "InProgress backup is not processed",
|
||||
key: "velero/backup-1",
|
||||
backup: defaultBackup().Phase(velerov1api.BackupPhaseInProgress).Result(),
|
||||
},
|
||||
{
|
||||
name: "Completed backup is not processed",
|
||||
key: "velero/backup-1",
|
||||
|
@ -140,6 +135,28 @@ func TestProcessBackupNonProcessedItems(t *testing.T) {
|
|||
}
|
||||
}
|
||||
|
||||
func TestMarkInProgressBackupAsFailed(t *testing.T) {
|
||||
backup := defaultBackup().Phase(velerov1api.BackupPhaseInProgress).Result()
|
||||
clientset := fake.NewSimpleClientset(backup)
|
||||
sharedInformers := informers.NewSharedInformerFactory(clientset, 0)
|
||||
logger := logging.DefaultLogger(logrus.DebugLevel, logging.FormatText)
|
||||
|
||||
c := &backupController{
|
||||
genericController: newGenericController("backup-test", logger),
|
||||
client: clientset.VeleroV1(),
|
||||
lister: sharedInformers.Velero().V1().Backups().Lister(),
|
||||
clock: &clock.RealClock{},
|
||||
}
|
||||
require.NoError(t, sharedInformers.Velero().V1().Backups().Informer().GetStore().Add(backup))
|
||||
|
||||
err := c.processBackup(fmt.Sprintf("%s/%s", backup.Namespace, backup.Name))
|
||||
require.Nil(t, err)
|
||||
|
||||
res, err := clientset.VeleroV1().Backups(backup.Namespace).Get(context.TODO(), backup.Name, metav1.GetOptions{})
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, velerov1api.BackupPhaseFailed, res.Status.Phase)
|
||||
}
|
||||
|
||||
func TestProcessBackupValidationFailures(t *testing.T) {
|
||||
defaultBackupLocation := builder.ForBackupStorageLocation("velero", "loc-1").Result()
|
||||
|
||||
|
@ -729,6 +746,7 @@ func TestProcessBackupCompletions(t *testing.T) {
|
|||
},
|
||||
Status: velerov1api.BackupStatus{
|
||||
Phase: velerov1api.BackupPhaseFailed,
|
||||
FailureReason: "backup already exists in object storage",
|
||||
Version: 1,
|
||||
FormatVersion: "1.1.0",
|
||||
StartTimestamp: ×tamp,
|
||||
|
@ -766,6 +784,7 @@ func TestProcessBackupCompletions(t *testing.T) {
|
|||
},
|
||||
Status: velerov1api.BackupStatus{
|
||||
Phase: velerov1api.BackupPhaseFailed,
|
||||
FailureReason: "error checking if backup already exists in object storage: Backup already exists in object storage",
|
||||
Version: 1,
|
||||
FormatVersion: "1.1.0",
|
||||
StartTimestamp: ×tamp,
|
||||
|
|
|
@ -23,25 +23,19 @@ import (
|
|||
"time"
|
||||
|
||||
jsonpatch "github.com/evanphx/json-patch"
|
||||
snapshotterClientSet "github.com/kubernetes-csi/external-snapshotter/client/v4/clientset/versioned"
|
||||
snapshotv1listers "github.com/kubernetes-csi/external-snapshotter/client/v4/listers/volumesnapshot/v1"
|
||||
"github.com/pkg/errors"
|
||||
"github.com/sirupsen/logrus"
|
||||
apierrors "k8s.io/apimachinery/pkg/api/errors"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/labels"
|
||||
"k8s.io/apimachinery/pkg/types"
|
||||
"k8s.io/apimachinery/pkg/util/clock"
|
||||
kubeerrs "k8s.io/apimachinery/pkg/util/errors"
|
||||
"k8s.io/client-go/tools/cache"
|
||||
"sigs.k8s.io/cluster-api/util/patch"
|
||||
ctrl "sigs.k8s.io/controller-runtime"
|
||||
|
||||
"github.com/vmware-tanzu/velero/internal/delete"
|
||||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
|
||||
pkgbackup "github.com/vmware-tanzu/velero/pkg/backup"
|
||||
"github.com/vmware-tanzu/velero/pkg/discovery"
|
||||
velerov1client "github.com/vmware-tanzu/velero/pkg/generated/clientset/versioned/typed/velero/v1"
|
||||
velerov1informers "github.com/vmware-tanzu/velero/pkg/generated/informers/externalversions/velero/v1"
|
||||
velerov1listers "github.com/vmware-tanzu/velero/pkg/generated/listers/velero/v1"
|
||||
"github.com/vmware-tanzu/velero/pkg/label"
|
||||
"github.com/vmware-tanzu/velero/pkg/metrics"
|
||||
"github.com/vmware-tanzu/velero/pkg/persistence"
|
||||
|
@ -54,250 +48,202 @@ import (
|
|||
"sigs.k8s.io/controller-runtime/pkg/client"
|
||||
)
|
||||
|
||||
const resticTimeout = time.Minute
|
||||
const (
|
||||
resticTimeout = time.Minute
|
||||
deleteBackupRequestMaxAge = 24 * time.Hour
|
||||
)
|
||||
|
||||
type backupDeletionController struct {
|
||||
*genericController
|
||||
|
||||
deleteBackupRequestClient velerov1client.DeleteBackupRequestsGetter
|
||||
deleteBackupRequestLister velerov1listers.DeleteBackupRequestLister
|
||||
backupClient velerov1client.BackupsGetter
|
||||
restoreLister velerov1listers.RestoreLister
|
||||
restoreClient velerov1client.RestoresGetter
|
||||
backupTracker BackupTracker
|
||||
resticMgr restic.RepositoryManager
|
||||
podvolumeBackupLister velerov1listers.PodVolumeBackupLister
|
||||
kbClient client.Client
|
||||
snapshotLocationLister velerov1listers.VolumeSnapshotLocationLister
|
||||
csiSnapshotLister snapshotv1listers.VolumeSnapshotLister
|
||||
csiSnapshotContentLister snapshotv1listers.VolumeSnapshotContentLister
|
||||
csiSnapshotClient *snapshotterClientSet.Clientset
|
||||
processRequestFunc func(*velerov1api.DeleteBackupRequest) error
|
||||
clock clock.Clock
|
||||
newPluginManager func(logrus.FieldLogger) clientmgmt.Manager
|
||||
backupStoreGetter persistence.ObjectBackupStoreGetter
|
||||
metrics *metrics.ServerMetrics
|
||||
helper discovery.Helper
|
||||
type backupDeletionReconciler struct {
|
||||
client.Client
|
||||
logger logrus.FieldLogger
|
||||
backupTracker BackupTracker
|
||||
resticMgr restic.RepositoryManager
|
||||
metrics *metrics.ServerMetrics
|
||||
clock clock.Clock
|
||||
discoveryHelper discovery.Helper
|
||||
newPluginManager func(logrus.FieldLogger) clientmgmt.Manager
|
||||
backupStoreGetter persistence.ObjectBackupStoreGetter
|
||||
}
|
||||
|
||||
// NewBackupDeletionController creates a new backup deletion controller.
|
||||
func NewBackupDeletionController(
|
||||
// NewBackupDeletionReconciler creates a new backup deletion reconciler.
|
||||
func NewBackupDeletionReconciler(
|
||||
logger logrus.FieldLogger,
|
||||
deleteBackupRequestInformer velerov1informers.DeleteBackupRequestInformer,
|
||||
deleteBackupRequestClient velerov1client.DeleteBackupRequestsGetter,
|
||||
backupClient velerov1client.BackupsGetter,
|
||||
restoreLister velerov1listers.RestoreLister,
|
||||
restoreClient velerov1client.RestoresGetter,
|
||||
client client.Client,
|
||||
backupTracker BackupTracker,
|
||||
resticMgr restic.RepositoryManager,
|
||||
podvolumeBackupLister velerov1listers.PodVolumeBackupLister,
|
||||
kbClient client.Client,
|
||||
snapshotLocationLister velerov1listers.VolumeSnapshotLocationLister,
|
||||
csiSnapshotLister snapshotv1listers.VolumeSnapshotLister,
|
||||
csiSnapshotContentLister snapshotv1listers.VolumeSnapshotContentLister,
|
||||
csiSnapshotClient *snapshotterClientSet.Clientset,
|
||||
newPluginManager func(logrus.FieldLogger) clientmgmt.Manager,
|
||||
backupStoreGetter persistence.ObjectBackupStoreGetter,
|
||||
metrics *metrics.ServerMetrics,
|
||||
helper discovery.Helper,
|
||||
) Interface {
|
||||
c := &backupDeletionController{
|
||||
genericController: newGenericController(BackupDeletion, logger),
|
||||
deleteBackupRequestClient: deleteBackupRequestClient,
|
||||
deleteBackupRequestLister: deleteBackupRequestInformer.Lister(),
|
||||
backupClient: backupClient,
|
||||
restoreLister: restoreLister,
|
||||
restoreClient: restoreClient,
|
||||
backupTracker: backupTracker,
|
||||
resticMgr: resticMgr,
|
||||
podvolumeBackupLister: podvolumeBackupLister,
|
||||
kbClient: kbClient,
|
||||
snapshotLocationLister: snapshotLocationLister,
|
||||
csiSnapshotLister: csiSnapshotLister,
|
||||
csiSnapshotContentLister: csiSnapshotContentLister,
|
||||
csiSnapshotClient: csiSnapshotClient,
|
||||
metrics: metrics,
|
||||
helper: helper,
|
||||
// use variables to refer to these functions so they can be
|
||||
// replaced with fakes for testing.
|
||||
newPluginManager func(logrus.FieldLogger) clientmgmt.Manager,
|
||||
backupStoreGetter persistence.ObjectBackupStoreGetter,
|
||||
) *backupDeletionReconciler {
|
||||
return &backupDeletionReconciler{
|
||||
Client: client,
|
||||
logger: logger,
|
||||
backupTracker: backupTracker,
|
||||
resticMgr: resticMgr,
|
||||
metrics: metrics,
|
||||
clock: clock.RealClock{},
|
||||
discoveryHelper: helper,
|
||||
newPluginManager: newPluginManager,
|
||||
backupStoreGetter: backupStoreGetter,
|
||||
|
||||
clock: &clock.RealClock{},
|
||||
}
|
||||
|
||||
c.syncHandler = c.processQueueItem
|
||||
c.processRequestFunc = c.processRequest
|
||||
|
||||
deleteBackupRequestInformer.Informer().AddEventHandler(
|
||||
cache.ResourceEventHandlerFuncs{
|
||||
AddFunc: c.enqueue,
|
||||
},
|
||||
)
|
||||
|
||||
c.resyncPeriod = time.Hour
|
||||
c.resyncFunc = c.deleteExpiredRequests
|
||||
|
||||
return c
|
||||
}
|
||||
|
||||
func (c *backupDeletionController) processQueueItem(key string) error {
|
||||
log := c.logger.WithField("key", key)
|
||||
log.Debug("Running processItem")
|
||||
|
||||
ns, name, err := cache.SplitMetaNamespaceKey(key)
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "error splitting queue key")
|
||||
}
|
||||
|
||||
req, err := c.deleteBackupRequestLister.DeleteBackupRequests(ns).Get(name)
|
||||
if apierrors.IsNotFound(err) {
|
||||
log.Debug("Unable to find DeleteBackupRequest")
|
||||
return nil
|
||||
}
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "error getting DeleteBackupRequest")
|
||||
}
|
||||
|
||||
switch req.Status.Phase {
|
||||
case velerov1api.DeleteBackupRequestPhaseProcessed:
|
||||
// Don't do anything because it's already been processed
|
||||
default:
|
||||
// Don't mutate the shared cache
|
||||
reqCopy := req.DeepCopy()
|
||||
return c.processRequestFunc(reqCopy)
|
||||
}
|
||||
|
||||
return nil
|
||||
func (r *backupDeletionReconciler) SetupWithManager(mgr ctrl.Manager) error {
|
||||
// Make sure the expired requests can be deleted eventually
|
||||
s := kube.NewPeriodicalEnqueueSource(r.logger, mgr.GetClient(), &velerov1api.DeleteBackupRequestList{}, time.Hour)
|
||||
return ctrl.NewControllerManagedBy(mgr).
|
||||
For(&velerov1api.DeleteBackupRequest{}).
|
||||
Watches(s, nil).
|
||||
Complete(r)
|
||||
}
|
||||
|
||||
func (c *backupDeletionController) processRequest(req *velerov1api.DeleteBackupRequest) error {
|
||||
log := c.logger.WithFields(logrus.Fields{
|
||||
"namespace": req.Namespace,
|
||||
"name": req.Name,
|
||||
"backup": req.Spec.BackupName,
|
||||
// +kubebuilder:rbac:groups=velero.io,resources=deletebackuprequests,verbs=get;list;watch;create;update;patch;delete
|
||||
// +kubebuilder:rbac:groups=velero.io,resources=deletebackuprequests/status,verbs=get;update;patch
|
||||
// +kubebuilder:rbac:groups=velero.io,resources=backups,verbs=delete
|
||||
|
||||
func (r *backupDeletionReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
|
||||
log := r.logger.WithFields(logrus.Fields{
|
||||
"controller": BackupDeletion,
|
||||
"deletebackuprequest": req.String(),
|
||||
})
|
||||
log.Debug("Getting deletebackuprequest")
|
||||
dbr := &velerov1api.DeleteBackupRequest{}
|
||||
if err := r.Get(ctx, req.NamespacedName, dbr); err != nil {
|
||||
if apierrors.IsNotFound(err) {
|
||||
log.Debug("Unable to find the deletebackuprequest")
|
||||
return ctrl.Result{}, nil
|
||||
}
|
||||
log.WithError(err).Error("Error getting deletebackuprequest")
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
var err error
|
||||
// Since we use the reconciler along with the PeriodicalEnqueueSource, there may be reconciliation triggered by
|
||||
// stale requests.
|
||||
if dbr.Status.Phase == velerov1api.DeleteBackupRequestPhaseProcessed {
|
||||
age := r.clock.Now().Sub(dbr.CreationTimestamp.Time)
|
||||
if age >= deleteBackupRequestMaxAge { // delete the expired request
|
||||
log.Debug("The request is expired, deleting it.")
|
||||
if err := r.Delete(ctx, dbr); err != nil {
|
||||
log.WithError(err).Error("Error deleting DeleteBackupRequest")
|
||||
}
|
||||
} else {
|
||||
log.Info("The request has been processed, skip.")
|
||||
}
|
||||
return ctrl.Result{}, nil
|
||||
}
|
||||
|
||||
// Make sure we have the backup name
|
||||
if req.Spec.BackupName == "" {
|
||||
_, err = c.patchDeleteBackupRequest(req, func(r *velerov1api.DeleteBackupRequest) {
|
||||
r.Status.Phase = velerov1api.DeleteBackupRequestPhaseProcessed
|
||||
r.Status.Errors = []string{"spec.backupName is required"}
|
||||
if dbr.Spec.BackupName == "" {
|
||||
_, err := r.patchDeleteBackupRequest(ctx, dbr, func(res *velerov1api.DeleteBackupRequest) {
|
||||
res.Status.Phase = velerov1api.DeleteBackupRequestPhaseProcessed
|
||||
res.Status.Errors = []string{"spec.backupName is required"}
|
||||
})
|
||||
return err
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
log = log.WithField("backup", dbr.Spec.BackupName)
|
||||
|
||||
// Remove any existing deletion requests for this backup so we only have
|
||||
// one at a time
|
||||
if errs := c.deleteExistingDeletionRequests(req, log); errs != nil {
|
||||
return kubeerrs.NewAggregate(errs)
|
||||
if errs := r.deleteExistingDeletionRequests(ctx, dbr, log); errs != nil {
|
||||
return ctrl.Result{}, kubeerrs.NewAggregate(errs)
|
||||
}
|
||||
|
||||
// Don't allow deleting an in-progress backup
|
||||
if c.backupTracker.Contains(req.Namespace, req.Spec.BackupName) {
|
||||
_, err = c.patchDeleteBackupRequest(req, func(r *velerov1api.DeleteBackupRequest) {
|
||||
if r.backupTracker.Contains(dbr.Namespace, dbr.Spec.BackupName) {
|
||||
_, err := r.patchDeleteBackupRequest(ctx, dbr, func(r *velerov1api.DeleteBackupRequest) {
|
||||
r.Status.Phase = velerov1api.DeleteBackupRequestPhaseProcessed
|
||||
r.Status.Errors = []string{"backup is still in progress"}
|
||||
})
|
||||
|
||||
return err
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
// Get the backup we're trying to delete
|
||||
backup, err := c.backupClient.Backups(req.Namespace).Get(context.TODO(), req.Spec.BackupName, metav1.GetOptions{})
|
||||
if apierrors.IsNotFound(err) {
|
||||
backup := &velerov1api.Backup{}
|
||||
if err := r.Get(ctx, types.NamespacedName{
|
||||
Namespace: dbr.Namespace,
|
||||
Name: dbr.Spec.BackupName,
|
||||
}, backup); apierrors.IsNotFound(err) {
|
||||
// Couldn't find backup - update status to Processed and record the not-found error
|
||||
req, err = c.patchDeleteBackupRequest(req, func(r *velerov1api.DeleteBackupRequest) {
|
||||
_, err = r.patchDeleteBackupRequest(ctx, dbr, func(r *velerov1api.DeleteBackupRequest) {
|
||||
r.Status.Phase = velerov1api.DeleteBackupRequestPhaseProcessed
|
||||
r.Status.Errors = []string{"backup not found"}
|
||||
})
|
||||
|
||||
return err
|
||||
}
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "error getting backup")
|
||||
return ctrl.Result{}, err
|
||||
} else if err != nil {
|
||||
return ctrl.Result{}, errors.Wrap(err, "error getting backup")
|
||||
}
|
||||
|
||||
// Don't allow deleting backups in read-only storage locations
|
||||
location := &velerov1api.BackupStorageLocation{}
|
||||
if err := c.kbClient.Get(context.Background(), client.ObjectKey{
|
||||
if err := r.Get(context.Background(), client.ObjectKey{
|
||||
Namespace: backup.Namespace,
|
||||
Name: backup.Spec.StorageLocation,
|
||||
}, location); err != nil {
|
||||
if apierrors.IsNotFound(err) {
|
||||
_, err := c.patchDeleteBackupRequest(req, func(r *velerov1api.DeleteBackupRequest) {
|
||||
_, err := r.patchDeleteBackupRequest(ctx, dbr, func(r *velerov1api.DeleteBackupRequest) {
|
||||
r.Status.Phase = velerov1api.DeleteBackupRequestPhaseProcessed
|
||||
r.Status.Errors = append(r.Status.Errors, fmt.Sprintf("backup storage location %s not found", backup.Spec.StorageLocation))
|
||||
})
|
||||
return err
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
return errors.Wrap(err, "error getting backup storage location")
|
||||
return ctrl.Result{}, errors.Wrap(err, "error getting backup storage location")
|
||||
}
|
||||
|
||||
if location.Spec.AccessMode == velerov1api.BackupStorageLocationAccessModeReadOnly {
|
||||
_, err := c.patchDeleteBackupRequest(req, func(r *velerov1api.DeleteBackupRequest) {
|
||||
_, err := r.patchDeleteBackupRequest(ctx, dbr, func(r *velerov1api.DeleteBackupRequest) {
|
||||
r.Status.Phase = velerov1api.DeleteBackupRequestPhaseProcessed
|
||||
r.Status.Errors = append(r.Status.Errors, fmt.Sprintf("cannot delete backup because backup storage location %s is currently in read-only mode", location.Name))
|
||||
})
|
||||
return err
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
// if the request object has no labels defined, initialise an empty map since
|
||||
// we will be updating labels
|
||||
if req.Labels == nil {
|
||||
req.Labels = map[string]string{}
|
||||
if dbr.Labels == nil {
|
||||
dbr.Labels = map[string]string{}
|
||||
}
|
||||
|
||||
// Update status to InProgress and set backup-name label if needed
|
||||
req, err = c.patchDeleteBackupRequest(req, func(r *velerov1api.DeleteBackupRequest) {
|
||||
// Update status to InProgress and set backup-name and backup-uid label if needed
|
||||
dbr, err := r.patchDeleteBackupRequest(ctx, dbr, func(r *velerov1api.DeleteBackupRequest) {
|
||||
r.Status.Phase = velerov1api.DeleteBackupRequestPhaseInProgress
|
||||
|
||||
if req.Labels[velerov1api.BackupNameLabel] == "" {
|
||||
req.Labels[velerov1api.BackupNameLabel] = label.GetValidName(req.Spec.BackupName)
|
||||
if r.Labels[velerov1api.BackupNameLabel] == "" {
|
||||
r.Labels[velerov1api.BackupNameLabel] = label.GetValidName(dbr.Spec.BackupName)
|
||||
}
|
||||
|
||||
if r.Labels[velerov1api.BackupUIDLabel] == "" {
|
||||
r.Labels[velerov1api.BackupUIDLabel] = string(backup.UID)
|
||||
}
|
||||
})
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Set backup-uid label if needed
|
||||
if req.Labels[velerov1api.BackupUIDLabel] == "" {
|
||||
req, err = c.patchDeleteBackupRequest(req, func(r *velerov1api.DeleteBackupRequest) {
|
||||
req.Labels[velerov1api.BackupUIDLabel] = string(backup.UID)
|
||||
})
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
// Set backup status to Deleting
|
||||
backup, err = c.patchBackup(backup, func(b *velerov1api.Backup) {
|
||||
backup, err = r.patchBackup(ctx, backup, func(b *velerov1api.Backup) {
|
||||
b.Status.Phase = velerov1api.BackupPhaseDeleting
|
||||
})
|
||||
if err != nil {
|
||||
log.WithError(errors.WithStack(err)).Error("Error setting backup phase to deleting")
|
||||
return err
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
backupScheduleName := backup.GetLabels()[velerov1api.ScheduleNameLabel]
|
||||
c.metrics.RegisterBackupDeletionAttempt(backupScheduleName)
|
||||
r.metrics.RegisterBackupDeletionAttempt(backupScheduleName)
|
||||
|
||||
var errs []string
|
||||
|
||||
pluginManager := c.newPluginManager(log)
|
||||
pluginManager := r.newPluginManager(log)
|
||||
defer pluginManager.CleanupClients()
|
||||
|
||||
backupStore, err := c.backupStoreGetter.Get(location, pluginManager, log)
|
||||
backupStore, err := r.backupStoreGetter.Get(location, pluginManager, log)
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "error getting the backup store")
|
||||
return ctrl.Result{}, errors.Wrap(err, "error getting the backup store")
|
||||
}
|
||||
|
||||
actions, err := pluginManager.GetDeleteItemActions()
|
||||
log.Debugf("%d actions before invoking actions", len(actions))
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "error getting delete item actions")
|
||||
return ctrl.Result{}, errors.Wrap(err, "error getting delete item actions")
|
||||
}
|
||||
// don't defer CleanupClients here, since it was already called above.
|
||||
|
||||
|
@ -308,13 +254,13 @@ func (c *backupDeletionController) processRequest(req *velerov1api.DeleteBackupR
|
|||
if err != nil {
|
||||
log.WithError(err).Errorf("Unable to download tarball for backup %s, skipping associated DeleteItemAction plugins", backup.Name)
|
||||
} else {
|
||||
defer closeAndRemoveFile(backupFile, c.logger)
|
||||
defer closeAndRemoveFile(backupFile, r.logger)
|
||||
ctx := &delete.Context{
|
||||
Backup: backup,
|
||||
BackupReader: backupFile,
|
||||
Actions: actions,
|
||||
Log: c.logger,
|
||||
DiscoveryHelper: c.helper,
|
||||
Log: r.logger,
|
||||
DiscoveryHelper: r.discoveryHelper,
|
||||
Filesystem: filesystem.NewFileSystem(),
|
||||
}
|
||||
|
||||
|
@ -322,11 +268,13 @@ func (c *backupDeletionController) processRequest(req *velerov1api.DeleteBackupR
|
|||
// but what do we do with the error returned? We can't just swallow it as that may lead to dangling resources.
|
||||
err = delete.InvokeDeleteActions(ctx)
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "error invoking delete item actions")
|
||||
return ctrl.Result{}, errors.Wrap(err, "error invoking delete item actions")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
var errs []string
|
||||
|
||||
if backupStore != nil {
|
||||
log.Info("Removing PV snapshots")
|
||||
|
||||
|
@ -340,7 +288,7 @@ func (c *backupDeletionController) processRequest(req *velerov1api.DeleteBackupR
|
|||
|
||||
volumeSnapshotter, ok := volumeSnapshotters[snapshot.Spec.Location]
|
||||
if !ok {
|
||||
if volumeSnapshotter, err = volumeSnapshotterForSnapshotLocation(backup.Namespace, snapshot.Spec.Location, c.snapshotLocationLister, pluginManager); err != nil {
|
||||
if volumeSnapshotter, err = volumeSnapshottersForVSL(ctx, backup.Namespace, snapshot.Spec.Location, r.Client, pluginManager); err != nil {
|
||||
errs = append(errs, err.Error())
|
||||
continue
|
||||
}
|
||||
|
@ -353,9 +301,8 @@ func (c *backupDeletionController) processRequest(req *velerov1api.DeleteBackupR
|
|||
}
|
||||
}
|
||||
}
|
||||
|
||||
log.Info("Removing restic snapshots")
|
||||
if deleteErrs := c.deleteResticSnapshots(backup); len(deleteErrs) > 0 {
|
||||
if deleteErrs := r.deleteResticSnapshots(ctx, backup); len(deleteErrs) > 0 {
|
||||
for _, err := range deleteErrs {
|
||||
errs = append(errs, err.Error())
|
||||
}
|
||||
|
@ -369,15 +316,19 @@ func (c *backupDeletionController) processRequest(req *velerov1api.DeleteBackupR
|
|||
}
|
||||
|
||||
log.Info("Removing restores")
|
||||
if restores, err := c.restoreLister.Restores(backup.Namespace).List(labels.Everything()); err != nil {
|
||||
restoreList := &velerov1api.RestoreList{}
|
||||
selector := labels.Everything()
|
||||
if err := r.List(ctx, restoreList, &client.ListOptions{
|
||||
Namespace: backup.Namespace,
|
||||
LabelSelector: selector,
|
||||
}); err != nil {
|
||||
log.WithError(errors.WithStack(err)).Error("Error listing restore API objects")
|
||||
} else {
|
||||
for _, restore := range restores {
|
||||
for _, restore := range restoreList.Items {
|
||||
if restore.Spec.BackupName != backup.Name {
|
||||
continue
|
||||
}
|
||||
|
||||
restoreLog := log.WithField("restore", kube.NamespaceAndName(restore))
|
||||
restoreLog := log.WithField("restore", kube.NamespaceAndName(&restore))
|
||||
|
||||
restoreLog.Info("Deleting restore log/results from backup storage")
|
||||
if err := backupStore.DeleteRestore(restore.Name); err != nil {
|
||||
|
@ -387,202 +338,160 @@ func (c *backupDeletionController) processRequest(req *velerov1api.DeleteBackupR
|
|||
}
|
||||
|
||||
restoreLog.Info("Deleting restore referencing backup")
|
||||
if err := c.restoreClient.Restores(restore.Namespace).Delete(context.TODO(), restore.Name, metav1.DeleteOptions{}); err != nil {
|
||||
errs = append(errs, errors.Wrapf(err, "error deleting restore %s", kube.NamespaceAndName(restore)).Error())
|
||||
if err := r.Delete(ctx, &restore); err != nil {
|
||||
errs = append(errs, errors.Wrapf(err, "error deleting restore %s", kube.NamespaceAndName(&restore)).Error())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if len(errs) == 0 {
|
||||
// Only try to delete the backup object from kube if everything preceding went smoothly
|
||||
err = c.backupClient.Backups(backup.Namespace).Delete(context.TODO(), backup.Name, metav1.DeleteOptions{})
|
||||
if err != nil {
|
||||
if err := r.Delete(ctx, backup); err != nil {
|
||||
errs = append(errs, errors.Wrapf(err, "error deleting backup %s", kube.NamespaceAndName(backup)).Error())
|
||||
}
|
||||
}
|
||||
|
||||
if len(errs) == 0 {
|
||||
c.metrics.RegisterBackupDeletionSuccess(backupScheduleName)
|
||||
r.metrics.RegisterBackupDeletionSuccess(backupScheduleName)
|
||||
} else {
|
||||
c.metrics.RegisterBackupDeletionFailed(backupScheduleName)
|
||||
r.metrics.RegisterBackupDeletionFailed(backupScheduleName)
|
||||
}
|
||||
|
||||
// Update status to processed and record errors
|
||||
req, err = c.patchDeleteBackupRequest(req, func(r *velerov1api.DeleteBackupRequest) {
|
||||
if _, err := r.patchDeleteBackupRequest(ctx, dbr, func(r *velerov1api.DeleteBackupRequest) {
|
||||
r.Status.Phase = velerov1api.DeleteBackupRequestPhaseProcessed
|
||||
r.Status.Errors = errs
|
||||
})
|
||||
if err != nil {
|
||||
return err
|
||||
}); err != nil {
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
// Everything deleted correctly, so we can delete all DeleteBackupRequests for this backup
|
||||
if len(errs) == 0 {
|
||||
listOptions := pkgbackup.NewDeleteBackupRequestListOptions(backup.Name, string(backup.UID))
|
||||
err = c.deleteBackupRequestClient.DeleteBackupRequests(req.Namespace).DeleteCollection(context.TODO(), metav1.DeleteOptions{}, listOptions)
|
||||
labelSelector, err := labels.Parse(fmt.Sprintf("%s=%s,%s=%s", velerov1api.BackupNameLabel, label.GetValidName(backup.Name), velerov1api.BackupUIDLabel, backup.UID))
|
||||
if err != nil {
|
||||
// Should not be here
|
||||
r.logger.WithError(err).WithField("backup", kube.NamespaceAndName(backup)).Error("error creating label selector for the backup for deleting DeleteBackupRequests")
|
||||
return ctrl.Result{}, nil
|
||||
}
|
||||
alldbr := &velerov1api.DeleteBackupRequest{}
|
||||
err = r.DeleteAllOf(ctx, alldbr, client.MatchingLabelsSelector{
|
||||
Selector: labelSelector,
|
||||
}, client.InNamespace(dbr.Namespace))
|
||||
if err != nil {
|
||||
// If this errors, all we can do is log it.
|
||||
c.logger.WithField("backup", kube.NamespaceAndName(backup)).Error("error deleting all associated DeleteBackupRequests after successfully deleting the backup")
|
||||
r.logger.WithError(err).WithField("backup", kube.NamespaceAndName(backup)).Error("error deleting all associated DeleteBackupRequests after successfully deleting the backup")
|
||||
}
|
||||
}
|
||||
log.Infof("Reconciliation done")
|
||||
|
||||
return nil
|
||||
return ctrl.Result{}, nil
|
||||
}
|
||||
|
||||
func volumeSnapshotterForSnapshotLocation(
|
||||
namespace, snapshotLocationName string,
|
||||
snapshotLocationLister velerov1listers.VolumeSnapshotLocationLister,
|
||||
func volumeSnapshottersForVSL(
|
||||
ctx context.Context,
|
||||
namespace, vslName string,
|
||||
client client.Client,
|
||||
pluginManager clientmgmt.Manager,
|
||||
) (velero.VolumeSnapshotter, error) {
|
||||
snapshotLocation, err := snapshotLocationLister.VolumeSnapshotLocations(namespace).Get(snapshotLocationName)
|
||||
vsl := &velerov1api.VolumeSnapshotLocation{}
|
||||
if err := client.Get(ctx, types.NamespacedName{
|
||||
Namespace: namespace,
|
||||
Name: vslName,
|
||||
}, vsl); err != nil {
|
||||
return nil, errors.Wrapf(err, "error getting volume snapshot location %s", vslName)
|
||||
}
|
||||
volumeSnapshotter, err := pluginManager.GetVolumeSnapshotter(vsl.Spec.Provider)
|
||||
if err != nil {
|
||||
return nil, errors.Wrapf(err, "error getting volume snapshot location %s", snapshotLocationName)
|
||||
return nil, errors.Wrapf(err, "error getting volume snapshotter for provider %s", vsl.Spec.Provider)
|
||||
}
|
||||
|
||||
volumeSnapshotter, err := pluginManager.GetVolumeSnapshotter(snapshotLocation.Spec.Provider)
|
||||
if err != nil {
|
||||
return nil, errors.Wrapf(err, "error getting volume snapshotter for provider %s", snapshotLocation.Spec.Provider)
|
||||
}
|
||||
|
||||
if err = volumeSnapshotter.Init(snapshotLocation.Spec.Config); err != nil {
|
||||
return nil, errors.Wrapf(err, "error initializing volume snapshotter for volume snapshot location %s", snapshotLocationName)
|
||||
if err = volumeSnapshotter.Init(vsl.Spec.Config); err != nil {
|
||||
return nil, errors.Wrapf(err, "error initializing volume snapshotter for volume snapshot location %s", vslName)
|
||||
}
|
||||
|
||||
return volumeSnapshotter, nil
|
||||
}
|
||||
|
||||
func (c *backupDeletionController) deleteExistingDeletionRequests(req *velerov1api.DeleteBackupRequest, log logrus.FieldLogger) []error {
|
||||
func (r *backupDeletionReconciler) deleteExistingDeletionRequests(ctx context.Context, req *velerov1api.DeleteBackupRequest, log logrus.FieldLogger) []error {
|
||||
log.Info("Removing existing deletion requests for backup")
|
||||
dbrList := &velerov1api.DeleteBackupRequestList{}
|
||||
selector := label.NewSelectorForBackup(req.Spec.BackupName)
|
||||
dbrs, err := c.deleteBackupRequestLister.DeleteBackupRequests(req.Namespace).List(selector)
|
||||
if err != nil {
|
||||
if err := r.List(ctx, dbrList, &client.ListOptions{
|
||||
Namespace: req.Namespace,
|
||||
LabelSelector: selector,
|
||||
}); err != nil {
|
||||
return []error{errors.Wrap(err, "error listing existing DeleteBackupRequests for backup")}
|
||||
}
|
||||
|
||||
var errs []error
|
||||
for _, dbr := range dbrs {
|
||||
for _, dbr := range dbrList.Items {
|
||||
if dbr.Name == req.Name {
|
||||
continue
|
||||
}
|
||||
|
||||
if err := c.deleteBackupRequestClient.DeleteBackupRequests(req.Namespace).Delete(context.TODO(), dbr.Name, metav1.DeleteOptions{}); err != nil {
|
||||
if err := r.Delete(ctx, &dbr); err != nil {
|
||||
errs = append(errs, errors.WithStack(err))
|
||||
} else {
|
||||
log.Infof("deletion request '%s' removed.", dbr.Name)
|
||||
}
|
||||
}
|
||||
|
||||
return errs
|
||||
}
|
||||
|
||||
func (c *backupDeletionController) deleteResticSnapshots(backup *velerov1api.Backup) []error {
|
||||
if c.resticMgr == nil {
|
||||
func (r *backupDeletionReconciler) deleteResticSnapshots(ctx context.Context, backup *velerov1api.Backup) []error {
|
||||
if r.resticMgr == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
snapshots, err := restic.GetSnapshotsInBackup(backup, c.podvolumeBackupLister)
|
||||
snapshots, err := restic.GetSnapshotsInBackup(ctx, backup, r.Client)
|
||||
if err != nil {
|
||||
return []error{err}
|
||||
}
|
||||
|
||||
ctx, cancelFunc := context.WithTimeout(context.Background(), resticTimeout)
|
||||
ctx2, cancelFunc := context.WithTimeout(ctx, resticTimeout)
|
||||
defer cancelFunc()
|
||||
|
||||
var errs []error
|
||||
for _, snapshot := range snapshots {
|
||||
if err := c.resticMgr.Forget(ctx, snapshot); err != nil {
|
||||
if err := r.resticMgr.Forget(ctx2, snapshot); err != nil {
|
||||
errs = append(errs, err)
|
||||
}
|
||||
}
|
||||
|
||||
return errs
|
||||
}
|
||||
|
||||
const deleteBackupRequestMaxAge = 24 * time.Hour
|
||||
|
||||
func (c *backupDeletionController) deleteExpiredRequests() {
|
||||
c.logger.Info("Checking for expired DeleteBackupRequests")
|
||||
defer c.logger.Info("Done checking for expired DeleteBackupRequests")
|
||||
|
||||
// Our shared informer factory filters on a single namespace, so asking for all is ok here.
|
||||
requests, err := c.deleteBackupRequestLister.List(labels.Everything())
|
||||
func (r *backupDeletionReconciler) patchDeleteBackupRequest(ctx context.Context, req *velerov1api.DeleteBackupRequest, mutate func(*velerov1api.DeleteBackupRequest)) (*velerov1api.DeleteBackupRequest, error) {
|
||||
patchHelper, err := patch.NewHelper(req, r.Client)
|
||||
if err != nil {
|
||||
c.logger.WithError(err).Error("unable to check for expired DeleteBackupRequests")
|
||||
return
|
||||
return nil, errors.Wrap(err, "unable to get the patch helper")
|
||||
}
|
||||
|
||||
now := c.clock.Now()
|
||||
|
||||
for _, req := range requests {
|
||||
if req.Status.Phase != velerov1api.DeleteBackupRequestPhaseProcessed {
|
||||
continue
|
||||
}
|
||||
|
||||
age := now.Sub(req.CreationTimestamp.Time)
|
||||
if age >= deleteBackupRequestMaxAge {
|
||||
reqLog := c.logger.WithFields(logrus.Fields{"namespace": req.Namespace, "name": req.Name})
|
||||
reqLog.Info("Deleting expired DeleteBackupRequest")
|
||||
|
||||
err = c.deleteBackupRequestClient.DeleteBackupRequests(req.Namespace).Delete(context.TODO(), req.Name, metav1.DeleteOptions{})
|
||||
if err != nil {
|
||||
reqLog.WithError(err).Error("Error deleting DeleteBackupRequest")
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (c *backupDeletionController) patchDeleteBackupRequest(req *velerov1api.DeleteBackupRequest, mutate func(*velerov1api.DeleteBackupRequest)) (*velerov1api.DeleteBackupRequest, error) {
|
||||
// Record original json
|
||||
oldData, err := json.Marshal(req)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error marshalling original DeleteBackupRequest")
|
||||
}
|
||||
|
||||
// Mutate
|
||||
mutate(req)
|
||||
|
||||
// Record new json
|
||||
newData, err := json.Marshal(req)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error marshalling updated DeleteBackupRequest")
|
||||
if err := patchHelper.Patch(ctx, req); err != nil {
|
||||
return nil, errors.Wrap(err, "error patching the deletebackuprquest")
|
||||
}
|
||||
|
||||
patchBytes, err := jsonpatch.CreateMergePatch(oldData, newData)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error creating json merge patch for DeleteBackupRequest")
|
||||
}
|
||||
|
||||
req, err = c.deleteBackupRequestClient.DeleteBackupRequests(req.Namespace).Patch(context.TODO(), req.Name, types.MergePatchType, patchBytes, metav1.PatchOptions{})
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error patching DeleteBackupRequest")
|
||||
}
|
||||
|
||||
return req, nil
|
||||
}
|
||||
|
||||
func (c *backupDeletionController) patchBackup(backup *velerov1api.Backup, mutate func(*velerov1api.Backup)) (*velerov1api.Backup, error) {
|
||||
func (r *backupDeletionReconciler) patchBackup(ctx context.Context, backup *velerov1api.Backup, mutate func(*velerov1api.Backup)) (*velerov1api.Backup, error) {
|
||||
//TODO: The patchHelper can't be used here because the `backup/xxx/status` does not exist, until the bakcup resource is refactored
|
||||
|
||||
// Record original json
|
||||
oldData, err := json.Marshal(backup)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error marshalling original Backup")
|
||||
}
|
||||
|
||||
// Mutate
|
||||
mutate(backup)
|
||||
|
||||
// Record new json
|
||||
newData, err := json.Marshal(backup)
|
||||
newBackup := backup.DeepCopy()
|
||||
mutate(newBackup)
|
||||
newData, err := json.Marshal(newBackup)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error marshalling updated Backup")
|
||||
}
|
||||
|
||||
patchBytes, err := jsonpatch.CreateMergePatch(oldData, newData)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error creating json merge patch for Backup")
|
||||
}
|
||||
|
||||
backup, err = c.backupClient.Backups(backup.Namespace).Patch(context.TODO(), backup.Name, types.MergePatchType, patchBytes, metav1.PatchOptions{})
|
||||
if err != nil {
|
||||
if err := r.Client.Patch(ctx, backup, client.RawPatch(types.MergePatchType, patchBytes)); err != nil {
|
||||
return nil, errors.Wrap(err, "error patching Backup")
|
||||
}
|
||||
|
||||
return backup, nil
|
||||
}
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -20,7 +20,9 @@ import (
|
|||
"context"
|
||||
"time"
|
||||
|
||||
snapshotv1api "github.com/kubernetes-csi/external-snapshotter/client/v4/apis/volumesnapshot/v1"
|
||||
snapshotterClientSet "github.com/kubernetes-csi/external-snapshotter/client/v4/clientset/versioned"
|
||||
snapshotv1listers "github.com/kubernetes-csi/external-snapshotter/client/v4/listers/volumesnapshot/v1"
|
||||
"github.com/pkg/errors"
|
||||
"github.com/sirupsen/logrus"
|
||||
kuberrs "k8s.io/apimachinery/pkg/api/errors"
|
||||
|
@ -29,6 +31,8 @@ import (
|
|||
"k8s.io/apimachinery/pkg/util/sets"
|
||||
"k8s.io/client-go/kubernetes"
|
||||
|
||||
"github.com/vmware-tanzu/velero/pkg/util/kube"
|
||||
|
||||
"github.com/vmware-tanzu/velero/internal/storage"
|
||||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
|
||||
"github.com/vmware-tanzu/velero/pkg/features"
|
||||
|
@ -48,6 +52,7 @@ type backupSyncController struct {
|
|||
kbClient client.Client
|
||||
podVolumeBackupClient velerov1client.PodVolumeBackupsGetter
|
||||
backupLister velerov1listers.BackupLister
|
||||
csiVSLister snapshotv1listers.VolumeSnapshotLister
|
||||
csiSnapshotClient *snapshotterClientSet.Clientset
|
||||
kubeClient kubernetes.Interface
|
||||
namespace string
|
||||
|
@ -62,6 +67,7 @@ func NewBackupSyncController(
|
|||
kbClient client.Client,
|
||||
podVolumeBackupClient velerov1client.PodVolumeBackupsGetter,
|
||||
backupLister velerov1listers.BackupLister,
|
||||
csiVSLister snapshotv1listers.VolumeSnapshotLister,
|
||||
syncPeriod time.Duration,
|
||||
namespace string,
|
||||
csiSnapshotClient *snapshotterClientSet.Clientset,
|
||||
|
@ -85,6 +91,7 @@ func NewBackupSyncController(
|
|||
defaultBackupLocation: defaultBackupLocation,
|
||||
defaultBackupSyncPeriod: syncPeriod,
|
||||
backupLister: backupLister,
|
||||
csiVSLister: csiVSLister,
|
||||
csiSnapshotClient: csiSnapshotClient,
|
||||
kubeClient: kubeClient,
|
||||
|
||||
|
@ -358,11 +365,34 @@ func (c *backupSyncController) deleteOrphanedBackups(locationName string, backup
|
|||
if backup.Status.Phase != velerov1api.BackupPhaseCompleted || backupStoreBackups.Has(backup.Name) {
|
||||
continue
|
||||
}
|
||||
|
||||
if err := c.backupClient.Backups(backup.Namespace).Delete(context.TODO(), backup.Name, metav1.DeleteOptions{}); err != nil {
|
||||
log.WithError(errors.WithStack(err)).Error("Error deleting orphaned backup from cluster")
|
||||
} else {
|
||||
log.Debug("Deleted orphaned backup from cluster")
|
||||
c.deleteCSISnapshotsByBackup(backup.Name, log)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (c *backupSyncController) deleteCSISnapshotsByBackup(backupName string, log logrus.FieldLogger) {
|
||||
if !features.IsEnabled(velerov1api.CSIFeatureFlag) {
|
||||
return
|
||||
}
|
||||
m := client.MatchingLabels{velerov1api.BackupNameLabel: label.GetValidName(backupName)}
|
||||
if vsList, err := c.csiVSLister.List(label.NewSelectorForBackup(label.GetValidName(backupName))); err != nil {
|
||||
log.WithError(err).Warnf("Failed to list volumesnapshots for backup: %s, the deletion will be skipped", backupName)
|
||||
} else {
|
||||
for _, vs := range vsList {
|
||||
name := kube.NamespaceAndName(vs.GetObjectMeta())
|
||||
log.Debugf("Deleting volumesnapshot %s", name)
|
||||
if err := c.kbClient.Delete(context.TODO(), vs); err != nil {
|
||||
log.WithError(err).Warnf("Failed to delete volumesnapshot %s", name)
|
||||
}
|
||||
}
|
||||
}
|
||||
vsc := &snapshotv1api.VolumeSnapshotContent{}
|
||||
log.Debugf("Deleting volumesnapshotcontents for backup: %s", backupName)
|
||||
if err := c.kbClient.DeleteAllOf(context.TODO(), vsc, m); err != nil {
|
||||
log.WithError(err).Warnf("Failed to delete volumesnapshotcontents for backup: %s", backupName)
|
||||
}
|
||||
}
|
||||
|
|
|
@ -344,6 +344,7 @@ func TestBackupSyncControllerRun(t *testing.T) {
|
|||
fakeClient,
|
||||
client.VeleroV1(),
|
||||
sharedInformers.Velero().V1().Backups().Lister(),
|
||||
nil, // csiVSLister
|
||||
time.Duration(0),
|
||||
test.namespace,
|
||||
nil, // csiSnapshotClient
|
||||
|
@ -565,6 +566,7 @@ func TestDeleteOrphanedBackups(t *testing.T) {
|
|||
fakeClient,
|
||||
client.VeleroV1(),
|
||||
sharedInformers.Velero().V1().Backups().Lister(),
|
||||
nil, // csiVSLister
|
||||
time.Duration(0),
|
||||
test.namespace,
|
||||
nil, // csiSnapshotClient
|
||||
|
@ -659,6 +661,7 @@ func TestStorageLabelsInDeleteOrphanedBackups(t *testing.T) {
|
|||
fakeClient,
|
||||
client.VeleroV1(),
|
||||
sharedInformers.Velero().V1().Backups().Lister(),
|
||||
nil, // csiVSLister
|
||||
time.Duration(0),
|
||||
test.namespace,
|
||||
nil, // csiSnapshotClient
|
||||
|
|
|
@ -30,7 +30,6 @@ import (
|
|||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/runtime"
|
||||
"k8s.io/apimachinery/pkg/util/clock"
|
||||
corev1listers "k8s.io/client-go/listers/core/v1"
|
||||
"sigs.k8s.io/cluster-api/util/patch"
|
||||
ctrl "sigs.k8s.io/controller-runtime"
|
||||
"sigs.k8s.io/controller-runtime/pkg/client"
|
||||
|
@ -60,9 +59,6 @@ type PodVolumeBackupReconciler struct {
|
|||
FileSystem filesystem.Interface
|
||||
ResticExec BackupExecuter
|
||||
Log logrus.FieldLogger
|
||||
|
||||
PvLister corev1listers.PersistentVolumeLister
|
||||
PvcLister corev1listers.PersistentVolumeClaimLister
|
||||
}
|
||||
|
||||
// +kubebuilder:rbac:groups=velero.io,resources=podvolumebackups,verbs=get;list;watch;create;update;patch;delete
|
||||
|
@ -302,7 +298,7 @@ type resticDetails struct {
|
|||
}
|
||||
|
||||
func (r *PodVolumeBackupReconciler) buildResticCommand(ctx context.Context, log *logrus.Entry, pvb *velerov1api.PodVolumeBackup, pod *corev1.Pod, details *resticDetails) (*restic.Command, error) {
|
||||
volDir, err := kube.GetVolumeDirectory(log, pod, pvb.Spec.Volume, r.PvcLister, r.PvLister, r.Client)
|
||||
volDir, err := kube.GetVolumeDirectory(ctx, log, pod, pvb.Spec.Volume, r.Client)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "getting volume directory name")
|
||||
}
|
||||
|
|
|
@ -18,13 +18,11 @@ package controller
|
|||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io/ioutil"
|
||||
"os"
|
||||
"path/filepath"
|
||||
|
||||
jsonpatch "github.com/evanphx/json-patch"
|
||||
"github.com/pkg/errors"
|
||||
"github.com/sirupsen/logrus"
|
||||
corev1api "k8s.io/api/core/v1"
|
||||
|
@ -33,191 +31,176 @@ import (
|
|||
"k8s.io/apimachinery/pkg/labels"
|
||||
"k8s.io/apimachinery/pkg/types"
|
||||
"k8s.io/apimachinery/pkg/util/clock"
|
||||
corev1informers "k8s.io/client-go/informers/core/v1"
|
||||
corev1listers "k8s.io/client-go/listers/core/v1"
|
||||
"k8s.io/client-go/tools/cache"
|
||||
k8scache "sigs.k8s.io/controller-runtime/pkg/cache"
|
||||
"sigs.k8s.io/cluster-api/util/patch"
|
||||
ctrl "sigs.k8s.io/controller-runtime"
|
||||
"sigs.k8s.io/controller-runtime/pkg/client"
|
||||
"sigs.k8s.io/controller-runtime/pkg/handler"
|
||||
"sigs.k8s.io/controller-runtime/pkg/reconcile"
|
||||
"sigs.k8s.io/controller-runtime/pkg/source"
|
||||
|
||||
"github.com/vmware-tanzu/velero/internal/credentials"
|
||||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
|
||||
velerov1client "github.com/vmware-tanzu/velero/pkg/generated/clientset/versioned/typed/velero/v1"
|
||||
informers "github.com/vmware-tanzu/velero/pkg/generated/informers/externalversions/velero/v1"
|
||||
listers "github.com/vmware-tanzu/velero/pkg/generated/listers/velero/v1"
|
||||
"github.com/vmware-tanzu/velero/pkg/restic"
|
||||
"github.com/vmware-tanzu/velero/pkg/util/boolptr"
|
||||
"github.com/vmware-tanzu/velero/pkg/util/filesystem"
|
||||
"github.com/vmware-tanzu/velero/pkg/util/kube"
|
||||
)
|
||||
|
||||
type podVolumeRestoreController struct {
|
||||
*genericController
|
||||
|
||||
podVolumeRestoreClient velerov1client.PodVolumeRestoresGetter
|
||||
podVolumeRestoreLister listers.PodVolumeRestoreLister
|
||||
podLister corev1listers.PodLister
|
||||
pvcLister corev1listers.PersistentVolumeClaimLister
|
||||
pvLister corev1listers.PersistentVolumeLister
|
||||
backupLocationInformer k8scache.Informer
|
||||
kbClient client.Client
|
||||
nodeName string
|
||||
credentialsFileStore credentials.FileStore
|
||||
|
||||
processRestoreFunc func(*velerov1api.PodVolumeRestore) error
|
||||
fileSystem filesystem.Interface
|
||||
clock clock.Clock
|
||||
func NewPodVolumeRestoreReconciler(logger logrus.FieldLogger, client client.Client, credentialsFileStore credentials.FileStore) *PodVolumeRestoreReconciler {
|
||||
return &PodVolumeRestoreReconciler{
|
||||
Client: client,
|
||||
logger: logger.WithField("controller", "PodVolumeRestore"),
|
||||
credentialsFileStore: credentialsFileStore,
|
||||
fileSystem: filesystem.NewFileSystem(),
|
||||
clock: &clock.RealClock{},
|
||||
}
|
||||
}
|
||||
|
||||
// NewPodVolumeRestoreController creates a new pod volume restore controller.
|
||||
func NewPodVolumeRestoreController(
|
||||
logger logrus.FieldLogger,
|
||||
podVolumeRestoreInformer informers.PodVolumeRestoreInformer,
|
||||
podVolumeRestoreClient velerov1client.PodVolumeRestoresGetter,
|
||||
podInformer cache.SharedIndexInformer,
|
||||
pvcInformer corev1informers.PersistentVolumeClaimInformer,
|
||||
pvInformer corev1informers.PersistentVolumeInformer,
|
||||
kbClient client.Client,
|
||||
nodeName string,
|
||||
credentialsFileStore credentials.FileStore,
|
||||
) Interface {
|
||||
c := &podVolumeRestoreController{
|
||||
genericController: newGenericController(PodVolumeRestore, logger),
|
||||
podVolumeRestoreClient: podVolumeRestoreClient,
|
||||
podVolumeRestoreLister: podVolumeRestoreInformer.Lister(),
|
||||
podLister: corev1listers.NewPodLister(podInformer.GetIndexer()),
|
||||
pvcLister: pvcInformer.Lister(),
|
||||
pvLister: pvInformer.Lister(),
|
||||
kbClient: kbClient,
|
||||
nodeName: nodeName,
|
||||
credentialsFileStore: credentialsFileStore,
|
||||
|
||||
fileSystem: filesystem.NewFileSystem(),
|
||||
clock: &clock.RealClock{},
|
||||
}
|
||||
|
||||
c.syncHandler = c.processQueueItem
|
||||
c.cacheSyncWaiters = append(
|
||||
c.cacheSyncWaiters,
|
||||
podVolumeRestoreInformer.Informer().HasSynced,
|
||||
podInformer.HasSynced,
|
||||
pvcInformer.Informer().HasSynced,
|
||||
)
|
||||
c.processRestoreFunc = c.processRestore
|
||||
|
||||
podVolumeRestoreInformer.Informer().AddEventHandler(
|
||||
cache.ResourceEventHandlerFuncs{
|
||||
AddFunc: c.pvrHandler,
|
||||
UpdateFunc: func(_, obj interface{}) {
|
||||
c.pvrHandler(obj)
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
podInformer.AddEventHandler(
|
||||
cache.ResourceEventHandlerFuncs{
|
||||
AddFunc: c.podHandler,
|
||||
UpdateFunc: func(_, obj interface{}) {
|
||||
c.podHandler(obj)
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
return c
|
||||
type PodVolumeRestoreReconciler struct {
|
||||
client.Client
|
||||
logger logrus.FieldLogger
|
||||
credentialsFileStore credentials.FileStore
|
||||
fileSystem filesystem.Interface
|
||||
clock clock.Clock
|
||||
}
|
||||
|
||||
func (c *podVolumeRestoreController) pvrHandler(obj interface{}) {
|
||||
pvr := obj.(*velerov1api.PodVolumeRestore)
|
||||
log := loggerForPodVolumeRestore(c.logger, pvr)
|
||||
// +kubebuilder:rbac:groups=velero.io,resources=podvolumerestores,verbs=get;list;watch;create;update;patch;delete
|
||||
// +kubebuilder:rbac:groups=velero.io,resources=podvolumerestores/status,verbs=get;update;patch
|
||||
// +kubebuilder:rbac:groups="",resources=pods,verbs=get
|
||||
// +kubebuilder:rbac:groups="",resources=persistentvolumes,verbs=get
|
||||
// +kubebuilder:rbac:groups="",resources=persistentvolumerclaims,verbs=get
|
||||
|
||||
if !isPVRNew(pvr) {
|
||||
log.Debugf("Restore is not new, not enqueuing")
|
||||
return
|
||||
}
|
||||
func (c *PodVolumeRestoreReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
|
||||
log := c.logger.WithField("PodVolumeRestore", req.NamespacedName.String())
|
||||
|
||||
pod, err := c.podLister.Pods(pvr.Spec.Pod.Namespace).Get(pvr.Spec.Pod.Name)
|
||||
if apierrors.IsNotFound(err) {
|
||||
log.WithError(err).Debugf("Restore's pod %s/%s not found, not enqueueing.", pvr.Spec.Pod.Namespace, pvr.Spec.Pod.Name)
|
||||
return
|
||||
}
|
||||
if err != nil {
|
||||
log.WithError(err).Errorf("Unable to get restore's pod %s/%s, not enqueueing.", pvr.Spec.Pod.Namespace, pvr.Spec.Pod.Name)
|
||||
return
|
||||
}
|
||||
|
||||
if !isPodOnNode(pod, c.nodeName) {
|
||||
log.Debugf("Restore's pod is not on this node, not enqueuing")
|
||||
return
|
||||
}
|
||||
|
||||
if !isResticInitContainerRunning(pod) {
|
||||
log.Debug("Restore's pod is not running restic-wait init container, not enqueuing")
|
||||
return
|
||||
}
|
||||
|
||||
resticInitContainerIndex := getResticInitContainerIndex(pod)
|
||||
if resticInitContainerIndex > 0 {
|
||||
log.Warnf(`Init containers before the %s container may cause issues
|
||||
if they interfere with volumes being restored: %s index %d`, restic.InitContainer, restic.InitContainer, resticInitContainerIndex)
|
||||
}
|
||||
|
||||
log.Debug("Enqueueing")
|
||||
c.enqueue(obj)
|
||||
}
|
||||
|
||||
func (c *podVolumeRestoreController) podHandler(obj interface{}) {
|
||||
pod := obj.(*corev1api.Pod)
|
||||
log := c.logger.WithField("key", kube.NamespaceAndName(pod))
|
||||
|
||||
// the pod should always be for this node since the podInformer is filtered
|
||||
// based on node, so this is just a failsafe.
|
||||
if !isPodOnNode(pod, c.nodeName) {
|
||||
return
|
||||
}
|
||||
|
||||
if !isResticInitContainerRunning(pod) {
|
||||
log.Debug("Pod is not running restic-wait init container, not enqueuing restores for pod")
|
||||
return
|
||||
}
|
||||
|
||||
resticInitContainerIndex := getResticInitContainerIndex(pod)
|
||||
if resticInitContainerIndex > 0 {
|
||||
log.Warnf(`Init containers before the %s container may cause issues
|
||||
if they interfere with volumes being restored: %s index %d`, restic.InitContainer, restic.InitContainer, resticInitContainerIndex)
|
||||
}
|
||||
|
||||
selector := labels.Set(map[string]string{
|
||||
velerov1api.PodUIDLabel: string(pod.UID),
|
||||
}).AsSelector()
|
||||
|
||||
pvrs, err := c.podVolumeRestoreLister.List(selector)
|
||||
if err != nil {
|
||||
log.WithError(err).Error("Unable to list pod volume restores")
|
||||
return
|
||||
}
|
||||
|
||||
if len(pvrs) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
for _, pvr := range pvrs {
|
||||
log := loggerForPodVolumeRestore(log, pvr)
|
||||
if !isPVRNew(pvr) {
|
||||
log.Debug("Restore is not new, not enqueuing")
|
||||
continue
|
||||
pvr := &velerov1api.PodVolumeRestore{}
|
||||
if err := c.Get(ctx, types.NamespacedName{Namespace: req.Namespace, Name: req.Name}, pvr); err != nil {
|
||||
if apierrors.IsNotFound(err) {
|
||||
log.Warn("PodVolumeRestore not found, skip")
|
||||
return ctrl.Result{}, nil
|
||||
}
|
||||
log.Debug("Enqueuing")
|
||||
c.enqueue(pvr)
|
||||
log.WithError(err).Error("Unable to get the PodVolumeRestore")
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
log = log.WithField("pod", fmt.Sprintf("%s/%s", pvr.Spec.Pod.Namespace, pvr.Spec.Pod.Name))
|
||||
if len(pvr.OwnerReferences) == 1 {
|
||||
log = log.WithField("restore", fmt.Sprintf("%s/%s", pvr.Namespace, pvr.OwnerReferences[0].Name))
|
||||
}
|
||||
|
||||
shouldProcess, pod, err := c.shouldProcess(ctx, log, pvr)
|
||||
if err != nil {
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
if !shouldProcess {
|
||||
return ctrl.Result{}, nil
|
||||
}
|
||||
|
||||
resticInitContainerIndex := getResticInitContainerIndex(pod)
|
||||
if resticInitContainerIndex > 0 {
|
||||
log.Warnf(`Init containers before the %s container may cause issues
|
||||
if they interfere with volumes being restored: %s index %d`, restic.InitContainer, restic.InitContainer, resticInitContainerIndex)
|
||||
}
|
||||
|
||||
patchHelper, err := patch.NewHelper(pvr, c.Client)
|
||||
if err != nil {
|
||||
log.WithError(err).Error("Unable to new patch helper")
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
log.Info("Restore starting")
|
||||
pvr.Status.Phase = velerov1api.PodVolumeRestorePhaseInProgress
|
||||
pvr.Status.StartTimestamp = &metav1.Time{Time: c.clock.Now()}
|
||||
if err = patchHelper.Patch(ctx, pvr); err != nil {
|
||||
log.WithError(err).Error("Unable to update status to in progress")
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
if err = c.processRestore(ctx, pvr, pod, log); err != nil {
|
||||
pvr.Status.Phase = velerov1api.PodVolumeRestorePhaseFailed
|
||||
pvr.Status.Message = err.Error()
|
||||
pvr.Status.CompletionTimestamp = &metav1.Time{Time: c.clock.Now()}
|
||||
if e := patchHelper.Patch(ctx, pvr); e != nil {
|
||||
log.WithError(err).Error("Unable to update status to failed")
|
||||
}
|
||||
|
||||
log.WithError(err).Error("Unable to process the PodVolumeRestore")
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
|
||||
pvr.Status.Phase = velerov1api.PodVolumeRestorePhaseCompleted
|
||||
pvr.Status.CompletionTimestamp = &metav1.Time{Time: c.clock.Now()}
|
||||
if err = patchHelper.Patch(ctx, pvr); err != nil {
|
||||
log.WithError(err).Error("Unable to update status to completed")
|
||||
return ctrl.Result{}, err
|
||||
}
|
||||
log.Info("Restore completed")
|
||||
return ctrl.Result{}, nil
|
||||
}
|
||||
|
||||
func (c *PodVolumeRestoreReconciler) shouldProcess(ctx context.Context, log logrus.FieldLogger, pvr *velerov1api.PodVolumeRestore) (bool, *corev1api.Pod, error) {
|
||||
if !isPVRNew(pvr) {
|
||||
log.Debug("PodVolumeRestore is not new, skip")
|
||||
return false, nil, nil
|
||||
}
|
||||
|
||||
// we filter the pods during the initialization of cache, if we can get a pod here, the pod must be in the same node with the controller
|
||||
// so we don't need to compare the node anymore
|
||||
pod := &corev1api.Pod{}
|
||||
if err := c.Get(ctx, types.NamespacedName{Namespace: pvr.Spec.Pod.Namespace, Name: pvr.Spec.Pod.Name}, pod); err != nil {
|
||||
if apierrors.IsNotFound(err) {
|
||||
log.WithError(err).Debug("Pod not found on this node, skip")
|
||||
return false, nil, nil
|
||||
}
|
||||
log.WithError(err).Error("Unable to get pod")
|
||||
return false, nil, err
|
||||
}
|
||||
|
||||
if !isResticInitContainerRunning(pod) {
|
||||
log.Debug("Pod is not running restic-wait init container, skip")
|
||||
return false, nil, nil
|
||||
}
|
||||
|
||||
return true, pod, nil
|
||||
}
|
||||
|
||||
func (c *PodVolumeRestoreReconciler) SetupWithManager(mgr ctrl.Manager) error {
|
||||
mgr.GetConfig()
|
||||
|
||||
// The pod may not being scheduled at the point when its PVRs are initially reconciled.
|
||||
// By watching the pods, we can trigger the PVR reconciliation again once the pod is finally scheduled on the node.
|
||||
return ctrl.NewControllerManagedBy(mgr).
|
||||
For(&velerov1api.PodVolumeRestore{}).
|
||||
Watches(&source.Kind{Type: &corev1api.Pod{}}, handler.EnqueueRequestsFromMapFunc(c.findVolumeRestoresForPod)).
|
||||
Complete(c)
|
||||
}
|
||||
|
||||
func (c *PodVolumeRestoreReconciler) findVolumeRestoresForPod(pod client.Object) []reconcile.Request {
|
||||
list := &velerov1api.PodVolumeRestoreList{}
|
||||
options := &client.ListOptions{
|
||||
LabelSelector: labels.Set(map[string]string{
|
||||
velerov1api.PodUIDLabel: string(pod.GetUID()),
|
||||
}).AsSelector(),
|
||||
}
|
||||
if err := c.List(context.TODO(), list, options); err != nil {
|
||||
c.logger.WithField("pod", fmt.Sprintf("%s/%s", pod.GetNamespace(), pod.GetName())).WithError(err).
|
||||
Error("unable to list PodVolumeRestores")
|
||||
return []reconcile.Request{}
|
||||
}
|
||||
requests := make([]reconcile.Request, len(list.Items))
|
||||
for i, item := range list.Items {
|
||||
requests[i] = reconcile.Request{
|
||||
NamespacedName: types.NamespacedName{
|
||||
Namespace: item.GetNamespace(),
|
||||
Name: item.GetName(),
|
||||
},
|
||||
}
|
||||
}
|
||||
return requests
|
||||
}
|
||||
|
||||
func isPVRNew(pvr *velerov1api.PodVolumeRestore) bool {
|
||||
return pvr.Status.Phase == "" || pvr.Status.Phase == velerov1api.PodVolumeRestorePhaseNew
|
||||
}
|
||||
|
||||
func isPodOnNode(pod *corev1api.Pod, node string) bool {
|
||||
return pod.Spec.NodeName == node
|
||||
}
|
||||
|
||||
func isResticInitContainerRunning(pod *corev1api.Pod) bool {
|
||||
// Restic wait container can be anywhere in the list of init containers, but must be running.
|
||||
i := getResticInitContainerIndex(pod)
|
||||
|
@ -237,92 +220,6 @@ func getResticInitContainerIndex(pod *corev1api.Pod) int {
|
|||
return -1
|
||||
}
|
||||
|
||||
func (c *podVolumeRestoreController) processQueueItem(key string) error {
|
||||
log := c.logger.WithField("key", key)
|
||||
log.Debug("Running processQueueItem")
|
||||
|
||||
ns, name, err := cache.SplitMetaNamespaceKey(key)
|
||||
if err != nil {
|
||||
log.WithError(errors.WithStack(err)).Error("error splitting queue key")
|
||||
return nil
|
||||
}
|
||||
|
||||
req, err := c.podVolumeRestoreLister.PodVolumeRestores(ns).Get(name)
|
||||
if apierrors.IsNotFound(err) {
|
||||
log.Debug("Unable to find PodVolumeRestore")
|
||||
return nil
|
||||
}
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "error getting PodVolumeRestore")
|
||||
}
|
||||
|
||||
// Don't mutate the shared cache
|
||||
reqCopy := req.DeepCopy()
|
||||
return c.processRestoreFunc(reqCopy)
|
||||
}
|
||||
|
||||
func loggerForPodVolumeRestore(baseLogger logrus.FieldLogger, req *velerov1api.PodVolumeRestore) logrus.FieldLogger {
|
||||
log := baseLogger.WithFields(logrus.Fields{
|
||||
"namespace": req.Namespace,
|
||||
"name": req.Name,
|
||||
})
|
||||
|
||||
if len(req.OwnerReferences) == 1 {
|
||||
log = log.WithField("restore", fmt.Sprintf("%s/%s", req.Namespace, req.OwnerReferences[0].Name))
|
||||
}
|
||||
|
||||
return log
|
||||
}
|
||||
|
||||
func (c *podVolumeRestoreController) processRestore(req *velerov1api.PodVolumeRestore) error {
|
||||
log := loggerForPodVolumeRestore(c.logger, req)
|
||||
|
||||
log.Info("Restore starting")
|
||||
|
||||
var err error
|
||||
|
||||
// update status to InProgress
|
||||
req, err = c.patchPodVolumeRestore(req, func(r *velerov1api.PodVolumeRestore) {
|
||||
r.Status.Phase = velerov1api.PodVolumeRestorePhaseInProgress
|
||||
r.Status.StartTimestamp = &metav1.Time{Time: c.clock.Now()}
|
||||
})
|
||||
if err != nil {
|
||||
log.WithError(err).Error("Error setting PodVolumeRestore startTimestamp and phase to InProgress")
|
||||
return errors.WithStack(err)
|
||||
}
|
||||
|
||||
pod, err := c.podLister.Pods(req.Spec.Pod.Namespace).Get(req.Spec.Pod.Name)
|
||||
if err != nil {
|
||||
log.WithError(err).Errorf("Error getting pod %s/%s", req.Spec.Pod.Namespace, req.Spec.Pod.Name)
|
||||
return c.failRestore(req, errors.Wrap(err, "error getting pod").Error(), log)
|
||||
}
|
||||
|
||||
volumeDir, err := kube.GetVolumeDirectory(log, pod, req.Spec.Volume, c.pvcLister, c.pvLister, c.kbClient)
|
||||
if err != nil {
|
||||
log.WithError(err).Error("Error getting volume directory name")
|
||||
return c.failRestore(req, errors.Wrap(err, "error getting volume directory name").Error(), log)
|
||||
}
|
||||
|
||||
// execute the restore process
|
||||
if err := c.restorePodVolume(req, volumeDir, log); err != nil {
|
||||
log.WithError(err).Error("Error restoring volume")
|
||||
return c.failRestore(req, errors.Wrap(err, "error restoring volume").Error(), log)
|
||||
}
|
||||
|
||||
// update status to Completed
|
||||
if _, err = c.patchPodVolumeRestore(req, func(r *velerov1api.PodVolumeRestore) {
|
||||
r.Status.Phase = velerov1api.PodVolumeRestorePhaseCompleted
|
||||
r.Status.CompletionTimestamp = &metav1.Time{Time: c.clock.Now()}
|
||||
}); err != nil {
|
||||
log.WithError(err).Error("Error setting PodVolumeRestore completionTimestamp and phase to Completed")
|
||||
return err
|
||||
}
|
||||
|
||||
log.Info("Restore completed")
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func singlePathMatch(path string) (string, error) {
|
||||
matches, err := filepath.Glob(path)
|
||||
if err != nil {
|
||||
|
@ -336,7 +233,12 @@ func singlePathMatch(path string) (string, error) {
|
|||
return matches[0], nil
|
||||
}
|
||||
|
||||
func (c *podVolumeRestoreController) restorePodVolume(req *velerov1api.PodVolumeRestore, volumeDir string, log logrus.FieldLogger) error {
|
||||
func (c *PodVolumeRestoreReconciler) processRestore(ctx context.Context, req *velerov1api.PodVolumeRestore, pod *corev1api.Pod, log logrus.FieldLogger) error {
|
||||
volumeDir, err := kube.GetVolumeDirectory(ctx, log, pod, req.Spec.Volume, c.Client)
|
||||
if err != nil {
|
||||
return errors.Wrap(err, "error getting volume directory name")
|
||||
}
|
||||
|
||||
// Get the full path of the new volume's directory as mounted in the daemonset pod, which
|
||||
// will look like: /host_pods/<new-pod-uid>/volumes/<volume-plugin-name>/<volume-dir>
|
||||
volumePath, err := singlePathMatch(fmt.Sprintf("/host_pods/%s/volumes/*/%s", string(req.Spec.Pod.UID), volumeDir))
|
||||
|
@ -346,8 +248,7 @@ func (c *podVolumeRestoreController) restorePodVolume(req *velerov1api.PodVolume
|
|||
|
||||
credsFile, err := c.credentialsFileStore.Path(restic.RepoKeySelector())
|
||||
if err != nil {
|
||||
log.WithError(err).Error("Error creating temp restic credentials file")
|
||||
return c.failRestore(req, errors.Wrap(err, "error creating temp restic credentials file").Error(), log)
|
||||
return errors.Wrap(err, "error creating temp restic credentials file")
|
||||
}
|
||||
// ignore error since there's nothing we can do and it's a temp file.
|
||||
defer os.Remove(credsFile)
|
||||
|
@ -360,11 +261,11 @@ func (c *podVolumeRestoreController) restorePodVolume(req *velerov1api.PodVolume
|
|||
)
|
||||
|
||||
backupLocation := &velerov1api.BackupStorageLocation{}
|
||||
if err := c.kbClient.Get(context.Background(), client.ObjectKey{
|
||||
if err := c.Get(ctx, client.ObjectKey{
|
||||
Namespace: req.Namespace,
|
||||
Name: req.Spec.BackupStorageLocation,
|
||||
}, backupLocation); err != nil {
|
||||
return c.failRestore(req, errors.Wrap(err, "error getting backup storage location").Error(), log)
|
||||
return errors.Wrap(err, "error getting backup storage location")
|
||||
}
|
||||
|
||||
// if there's a caCert on the ObjectStorage, write it to disk so that it can be passed to restic
|
||||
|
@ -381,7 +282,7 @@ func (c *podVolumeRestoreController) restorePodVolume(req *velerov1api.PodVolume
|
|||
|
||||
env, err := restic.CmdEnv(backupLocation, c.credentialsFileStore)
|
||||
if err != nil {
|
||||
return c.failRestore(req, errors.Wrap(err, "error setting restic cmd env").Error(), log)
|
||||
return errors.Wrap(err, "error setting restic cmd env")
|
||||
}
|
||||
resticCmd.Env = env
|
||||
|
||||
|
@ -432,55 +333,18 @@ func (c *podVolumeRestoreController) restorePodVolume(req *velerov1api.PodVolume
|
|||
return nil
|
||||
}
|
||||
|
||||
func (c *podVolumeRestoreController) patchPodVolumeRestore(req *velerov1api.PodVolumeRestore, mutate func(*velerov1api.PodVolumeRestore)) (*velerov1api.PodVolumeRestore, error) {
|
||||
// Record original json
|
||||
oldData, err := json.Marshal(req)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error marshalling original PodVolumeRestore")
|
||||
}
|
||||
|
||||
// Mutate
|
||||
mutate(req)
|
||||
|
||||
// Record new json
|
||||
newData, err := json.Marshal(req)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error marshalling updated PodVolumeRestore")
|
||||
}
|
||||
|
||||
patchBytes, err := jsonpatch.CreateMergePatch(oldData, newData)
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error creating json merge patch for PodVolumeRestore")
|
||||
}
|
||||
|
||||
req, err = c.podVolumeRestoreClient.PodVolumeRestores(req.Namespace).Patch(context.TODO(), req.Name, types.MergePatchType, patchBytes, metav1.PatchOptions{})
|
||||
if err != nil {
|
||||
return nil, errors.Wrap(err, "error patching PodVolumeRestore")
|
||||
}
|
||||
|
||||
return req, nil
|
||||
}
|
||||
|
||||
func (c *podVolumeRestoreController) failRestore(req *velerov1api.PodVolumeRestore, msg string, log logrus.FieldLogger) error {
|
||||
if _, err := c.patchPodVolumeRestore(req, func(pvr *velerov1api.PodVolumeRestore) {
|
||||
pvr.Status.Phase = velerov1api.PodVolumeRestorePhaseFailed
|
||||
pvr.Status.Message = msg
|
||||
pvr.Status.CompletionTimestamp = &metav1.Time{Time: c.clock.Now()}
|
||||
}); err != nil {
|
||||
log.WithError(err).Error("Error setting PodVolumeRestore phase to Failed")
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// updateRestoreProgressFunc returns a func that takes progress info and patches
|
||||
// the PVR with the new progress
|
||||
func (c *podVolumeRestoreController) updateRestoreProgressFunc(req *velerov1api.PodVolumeRestore, log logrus.FieldLogger) func(velerov1api.PodVolumeOperationProgress) {
|
||||
func (c *PodVolumeRestoreReconciler) updateRestoreProgressFunc(req *velerov1api.PodVolumeRestore, log logrus.FieldLogger) func(velerov1api.PodVolumeOperationProgress) {
|
||||
return func(progress velerov1api.PodVolumeOperationProgress) {
|
||||
if _, err := c.patchPodVolumeRestore(req, func(r *velerov1api.PodVolumeRestore) {
|
||||
r.Status.Progress = progress
|
||||
}); err != nil {
|
||||
log.WithError(err).Error("error updating PodVolumeRestore progress")
|
||||
helper, err := patch.NewHelper(req, c.Client)
|
||||
if err != nil {
|
||||
log.WithError(err).Error("Unable to new patch helper")
|
||||
return
|
||||
}
|
||||
req.Status.Progress = progress
|
||||
if err = helper.Patch(context.Background(), req); err != nil {
|
||||
log.WithError(err).Error("Unable to update PodVolumeRestore progress")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -17,64 +17,64 @@ limitations under the License.
|
|||
package controller
|
||||
|
||||
import (
|
||||
"context"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
|
||||
"k8s.io/apimachinery/pkg/runtime"
|
||||
|
||||
"github.com/sirupsen/logrus"
|
||||
|
||||
"sigs.k8s.io/controller-runtime/pkg/client/fake"
|
||||
|
||||
"github.com/stretchr/testify/assert"
|
||||
corev1api "k8s.io/api/core/v1"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/types"
|
||||
"k8s.io/apimachinery/pkg/util/sets"
|
||||
corev1listers "k8s.io/client-go/listers/core/v1"
|
||||
"k8s.io/client-go/tools/cache"
|
||||
|
||||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
|
||||
velerofake "github.com/vmware-tanzu/velero/pkg/generated/clientset/versioned/fake"
|
||||
veleroinformers "github.com/vmware-tanzu/velero/pkg/generated/informers/externalversions"
|
||||
velerov1listers "github.com/vmware-tanzu/velero/pkg/generated/listers/velero/v1"
|
||||
"github.com/vmware-tanzu/velero/pkg/restic"
|
||||
velerotest "github.com/vmware-tanzu/velero/pkg/test"
|
||||
)
|
||||
|
||||
func TestPVRHandler(t *testing.T) {
|
||||
func TestShouldProcess(t *testing.T) {
|
||||
controllerNode := "foo"
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
obj *velerov1api.PodVolumeRestore
|
||||
pod *corev1api.Pod
|
||||
shouldEnqueue bool
|
||||
name string
|
||||
obj *velerov1api.PodVolumeRestore
|
||||
pod *corev1api.Pod
|
||||
shouldProcessed bool
|
||||
}{
|
||||
{
|
||||
name: "InProgress phase pvr should not be enqueued",
|
||||
name: "InProgress phase pvr should not be processed",
|
||||
obj: &velerov1api.PodVolumeRestore{
|
||||
Status: velerov1api.PodVolumeRestoreStatus{
|
||||
Phase: velerov1api.PodVolumeRestorePhaseInProgress,
|
||||
},
|
||||
},
|
||||
shouldEnqueue: false,
|
||||
shouldProcessed: false,
|
||||
},
|
||||
{
|
||||
name: "Completed phase pvr should not be enqueued",
|
||||
name: "Completed phase pvr should not be processed",
|
||||
obj: &velerov1api.PodVolumeRestore{
|
||||
Status: velerov1api.PodVolumeRestoreStatus{
|
||||
Phase: velerov1api.PodVolumeRestorePhaseCompleted,
|
||||
},
|
||||
},
|
||||
shouldEnqueue: false,
|
||||
shouldProcessed: false,
|
||||
},
|
||||
{
|
||||
name: "Failed phase pvr should not be enqueued",
|
||||
name: "Failed phase pvr should not be processed",
|
||||
obj: &velerov1api.PodVolumeRestore{
|
||||
Status: velerov1api.PodVolumeRestoreStatus{
|
||||
Phase: velerov1api.PodVolumeRestorePhaseFailed,
|
||||
},
|
||||
},
|
||||
shouldEnqueue: false,
|
||||
shouldProcessed: false,
|
||||
},
|
||||
{
|
||||
name: "Unable to get pvr's pod should not be enqueued",
|
||||
name: "Unable to get pvr's pod should not be processed",
|
||||
obj: &velerov1api.PodVolumeRestore{
|
||||
Spec: velerov1api.PodVolumeRestoreSpec{
|
||||
Pod: corev1api.ObjectReference{
|
||||
|
@ -86,50 +86,10 @@ func TestPVRHandler(t *testing.T) {
|
|||
Phase: "",
|
||||
},
|
||||
},
|
||||
shouldEnqueue: false,
|
||||
shouldProcessed: false,
|
||||
},
|
||||
{
|
||||
name: "Empty phase pvr with pod not on node running init container should not be enqueued",
|
||||
obj: &velerov1api.PodVolumeRestore{
|
||||
Spec: velerov1api.PodVolumeRestoreSpec{
|
||||
Pod: corev1api.ObjectReference{
|
||||
Namespace: "ns-1",
|
||||
Name: "pod-1",
|
||||
},
|
||||
},
|
||||
Status: velerov1api.PodVolumeRestoreStatus{
|
||||
Phase: "",
|
||||
},
|
||||
},
|
||||
pod: &corev1api.Pod{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pod-1",
|
||||
},
|
||||
Spec: corev1api.PodSpec{
|
||||
NodeName: "some-other-node",
|
||||
InitContainers: []corev1api.Container{
|
||||
{
|
||||
Name: restic.InitContainer,
|
||||
},
|
||||
},
|
||||
},
|
||||
Status: corev1api.PodStatus{
|
||||
InitContainerStatuses: []corev1api.ContainerStatus{
|
||||
{
|
||||
State: corev1api.ContainerState{
|
||||
Running: &corev1api.ContainerStateRunning{
|
||||
StartedAt: metav1.Time{Time: time.Now()},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
shouldEnqueue: false,
|
||||
},
|
||||
{
|
||||
name: "Empty phase pvr with pod on node not running init container should not be enqueued",
|
||||
name: "Empty phase pvr with pod on node not running init container should not be processed",
|
||||
obj: &velerov1api.PodVolumeRestore{
|
||||
Spec: velerov1api.PodVolumeRestoreSpec{
|
||||
Pod: corev1api.ObjectReference{
|
||||
|
@ -162,7 +122,7 @@ func TestPVRHandler(t *testing.T) {
|
|||
},
|
||||
},
|
||||
},
|
||||
shouldEnqueue: false,
|
||||
shouldProcessed: false,
|
||||
},
|
||||
{
|
||||
name: "Empty phase pvr with pod on node running init container should be enqueued",
|
||||
|
@ -202,220 +162,23 @@ func TestPVRHandler(t *testing.T) {
|
|||
},
|
||||
},
|
||||
},
|
||||
shouldEnqueue: true,
|
||||
shouldProcessed: true,
|
||||
},
|
||||
}
|
||||
|
||||
for _, test := range tests {
|
||||
t.Run(test.name, func(t *testing.T) {
|
||||
var (
|
||||
podInformer = cache.NewSharedIndexInformer(nil, new(corev1api.Pod), 0, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc})
|
||||
c = &podVolumeRestoreController{
|
||||
genericController: newGenericController(PodVolumeRestore, velerotest.NewLogger()),
|
||||
podLister: corev1listers.NewPodLister(podInformer.GetIndexer()),
|
||||
nodeName: controllerNode,
|
||||
}
|
||||
)
|
||||
|
||||
builder := fake.NewClientBuilder()
|
||||
if test.pod != nil {
|
||||
require.NoError(t, podInformer.GetStore().Add(test.pod))
|
||||
builder.WithObjects(test.pod)
|
||||
}
|
||||
c := &PodVolumeRestoreReconciler{
|
||||
logger: logrus.New(),
|
||||
Client: builder.Build(),
|
||||
}
|
||||
|
||||
c.pvrHandler(test.obj)
|
||||
|
||||
if !test.shouldEnqueue {
|
||||
assert.Equal(t, 0, c.queue.Len())
|
||||
return
|
||||
}
|
||||
|
||||
require.Equal(t, 1, c.queue.Len())
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestPodHandler(t *testing.T) {
|
||||
controllerNode := "foo"
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
pod *corev1api.Pod
|
||||
podVolumeRestores []*velerov1api.PodVolumeRestore
|
||||
expectedEnqueues sets.String
|
||||
}{
|
||||
{
|
||||
name: "pod on controller node running restic init container with multiple PVRs has new ones enqueued",
|
||||
pod: &corev1api.Pod{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pod-1",
|
||||
UID: types.UID("uid"),
|
||||
},
|
||||
Spec: corev1api.PodSpec{
|
||||
NodeName: controllerNode,
|
||||
InitContainers: []corev1api.Container{
|
||||
{
|
||||
Name: restic.InitContainer,
|
||||
},
|
||||
},
|
||||
},
|
||||
Status: corev1api.PodStatus{
|
||||
InitContainerStatuses: []corev1api.ContainerStatus{
|
||||
{
|
||||
State: corev1api.ContainerState{
|
||||
Running: &corev1api.ContainerStateRunning{StartedAt: metav1.Time{Time: time.Now()}},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
podVolumeRestores: []*velerov1api.PodVolumeRestore{
|
||||
{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pvr-1",
|
||||
Labels: map[string]string{
|
||||
velerov1api.PodUIDLabel: "uid",
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pvr-2",
|
||||
Labels: map[string]string{
|
||||
velerov1api.PodUIDLabel: "uid",
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pvr-3",
|
||||
Labels: map[string]string{
|
||||
velerov1api.PodUIDLabel: "uid",
|
||||
},
|
||||
},
|
||||
Status: velerov1api.PodVolumeRestoreStatus{
|
||||
Phase: velerov1api.PodVolumeRestorePhaseInProgress,
|
||||
},
|
||||
},
|
||||
{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pvr-4",
|
||||
Labels: map[string]string{
|
||||
velerov1api.PodUIDLabel: "some-other-pod",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
expectedEnqueues: sets.NewString("ns-1/pvr-1", "ns-1/pvr-2"),
|
||||
},
|
||||
{
|
||||
name: "pod on controller node not running restic init container doesn't have PVRs enqueued",
|
||||
pod: &corev1api.Pod{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pod-1",
|
||||
UID: types.UID("uid"),
|
||||
},
|
||||
Spec: corev1api.PodSpec{
|
||||
NodeName: controllerNode,
|
||||
InitContainers: []corev1api.Container{
|
||||
{
|
||||
Name: restic.InitContainer,
|
||||
},
|
||||
},
|
||||
},
|
||||
Status: corev1api.PodStatus{
|
||||
InitContainerStatuses: []corev1api.ContainerStatus{
|
||||
{
|
||||
State: corev1api.ContainerState{},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
podVolumeRestores: []*velerov1api.PodVolumeRestore{
|
||||
{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pvr-1",
|
||||
Labels: map[string]string{
|
||||
velerov1api.PodUIDLabel: "uid",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "pod not running on controller node doesn't have PVRs enqueued",
|
||||
pod: &corev1api.Pod{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pod-1",
|
||||
UID: types.UID("uid"),
|
||||
},
|
||||
Spec: corev1api.PodSpec{
|
||||
NodeName: "some-other-node",
|
||||
InitContainers: []corev1api.Container{
|
||||
{
|
||||
Name: restic.InitContainer,
|
||||
},
|
||||
},
|
||||
},
|
||||
Status: corev1api.PodStatus{
|
||||
InitContainerStatuses: []corev1api.ContainerStatus{
|
||||
{
|
||||
State: corev1api.ContainerState{
|
||||
Running: &corev1api.ContainerStateRunning{StartedAt: metav1.Time{Time: time.Now()}},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
podVolumeRestores: []*velerov1api.PodVolumeRestore{
|
||||
{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Namespace: "ns-1",
|
||||
Name: "pvr-1",
|
||||
Labels: map[string]string{
|
||||
velerov1api.PodUIDLabel: "uid",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
for _, test := range tests {
|
||||
t.Run(test.name, func(t *testing.T) {
|
||||
var (
|
||||
client = velerofake.NewSimpleClientset()
|
||||
informers = veleroinformers.NewSharedInformerFactory(client, 0)
|
||||
pvrInformer = informers.Velero().V1().PodVolumeRestores()
|
||||
c = &podVolumeRestoreController{
|
||||
genericController: newGenericController(PodVolumeRestore, velerotest.NewLogger()),
|
||||
podVolumeRestoreLister: velerov1listers.NewPodVolumeRestoreLister(pvrInformer.Informer().GetIndexer()),
|
||||
nodeName: controllerNode,
|
||||
}
|
||||
)
|
||||
|
||||
if len(test.podVolumeRestores) > 0 {
|
||||
for _, pvr := range test.podVolumeRestores {
|
||||
require.NoError(t, pvrInformer.Informer().GetStore().Add(pvr))
|
||||
}
|
||||
}
|
||||
|
||||
c.podHandler(test.pod)
|
||||
|
||||
require.Equal(t, len(test.expectedEnqueues), c.queue.Len())
|
||||
|
||||
itemCount := c.queue.Len()
|
||||
|
||||
for i := 0; i < itemCount; i++ {
|
||||
item, _ := c.queue.Get()
|
||||
assert.True(t, test.expectedEnqueues.Has(item.(string)))
|
||||
}
|
||||
shouldProcess, _, _ := c.shouldProcess(context.Background(), c.logger, test.obj)
|
||||
require.Equal(t, test.shouldProcessed, shouldProcess)
|
||||
})
|
||||
}
|
||||
}
|
||||
|
@ -437,17 +200,6 @@ func TestIsPVRNew(t *testing.T) {
|
|||
}
|
||||
}
|
||||
|
||||
func TestIsPodOnNode(t *testing.T) {
|
||||
pod := &corev1api.Pod{}
|
||||
assert.False(t, isPodOnNode(pod, "bar"))
|
||||
|
||||
pod.Spec.NodeName = "foo"
|
||||
assert.False(t, isPodOnNode(pod, "bar"))
|
||||
|
||||
pod.Spec.NodeName = "bar"
|
||||
assert.True(t, isPodOnNode(pod, "bar"))
|
||||
}
|
||||
|
||||
func TestIsResticContainerRunning(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
|
@ -720,3 +472,44 @@ func TestGetResticInitContainerIndex(t *testing.T) {
|
|||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestFindVolumeRestoresForPod(t *testing.T) {
|
||||
pod := &corev1api.Pod{}
|
||||
pod.UID = "uid"
|
||||
|
||||
scheme := runtime.NewScheme()
|
||||
scheme.AddKnownTypes(velerov1api.SchemeGroupVersion, &velerov1api.PodVolumeRestore{}, &velerov1api.PodVolumeRestoreList{})
|
||||
clientBuilder := fake.NewClientBuilder().WithScheme(scheme)
|
||||
|
||||
// no matching PVR
|
||||
reconciler := &PodVolumeRestoreReconciler{
|
||||
Client: clientBuilder.Build(),
|
||||
logger: logrus.New(),
|
||||
}
|
||||
requests := reconciler.findVolumeRestoresForPod(pod)
|
||||
assert.Len(t, requests, 0)
|
||||
|
||||
// contain one matching PVR
|
||||
reconciler.Client = clientBuilder.WithLists(&velerov1api.PodVolumeRestoreList{
|
||||
Items: []velerov1api.PodVolumeRestore{
|
||||
{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Name: "pvr1",
|
||||
Labels: map[string]string{
|
||||
velerov1api.PodUIDLabel: string(pod.GetUID()),
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
ObjectMeta: metav1.ObjectMeta{
|
||||
Name: "pvr2",
|
||||
Labels: map[string]string{
|
||||
velerov1api.PodUIDLabel: "non-matching-uid",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}).Build()
|
||||
requests = reconciler.findVolumeRestoresForPod(pod)
|
||||
assert.Len(t, requests, 1)
|
||||
}
|
||||
|
|
|
@ -144,13 +144,12 @@ func NewRestoreController(
|
|||
restore := obj.(*api.Restore)
|
||||
|
||||
switch restore.Status.Phase {
|
||||
case "", api.RestorePhaseNew:
|
||||
// only process new restores
|
||||
case "", api.RestorePhaseNew, api.RestorePhaseInProgress:
|
||||
default:
|
||||
c.logger.WithFields(logrus.Fields{
|
||||
"restore": kubeutil.NamespaceAndName(restore),
|
||||
"phase": restore.Status.Phase,
|
||||
}).Debug("Restore is not new, skipping")
|
||||
}).Debug("Restore is not new or in-progress, skipping")
|
||||
return
|
||||
}
|
||||
|
||||
|
@ -202,7 +201,21 @@ func (c *restoreController) processQueueItem(key string) error {
|
|||
// is ("" | New)
|
||||
switch restore.Status.Phase {
|
||||
case "", api.RestorePhaseNew:
|
||||
// only process new restores
|
||||
case api.RestorePhaseInProgress:
|
||||
// A restore may stay in-progress forever because of
|
||||
// 1) the controller restarts during the processing of a restore
|
||||
// 2) the restore with in-progress status isn't updated to completed or failed status successfully
|
||||
// So we try to mark such restores as failed to avoid it
|
||||
updated := restore.DeepCopy()
|
||||
updated.Status.Phase = api.RestorePhaseFailed
|
||||
updated.Status.FailureReason = fmt.Sprintf("got a Restore with unexpected status %q, this may be due to a restart of the controller during the restore, mark it as %q",
|
||||
api.RestorePhaseInProgress, updated.Status.Phase)
|
||||
_, err = patchRestore(restore, updated, c.restoreClient)
|
||||
if err != nil {
|
||||
return errors.Wrapf(err, "error updating Restore status to %s", updated.Status.Phase)
|
||||
}
|
||||
log.Warn(updated.Status.FailureReason)
|
||||
return nil
|
||||
default:
|
||||
return nil
|
||||
}
|
||||
|
|
|
@ -20,6 +20,7 @@ import (
|
|||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io/ioutil"
|
||||
"testing"
|
||||
"time"
|
||||
|
@ -170,11 +171,6 @@ func TestProcessQueueItemSkips(t *testing.T) {
|
|||
restoreKey: "foo/bar",
|
||||
expectError: true,
|
||||
},
|
||||
{
|
||||
name: "restore with phase InProgress does not get processed",
|
||||
restoreKey: "foo/bar",
|
||||
restore: builder.ForRestore("foo", "bar").Phase(velerov1api.RestorePhaseInProgress).Result(),
|
||||
},
|
||||
{
|
||||
name: "restore with phase Completed does not get processed",
|
||||
restoreKey: "foo/bar",
|
||||
|
@ -226,6 +222,31 @@ func TestProcessQueueItemSkips(t *testing.T) {
|
|||
}
|
||||
}
|
||||
|
||||
func TestMarkInProgressRestoreAsFailed(t *testing.T) {
|
||||
var (
|
||||
restore = builder.ForRestore("velero", "bar").Phase(velerov1api.RestorePhaseInProgress).Result()
|
||||
client = fake.NewSimpleClientset(restore)
|
||||
sharedInformers = informers.NewSharedInformerFactory(client, 0)
|
||||
logger = velerotest.NewLogger()
|
||||
)
|
||||
|
||||
c := restoreController{
|
||||
genericController: newGenericController("restore-test", logger),
|
||||
restoreClient: client.VeleroV1(),
|
||||
restoreLister: sharedInformers.Velero().V1().Restores().Lister(),
|
||||
}
|
||||
|
||||
err := sharedInformers.Velero().V1().Restores().Informer().GetStore().Add(restore)
|
||||
require.Nil(t, err)
|
||||
|
||||
err = c.processQueueItem(fmt.Sprintf("%s/%s", restore.Namespace, restore.Name))
|
||||
require.Nil(t, err)
|
||||
|
||||
res, err := c.restoreClient.Restores(restore.Namespace).Get(context.Background(), restore.Name, metav1.GetOptions{})
|
||||
require.Nil(t, err)
|
||||
assert.Equal(t, velerov1api.RestorePhaseFailed, res.Status.Phase)
|
||||
}
|
||||
|
||||
func TestProcessQueueItem(t *testing.T) {
|
||||
|
||||
defaultStorageLocation := builder.ForBackupStorageLocation("velero", "default").Provider("myCloud").Bucket("bucket").Result()
|
||||
|
|
|
@ -410,6 +410,15 @@ func (m *ServerMetrics) InitSchedule(scheduleName string) {
|
|||
if c, ok := m.metrics[volumeSnapshotFailureTotal].(*prometheus.CounterVec); ok {
|
||||
c.WithLabelValues(scheduleName).Add(0)
|
||||
}
|
||||
if c, ok := m.metrics[csiSnapshotAttemptTotal].(*prometheus.CounterVec); ok {
|
||||
c.WithLabelValues(scheduleName, "").Add(0)
|
||||
}
|
||||
if c, ok := m.metrics[csiSnapshotSuccessTotal].(*prometheus.CounterVec); ok {
|
||||
c.WithLabelValues(scheduleName, "").Add(0)
|
||||
}
|
||||
if c, ok := m.metrics[csiSnapshotFailureTotal].(*prometheus.CounterVec); ok {
|
||||
c.WithLabelValues(scheduleName, "").Add(0)
|
||||
}
|
||||
}
|
||||
|
||||
// InitSchedule initializes counter metrics for a node.
|
||||
|
|
|
@ -17,6 +17,7 @@ limitations under the License.
|
|||
package restic
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os"
|
||||
"strings"
|
||||
|
@ -26,10 +27,10 @@ import (
|
|||
corev1api "k8s.io/api/core/v1"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/labels"
|
||||
"sigs.k8s.io/controller-runtime/pkg/client"
|
||||
|
||||
"github.com/vmware-tanzu/velero/internal/credentials"
|
||||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
|
||||
velerov1listers "github.com/vmware-tanzu/velero/pkg/generated/listers/velero/v1"
|
||||
"github.com/vmware-tanzu/velero/pkg/label"
|
||||
"github.com/vmware-tanzu/velero/pkg/util/filesystem"
|
||||
)
|
||||
|
@ -242,18 +243,21 @@ type SnapshotIdentifier struct {
|
|||
|
||||
// GetSnapshotsInBackup returns a list of all restic snapshot ids associated with
|
||||
// a given Velero backup.
|
||||
func GetSnapshotsInBackup(backup *velerov1api.Backup, podVolumeBackupLister velerov1listers.PodVolumeBackupLister) ([]SnapshotIdentifier, error) {
|
||||
selector := labels.Set(map[string]string{
|
||||
velerov1api.BackupNameLabel: label.GetValidName(backup.Name),
|
||||
}).AsSelector()
|
||||
func GetSnapshotsInBackup(ctx context.Context, backup *velerov1api.Backup, kbClient client.Client) ([]SnapshotIdentifier, error) {
|
||||
podVolumeBackups := &velerov1api.PodVolumeBackupList{}
|
||||
options := &client.ListOptions{
|
||||
LabelSelector: labels.Set(map[string]string{
|
||||
velerov1api.BackupNameLabel: label.GetValidName(backup.Name),
|
||||
}).AsSelector(),
|
||||
}
|
||||
|
||||
podVolumeBackups, err := podVolumeBackupLister.List(selector)
|
||||
err := kbClient.List(ctx, podVolumeBackups, options)
|
||||
if err != nil {
|
||||
return nil, errors.WithStack(err)
|
||||
}
|
||||
|
||||
var res []SnapshotIdentifier
|
||||
for _, item := range podVolumeBackups {
|
||||
for _, item := range podVolumeBackups.Items {
|
||||
if item.Status.SnapshotID == "" {
|
||||
continue
|
||||
}
|
||||
|
|
|
@ -17,6 +17,7 @@ limitations under the License.
|
|||
package restic
|
||||
|
||||
import (
|
||||
"context"
|
||||
"os"
|
||||
"sort"
|
||||
"testing"
|
||||
|
@ -28,8 +29,6 @@ import (
|
|||
|
||||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
|
||||
"github.com/vmware-tanzu/velero/pkg/builder"
|
||||
"github.com/vmware-tanzu/velero/pkg/generated/clientset/versioned/fake"
|
||||
informers "github.com/vmware-tanzu/velero/pkg/generated/informers/externalversions"
|
||||
velerotest "github.com/vmware-tanzu/velero/pkg/test"
|
||||
)
|
||||
|
||||
|
@ -369,10 +368,8 @@ func TestGetSnapshotsInBackup(t *testing.T) {
|
|||
for _, test := range tests {
|
||||
t.Run(test.name, func(t *testing.T) {
|
||||
var (
|
||||
client = fake.NewSimpleClientset()
|
||||
sharedInformers = informers.NewSharedInformerFactory(client, 0)
|
||||
pvbInformer = sharedInformers.Velero().V1().PodVolumeBackups()
|
||||
veleroBackup = &velerov1api.Backup{}
|
||||
clientBuilder = velerotest.NewFakeControllerRuntimeClientBuilder(t)
|
||||
veleroBackup = &velerov1api.Backup{}
|
||||
)
|
||||
|
||||
veleroBackup.Name = "backup-1"
|
||||
|
@ -380,12 +377,11 @@ func TestGetSnapshotsInBackup(t *testing.T) {
|
|||
if test.longBackupNameEnabled {
|
||||
veleroBackup.Name = "the-really-long-backup-name-that-is-much-more-than-63-characters"
|
||||
}
|
||||
clientBuilder.WithLists(&velerov1api.PodVolumeBackupList{
|
||||
Items: test.podVolumeBackups,
|
||||
})
|
||||
|
||||
for _, pvb := range test.podVolumeBackups {
|
||||
require.NoError(t, pvbInformer.Informer().GetStore().Add(pvb.DeepCopy()))
|
||||
}
|
||||
|
||||
res, err := GetSnapshotsInBackup(veleroBackup, pvbInformer.Lister())
|
||||
res, err := GetSnapshotsInBackup(context.TODO(), veleroBackup, clientBuilder.Build())
|
||||
assert.NoError(t, err)
|
||||
|
||||
// sort to ensure good compare of slices
|
||||
|
|
|
@ -28,6 +28,15 @@ import (
|
|||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
|
||||
)
|
||||
|
||||
func NewFakeControllerRuntimeClientBuilder(t *testing.T) *k8sfake.ClientBuilder {
|
||||
scheme := runtime.NewScheme()
|
||||
err := velerov1api.AddToScheme(scheme)
|
||||
require.NoError(t, err)
|
||||
err = corev1api.AddToScheme(scheme)
|
||||
require.NoError(t, err)
|
||||
return k8sfake.NewClientBuilder().WithScheme(scheme)
|
||||
}
|
||||
|
||||
func NewFakeControllerRuntimeClient(t *testing.T, initObjs ...runtime.Object) client.Client {
|
||||
scheme := runtime.NewScheme()
|
||||
err := velerov1api.AddToScheme(scheme)
|
||||
|
|
|
@ -37,6 +37,10 @@ func ResetVolumeSnapshotContent(snapCont *snapshotv1api.VolumeSnapshotContent) e
|
|||
return fmt.Errorf("the volumesnapshotcontent '%s' does not have snapshothandle set", snapCont.Name)
|
||||
}
|
||||
|
||||
snapCont.Spec.VolumeSnapshotRef = corev1.ObjectReference{}
|
||||
// set the VolumeSnapshotRef to non-existing one to bypass the validation webhook
|
||||
snapCont.Spec.VolumeSnapshotRef = corev1.ObjectReference{
|
||||
Namespace: fmt.Sprintf("ns-%s", snapCont.UID),
|
||||
Name: fmt.Sprintf("name-%s", snapCont.UID),
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
|
|
@ -21,8 +21,6 @@ import (
|
|||
"fmt"
|
||||
"time"
|
||||
|
||||
"sigs.k8s.io/controller-runtime/pkg/client"
|
||||
|
||||
"github.com/pkg/errors"
|
||||
"github.com/sirupsen/logrus"
|
||||
corev1api "k8s.io/api/core/v1"
|
||||
|
@ -35,7 +33,7 @@ import (
|
|||
"k8s.io/apimachinery/pkg/runtime"
|
||||
"k8s.io/apimachinery/pkg/util/wait"
|
||||
corev1client "k8s.io/client-go/kubernetes/typed/core/v1"
|
||||
corev1listers "k8s.io/client-go/listers/core/v1"
|
||||
"sigs.k8s.io/controller-runtime/pkg/client"
|
||||
)
|
||||
|
||||
// These annotations are taken from the Kubernetes persistent volume/persistent volume claim controller.
|
||||
|
@ -117,8 +115,7 @@ func EnsureNamespaceExistsAndIsReady(namespace *corev1api.Namespace, client core
|
|||
// GetVolumeDirectory gets the name of the directory on the host, under /var/lib/kubelet/pods/<podUID>/volumes/,
|
||||
// where the specified volume lives.
|
||||
// For volumes with a CSIVolumeSource, append "/mount" to the directory name.
|
||||
func GetVolumeDirectory(log logrus.FieldLogger, pod *corev1api.Pod, volumeName string, pvcLister corev1listers.PersistentVolumeClaimLister,
|
||||
pvLister corev1listers.PersistentVolumeLister, client client.Client) (string, error) {
|
||||
func GetVolumeDirectory(ctx context.Context, log logrus.FieldLogger, pod *corev1api.Pod, volumeName string, cli client.Client) (string, error) {
|
||||
var volume *corev1api.Volume
|
||||
|
||||
for _, item := range pod.Spec.Volumes {
|
||||
|
@ -142,18 +139,20 @@ func GetVolumeDirectory(log logrus.FieldLogger, pod *corev1api.Pod, volumeName s
|
|||
}
|
||||
|
||||
// Most common case is that we have a PVC VolumeSource, and we need to check the PV it points to for a CSI source.
|
||||
pvc, err := pvcLister.PersistentVolumeClaims(pod.Namespace).Get(volume.VolumeSource.PersistentVolumeClaim.ClaimName)
|
||||
pvc := &corev1api.PersistentVolumeClaim{}
|
||||
err := cli.Get(ctx, client.ObjectKey{Namespace: pod.Namespace, Name: volume.VolumeSource.PersistentVolumeClaim.ClaimName}, pvc)
|
||||
if err != nil {
|
||||
return "", errors.WithStack(err)
|
||||
}
|
||||
|
||||
pv, err := pvLister.Get(pvc.Spec.VolumeName)
|
||||
pv := &corev1api.PersistentVolume{}
|
||||
err = cli.Get(ctx, client.ObjectKey{Name: pvc.Spec.VolumeName}, pv)
|
||||
if err != nil {
|
||||
return "", errors.WithStack(err)
|
||||
}
|
||||
|
||||
// PV's been created with a CSI source.
|
||||
isProvisionedByCSI, err := isProvisionedByCSI(log, pv, client)
|
||||
isProvisionedByCSI, err := isProvisionedByCSI(log, pv, cli)
|
||||
if err != nil {
|
||||
return "", errors.WithStack(err)
|
||||
}
|
||||
|
|
|
@ -17,6 +17,7 @@ limitations under the License.
|
|||
package kube
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"testing"
|
||||
"time"
|
||||
|
@ -33,7 +34,6 @@ import (
|
|||
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
|
||||
"k8s.io/apimachinery/pkg/runtime"
|
||||
"k8s.io/apimachinery/pkg/runtime/schema"
|
||||
kubeinformers "k8s.io/client-go/informers"
|
||||
"sigs.k8s.io/controller-runtime/pkg/client/fake"
|
||||
|
||||
"github.com/vmware-tanzu/velero/pkg/builder"
|
||||
|
@ -202,22 +202,18 @@ func TestGetVolumeDirectorySuccess(t *testing.T) {
|
|||
csiDriver := storagev1api.CSIDriver{
|
||||
ObjectMeta: metav1.ObjectMeta{Name: "csi.test.com"},
|
||||
}
|
||||
kbClient := fake.NewClientBuilder().WithLists(&storagev1api.CSIDriverList{Items: []storagev1api.CSIDriver{csiDriver}}).Build()
|
||||
for _, tc := range tests {
|
||||
h := newHarness(t)
|
||||
|
||||
pvcInformer := kubeinformers.NewSharedInformerFactoryWithOptions(h.KubeClient, 0, kubeinformers.WithNamespace("ns-1")).Core().V1().PersistentVolumeClaims()
|
||||
pvInformer := kubeinformers.NewSharedInformerFactory(h.KubeClient, 0).Core().V1().PersistentVolumes()
|
||||
clientBuilder := fake.NewClientBuilder().WithLists(&storagev1api.CSIDriverList{Items: []storagev1api.CSIDriver{csiDriver}})
|
||||
|
||||
if tc.pvc != nil {
|
||||
require.NoError(t, pvcInformer.Informer().GetStore().Add(tc.pvc))
|
||||
clientBuilder = clientBuilder.WithObjects(tc.pvc)
|
||||
}
|
||||
if tc.pv != nil {
|
||||
require.NoError(t, pvInformer.Informer().GetStore().Add(tc.pv))
|
||||
clientBuilder = clientBuilder.WithObjects(tc.pv)
|
||||
}
|
||||
|
||||
// Function under test
|
||||
dir, err := GetVolumeDirectory(logrus.StandardLogger(), tc.pod, tc.pod.Spec.Volumes[0].Name, pvcInformer.Lister(), pvInformer.Lister(), kbClient)
|
||||
dir, err := GetVolumeDirectory(context.Background(), logrus.StandardLogger(), tc.pod, tc.pod.Spec.Volumes[0].Name, clientBuilder.Build())
|
||||
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, tc.want, dir)
|
||||
|
|
|
@ -144,5 +144,7 @@ status:
|
|||
warnings: 2
|
||||
# Number of errors that were logged by the backup.
|
||||
errors: 0
|
||||
# An error that caused the entire backup to fail.
|
||||
failureReason: ""
|
||||
|
||||
```
|
||||
|
|
|
@ -77,8 +77,15 @@ If there is possibility the schedule will be disable to not create backup anymor
|
|||
|
||||
## Kubernetes API Pagination
|
||||
|
||||
By default, Velero will paginate the LIST API call for each resource type in the Kubernetes API when collecting items into a backup. The `--client-page-size` flag for the Velero server configures the size of each page.
|
||||
By default, Velero will paginate the LIST API call for each resource type in the Kubernetes API when collecting items into a backup. The `--client-page-size` flag for the Velero server configures the size of each page.
|
||||
|
||||
Depending on the cluster's scale, tuning the page size can improve backup performance. You can experiment with higher values, noting their impact on the relevant `apiserver_request_duration_seconds_*` metrics from the Kubernetes apiserver.
|
||||
|
||||
Pagination can be entirely disabled by setting `--client-page-size` to `0`. This will request all items in a single unpaginated LIST call.
|
||||
|
||||
## Deleting Backups
|
||||
|
||||
Use the following commands to delete Velero backups and data:
|
||||
|
||||
* `kubectl delete backup <backupName> -n <veleroNamespace>` will delete the backup custom resource only and will not delete any associated data from object/block storage
|
||||
* `velero backup delete <backupName>` will delete the backup resource including all data in object/block storage
|
||||
|
|
|
@ -129,13 +129,22 @@ These are the steps to update the Velero Homebrew version.
|
|||
- Run `export HOMEBREW_GITHUB_API_TOKEN=your_token_here` on your command line to make sure that `brew` can work on GitHub on your behalf.
|
||||
- Run `hack/release-tools/brew-update.sh`. This script will download the necessary files, do the checks, and invoke the brew helper to submit the PR, which will open in your browser.
|
||||
- Update Windows Chocolatey version. From a Windows computer, follow the step-by-step instructions to [create the Windows Chocolatey package for Velero CLI](https://github.com/adamrushuk/velero-choco/blob/main/README.md)
|
||||
-
|
||||
|
||||
## Plugins
|
||||
|
||||
To release plugins maintained by the Velero team, follow the [plugin release instructions](plugin-release-instructions.md).
|
||||
|
||||
After the plugin images are built, be sure to update any [e2e tests][3] that use these plugins.
|
||||
|
||||
## Helm Chart (GA only)
|
||||
|
||||
### Steps
|
||||
- Update the CRDs under helm chart folder `crds` according to the current Velero GA version, and add the labels for the helm chart CRDs. For example: https://github.com/vmware-tanzu/helm-charts/pull/248.
|
||||
- Bump the Chart version `version` on the `Chart.yaml`.
|
||||
- Bump the Velero version `appVersion` on the `Chart.yaml` file and `tag` on the `values.yaml` file.
|
||||
- Bump the plugin version on the `values.yaml` if needed.
|
||||
- Update the _upgrade_ instruction and related tag on the `README.md` file.
|
||||
|
||||
## How to write and release a blog post
|
||||
What to include in a release blog:
|
||||
* Thank all contributors for their involvement in the release.
|
||||
|
|
Loading…
Reference in New Issue