* Use pod namespace from backup when matching PVBs
In #3051, we introduced an additional check to ensure that a PVB matched
a particular pod by checking both the name and the namespace of the pod.
This caused an issue when using a namespace mapping on restore. In the
case where a namespace mapping is being used, the check for whether a
PVB matches a particular pod will fail as the PVB was created for the
original pod namespace and is not aware of the new namespace mapping
being used. This resulted in PVRs not being created for pods that were
being restored into new namespaces. The restic init containers were
being created to wait on the volume restore, however this would cause
the restored pods to block indefinitely as they would be waiting for a
volume restore that was not scheduled.
To fix this, we use the original namespace of the pod from the backup to
match the PVB to the pod being restored, not the new namespace where
the pod is being restored into.
Fixes#3467.
Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
* Explain why the namespace mapping can't be used
Signed-off-by: Bridget McErlean <bmcerlean@vmware.com>
* Restore API group version by priority
Signed-off-by: F. Gold <fgold@vmware.com>
* Add changelog
Signed-off-by: F. Gold <fgold@vmware.com>
* Correct spelling
Signed-off-by: F. Gold <fgold@vmware.com>
* Refactor userResourceGroupVersionPriorities(...) to accept config map, adjust unit test
Signed-off-by: F. Gold <fgold@vmware.com>
* Move some unit tests into e2e
Signed-off-by: F. Gold <fgold@vmware.com>
* Add three e2e tests using Testify Suites
Summary of changes
Makefile - add testify e2e test target
go.sum - changed with go mod tidy
pkg/install/install.go - increased polling timeout
test/e2e/restore_priority_group_test.go - deleted
test/e2e/restore_test.go - deleted
test/e2e/velero_utils.go - made restic optional in velero install
test/e2e_testify/Makefile - makefile for testify e2e tests
test/e2e_testify/README.md - example command for running tests
test/e2e_testify/common_test.go - helper functions
test/e2e_testify/e2e_suite_test.go - prepare for tests and run
test/e2e_testify/restore_priority_apigv_test.go - test cases
Signed-off-by: F. Gold <fgold@vmware.com>
* Make changes per @nrb code review
Signed-off-by: F. Gold <fgold@vmware.com>
* Wait for pods in e2e tests
Signed-off-by: F. Gold <fgold@vmware.com>
* Remove testify suites e2e scaffolding moved to PR #3354
Signed-off-by: F. Gold <fgold@vmware.com>
* Make changes per @brito-rafa and Velero maintainers code reviews
- Made changes suggested by @brito-rafa in GitHub.
- We had a code review meeting with @carlisia, @dsu-igeek, @zubron, and @nrb
- and changes were made based on their suggetions:
- pull in logic from 'meetsAPIGVResotreReqs()' to restore.go.
- add TODO to remove APIGroupVersionFeatureFlag check
- have feature flag and backup version format checks in separate `if` statements.
- rename variables to be sourceGVs, targetGVs, and userGVs.
Signed-off-by: F. Gold <fgold@vmware.com>
* Convert Testify Suites e2e tests to existing Ginkgo framework
Signed-off-by: F. Gold <fgold@vmware.com>
* Made changes per @zubron PR review
Signed-off-by: F. Gold <fgold@vmware.com>
* Run go mod tidy after resolving go.sum merge conflict
Signed-off-by: F. Gold <fgold@vmware.com>
* Add feature documentation to velero.io site
Signed-off-by: F. Gold <fgold@vmware.com>
* Add config map e2e test; rename e2e test file and name
Signed-off-by: F. Gold <fgold@vmware.com>
* Update go.{mod,sum} files
Signed-off-by: F. Gold <fgold@vmware.com>
* Move CRDs and CRs to testdata folder
Signed-off-by: F. Gold <fgold@vmware.com>
* Fix typos in cert-manager to pass codespell CICD check
Signed-off-by: F. Gold <fgold@vmware.com>
* Make changes per @nrb code review round 2
- make checkAndReadDir function private
- add info level messages when priorties 1-3 API group versions can not be used
Signed-off-by: F. Gold <fgold@vmware.com>
* Make user config map rules less strict
Signed-off-by: F. Gold <fgold@vmware.com>
* Update e2e test image version in example
Signed-off-by: F. Gold <fgold@vmware.com>
* Update case A music-system controller code
Signed-off-by: F. Gold <fgold@vmware.com>
* Documentation updates
Signed-off-by: F. Gold <fgold@vmware.com>
* Update migration case documentation
Signed-off-by: F. Gold <fgold@vmware.com>
* -> Preserve nodePort support when restoring via "--preserve-nodeports" flag
Signed-off-by: Yusuf Güngör <yusuf.gungor@hepsiburada.com>
* -> Added changelog.
Signed-off-by: Yusuf Güngör <yusuf.gungor@hepsiburada.com>
* -> Unit test added.
-> Using boolptr.IsSetToTrue for bool ptr check.
Signed-off-by: Yusuf Güngör <yusuf.gungor@hepsiburada.com>
* -> Unit test added.
-> Using boolptr.IsSetToTrue for bool ptr check.
Signed-off-by: Yusuf Güngör <yusuf.gungor@hepsiburada.com>
* -> Other restore errors log level changed from info to error.
-> Documentation updated about Velero nodePort restore logic and preservation of them.
Signed-off-by: Yusuf Güngör <yusuf.gungor@hepsiburada.com>
Co-authored-by: Yusuf Güngör <yusuf.gungor@hepsiburada.com>
By running the following command:
codespell -S .git,*.png,*.jpg,*.woff,*.ttf,*.gif,*.ico -L \
iam,aks,ist,bridget,ue
Signed-off-by: Mateusz Gozdek <mgozdekof@gmail.com>
* Only remove the UID from a PV's claimRef
The UID is the only part of a claimRef that might prevent it from being
rebound correctly on a restore. The namespace and name within the
claimRef should be preserved in order to ensure that the PV is claimed
by the correct PVC on restore.
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Remap PVs claimRef.namespace on relevant restores
When remapping namespaces, any included PVs need to have their claimRef
updated to point remapped namespaces to the new namespace name in order
to be bound to the correct PVC.
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Update tests and ensure claimRef namespace remaps
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Remove lowercased uid field from unstructured PV
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Fix issues that prevented PVs from being restored
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Add changelog
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Dynamically reprovision volumes without snapshots
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Update test for lower case uid field
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Remove stray debugging print statement
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Fix typo, remove extra code, add tests.
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
* Exec hooks in restored pods
Signed-off-by: Andrew Reed <andrew@replicated.com>
* WaitExecHookHandler implements ItemHookHandler
This required adding a context.Context argument to the ItemHookHandler
interface which is unused by the DefaultItemHookHandler implementation.
It also means passing nil for the []ResourceHook argument since that
holds BackupResourceHook.
Signed-off-by: Andrew Reed <andrew@replicated.com>
* WaitExecHookHandler unit tests
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Changelog and go fmt
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Fix double import
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Default to first contaienr in pod
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Use constants for hook error modes in tests
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Revert to separate WaitExecHookHandler interface
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Negative tests for invalid timeout annotations
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Rename NamedExecRestoreHook PodExecRestoreHook
Also make field names more descriptive.
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Cleanup test names
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Separate maxHookWait and add unit tests
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Comment on maxWait <= 0
Also info log container is not running for hooks to execute in.
Also add context error to hooks not executed errors.
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Remove log about default for invalid timeout
There is no default wait or exec timeout.
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Linting
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Fix log message and rename controller to podWatcher
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Comment on exactly-once semantics for handler
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Fix logging and comments
Use filed logger for pod in handler.
Add comment about pod changes in unit tests.
Use kube util NamespaceAndName in messages.
Signed-off-by: Andrew Reed <andrew@replicated.com>
* Fix maxHookWait
Signed-off-by: Andrew Reed <andrew@replicated.com>
* fix: rename the PV if VolumeSnapshotter has modified the PV name
When VolumeSnapshotter sets the PV name via SetVolumeID and PV is
not there in the cluster, velero does not rename the PV. Which causes
the pvc to be in the lost state as pvc points to the old PV but pv object
has been renamed by VolumeSnapshotter.
Signed-off-by: Pawan <pawan@mayadata.io>
* adding a test case for pv rename
Signed-off-by: Pawan <pawan@mayadata.io>
* k8s 1.18 import wip
backup, cmd, controller, generated, restic, restore, serverstatusrequest, test and util
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* go mod tidy
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* add changelog file
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* go fmt
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* update code-generator and controller-gen in CI
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* checkout proper code-generator version, regen
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* fix remaining calls
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* regenerate CRDs with ./hack/update-generated-crd-code.sh
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* use existing context in restic and server
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* fix test cases by resetting resource version
also use main library go context, not golang.org/x/net/context, in pkg/restore/restore.go
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* clarify changelog message
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* use github.com/kubernetes-csi/external-snapshotter/v2@v2.2.0-rc1
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* run 'go mod tidy' to remove old external-snapshotter version
Signed-off-by: Andrew Lavery <laverya@umich.edu>
* Wait for CRDs to be available and ready
When restoring CRDs, we should wait for the definition to be ready and
available before moving on to restoring specific CRs.
While the CRDs are often ready by the time we get to restoring a CR,
there is a race condition where the CRD isn't ready.
This change waits on each CRD at restore time.
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Migrate logic from NewUUID function into the pvRenamer function.
PR #2133 switched to a new NewUUID function that returns an error, but
the invocation of that function needs to happen within the pvRenamer
closure. Because the new function returns an error, the pvRenamer should
return the error, the signature needs to be changed and the return
checked.
Signed-off-by: John Naulty <johnnaulty@bitgo.com>
satori/go.uuid has a known issue with random uuid generation.
gofrs/uuid is still maintained and has fixed the random uuid generation
issue present in satori/go.uuid
Signed-off-by: John Naulty <johnnaulty@bitgo.com>
* update import paths to github.com/vmware-tanzu/...
Signed-off-by: Steve Kriss <krisss@vmware.com>
* update other GH org refs to vmware-tanzu
Signed-off-by: Steve Kriss <krisss@vmware.com>
* site and docs: update GH org to vmware-tanzu
Signed-off-by: Steve Kriss <krisss@vmware.com>
* update travis badge links on docs readmes
Signed-off-by: Steve Kriss <krisss@vmware.com>
* rename PV during restore when cloning a namespace
Signed-off-by: Steve Kriss <krisss@vmware.com>
* rename func and vars, switch to if..else
Signed-off-by: Steve Kriss <krisss@vmware.com>
* make pv renamer func configurable for testing purposes
Signed-off-by: Steve Kriss <krisss@vmware.com>
* add unit test cases
Signed-off-by: Steve Kriss <krisss@vmware.com>
* changelog
Signed-off-by: Steve Kriss <krisss@vmware.com>
* address review feedback
Signed-off-by: Steve Kriss <krisss@vmware.com>
* address review feedback
Signed-off-by: Steve Kriss <krisss@vmware.com>
Velero should handle cases when the label length exceeds 63 characters.
- if the length of the backup/restore name is <= 63 characters, use it as the value of the label
- if it's > 63 characters, take the SHA256 hash of the name. the value of
the label will be the first 57 characters of the backup/restore name
plus the first six characters of the SHA256 hash.
Fixes heptio#1021
Signed-off-by: Anshul Chandra <anshulc@vmware.com>
* Adds support for allowing a RestoreItemAction to skip item restore
This allows a RestoreItemAction plugin to signal to velero that
the returned item should be skipped rather than restored to the
cluster.
To support this, a boolean SkipRestore attribute is added to
RestoreItemActionExecuteOutput. If restore.restoreResource finds
this set to true, any remaining actions on this item are skipped,
and restore on this item is skipped. Execution continues with
the next item of this resource type.
To signal this for a particular item, the RestoreItemAction's
Execute method should call WithoutRestore() on the
RestoreItemActionExecuteOutput before returning it.
Signed-off-by: Scott Seago <sseago@redhat.com>
* Autogenerated code to support SkipRestore
Signed-off-by: Scott Seago <sseago@redhat.com>
* Added changelog for #1336
Signed-off-by: Scott Seago <sseago@redhat.com>