* add restore item action to change PVC/PV storage class name
Signed-off-by: Steve Kriss <krisss@vmware.com>
* code review
Signed-off-by: Steve Kriss <krisss@vmware.com>
* change existing plugin names to lowercase/hyphenated
Signed-off-by: Steve Kriss <krisss@vmware.com>
* add validation for new storage class name
Signed-off-by: Steve Kriss <krisss@vmware.com>
* add test cases
Signed-off-by: Steve Kriss <krisss@vmware.com>
* changelog
Signed-off-by: Steve Kriss <krisss@vmware.com>
* fix imports
Signed-off-by: Steve Kriss <krisss@vmware.com>
* update plugin names to be more consistent
Signed-off-by: Steve Kriss <krisss@vmware.com>
* update unit tests to use pkg/test object constructors
Signed-off-by: Steve Kriss <krisss@vmware.com>
CSI volumes are mounted one level deeper than "native" kubernetes
volumes, and this needs to be appended for proper restic support.
Fixes#1313.
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Velero should handle cases when the label length exceeds 63 characters.
- if the length of the backup/restore name is <= 63 characters, use it as the value of the label
- if it's > 63 characters, take the SHA256 hash of the name. the value of
the label will be the first 57 characters of the backup/restore name
plus the first six characters of the SHA256 hash.
Fixes heptio#1021
Signed-off-by: Anshul Chandra <anshulc@vmware.com>
* Adds support for allowing a RestoreItemAction to skip item restore
This allows a RestoreItemAction plugin to signal to velero that
the returned item should be skipped rather than restored to the
cluster.
To support this, a boolean SkipRestore attribute is added to
RestoreItemActionExecuteOutput. If restore.restoreResource finds
this set to true, any remaining actions on this item are skipped,
and restore on this item is skipped. Execution continues with
the next item of this resource type.
To signal this for a particular item, the RestoreItemAction's
Execute method should call WithoutRestore() on the
RestoreItemActionExecuteOutput before returning it.
Signed-off-by: Scott Seago <sseago@redhat.com>
* Autogenerated code to support SkipRestore
Signed-off-by: Scott Seago <sseago@redhat.com>
* Added changelog for #1336
Signed-off-by: Scott Seago <sseago@redhat.com>
Original error was:
```
ark -n <redacted> restore logs <redacted>
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=nodes logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=events logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=events.events.k8s.io logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=backups.ark.heptio.com logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=restores.ark.heptio.com logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Starting restore of backup backup/<redacted>" logSource="pkg/restore/restore.go:342"
time="2019-03-06T18:31:06Z" level=info msg="error unzipping and extracting: open /tmp/604421455/resources/rolebindings.rbac.authorization.k8s.io/namespaces/<redacted>/<redacted>: too many open files" logSource="pkg/restore/restore.go:346"
```
Downloading the directory from s3 and untarring it I found 1036 files. The ulimit -n output says 1024. This is our team's best guess at a root cause. But the code fixed in the PR definitely is holding all the files open until the method closes: https://blog.learngoprogramming.com/gotchas-of-defer-in-go-1-8d070894cb01
Please note my go code abilities are not great and I did not test this. I just edited the file in github. All I did was remove the defer and put fileClose after the copy is done. Theoretically this should only hold one file open at a time now. Let me know if you want me to do any further steps.
Thank you,
-Asaf
Signed-off-by: Asaf Erlich <aerlich@groupon.com>
If a PV already exists, wait for it, it's associated PVC, and associated
namespace to be deleted before attempting to restore it.
If a namespace already exists, wait for it to be deleted before
attempting to restore it.
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Instead of only removing the default token from a service account when
it already exists in the cluster, always remove it. If the service
account already exists, continue to do the merging logic.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
We were checking for nil, but were getting back an empty
*unstructured.Unstructured{} instead, along with a NotFound error.
Change the logic to check for the NotFound error instead of a nil
object.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
This is to better reflect the intent of the user when node ports are
specified explicitly (as opposed to being assigned by Kubernetes). The
`last-applied-configuration` annotation added by `kubectl apply` is one
such indicator we are now leveraging.
We still default to omitting the node ports when the annotation is
missing.
Signed-off-by: Timo Reimann <ttr314@googlemail.com>
Refactor plugin management:
- support multiple plugins per executable
- support restarting a plugin process in the event it terminates
- simplify plugin lifecycle management by using separate managers for
each scope (server vs backup vs restore)
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Mirror pods are pods created from static manifest files on a node.
They're mirrored to the apiserver so they're visible when querying the
apiserver for a list of pods, but it's not possible to send a pod
containing the mirror pod annotation to the apiserver and have it be
created successfully. Instead of trying to do this, log a message that
we're skipping restoring the pod because it's a mirror pod.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
If a PV has a reclaim policy of Delete and we didn't create a snapshot
of it, don't restore the PV, as doing so would create a PV whose
underlying volume is incorrect.
Also "reset" any PVCs bound to the PV so they'll be dynamically
provisioned when restored.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
All properties from a backup will be merged into the ServiceAccount
except for the default token secret.
Signed-off-by: Nolan Brubaker <nolan@heptio.com>
Completed jobs and pods may be useful in the backup for auditing
purposes, but don't recreate them when restoring.
Signed-off-by: Nolan Brubaker <nolan@heptio.com>
When restoring resources that raise an already exists error, check their
equality before logging a message on the restore. If they're the same
except for some metadata, don't generate a message.
The restore process was modified so that if an object had an empty
namespace string, no namespace key is created on the object. This was to
avoid manipulating the copy of the current cluster's object by adding
the target namespace.
There are some cases right now that are known to not be equal via this
method:
- The `default` ServiceAccount in a namespace will not match, primarily
because of differing default tokens. These will be handled in their own
patch
- IP addresses for Services are recorded in the backup object, but are
either not present on the cluster object, or different. An issue for
this already exists at https://github.com/heptio/ark/issues/354
- Endpoints have differing values for `renewTime`. This may be
insubstantial, but isn't currently handled by the resetMetadataAndStatus
function.
- PersistentVolume objects do not match on spec fields, such as
claimRef and cloud provider persistent disk info
Signed-off-by: Nolan Brubaker <nolan@heptio.com>