Make sure a DeleteBackupRequest has its Spec.BackupName filled in. If
not, record an error in the status and mark the request as processed.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Always request DeleteBackupRequests for a given backup so we can show
failed deletion attempts if you try to delete a backup that has PV
snapshots when Ark doesn't have a persistentVolumeProvider configured.
When creating a DeleteBackupRequest, include a label for the UID so we
can match based on name and UID when associated DeleteBackupRequests
with a given backup.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
We ran into a lot of problems using a finalizer on the backup to allow
the Ark server to clean up all associated backup data when deleting a
backup.
Users also found it less than desirable that deleting the heptio-ark
namespace resulted in all the backup data being deleted.
This removes the finalizer and replaces it with an explicit
DeleteBackupRequest that is created as a means of requesting the
deletion of a backup and all its associated data. This is what `ark
backup delete` does.
If you use kubectl to delete a backup or to delete the heptio-ark
namespace, this no longer deletes associated backups. Additionally, as
long as the heptio-ark namespace still exists, the Ark server's
BackupSyncController will continually sync backups into the heptio-ark
namespace from object storage.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
This commit adds limitranges to defaultResourcePriorities as
suggested in #385.
This is done so that pods are not restored before the LimitRange
objects, because that would lead to pods not honoring the requests
and limits set in LimitRange objects.
Fixes#385
Signed-off-by: Shubham <shubham@linux.com>
Instead of requiring the Ark admin to specify a "location" in the azure
persistentVolumeProvider config (meaning only a single location is
supported), get info about the disk (for its location) when creating a
snapshot, and get info about the snapshot (for its location) when
creating a disk from a snapshot.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
If you want to test changes to the ark server without having to rebuild
and redeploy the ark container, this change allows you to do something
like this (assuming you've created your cloud credentials file):
AWS_SHARED_CREDENTIALS_FILE=credentials-minio ark server -n heptio-ark
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Add --force and --confirm to `ark backup delete` to support forced
backup deletion. This forcibly removes the Ark GC finalizer (if it's
present) from a backup and will orphan any resources associated with the
backup, such as backup tarballs in object storage, persistent volume
snapshots, and restores for the backup.
If a backup has a deletion timestamp, display `Deleting` in `ark backup
describe` and `ark backup get`.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Always try to create the config directory when saving the client config
in case it doesn't exist.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
When running `ark <resource> create`, a request is sent to the server,
but the status is not immediately known. Inform the user that a request
was sent and provide a way to get more information on it.
Signed-off-by: Nolan Brubaker <nolan@heptio.com>
When creating a restore based on a backup that doesn't exist, the
restore should be marked as invalid and the error clearly communicated
so the user understands why the restore wasn't made.
Previously, the restore was left as in progress with an error attached.
Since restores are CRDs and must be updated via a controller, there's
currently not a way to give the client immediate errors.
Signed-off-by: Nolan Brubaker <nolan@heptio.com>
Add the ability for the Ark server to run in any namespace.
Add `ark client config get/set` for manipulating the new client
configuration file in $HOME/.config/ark/config.json. This holds client
defaults, such as the Ark server's namespace (to avoid having to specify
the --namespace flag all the time).
Add a --namespace flag to all client commands.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
If a backup item action errors, log the error as soon as it occurs, so
it's clear when the error happened. Also include information about the
groupResource, namespace, and name of the item in the error.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
When running a backup, try to do as much as possible, collecting errors
along the way, and return an aggregate at the end. This way, if a backup
fails for most reasons, we'll be able to upload the backup log file to
object storage, which wasn't happening before.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
The main Ark code was hard-coding specific support for AWS, GCE, and
Azure volume snapshots and restores, and anything else was considered
unsupported.
Add GetVolumeID and SetVolumeID to the BlockStore interface, to allow
block store plugins to handle volume snapshots and restores.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
The log location hook was matching github.com/heptio/ark and stripping
off that + 1 more char. This meant that
github.com/heptio/ark-plugin-example/foo.go was being listed as
plugin-example/foo.go instead of
github.com/heptio/ark-plugin-example/foo.go.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
If backing up specific namespaces with "auto" cluster resources, the
actual namespace objects themselves were not being included in the
backup. Restore would create them but any labels or metadata would be
lost.
Instead handle the special case of namespace as a cluster level resource
we may still need, even if excluding most cluster level resources.
Signed-off-by: Devan Goodwin <dgoodwin@redhat.com>
If you have a large number of warnings and/or errors, the restore
object's size can exceed the maximum allowed by etcd. Move them to
object storage, and add a new describe command to fetch and display them
on the fly.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Deleting the clusterIP field when the service should be headless will
cause it to be assigned a new IP on restore; instead it should retain
the headless state after restoration.
Fixes#168
Signed-off-by: Nolan Brubaker <nolan@heptio.com>
We only need them if we've got unstructured/unknown data and we want to
convert it to typed objects.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Previously the directory structure separated resources depending on
whether or not they were cluster or namespace scoped. All cluster
resources were restored first, then all namespace resources. Priority
did not apply across both and you could not order any namespace
resources before any cluster resources.
This restructure sorts firstly on resource type.
resources/serviceaccounts/namespaces/ns1.json
resources/nodes/cluster/node1.json
This will break old backups as the format is no longer consistent as
announced on the Google group.
Signed-off-by: Devan Goodwin <dgoodwin@redhat.com>
- Read PV's AZ info from fault-domain label of the PV object for snapshotting.
- Store PV's AZ info in the VolumeInfo.
- Add tests for reading the label from the PV object.
- Remove availability zone validation in AWS and GCP BlockStorageAdaptor.
- Add volumeAZ as a parameter to methods in the BlockStorageAdapter interface.
- Get AZ from VolumeInfo when restoring PV snapshot.
- Remove references to PV availability zone in docs.
Signed-off-by: Ashish Amarnath <ashish.amarnath@gmail.com>
- Introduced a blacklist of resources that are non-restorable. The
goal being that the backup can still include these resources for
logging/auditing purposes but they are explicitly added to
ExcludedResources in the RestorController's "defaulting" logic
to ensure that if someone were to explicitly ask for nodes
that they would be expressly denied.
Signed-off-by: Justin Nauman <justin.r.nauman@gmail.com>
Fix 2 issues with config change detection:
- Objects received via Get() don't have kind and apiVersion set, while
those from Watch() do, leading to false positives.
- Compare the unmodified config (prior to applying defaults) to the
updated one from Watch().
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
- Adding in support for a new `download` subcommand of backup
- Adjusted signing to allow for multiple types
- Adding in git sha version during build more granular version
debugging
Signed-off-by: Justin Nauman <justin.r.nauman@gmail.com>
Delete all objects in backup "dir" when deleting a backup, instead of
hard-coding individual file names/types. This way, we'll be able to
delete log files and anything else we add without having to update our
deletion code.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Due to a change in glog, if you don't call flag.Parse, every log line
prints out 'ERROR: logging before flag.Parse'. This works around that
change. Ultimately we need to switch to a different logging library.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
- Per discussion, there is no reason to deal
with the complexity of backwards compatibility
with the Namespace attribute on the Restore
domain.
- Also noticed there was an error on the
validation of the BackupController where
the message would actually just be the index
of the error and not the contents of the message
itself.
Signed-off-by: Justin Nauman <justin.r.nauman@gmail.com>
- Adding in additional test to ensure *Namespaces attributes
don't directly conflict logically with one another
- Additional PR changes around naming/typos
Signed-off-by: Justin Nauman <justin.r.nauman@gmail.com>
- Introduces similar Include/Exclude declaration on the Restore
resource and cli flags
- Kept support for legacy Namespaces attribute until it could be
deprecating. Defining both IncludeNamespaces and Namespaces results
in a validation error and the Restore will not be processed (shouldn't
be able to occur)
Signed-off-by: Justin Nauman <justin.r.nauman@gmail.com>
- Changed the default kubeconfig loading to utilize
the client-go's loader strategy. This allows users
to use the Ark client without having to explicitly
define a KUBECONFIG env var or argument.
This more closely resemebles how Kubectl works and users
are probably more used to while preserving the
current rules.
Signed-off-by: Justin Nauman <justin.r.nauman@gmail.com>
- Adding in paging support for the S3 and Snapshot
AWS integration.
As a testing note, you can add in a a MaxKeys to the S3
request as an easy way to ensure that paging is working
properly without having to creation over 1k backups.
Signed-off-by: Justin Nauman <justin.r.nauman@gmail.com>
Prepping to roll back to the same version of testify that client-go
uses, and that does not have NoErrorf.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>