diff --git a/README.md b/README.md index d82bf91c7..c787cd7f0 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,10 @@ Ark consists of: * A server that runs on your cluster * A command-line client that runs locally +## More information + +[The documentation][29] provides detailed information about building from source, architecture, extending Ark, and more. + ## Getting started The following example sets up the Ark server and client, then backs up and restores a sample application. @@ -153,13 +157,9 @@ kubectl delete -f examples/minio/ kubectl delete -f examples/nginx-app/base.yaml ``` -## More information - -[The documentation][29] provides detailed information about building from source, architecture, extending Ark, and more. - ## Troubleshooting -If you encounter any problems that the documentation does not address, review the [troubleshooting][30] page, [file an issue][4], or talk to us on the [Kubernetes Slack team][25] channel `#ark-dr`. +If you encounter issues, review the [troubleshooting docs][30], [file an issue][4], or talk to us on the [Kubernetes Slack team][25] channel `#ark-dr`. ## Contributing diff --git a/docs/debugging-deletes.md b/docs/debugging-deletes.md new file mode 100644 index 000000000..b5fa1e6e1 --- /dev/null +++ b/docs/debugging-deletes.md @@ -0,0 +1,47 @@ +# Ark version 0.7.0 and later: issue with deleting namespaces and backups + +Version 0.7.0 introduced the ability to delete backups. However, you may encounter an issue if you try to +delete the `heptio-ark` namespace. The namespace can get stuck in a terminating state, and you cannot delete your backups. +To fix: + +1. If you don't have it, [install `jq`][0]. + +1. Run: + + ```bash + kubectl -n heptio-ark get backup -o json | jq -c -r $'.items[] | "kubectl -n heptio-ark patch backup/" + .metadata.name + " -p \'" + (({metadata: {finalizers: ( (.metadata.finalizers // []) - ["gc.ark.heptio.com"]), resourceVersion: .metadata.resourceVersion}}) | tostring) + "\' --type=merge"' + ``` + +This command retrieves a list of backups, then generates and runs another list of commands that look like: + +``` +kubectl -n heptio-ark patch backup/my-backup -p '{"metadata":{"finalizers":[],"resourceVersion":"461343"}}' --type=merge +kubectl -n heptio-ark patch backup/some-other-backup -p '{"metadata":{"finalizers":[],"resourceVersion":"461718"}}' --type=merge +``` + +If you encounter errors that tell you patching backups is not allowed, the Ark +CustomResourceDefinitions (CRDs) might have been deleted. To fix, recreate the CRDs using +`examples/common/00-prereqs.yaml`, then follow the steps above. + +## Mitigate the issue in Ark version 0.7.1 and later + +In Ark version 0.7.1, the default configuration runs the Ark server in a different namespace from the namespace +for backups, schedules, restores, and the Ark config. We strongly recommend that you keep this configuration. +This approach can help prevent issues with deletes. + +## For the curious: why the error occurs + +The Ark team added the ability to delete backups by adding a **finalizer** to each +backup. When you request the deletion of an object that has at least one finalizer, Kubernetes sets +the object's deletion timestamp, which indicates that the object is marked for deletion. However, it does +not immediately delete the object. Instead, the object is deleted only when it no longer has +any finalizers. This means that something -- in this case, Ark, in this case -- must process the backup and then +remove the Ark finalizer from it. + +Ark versions earlier than v0.7.1 place the Ark server pod in the same namespace as backups, restores, +schedules, and the Ark config. If you try to delete the namespace, with `kubectl delete +namespace/heptio-ark`, the Ark server pod might be deleted before the backups, because +the order of deletions is arbitrary. If this happens, the remaining bacukps are stuck in a +deleting state, because the Ark server pod no longer exists to remove their finalizers. + +[0]: https://stedolan.github.io/jq/ \ No newline at end of file diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index e3a905109..cd9c954c9 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -1,38 +1,12 @@ # Troubleshooting -## heptio-ark namespace stuck terminating / unable to delete backups +These tips can help you troubleshoot known issues. If they don't help, you can [file an issue][4], or talk to us on the [Kubernetes Slack team][25] channel `#ark-dr`. -Ark v0.7.0 added the ability to delete backups by adding what is called an Ark "finalizer" to each -backup. When you request the deletion of an object that has at least one finalizer, Kubernetes sets -the object's "deletion timestamp" (indicating the object has been marked for deletion), but it does -not immediately delete the object. Instead, Kubernetes only deletes the object when it no longer has -any finalizers. This means that something (Ark, in this case) must process the backup and then -remove the Ark finalizer from it. +* [Delete namespaces and backups][0] -Ark versions before v0.7.1 place the Ark server pod in the same namespace as backups, restores, -schedules, and the Ark config. If you try to delete the namespace (`kubectl delete -namespace/heptio-ark`), it's possible that the Ark server pod is deleted before the backups, because -the order of deletions is arbitrary. If this happens, the remaining bacukps will be "stuck" -deleting, because the Ark server pod no longer exists to remove their finalizers. +* [Debug restores][1] -With v0.7.1, we strongly encourage you to run the Ark server pod in a different namespace than the -one used for backups, schedules, restores, and the Ark config. This is the default configuration as -of v0.7.1. - -If you encounter this problem, here is how to fix it. First, make sure you have `jq` installed. Then -run: - -``` -bash <(kubectl -n heptio-ark get backup -o json | jq -c -r $'.items[] | "kubectl -n heptio-ark patch backup/" + .metadata.name + " -p \'" + (({metadata: {finalizers: ( (.metadata.finalizers // []) - ["gc.ark.heptio.com"]), resourceVersion: .metadata.resourceVersion}}) | tostring) + "\' --type=merge"') -``` - -This retrieves a list of backups and uses it to generate and run a list of commands that look like: - -``` -kubectl -n heptio-ark patch backup/my-backup -p '{"metadata":{"finalizers":[],"resourceVersion":"461343"}}' --type=merge -kubectl -n heptio-ark patch backup/some-other-backup -p '{"metadata":{"finalizers":[],"resourceVersion":"461718"}}' --type=merge -``` - -If you receive errors that patching backups is not allowed, it's possible that the Ark -CustomResourceDefinitions (CRDs) were deleted. You'll need to recreate them (they're in -`examples/common/00-prereqs.yaml`), then follow the steps above. +[0]: /docs/debugging-deletes.md +[1]: /docs/debugging-restores.md +[4]: https://github.com/heptio/ark/issues +[25]: http://slack.kubernetes.io/ \ No newline at end of file