Design doc for upload progress monitoring (#3416)
Change to add new plugin SnapshotItemAction, added started/updated fields to UploadProgress Updated SnapshotItemAction, added additional tasks Signed-off-by: Dave Smith-Uchida <dsmithuchida@vmware.com>pull/3762/head
parent
c3f08af872
commit
3b3d228507
Binary file not shown.
Binary file not shown.
After Width: | Height: | Size: 81 KiB |
|
@ -0,0 +1,311 @@
|
|||
# Upload Progress Monitoring
|
||||
|
||||
Volume snapshotter plug-in are used by Velero to take snapshots of persistent volume contents.
|
||||
Depending on the underlying storage system, those snapshots may be available to use immediately,
|
||||
they may be uploaded to stable storage internally by the plug-in or they may need to be uploaded after
|
||||
the snapshot has been taken. We would like for Velero to continue on to the next part of the backup as quickly
|
||||
as possible but we would also like the backup to not be marked as complete until it is a usable backup. We'd also
|
||||
eventually like to bring the control of upload under the control of Velero and allow the user to make decisions
|
||||
about the ultimate destination of backup data independent of the storage system they're using.
|
||||
|
||||
|
||||
|
||||
## Examples
|
||||
AWS - AWS snapshots return quickly, but are then uploaded in the background and cannot be used until EBS moves
|
||||
the data into S3 internally.
|
||||
|
||||
vSphere - The vSphere plugin takes a local snapshot and then the vSphere plugin uploads the data to S3. The local
|
||||
snapshot is usable before the upload completes.
|
||||
|
||||
Restic - Does not go through the volume snapshot path. Restic backups will block Velero progress until completed.
|
||||
|
||||
## Goals
|
||||
|
||||
- Enable monitoring of operations that continue after snapshotting operations have completed
|
||||
- Keep non-usable backups (upload/persistence has not finished) from appearing as completed
|
||||
- Minimize change to volume snapshot and BackupItemAction plug-ins
|
||||
|
||||
## Non-goals
|
||||
- Unification of BackupItemActions and VolumeSnapshotters
|
||||
|
||||
## Models
|
||||
|
||||
### Internal configuration and management
|
||||
In this model, movement of the snapshot to stable storage is under the control of the snapshot
|
||||
plug-in. Decisions about where and when the snapshot gets moved to stable storage are not
|
||||
directly controlled by Velero. This is the model for the current VolumeSnapshot plugins.
|
||||
|
||||
### Velero controlled management
|
||||
In this model, the snapshot is moved to external storage under the control of Velero. This
|
||||
enables Velero to move data between storage systems. This also allows backup partners to use
|
||||
Velero to snapshot data and then move the data into their backup repository.
|
||||
|
||||
## Backup phases
|
||||
|
||||
Velero currently has backup phases "InProgress" and "Completed". The backup moves to the Completed
|
||||
phase when all of the volume snapshots have completed and the Kubernetes metadata has been written
|
||||
into the object store. However, the actual data movement may be happening in the background
|
||||
after the backup has been marked "Completed". The backup is not actually a stable backup until
|
||||
the data has been persisted properly. In some cases (e.g. AWS) the backup cannot be restored from
|
||||
until the snapshots have been persisted.
|
||||
|
||||
Once the snapshots have been taken, however, it is possible for additional backups to be made without
|
||||
interference. Waiting until all data has been moved before starting the next backup will
|
||||
slow the progress of the system without adding any actual benefit to the user.
|
||||
|
||||
A new backup phase, "Uploading" will be introduced. When a backup has entered this phase, Velero
|
||||
is free to start another backup. The backup will remain in the "Uploading" phase until all data
|
||||
has been successfully moved to persistent storage. The backup will not fail once it reaches
|
||||
this phase, it will continuously retry moving the data. If the backup is deleted (cancelled), the plug-ins will
|
||||
attempt to delete the snapshots and stop the data movement - this may not be possible with all
|
||||
storage systems.
|
||||
|
||||
### State progression
|
||||
|
||||
![image](UploadFSM.png)
|
||||
### New
|
||||
When a backup request is initially created, it is in the "New" phase.
|
||||
|
||||
The next state is either "InProgress" or "FailedValidation"
|
||||
|
||||
### FailedValidation
|
||||
If the backup request is incorrectly formed, it goes to the "FailedValidation" phase and terminates
|
||||
|
||||
### InProgress
|
||||
When work on the backup begins, it moves to the "InProgress" phase. It remains in the "InProgress"
|
||||
phase until all pre/post execution hooks have been executed, all snapshots have been taken and the
|
||||
Kubernetes metadata and backup info is safely written to the object store plug-in.
|
||||
|
||||
In the current implementation, Restic backups will move data during the "InProgress" phase.
|
||||
In the future, it may be possible to combine a snapshot with a Restic (or equivalent) backup which
|
||||
would allow for data movement to be handled in the "Uploading" phase,
|
||||
|
||||
The next phase is either "Completed", "Uploading", "Failed" or "PartiallyFailed". Backups which
|
||||
would have a final phase of "Completed" or "PartiallyFailed" may move to the "Uploading" state.
|
||||
A backup which will be marked "Failed" will go directly to
|
||||
the "Failed" phase. Uploads may continue in the background for snapshots that were taken by a "Failed"
|
||||
backup, but no progress will not be monitored or updated. When a "Failed" backup is deleted, all snapshots
|
||||
will be deleted and at that point any uploads still in progress should be aborted.
|
||||
|
||||
### Uploading (new)
|
||||
The "Uploading" phase signifies that the main part of the backup, including snapshotting has completed successfully
|
||||
and and uploading is continuing. In the event of an error during uploading, the phase will change to
|
||||
UploadingPartialFailure. On success, the phase changes to Completed. The backup cannot be
|
||||
restored from when it is in the Uploading state.
|
||||
|
||||
### UploadingPartialFailure (new)
|
||||
The "UploadingPartialFailure" phase signifies that the main part of the backup, including snapshotting has completed,
|
||||
but there were partial failures either during the main part or during the uploading. The backup cannot be
|
||||
restored from when it is in the UploadingPartialFailure state.
|
||||
|
||||
### Failed
|
||||
When a backup has had fatal errors it is marked as "Failed" This backup cannot be restored from.
|
||||
|
||||
### Completed
|
||||
The "Completed" phase signifies that the backup has completed, all data has been transferred to stable storage
|
||||
and the backup is ready to be used in a restore. When the Completed phase has been reached it is safe
|
||||
to remove any of the items that were backed up.
|
||||
|
||||
### PartiallyFailed
|
||||
The "PartiallyFailed" phase signifies that the backup has completed and at least part of the backup is usable.
|
||||
Restoration from a PartiallyFailed backup will not result in a complete restoration but pieces may be available.
|
||||
|
||||
## Workflow
|
||||
|
||||
When a BackupAction is executed, any SnapshotItemAction or VolumeSnapshot plugins will return snapshot IDs.
|
||||
The plugin should be able to provide status on
|
||||
the progress for the snapshot and handle cancellation of the upload if the snapshot is deleted.
|
||||
If the plugin is restarted, the snapshot ID should remain valid.
|
||||
|
||||
When all snapshots have been taken and Kubernetes resources have been persisted to the ObjectStorePlugin
|
||||
the backup will either have fatal errors or will be at least partially usable.
|
||||
|
||||
If the backup has fatal errors it will move to the "Failed" state and finish. If a backup fails, the upload will not be
|
||||
cancelled but it will not be monitored either. For backups in any phase, all snapshots will be deleted when the backup
|
||||
is deleted. Plugins will cancel any data movement and
|
||||
remove snapshots and other associated resources when the VolumeSnapshotter DeleteSnapshot method or
|
||||
DeleteItemAction Execute method is called.
|
||||
|
||||
Velero will poll the plugins for status on the snapshots when the backup exits the "InProgress" phase and
|
||||
has no fatal errors.
|
||||
|
||||
If any snapshots are not complete, the backup will move to either Uploading or UploadingPartialFailure or Failed.
|
||||
|
||||
Post-snapshot operations may take a long time and Velero and its plugins may be restarted during
|
||||
this time. Once a backup has moved into the Uploading or UploadingPartialFailure phase, another
|
||||
backup may be started.
|
||||
|
||||
While in the Uploading or UploadingPartialFailure phase, the snapshots and backup items will be periodically polled.
|
||||
When all of the snapshots and backup items have reported success, the backup will move to the Completed or
|
||||
PartiallyFailed phase, depending on whether the backup was in the Uploading or UploadingPartialFailure phase.
|
||||
|
||||
The Backup resources will not be written to object storage until the backup has entered a final phase:
|
||||
Completed, Failed or PartialFailure
|
||||
## Reconciliation of InProgress backups
|
||||
|
||||
InProgress backups will not have a `velero-backup.json` present in the object store. During reconciliation, backups which
|
||||
do not have a `velero-backup.json` object in the object store will be ignored.
|
||||
|
||||
## Plug-in API changes
|
||||
|
||||
### UploadProgress struct
|
||||
|
||||
type UploadProgress struct {
|
||||
completed bool // True when the operation has completed, either successfully or with a failure
|
||||
err error // Set when the operation has failed
|
||||
itemsCompleted, itemsToComplete int64 // The number of items that have been completed and the items to complete
|
||||
// For a disk, an item would be a byte and itemsToComplete would be the
|
||||
// total size to transfer (may be less than the size of a volume if
|
||||
// performing an incremental) and itemsCompleted is the number of bytes
|
||||
// transferred. On successful completion, itemsCompleted and itemsToComplete
|
||||
// should be the same
|
||||
started, updated time.Time // When the upload was started and when the last update was seen. Not all
|
||||
// systems retain when the upload was begun, return Time 0 (time.Unix(0, 0))
|
||||
// if unknown.
|
||||
}
|
||||
|
||||
### VolumeSnapshotter changes
|
||||
|
||||
A new method will be added to the VolumeSnapshotter interface (details depending on plug-in versioning spec)
|
||||
|
||||
UploadProgress(snapshotID string) (UploadProgress, error)
|
||||
|
||||
UploadProgress will report the current status of a snapshot upload. This should be callable at any time after the snapshot
|
||||
has been taken. In the event a plug-in is restarted, if the snapshotID continues to be valid it should be possible to
|
||||
retrieve the progress.
|
||||
|
||||
`error` is set if there is an issue retrieving progress. If the snapshot is has encountered an error during the upload,
|
||||
the error should be return in UploadProgress and error should be nil.
|
||||
|
||||
### SnapshotItemAction plug-in
|
||||
|
||||
Currently CSI snapshots and the Velero Plug-in for vSphere are implemented as BackupItemAction plugins. The majority of
|
||||
BackupItemAction plugins do not take snapshots or upload data so rather than modify BackupItemAction we introduce a new
|
||||
plug-ins, SnapshotItemAction. SnapshotItemAction will be used in place of BackupItemAction for
|
||||
the CSI snapshots and the Velero Plug-in for vSphere and will return a snapshot ID in addition to the item itself.
|
||||
|
||||
The SnapshotItemAction plugin identifier as well as the Item and Snapshot ID will be stored in the
|
||||
`<backup-name>-itemsnapshots.json.gz`. When checking for progress, this info will be used to select the appropriate
|
||||
SnapshotItemAction plugin to query for progress.
|
||||
|
||||
_NotApplicable_ should only be returned if the SnapshotItemAction plugin should not be handling the item. If the
|
||||
SnapshotItemAction plugin should handle the item but, for example, the item/snapshot ID cannot be found to report progress, a
|
||||
UploadProgress struct with the error set appropriately (in this case _NotFound_) should be returned.
|
||||
|
||||
// SnapshotItemAction is an actor that snapshots an individual item being backed up (it may also do other
|
||||
operations on the item that is returned).
|
||||
|
||||
type SnapshotItemAction interface {
|
||||
// AppliesTo returns information about which resources this action should be invoked for.
|
||||
// A BackupItemAction's Execute function will only be invoked on items that match the returned
|
||||
// selector. A zero-valued ResourceSelector matches all resources.
|
||||
AppliesTo() (ResourceSelector, error)
|
||||
|
||||
// Execute allows the ItemAction to perform arbitrary logic with the item being backed up,
|
||||
// including mutating the item itself prior to backup. The item (unmodified or modified)
|
||||
// should be returned, along with an optional slice of ResourceIdentifiers specifying
|
||||
// additional related items that should be backed up.
|
||||
Execute(item runtime.Unstructured, backup *api.Backup) (runtime.Unstructured, snapshotID string,
|
||||
[]ResourceIdentifier, error)
|
||||
|
||||
// Progress
|
||||
Progress(input *SnapshotItemProgressInput) (UploadProgress, error)
|
||||
}
|
||||
|
||||
// SnapshotItemProgressInput contains the input parameters for the SnapshotItemAction's Progress function.
|
||||
type SnapshotItemProgressInput struct {
|
||||
// Item is the item that was stored in the backup
|
||||
Item runtime.Unstructured
|
||||
// SnapshotID is the snapshot ID returned by SnapshotItemAction
|
||||
SnapshotID string
|
||||
// Backup is the representation of the restore resource processed by Velero.
|
||||
Backup *velerov1api.Backup
|
||||
}
|
||||
|
||||
|
||||
## Changes in Velero backup format
|
||||
|
||||
No changes to the existing format are introduced by this change. A `<backup-name>-itemsnapshots.json.gz` file will be
|
||||
added that contains the items and snapshot IDs returned by ItemSnapshotAction. Also, the creation of the
|
||||
`velero-backup.json` object will not occur until the backup moves to one of the terminal phases (_Completed_,
|
||||
_PartiallyFailed_, or _Failed_). Reconciliation should ignore backups that do not have a `velero-backup.json` object.
|
||||
|
||||
The cluster that is creating the backup will have the Backup resource present and will be able to manage the backup
|
||||
before the backup completes.
|
||||
|
||||
If the Backup resource is removed (e.g. Velero is uninstalled) before a backup completes and writes its
|
||||
`velero-backup.json` object, the other objects in the object store for the backup will be effectively orphaned. This
|
||||
can currently happen but the current window is much smaller.
|
||||
|
||||
### `<backup-name>-itemsnapshots.json.gz`
|
||||
The itemsnapshots file is similar to the existing `<backup-name>-itemsnapshots.json.gz` Each snapshot taken via
|
||||
SnapshotItemAction will have a JSON record in the file. Exact format TBD.
|
||||
|
||||
## CSI snapshots
|
||||
|
||||
For systems such as EBS, a snapshot is not available until the storage system has transferred the snapshot to
|
||||
stable storage. CSI snapshots expose the _readyToUse_ state that, in the case of EBS, indicates that the snapshot
|
||||
has been transferred to durable storage and is ready to be used. The CSI BackupItemProgress.Progress method will
|
||||
poll that field and when completed, return completion.
|
||||
|
||||
## vSphere plug-in
|
||||
|
||||
The vSphere Plug-in for Velero uploads snapshots to S3 in the background. This is also a BackupItemAction plug-in,
|
||||
it will check the status of the Upload records for the snapshot and return progress.
|
||||
|
||||
## Backup workflow changes
|
||||
|
||||
The backup workflow remains the same until we get to the point where the `velero-backup.json` object is written.
|
||||
At this point, we will queue the backup to a finalization go-routine. The next backup may then begin. The finalization
|
||||
routine will run across all of the volume snapshots and call the _UploadProgress_ method on each of them. It will
|
||||
then run across all items and call _BackupItemProgress.Progress_ for any that match with a BackupItemProgress.
|
||||
|
||||
If all snapshots and backup items have finished uploading (either successfully or failed), the backup will be completed
|
||||
and the backup will move to the appropriate terminal phase and upload the `velero-backup.json` object to the object store
|
||||
and the backup will be complete.
|
||||
|
||||
If any of the snapshots or backup items are still being processed, the phase of the backup will be set to the appropriate
|
||||
phase (_Uploading_ or _UploadingPartialFailure_). In the event of any of the upload progress checks return an error, the
|
||||
phase will move to _UploadingPartialFailure_. The backup will then be requeued and will be rechecked again after some
|
||||
time has passed.
|
||||
|
||||
## Restart workflow
|
||||
On restart, the Velero server will scan all Backup resources. Any Backup resources which are in the _InProgress_ phase
|
||||
will be moved to the _Failed_ phase. Any Backup resources in the _Oploading_ or _OploadingPartialFailure_ phase will
|
||||
be treated as if they have been requeued and progress checked and the backup will be requeued or moved to a terminal
|
||||
phase as appropriate.
|
||||
|
||||
# Implementation tasks
|
||||
|
||||
VolumeSnapshotter new plugin APIs
|
||||
BackupItemProgress new plugin interface
|
||||
New backup phases
|
||||
Defer uploading `velero-backup.json`
|
||||
AWS EBS plug-in UploadProgress implementation
|
||||
Upload monitoring
|
||||
Implementation of `<backup-name>-itemsnapshots.json.gz` file
|
||||
Restart logic
|
||||
Change in reconciliation logic to ignore backups that have not completed
|
||||
CSI plug-in BackupItemProgress implementation
|
||||
vSphere plug-in BackupItemProgress implementation (vSphere plug-in team)
|
||||
|
||||
# Future Fragile/Durable snapshot tracking
|
||||
Futures are here for reference, they may change radically when actually implemented.
|
||||
|
||||
Some storage systems have the ability to provide different levels of protection for snapshots. These are termed "Fragile"
|
||||
and "Durable". Currently, Velero expects snapshots to be Durable (they should be able to survive the destruction of the
|
||||
cluster and the storage it is using). In the future we would like the ability to take advantage of snapshots that are
|
||||
Fragile. For example, vSphere snapshots are Fragile (they reside in the same datastore as the virtual disk). The Velero
|
||||
Plug-in for vSphere uses a vSphere local/fragile snapshot to get a consistent snapshot, then uploads the data to S3 to
|
||||
make it Durable. In the current design, upload progress will not be complete until the snapshot is ready to use and
|
||||
Durable. It is possible, however, to restore data from a vSphere snapshot before it has been made Durable, and this is a
|
||||
capability we'd like to expose in the future. Other storage systems implement this functionality as well. We will be moving
|
||||
the control of the data movement from the vSphere plug-in into Velero.
|
||||
|
||||
Some storage system, such as EBS, are only capable of creating Durable snapshots. There is no usable intermediate Fragile stage.
|
||||
|
||||
For a Velero backup, users should be able to specify whether they want a Durable backup or a Fragile backup (Fragile backups
|
||||
may consume less resources, be quicker to restore from and are suitable for things like backing up a cluster before upgrading
|
||||
software). We can introduce three snapshot states - Creating, Fragile and Durable. A snapshot would be created with a
|
||||
desired state, Fragile or Durable. When the snapshot reaches the desired or higher state (e.g. request was for Fragile but
|
||||
snapshot went to Durable as on EBS), then the snapshot would be completed.
|
Loading…
Reference in New Issue