2016-12-02 21:52:31 +00:00
---
2017-08-03 23:22:52 +00:00
approvers:
2016-12-02 21:52:31 +00:00
- enisoc
- erictune
- foxish
- janetkuo
- kow3ns
- smarterclayton
2016-12-15 20:16:54 +00:00
title: StatefulSets
2016-12-02 21:52:31 +00:00
---
{% capture overview %}
2017-07-28 15:23:11 +00:00
**StatefulSets are a beta feature in 1.7. This feature replaces the
PetSets feature from 1.4. Users of PetSets are referred to the 1.5
2016-12-12 06:36:19 +00:00
[Upgrade Guide ](/docs/tasks/manage-stateful-set/upgrade-pet-set-to-stateful-set/ )
2016-12-02 21:52:31 +00:00
for further information on how to upgrade existing PetSets to StatefulSets.**
A StatefulSet is a Controller that provides a unique identity to its Pods. It provides
guarantees about the ordering of deployment and scaling.
{% endcapture %}
{% capture body %}
2017-01-18 18:18:37 +00:00
## Using StatefulSets
2016-12-13 03:03:38 +00:00
2017-07-28 15:23:11 +00:00
StatefulSets are valuable for applications that require one or more of the
2016-12-02 21:52:31 +00:00
following.
* Stable, unique network identifiers.
* Stable, persistent storage.
* Ordered, graceful deployment and scaling.
* Ordered, graceful deletion and termination.
2017-06-27 16:46:39 +00:00
* Ordered, automated rolling updates.
2016-12-02 21:52:31 +00:00
2017-06-27 16:46:39 +00:00
In the above, stable is synonymous with persistence across Pod (re)scheduling.
2017-07-28 15:23:11 +00:00
If an application doesn't require any stable identifiers or ordered deployment,
deletion, or scaling, you should deploy your application with a controller that
provides a set of stateless replicas. Controllers such as
[Deployment ](/docs/concepts/workloads/controllers/deployment/ ) or
2017-04-19 17:56:47 +00:00
[ReplicaSet ](/docs/concepts/workloads/controllers/replicaset/ ) may be better suited to your stateless needs.
2016-12-02 21:52:31 +00:00
2017-01-18 18:18:37 +00:00
## Limitations
2016-12-02 21:52:31 +00:00
* StatefulSet is a beta resource, not available in any Kubernetes release prior to 1.5.
* As with all alpha/beta resources, you can disable StatefulSet through the `--runtime-config` option passed to the apiserver.
2017-01-17 15:24:31 +00:00
* The storage for a given Pod must either be provisioned by a [PersistentVolume Provisioner ](http://releases.k8s.io/{{page.githubbranch}}/examples/persistent-volume-provisioning/README.md ) based on the requested `storage class` , or pre-provisioned by an admin.
2016-12-02 21:52:31 +00:00
* Deleting and/or scaling a StatefulSet down will *not* delete the volumes associated with the StatefulSet. This is done to ensure data safety, which is generally more valuable than an automatic purge of all related StatefulSet resources.
2017-05-12 22:37:47 +00:00
* StatefulSets currently require a [Headless Service ](/docs/concepts/services-networking/service/#headless-services ) to be responsible for the network identity of the Pods. You are responsible for creating this Service.
2016-12-02 21:52:31 +00:00
2017-01-18 18:18:37 +00:00
## Components
2017-07-28 15:23:11 +00:00
The example below demonstrates the components of a StatefulSet.
2016-12-02 21:52:31 +00:00
2017-07-28 15:23:11 +00:00
* A Headless Service, named nginx, is used to control the network domain.
2016-12-02 21:52:31 +00:00
* The StatefulSet, named web, has a Spec that indicates that 3 replicas of the nginx container will be launched in unique Pods.
2017-07-28 15:23:11 +00:00
* The volumeClaimTemplates will provide stable storage using [PersistentVolumes ](/docs/concepts/storage/volumes/ ) provisioned by a
2016-12-02 21:52:31 +00:00
PersistentVolume Provisioner.
```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx"
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: gcr.io/google_containers/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
2017-08-09 01:31:12 +00:00
storageClassName: my-storage-class
2016-12-02 21:52:31 +00:00
resources:
requests:
storage: 1Gi
```
2017-01-18 18:18:37 +00:00
## Pod Identity
2017-07-28 15:23:11 +00:00
StatefulSet Pods have a unique identity that is comprised of an ordinal, a
stable network identity, and stable storage. The identity sticks to the Pod,
2016-12-16 20:52:14 +00:00
regardless of which node it's (re)scheduled on.
2016-12-02 21:52:31 +00:00
2017-01-18 18:18:37 +00:00
### Ordinal Index
2016-12-02 21:52:31 +00:00
2017-07-28 15:23:11 +00:00
For a StatefulSet with N replicas, each Pod in the StatefulSet will be
assigned an integer ordinal, in the range [0,N), that is unique over the Set.
2016-12-02 21:52:31 +00:00
2017-01-18 18:18:37 +00:00
### Stable Network ID
2016-12-02 21:52:31 +00:00
2017-07-28 15:23:11 +00:00
Each Pod in a StatefulSet derives its hostname from the name of the StatefulSet
and the ordinal of the Pod. The pattern for the constructed hostname
is `$(statefulset name)-$(ordinal)` . The example above will create three Pods
2016-12-02 21:52:31 +00:00
named `web-0,web-1,web-2` .
2017-05-12 22:37:47 +00:00
A StatefulSet can use a [Headless Service ](/docs/concepts/services-networking/service/#headless-services )
2017-07-28 15:23:11 +00:00
to control the domain of its Pods. The domain managed by this Service takes the form:
`$(service name).$(namespace).svc.cluster.local` , where "cluster.local"
is the [cluster domain ](http://releases.k8s.io/{{page.githubbranch}}/cluster/addons/dns/README.md ).
As each Pod is created, it gets a matching DNS subdomain, taking the form:
`$(podname).$(governing service domain)` , where the governing service is defined
2016-12-02 21:52:31 +00:00
by the `serviceName` field on the StatefulSet.
2017-07-28 15:23:11 +00:00
Here are some examples of choices for Cluster Domain, Service name,
2016-12-02 21:52:31 +00:00
StatefulSet name, and how that affects the DNS names for the StatefulSet's Pods.
Cluster Domain | Service (ns/name) | StatefulSet (ns/name) | StatefulSet Domain | Pod DNS | Pod Hostname |
-------------- | ----------------- | ----------------- | -------------- | ------- | ------------ |
cluster.local | default/nginx | default/web | nginx.default.svc.cluster.local | web-{0..N-1}.nginx.default.svc.cluster.local | web-{0..N-1} |
cluster.local | foo/nginx | foo/web | nginx.foo.svc.cluster.local | web-{0..N-1}.nginx.foo.svc.cluster.local | web-{0..N-1} |
kube.local | foo/nginx | foo/web | nginx.foo.svc.kube.local | web-{0..N-1}.nginx.foo.svc.kube.local | web-{0..N-1} |
2017-07-28 15:23:11 +00:00
Note that Cluster Domain will be set to `cluster.local` unless
2017-05-12 22:37:47 +00:00
[otherwise configured ](http://releases.k8s.io/{{page.githubbranch}}/cluster/addons/dns/README.md ).
2016-12-02 21:52:31 +00:00
2017-01-18 18:18:37 +00:00
### Stable Storage
2016-12-02 21:52:31 +00:00
2017-07-28 15:23:11 +00:00
Kubernetes creates one [PersistentVolume ](/docs/concepts/storage/volumes/ ) for each
VolumeClaimTemplate. In the nginx example above, each Pod will receive a single PersistentVolume
2017-08-09 01:31:12 +00:00
with a StorageClass of `my-storage-class` and 1 Gib of provisioned storage. If no StorageClass
is specified, then the default StorageClass will be used. When a Pod is (re)scheduled
2017-07-28 15:23:11 +00:00
onto a node, its `volumeMounts` mount the PersistentVolumes associated with its
PersistentVolume Claims. Note that, the PersistentVolumes associated with the
Pods' PersistentVolume Claims are not deleted when the Pods, or StatefulSet are deleted.
2016-12-02 21:52:31 +00:00
This must be done manually.
2017-06-27 16:46:39 +00:00
## Deployment and Scaling Guarantees
2016-12-02 21:52:31 +00:00
2017-07-28 15:23:11 +00:00
* For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.
2016-12-02 21:52:31 +00:00
* When Pods are being deleted, they are terminated in reverse order, from {N-1..0}.
2017-07-28 15:23:11 +00:00
* Before a scaling operation is applied to a Pod, all of its predecessors must be Running and Ready.
2016-12-02 21:52:31 +00:00
* Before a Pod is terminated, all of its successors must be completely shutdown.
2017-05-12 22:37:47 +00:00
The StatefulSet should not specify a `pod.Spec.TerminationGracePeriodSeconds` of 0. This practice is unsafe and strongly discouraged. For further explanation, please refer to [force deleting StatefulSet Pods ](/docs/tasks/run-application/force-delete-stateful-set-pod/ ).
2016-12-07 18:25:44 +00:00
2017-07-28 15:23:11 +00:00
When the nginx example above is created, three Pods will be deployed in the order
web-0, web-1, web-2. web-1 will not be deployed before web-0 is
[Running and Ready ](/docs/user-guide/pod-states ), and web-2 will not be deployed until
web-1 is Running and Ready. If web-0 should fail, after web-1 is Running and Ready, but before
web-2 is launched, web-2 will not be launched until web-0 is successfully relaunched and
becomes Running and Ready.
2016-12-02 21:52:31 +00:00
If a user were to scale the deployed example by patching the StatefulSet such that
2017-07-28 15:23:11 +00:00
`replicas=1` , web-2 would be terminated first. web-1 would not be terminated until web-2
is fully shutdown and deleted. If web-0 were to fail after web-2 has been terminated and
is completely shutdown, but prior to web-1's termination, web-1 would not be terminated
2016-12-02 21:52:31 +00:00
until web-0 is Running and Ready.
2017-06-27 16:46:39 +00:00
### Pod Management Policies
2017-07-28 15:23:11 +00:00
In Kubernetes 1.7 and later, StatefulSet allows you to relax its ordering guarantees while
2017-06-27 16:46:39 +00:00
preserving its uniqueness and identity guarantees via its `.spec.podManagementPolicy` field.
#### OrderedReady Pod Management
2017-07-28 15:23:11 +00:00
`OrderedReady` pod management is the default for StatefulSets. It implements the behavior
2017-06-27 16:46:39 +00:00
described [above ](#deployment-and-scaling-guarantees ).
#### Parallel Pod Management
2017-07-28 15:23:11 +00:00
`Parallel` pod management tells the StatefulSet controller to launch or
terminate all Pods in parallel, and to not wait for Pods to become Running
and Ready or completely terminated prior to launching or terminating another
2017-06-27 16:46:39 +00:00
Pod.
## Update Strategies
2017-08-01 06:45:38 +00:00
In Kubernetes 1.7 and later, StatefulSet's `.spec.updateStrategy` field allows you to configure
2017-07-28 15:23:11 +00:00
and disable automated rolling updates for containers, labels, resource request/limits, and
2017-06-27 16:46:39 +00:00
annotations for the Pods in a StatefulSet.
### On Delete
2017-07-28 15:23:11 +00:00
The `OnDelete` update strategy implements the legacy (1.6 and prior) behavior. It is the default
strategy when `spec.updateStrategy` is left unspecified. When a StatefulSet's
`.spec.updateStrategy.type` is set to `OnDelete` , the StatefulSet controller will not automatically
update the Pods in a StatefulSet. Users must manually delete Pods to cause the controller to
2017-06-27 16:46:39 +00:00
create new Pods that reflect modifications made to a StatefulSet's `.spec.template` .
### Rolling Updates
2017-07-28 15:23:11 +00:00
The `RollingUpdate` update strategy implements automated, rolling update for the Pods in a
StatefulSet. When a StatefulSet's `.spec.updateStrategy.type` is set to `RollingUpdate` , the
StatefulSet controller will delete and recreate each Pod in the StatefulSet. It will proceed
in the same order as Pod termination (from the largest ordinal to the smallest), updating
each Pod one at a time. It will wait until an updated Pod is Running and Ready prior to
2017-06-27 16:46:39 +00:00
updating its predecessor.
#### Partitions
2017-07-28 15:23:11 +00:00
The `RollingUpdate` update strategy can be partitioned, by specifying a
`.spec.updateStrategy.rollingUpdate.partition` . If a partition is specified, all Pods with an
ordinal that is greater than or equal to the partition will be updated when the StatefulSet's
`.spec.template` is updated. All Pods with an ordinal that is less than the partition will not
be updated, and, even if they are deleted, they will be recreated at the previous version. If a
StatefulSet's `.spec.updateStrategy.rollingUpdate.partition` is greater than its `.spec.replicas` ,
2017-06-27 16:46:39 +00:00
updates to its `.spec.template` will not be propagated to its Pods.
2017-07-28 15:23:11 +00:00
In most cases you will not need to use a partition, but they are useful if you want to stage an
2017-06-27 16:46:39 +00:00
update, roll out a canary, or perform a phased roll out.
{% endcapture %}
{% capture whatsnext %}
2017-07-28 15:23:11 +00:00
* Follow an example of [deploying a stateful application ](/docs/tutorials/stateful-application/basic-stateful-set ).
2017-06-27 16:46:39 +00:00
2016-12-02 21:52:31 +00:00
{% endcapture %}
{% include templates/concept.md %}