|
|
|
@ -1,70 +1,492 @@
|
|
|
|
|
---
|
|
|
|
|
approvers:
|
|
|
|
|
- pweil-
|
|
|
|
|
- tallclair
|
|
|
|
|
title: Pod Security Policies
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
Objects of type `PodSecurityPolicy` govern the ability
|
|
|
|
|
to make requests on a pod that affect the `SecurityContext` that will be
|
|
|
|
|
applied to a pod and container.
|
|
|
|
|
{% include feature-state-beta.md %}
|
|
|
|
|
|
|
|
|
|
See [PodSecurityPolicy proposal](https://git.k8s.io/community/contributors/design-proposals/auth/pod-security-policy.md) for more information.
|
|
|
|
|
Pod Security Policies enable fine-grained authorization of pod creation and
|
|
|
|
|
updates.
|
|
|
|
|
|
|
|
|
|
* TOC
|
|
|
|
|
{:toc}
|
|
|
|
|
|
|
|
|
|
## What is a Pod Security Policy?
|
|
|
|
|
|
|
|
|
|
A _Pod Security Policy_ is a cluster-level resource that controls the
|
|
|
|
|
actions that a pod can perform and what it has the ability to access. The
|
|
|
|
|
`PodSecurityPolicy` objects define a set of conditions that a pod must
|
|
|
|
|
run with in order to be accepted into the system. They allow an
|
|
|
|
|
A _Pod Security Policy_ is a cluster-level resource that controls security
|
|
|
|
|
sensitive aspects of the pod specification. The `PodSecurityPolicy` objects
|
|
|
|
|
define a set of conditions that a pod must run with in order to be accepted into
|
|
|
|
|
the system, as well as defaults for the related fields. They allow an
|
|
|
|
|
administrator to control the following:
|
|
|
|
|
|
|
|
|
|
| Control Aspect | Field Name |
|
|
|
|
|
| ---------------------------------------------------------------------- | ------------------------------------------- |
|
|
|
|
|
| Control Aspect | Field Names |
|
|
|
|
|
| ----------------------------------------------------| ------------------------------------------- |
|
|
|
|
|
| Running of privileged containers | `privileged` |
|
|
|
|
|
| Default set of capabilities that will be added to a container | `defaultAddCapabilities` |
|
|
|
|
|
| Capabilities that will be dropped from a container | `requiredDropCapabilities` |
|
|
|
|
|
| Capabilities a container can request to be added | `allowedCapabilities` |
|
|
|
|
|
| Controlling the usage of volume types | [`volumes`](#controlling-volumes) |
|
|
|
|
|
| The use of host networking | [`hostNetwork`](#host-network) |
|
|
|
|
|
| The use of host ports | `hostPorts` |
|
|
|
|
|
| The use of host's PID namespace | `hostPID` |
|
|
|
|
|
| The use of host's IPC namespace | `hostIPC` |
|
|
|
|
|
| Usage of the root namespaces | [`hostPID`, `hostIPC`](#host-namespaces) |
|
|
|
|
|
| Usage of host networking and ports | [`hostNetwork`, `hostPorts`](#host-namespaces) |
|
|
|
|
|
| Usage of volume types | [`volumes`](#volumes-and-file-systems) |
|
|
|
|
|
| Usage of the host filesystem | [`allowedHostPaths`](#volumes-and-file-systems) |
|
|
|
|
|
| Allocating an FSGroup that owns the pod's volumes | [`fsGroup`](#volumes-and-file-systems) |
|
|
|
|
|
| Requiring the use of a read only root file system | [`readOnlyRootFilesystem`](#volumes-and-file-systems) |
|
|
|
|
|
| The user and group IDs of the container | [`runAsUser`, `supplementalGroups`](#users-and-groups) |
|
|
|
|
|
| Restricting escalation to root privileges | [`allowPrivilegeEscalation`, `defaultAllowPrivilegeEscalation`](#privilege-escalation) |
|
|
|
|
|
| Linux capabilities | [`defaultAddCapabilities`, `requiredDropCapabilities`, `allowedCapabilities`](#capabilities) |
|
|
|
|
|
| The SELinux context of the container | [`seLinux`](#selinux) |
|
|
|
|
|
| The user ID | [`runAsUser`](#runasuser) |
|
|
|
|
|
| Configuring allowable supplemental groups | [`supplementalGroups`](#supplementalgroups) |
|
|
|
|
|
| Allocating an FSGroup that owns the pod's volumes | [`fsGroup`](#fsgroup) |
|
|
|
|
|
| Requiring the use of a read only root file system | `readOnlyRootFilesystem` |
|
|
|
|
|
| Running of a container that allow privilege escalation from its parent | [`allowPrivilegeEscalation`](#allowprivilegeescalation) |
|
|
|
|
|
| Control whether a process can gain more privileges than its parent process | [`defaultAllowPrivilegeEscalation`](#defaultallowprivilegeescalation) |
|
|
|
|
|
| Whitelist of allowed host paths | [`allowedHostPaths`](#allowedhostpaths) |
|
|
|
|
|
|
|
|
|
|
_Pod Security Policies_ are comprised of settings and strategies that
|
|
|
|
|
control the security features a pod has access to. These settings fall
|
|
|
|
|
into three categories:
|
|
|
|
|
|
|
|
|
|
- *Controlled by a Boolean*: Fields of this type default to the most
|
|
|
|
|
restrictive value.
|
|
|
|
|
- *Controlled by an allowable set*: Fields of this type are checked
|
|
|
|
|
against the set to ensure their values are allowed.
|
|
|
|
|
- *Controlled by a strategy*: Items that have a strategy to provide
|
|
|
|
|
a mechanism to generate the value and a mechanism to ensure that a
|
|
|
|
|
specified value falls into the set of allowable values.
|
|
|
|
|
| The AppArmor profile used by containers | [annotations](#apparmor) |
|
|
|
|
|
| The seccomp profile used by containers | [annotations](#seccomp) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Strategies
|
|
|
|
|
## Enabling Pod Security Policies
|
|
|
|
|
|
|
|
|
|
### RunAsUser
|
|
|
|
|
Pod security policy control is implemented as an optional (but recommended)
|
|
|
|
|
[admission
|
|
|
|
|
controller](/docs/admin/admission-controllers/#podsecuritypolicy). PodSecurityPolicies
|
|
|
|
|
are enforced by [enabling the admission
|
|
|
|
|
controller](/docs/admin/admission-controllers/#how-do-i-turn-on-an-admission-control-plug-in),
|
|
|
|
|
but doing so without authorizing any policies **will prevent any pods from being
|
|
|
|
|
created** in the cluster.
|
|
|
|
|
|
|
|
|
|
- *MustRunAs* - Requires a `range` to be configured. Uses the first value
|
|
|
|
|
of the range as the default. Validates against the configured range.
|
|
|
|
|
Since the pod security policy API (`extensions/v1beta1/podsecuritypolicy`) is
|
|
|
|
|
enabled independently of the admission controller, for existing clusters it is
|
|
|
|
|
recommended that policies are added and authorized before enabling the admission
|
|
|
|
|
controller.
|
|
|
|
|
|
|
|
|
|
## Authorizing Policies
|
|
|
|
|
|
|
|
|
|
When a PodSecurityPolicy resource is created, it does nothing. In order to use
|
|
|
|
|
it, the requesting user or target pod's [service
|
|
|
|
|
account](/docs/tasks/configure-pod-container/configure-service-account/) must be
|
|
|
|
|
authorized to use the policy, by allowing the `use` verb on the policy.
|
|
|
|
|
|
|
|
|
|
Most Kubernetes pods are not created directly by users. Instead, they are
|
|
|
|
|
typically created indirectly as part of a
|
|
|
|
|
[Deployment](/docs/concepts/workloads/controllers/deployment/),
|
|
|
|
|
[ReplicaSet](/docs/concepts/workloads/controllers/replicaset/), or other
|
|
|
|
|
templated controller via the controller manager. Granting the controller access
|
|
|
|
|
to the policy would grant access for *all* pods created by that the controller,
|
|
|
|
|
so the preferred method for authorizing policies is to grant access to the
|
|
|
|
|
pod's service account (see [example](#run-another-pod)).
|
|
|
|
|
|
|
|
|
|
### Via RBAC
|
|
|
|
|
|
|
|
|
|
[RBAC](/docs/admin/authorization/rbac/) is a standard Kubernetes authorization
|
|
|
|
|
mode, and can easily be used to authorize use of policies.
|
|
|
|
|
|
|
|
|
|
First, a `Role` or `ClusterRole` needs to grant access to `use` the desired
|
|
|
|
|
policies. The rules to grant access look like this:
|
|
|
|
|
|
|
|
|
|
```yaml
|
|
|
|
|
kind: ClusterRole
|
|
|
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
|
|
|
metadata:
|
|
|
|
|
name: <role name>
|
|
|
|
|
rules:
|
|
|
|
|
- apiGroups: ['extensions']
|
|
|
|
|
resources: ['podsecuritypolicies']
|
|
|
|
|
verbs: ['use']
|
|
|
|
|
resourceNames:
|
|
|
|
|
- <list of policies to authorize>
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Then the `(Cluster)Role` is bound to the authorized user(s):
|
|
|
|
|
|
|
|
|
|
```yaml
|
|
|
|
|
kind: ClusterRoleBinding
|
|
|
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
|
|
|
metadata:
|
|
|
|
|
name: <binding name>
|
|
|
|
|
roleRef:
|
|
|
|
|
kind: ClusterRole
|
|
|
|
|
name: <role name>
|
|
|
|
|
apiGroup: rbac.authorization.k8s.io
|
|
|
|
|
subjects:
|
|
|
|
|
# Authorize specific service accounts:
|
|
|
|
|
- kind: ServiceAccount
|
|
|
|
|
name: <authorized service account name>
|
|
|
|
|
namespace: <authorized pod namespace>
|
|
|
|
|
# Authorize specific users (not recommended):
|
|
|
|
|
- kind: User
|
|
|
|
|
apiGroup: rbac.authorization.k8s.io
|
|
|
|
|
name: <authorized user name>
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
If a `RoleBinding` (not a `ClusterRoleBinding`) is used, it will only grant
|
|
|
|
|
usage for pods being run in the same namespace as the binding. This can be
|
|
|
|
|
paired with system groups to grant access to all pods run in the namespace:
|
|
|
|
|
```yaml
|
|
|
|
|
# Authorize all service accounts in a namespace:
|
|
|
|
|
- kind: Group
|
|
|
|
|
apiGroup: rbac.authorization.k8s.io
|
|
|
|
|
name: system:serviceaccounts
|
|
|
|
|
# Or equivalently, all authenticated users in a namespace:
|
|
|
|
|
- kind: Group
|
|
|
|
|
apiGroup: rbac.authorization.k8s.io
|
|
|
|
|
name: system:authenticated
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
For more examples of RBAC bindings, see [Role Binding
|
|
|
|
|
Examples](docs/admin/authorization/rbac/#role-binding-examples). For a complete
|
|
|
|
|
example of authorizing a PodSecurityPolicy, see
|
|
|
|
|
[below](#example).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Troubleshooting
|
|
|
|
|
|
|
|
|
|
- The [Controller Manager](/docs/admin/kube-controller-manager/) must be run
|
|
|
|
|
against [the secured API port](/docs/admin/accessing-the-api/), and must not
|
|
|
|
|
have superuser permissions. Otherwise requests would bypass authentication and
|
|
|
|
|
authorization modules, all PodSecurityPolicy objects would be allowed, and users
|
|
|
|
|
would be able to create privileged containers. For more details on configuring
|
|
|
|
|
Controller Manager authorization, see [Controller
|
|
|
|
|
Roles](docs/admin/authorization/rbac/#controller-roles).
|
|
|
|
|
|
|
|
|
|
## Policy Order
|
|
|
|
|
|
|
|
|
|
In addition to restricting pod creation and update, pod security policies can
|
|
|
|
|
also be used to provide default values for many of the fields that it
|
|
|
|
|
controls. When multiple policies are available, the pod security policy
|
|
|
|
|
controller selects policies in the following order:
|
|
|
|
|
|
|
|
|
|
1. If any policies successfully validate the pod without altering it, they are
|
|
|
|
|
used.
|
|
|
|
|
2. Otherwise, the first valid policy in alphabetical order is used.
|
|
|
|
|
|
|
|
|
|
## Example
|
|
|
|
|
|
|
|
|
|
_This example assumes you have a running cluster with the PodSecurityPolicy
|
|
|
|
|
admission controller enabled and you have cluster admin privileges._
|
|
|
|
|
|
|
|
|
|
### Set up
|
|
|
|
|
|
|
|
|
|
Set up a namespace and a service account to act as for this example. We'll use
|
|
|
|
|
this service account to mock a non-admin user.
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl create namespace psp-example
|
|
|
|
|
$ kubectl create serviceaccount -n psp-example fake-user
|
|
|
|
|
$ kubectl create rolebinding -n psp-example fake-editor --clusterrole=edit --serviceaccount=psp-example:fake-user
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
To make it clear which user we're acting as and save some typing, create 2
|
|
|
|
|
aliases:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ alias kubectl-admin='kubectl -n psp-example'
|
|
|
|
|
$ alias kubectl-user='kubectl --as=system:serviceaccount:psp-example:fake-user -n psp-example'
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Create a policy and a pod
|
|
|
|
|
|
|
|
|
|
Define the example PodSecurityPolicy object in a file. This is a policy that
|
|
|
|
|
simply prevents the creation of privileged pods.
|
|
|
|
|
|
|
|
|
|
{% include code.html language="yaml" file="example-psp.yaml" ghlink="/docs/concepts/policy/example-psp.yaml" %}
|
|
|
|
|
|
|
|
|
|
And create it with kubectl:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-admin create -f example-psp.yaml
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Now, as the unprivileged user, try to create a simple pod:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-user create -f- <<EOF
|
|
|
|
|
apiVersion: v1
|
|
|
|
|
kind: Pod
|
|
|
|
|
metadata:
|
|
|
|
|
name: pause
|
|
|
|
|
spec:
|
|
|
|
|
containers:
|
|
|
|
|
- name: pause
|
|
|
|
|
image: gcr.io/google-containers/pause
|
|
|
|
|
EOF
|
|
|
|
|
Error from server (Forbidden): error when creating "STDIN": pods "pause" is forbidden: unable to validate against any pod security policy: []
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**What happened?** Although the PodSecurityPolicy was created, neither the
|
|
|
|
|
pod's service account nor `fake-user` have permission to use the new policy:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-user auth can-i use podsecuritypolicy/example
|
|
|
|
|
no
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Create the rolebinding to grant `fake-user` the `use` verb on the example
|
|
|
|
|
policy:
|
|
|
|
|
|
|
|
|
|
_Note: This is not the recommended way! See the [next section](#run-another-pod)
|
|
|
|
|
for the preferred approach._
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-admin create role psp:unprivileged \
|
|
|
|
|
--verb=use \
|
|
|
|
|
--resource=podsecuritypolicy \
|
|
|
|
|
--resource-name=example
|
|
|
|
|
role "psp:unprivileged" created
|
|
|
|
|
$ kubectl-admin create rolebinding fake-user:psp:unprivileged \
|
|
|
|
|
--role=psp:unprivileged \
|
|
|
|
|
--serviceaccount=psp-example:fake-user
|
|
|
|
|
rolebinding "fake-user:psp:unprivileged" created
|
|
|
|
|
$ kubectl-user auth can-i use podsecuritypolicy/example
|
|
|
|
|
yes
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Now retry creating the pod:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-user create -f- <<EOF
|
|
|
|
|
apiVersion: v1
|
|
|
|
|
kind: Pod
|
|
|
|
|
metadata:
|
|
|
|
|
name: pause
|
|
|
|
|
spec:
|
|
|
|
|
containers:
|
|
|
|
|
- name: pause
|
|
|
|
|
image: gcr.io/google-containers/pause
|
|
|
|
|
EOF
|
|
|
|
|
pod "pause" created
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
It works as expected! But any attempts to create a privileged pod should still
|
|
|
|
|
be denied:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-user create -f- <<EOF
|
|
|
|
|
apiVersion: v1
|
|
|
|
|
kind: Pod
|
|
|
|
|
metadata:
|
|
|
|
|
name: privileged
|
|
|
|
|
spec:
|
|
|
|
|
containers:
|
|
|
|
|
- name: pause
|
|
|
|
|
image: gcr.io/google-containers/pause
|
|
|
|
|
securityContext:
|
|
|
|
|
privileged: true
|
|
|
|
|
EOF
|
|
|
|
|
Error from server (Forbidden): error when creating "STDIN": pods "privileged" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Delete the pod before moving on:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-user delete pause
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Run another pod
|
|
|
|
|
|
|
|
|
|
Let's try that again, slightly differently:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-user run pause --image=gcr.io/google-containers/pause
|
|
|
|
|
deployment "pause" created
|
|
|
|
|
$ kubectl-user get pods
|
|
|
|
|
No resources found.
|
|
|
|
|
$ kubectl-user get events | head -n 2
|
|
|
|
|
LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
|
|
|
|
|
1m 2m 15 pause-7774d79b5 ReplicaSet Warning FailedCreate replicaset-controller Error creating: pods "pause-7774d79b5-" is forbidden: no providers available to validate pod request
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**What happened?** We already bound the `psp:unprivileged` role for our `fake-user`,
|
|
|
|
|
why are we getting the error `Error creating: pods "pause-7774d79b5-" is
|
|
|
|
|
forbidden: no providers available to validate pod request`? The answer lies in
|
|
|
|
|
the source - `replicaset-controller`. Fake-user successfully created the
|
|
|
|
|
deployment (which successfully created a replicaset), but when the replicaset
|
|
|
|
|
went to create the pod it was not authorized to use the example
|
|
|
|
|
podsecuritypolicy.
|
|
|
|
|
|
|
|
|
|
In order to fix this, bind the `psp:unprivileged` role to the pod's service
|
|
|
|
|
account instead. In this case (since we didn't specify it) the service account
|
|
|
|
|
is `default`:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-admin create rolebinding default:psp:unprivileged \
|
|
|
|
|
--role=psp:unprivileged \
|
|
|
|
|
--serviceaccount=psp-example:default
|
|
|
|
|
rolebinding "default:psp:unprivileged" created
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Now if you give it a minute to retry, the replicaset-controller should
|
|
|
|
|
eventually succeed in creating the pod:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-user get pods --watch
|
|
|
|
|
NAME READY STATUS RESTARTS AGE
|
|
|
|
|
pause-7774d79b5-qrgcb 0/1 Pending 0 1s
|
|
|
|
|
pause-7774d79b5-qrgcb 0/1 Pending 0 1s
|
|
|
|
|
pause-7774d79b5-qrgcb 0/1 ContainerCreating 0 1s
|
|
|
|
|
pause-7774d79b5-qrgcb 1/1 Running 0 2s
|
|
|
|
|
^C
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Clean up
|
|
|
|
|
|
|
|
|
|
Delete the namespace to clean up most of the example resources:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-admin delete ns psp-example
|
|
|
|
|
namespace "psp-example" deleted
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Note that `PodSecurityPolicy` resources are not namespaced, and must be cleaned
|
|
|
|
|
up separately:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl-admin delete psp example
|
|
|
|
|
podsecuritypolicy "example" deleted
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Example Policies
|
|
|
|
|
|
|
|
|
|
This is the least restricted policy you can create, equivalent to not using the
|
|
|
|
|
pod security policy admission controller:
|
|
|
|
|
|
|
|
|
|
{% include code.html language="yaml" file="privileged-psp.yaml" ghlink="/docs/concepts/policy/privileged-psp.yaml" %}
|
|
|
|
|
|
|
|
|
|
This is an example of a restrictive policy that requires users to run as an
|
|
|
|
|
unprivileged user, blocks possible escalations to root, and requires use of
|
|
|
|
|
several security mechanisms.
|
|
|
|
|
|
|
|
|
|
{% include code.html language="yaml" file="restricted-psp.yaml" ghlink="/docs/concepts/policy/restricted-psp.yaml" %}
|
|
|
|
|
|
|
|
|
|
## Policy Reference
|
|
|
|
|
|
|
|
|
|
### Host namespaces
|
|
|
|
|
|
|
|
|
|
**HostPID** - Controls whether the pod containers can share the host process ID
|
|
|
|
|
namespace. Note that when paired with ptrace this can be used to escalate
|
|
|
|
|
privileges outside of the container (ptrace is forbidden by default).
|
|
|
|
|
|
|
|
|
|
**HostIPC** - Controls whether the pod containers can share the host IPC
|
|
|
|
|
namespace.
|
|
|
|
|
|
|
|
|
|
**HostNetwork** - Controls whether the pod may use the node network
|
|
|
|
|
namespace. Doing so gives the pod access to the loopback device, services
|
|
|
|
|
listening on localhost, and could be used to snoop on network activity of other
|
|
|
|
|
pods on the same node.
|
|
|
|
|
|
|
|
|
|
**HostPorts** - Provides a whitelist of ranges of allowable ports in the host
|
|
|
|
|
network namespace. Defined as a list of `HostPortRange`, with `min`(inclusive)
|
|
|
|
|
and `max`(inclusive). Defaults to no allowed host ports.
|
|
|
|
|
|
|
|
|
|
**AllowedHostPaths** - See [Volumes and file systems](#volumes-and-file-systems).
|
|
|
|
|
|
|
|
|
|
### Volumes and file systems
|
|
|
|
|
|
|
|
|
|
**Volumes** - Provides a whitelist of allowed volume types. The allowable values
|
|
|
|
|
correspond to the volume sources that are defined when creating a volume. For
|
|
|
|
|
the complete list of volume types, see [Types of
|
|
|
|
|
Volumes](/docs/concepts/storage/volumes/#types-of-volumes). Additionally, `*`
|
|
|
|
|
may be used to allow all volume types.
|
|
|
|
|
|
|
|
|
|
The **recommended minimum set** of allowed volumes for new PSPs are:
|
|
|
|
|
|
|
|
|
|
- configMap
|
|
|
|
|
- downwardAPI
|
|
|
|
|
- emptyDir
|
|
|
|
|
- persistentVolumeClaim
|
|
|
|
|
- secret
|
|
|
|
|
- projected
|
|
|
|
|
|
|
|
|
|
**FSGroup** - Controls the supplemental group applied to some volumes.
|
|
|
|
|
|
|
|
|
|
- *MustRunAs* - Requires at least one `range` to be specified. Uses the
|
|
|
|
|
minimum value of the first range as the default. Validates against all ranges.
|
|
|
|
|
- *RunAsAny* - No default provided. Allows any `fsGroup` ID to be specified.
|
|
|
|
|
|
|
|
|
|
**AllowedHostPaths** - This specifies a whitelist of host paths that are allowed
|
|
|
|
|
to be used by hostPath volumes. An empty list means there is no restriction on
|
|
|
|
|
host paths used. This is defined as a list of objects with a single `pathPrefix`
|
|
|
|
|
field, which allows hostPath volumes to mount a path that begins with an
|
|
|
|
|
allowed prefix. For example:
|
|
|
|
|
|
|
|
|
|
```yaml
|
|
|
|
|
allowedHostPaths:
|
|
|
|
|
# This allows "/foo", "/foo/", "/foo/bar" etc., but
|
|
|
|
|
# disallows "/fool", "/etc/foo" etc.
|
|
|
|
|
# "/foo/../" is never valid.
|
|
|
|
|
- pathPrefix: "/foo"
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
_Note: There are many ways a container with unrestricted access to the host
|
|
|
|
|
filesystem can escalate privileges, including reading data from other
|
|
|
|
|
containers, and abusing the credentials of system services, such as Kubelet._
|
|
|
|
|
|
|
|
|
|
**ReadOnlyRootFilesystem** - Requires that containers must run with a read-only
|
|
|
|
|
root filesystem (i.e. no writeable layer).
|
|
|
|
|
|
|
|
|
|
### Users and groups
|
|
|
|
|
|
|
|
|
|
**RunAsUser** - Controls the what user ID containers run as.
|
|
|
|
|
|
|
|
|
|
- *MustRunAs* - Requires at least one `range` to be specified. Uses the
|
|
|
|
|
minimum value of the first range as the default. Validates against all ranges.
|
|
|
|
|
- *MustRunAsNonRoot* - Requires that the pod be submitted with a non-zero
|
|
|
|
|
`runAsUser` or have the `USER` directive defined in the image. No default
|
|
|
|
|
provided.
|
|
|
|
|
`runAsUser` or have the `USER` directive defined (using a numeric UID) in the
|
|
|
|
|
image. No default provided. Setting `allowPrivilegeEscalation=false` is strongly
|
|
|
|
|
recommended with this strategy.
|
|
|
|
|
- *RunAsAny* - No default provided. Allows any `runAsUser` to be specified.
|
|
|
|
|
|
|
|
|
|
**SupplementalGroups** - Controls which group IDs containers add.
|
|
|
|
|
|
|
|
|
|
- *MustRunAs* - Requires at least one `range` to be specified. Uses the
|
|
|
|
|
minimum value of the first range as the default. Validates against all ranges.
|
|
|
|
|
- *RunAsAny* - No default provided. Allows any `supplementalGroups` to be
|
|
|
|
|
specified.
|
|
|
|
|
|
|
|
|
|
### Privilege Escalation
|
|
|
|
|
|
|
|
|
|
These options control the `allowPrivilegeEscalation` container option. This bool
|
|
|
|
|
directly controls whether the
|
|
|
|
|
[`no_new_privs`](https://www.kernel.org/doc/Documentation/prctl/no_new_privs.txt)
|
|
|
|
|
flag gets set on the container process. This flag will prevent `setuid` binaries
|
|
|
|
|
from changing the effective user ID, and prevent files from enabling extra
|
|
|
|
|
capabilities (e.g. it will prevent the use of the `ping` tool). This behavior is
|
|
|
|
|
required to effectively enforce `MustRunAsNonRoot`.
|
|
|
|
|
|
|
|
|
|
It defaults to `nil`. The default behavior of `nil` allows privilege escalation
|
|
|
|
|
so as to not break setuid binaries. Setting it to `false` ensures that no child
|
|
|
|
|
process of a container can gain more privileges than its parent.
|
|
|
|
|
|
|
|
|
|
**AllowPrivilegeEscalation** - Gates whether or not a user is allowed to set the
|
|
|
|
|
security context of a container to `allowPrivilegeEscalation=true`. This
|
|
|
|
|
defaults to allowed. When set to false, the container's
|
|
|
|
|
`allowPrivilegeEscalation` is defaulted to false.
|
|
|
|
|
|
|
|
|
|
**DefaultAllowPrivilegeEscalation** - Sets the default for the
|
|
|
|
|
`allowPrivilegeEscalation` option. The default behavior without this is to allow
|
|
|
|
|
privilege escalation so as to not break setuid binaries. If that behavior is not
|
|
|
|
|
desired, this field can be used to default to disallow, while still permitting
|
|
|
|
|
pods to request `allowPrivilegeEscalation` explicitly.
|
|
|
|
|
|
|
|
|
|
### Capabilities
|
|
|
|
|
|
|
|
|
|
Linux capabilities provide a finer grained breakdown of the privileges
|
|
|
|
|
traditionally associated with the superuser. Some of these capabilities can be
|
|
|
|
|
used to escalate privileges or for container breakout, and may be restricted by
|
|
|
|
|
the PodSecurityPolicy. For more details on Linux capabilities, see
|
|
|
|
|
[capabilities(7)](http://man7.org/linux/man-pages/man7/capabilities.7.html).
|
|
|
|
|
|
|
|
|
|
The following fields take a list of capabilities, specified as the capability
|
|
|
|
|
name in ALL_CAPS without the `CAP_` prefix.
|
|
|
|
|
|
|
|
|
|
**AllowedCapabilities** - Provides a whitelist of capabilities that may be added
|
|
|
|
|
to a container. The default set of capabilities are implicitly allowed. The
|
|
|
|
|
empty set means that no additional capabilities may be added beyond the default
|
|
|
|
|
set. `*` can be used to allow all capabilities.
|
|
|
|
|
|
|
|
|
|
**RequiredDropCapabilities** - The capabilities which must be dropped from
|
|
|
|
|
containers. These capabilities are removed from the default set, and must not be
|
|
|
|
|
added. Capabilities listed in `RequiredDropCapabilities` must not be included in
|
|
|
|
|
`AllowedCapabilities` or `DefaultAddCapabilities`.
|
|
|
|
|
|
|
|
|
|
**DefaultAddCapabilities** - The capabilities which are added to containers by
|
|
|
|
|
default, in addition to the runtime defaults. See the [Docker
|
|
|
|
|
documentation](https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities)
|
|
|
|
|
for the default list of capabilities when using the Docker runtime.
|
|
|
|
|
|
|
|
|
|
### SELinux
|
|
|
|
|
|
|
|
|
|
- *MustRunAs* - Requires `seLinuxOptions` to be configured if not using
|
|
|
|
@ -73,193 +495,29 @@ pre-allocated values. Uses `seLinuxOptions` as the default. Validates against
|
|
|
|
|
- *RunAsAny* - No default provided. Allows any `seLinuxOptions` to be
|
|
|
|
|
specified.
|
|
|
|
|
|
|
|
|
|
### SupplementalGroups
|
|
|
|
|
### AppArmor
|
|
|
|
|
|
|
|
|
|
- *MustRunAs* - Requires at least one range to be specified. Uses the
|
|
|
|
|
minimum value of the first range as the default. Validates against all ranges.
|
|
|
|
|
- *RunAsAny* - No default provided. Allows any `supplementalGroups` to be
|
|
|
|
|
specified.
|
|
|
|
|
Controlled via annotations on the PodSecurityPolicy. Refer to the [AppArmor
|
|
|
|
|
documentation](/docs/tutorials/clusters/apparmor/#podsecuritypolicy-annotations).
|
|
|
|
|
|
|
|
|
|
### FSGroup
|
|
|
|
|
### Seccomp
|
|
|
|
|
|
|
|
|
|
- *MustRunAs* - Requires at least one range to be specified. Uses the
|
|
|
|
|
minimum value of the first range as the default. Validates against the
|
|
|
|
|
first ID in the first range.
|
|
|
|
|
- *RunAsAny* - No default provided. Allows any `fsGroup` ID to be specified.
|
|
|
|
|
The use of seccomp profiles in pods can be controlled via annotations on the
|
|
|
|
|
PodSecurityPolicy. Seccomp is an alpha feature in Kubernetes.
|
|
|
|
|
|
|
|
|
|
### Controlling Volumes
|
|
|
|
|
**seccomp.security.alpha.kubernetes.io/defaultProfileName** - Annotation that
|
|
|
|
|
specifies the default seccomp profile to apply to containers. Possible values
|
|
|
|
|
are:
|
|
|
|
|
|
|
|
|
|
The usage of specific volume types can be controlled by setting the
|
|
|
|
|
volumes field of the PSP. The allowable values of this field correspond
|
|
|
|
|
to the volume sources that are defined when creating a volume:
|
|
|
|
|
- `unconfined` - Seccomp is not applied to the container processes (this is the
|
|
|
|
|
default in Kubernetes), if no alternative is provided.
|
|
|
|
|
- `docker/default` - The Docker default seccomp profile is used.
|
|
|
|
|
- `localhost/<path>` - Specify a profile as a file on the node located at
|
|
|
|
|
`<seccomp_root>/<path>`, where `<seccomp_root>` is defined via the
|
|
|
|
|
`--seccomp-profile-root` flag on the Kubelet.
|
|
|
|
|
|
|
|
|
|
1. azureFile
|
|
|
|
|
1. azureDisk
|
|
|
|
|
1. flocker
|
|
|
|
|
1. flexVolume
|
|
|
|
|
1. hostPath
|
|
|
|
|
1. emptyDir
|
|
|
|
|
1. gcePersistentDisk
|
|
|
|
|
1. awsElasticBlockStore
|
|
|
|
|
1. gitRepo
|
|
|
|
|
1. secret
|
|
|
|
|
1. nfs
|
|
|
|
|
1. iscsi
|
|
|
|
|
1. glusterfs
|
|
|
|
|
1. persistentVolumeClaim
|
|
|
|
|
1. rbd
|
|
|
|
|
1. cinder
|
|
|
|
|
1. cephFS
|
|
|
|
|
1. downwardAPI
|
|
|
|
|
1. fc
|
|
|
|
|
1. configMap
|
|
|
|
|
1. vsphereVolume
|
|
|
|
|
1. quobyte
|
|
|
|
|
1. photonPersistentDisk
|
|
|
|
|
1. projected
|
|
|
|
|
1. portworxVolume
|
|
|
|
|
1. scaleIO
|
|
|
|
|
1. storageos
|
|
|
|
|
1. \* (allow all volumes)
|
|
|
|
|
|
|
|
|
|
The recommended minimum set of allowed volumes for new PSPs are
|
|
|
|
|
configMap, downwardAPI, emptyDir, persistentVolumeClaim, secret, and projected.
|
|
|
|
|
|
|
|
|
|
### Host Network
|
|
|
|
|
- *HostPorts*, default `empty`. List of `HostPortRange`, defined by `min`(inclusive) and `max`(inclusive), which define the allowed host ports.
|
|
|
|
|
|
|
|
|
|
### AllowPrivilegeEscalation
|
|
|
|
|
|
|
|
|
|
Gates whether or not a user is allowed to set the security context of a container
|
|
|
|
|
to `allowPrivilegeEscalation=true`. This field defaults to `false`.
|
|
|
|
|
|
|
|
|
|
### DefaultAllowPrivilegeEscalation
|
|
|
|
|
|
|
|
|
|
Sets the default for the security context `AllowPrivilegeEscalation` of a container.
|
|
|
|
|
This bool directly controls whether the `no_new_privs` flag gets set on the
|
|
|
|
|
container process. It defaults to `nil`. The default behavior of `nil`
|
|
|
|
|
allows privilege escalation so as to not break setuid binaries. Setting it to `false`
|
|
|
|
|
ensures that no child process of a container can gain more privileges than
|
|
|
|
|
its parent.
|
|
|
|
|
|
|
|
|
|
### AllowedHostPaths
|
|
|
|
|
|
|
|
|
|
This specifies a whitelist of host paths that are allowed to be used by Pods.
|
|
|
|
|
An empty list means there is no restriction on host paths used.
|
|
|
|
|
Each item in the list must specify a string value named `pathPrefix` that
|
|
|
|
|
defines a host path to match. The value cannot be "`*`" though.
|
|
|
|
|
An example is shown below:
|
|
|
|
|
|
|
|
|
|
```yaml
|
|
|
|
|
apiVersion: extensions/v1beta1
|
|
|
|
|
kind: PodSecurityPolicy
|
|
|
|
|
metadata:
|
|
|
|
|
name: custom-paths
|
|
|
|
|
spec:
|
|
|
|
|
allowedHostPaths:
|
|
|
|
|
# This allows "/foo", "/foo/", "/foo/bar" etc., but
|
|
|
|
|
# disallows "/fool", "/etc/foo" etc.
|
|
|
|
|
- pathPrefix: "/foo"
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Admission
|
|
|
|
|
|
|
|
|
|
[_Admission control_ with `PodSecurityPolicy`](/docs/admin/admission-controllers/#podsecuritypolicy)
|
|
|
|
|
allows for control over the creation and modification of resources based on the
|
|
|
|
|
capabilities allowed in the cluster.
|
|
|
|
|
|
|
|
|
|
Admission uses the following approach to create the final security context for
|
|
|
|
|
the pod:
|
|
|
|
|
|
|
|
|
|
1. Retrieve all PSPs available for use.
|
|
|
|
|
1. Generate field values for security context settings that were not specified
|
|
|
|
|
on the request.
|
|
|
|
|
1. Validate the final settings against the available policies.
|
|
|
|
|
|
|
|
|
|
If a matching policy is found, then the pod is accepted. If the
|
|
|
|
|
request cannot be matched to a PSP, the pod is rejected.
|
|
|
|
|
|
|
|
|
|
A pod must validate every field against the PSP.
|
|
|
|
|
|
|
|
|
|
## Creating a Pod Security Policy
|
|
|
|
|
|
|
|
|
|
Here is an example Pod Security Policy. It has permissive settings for
|
|
|
|
|
all fields
|
|
|
|
|
|
|
|
|
|
{% include code.html language="yaml" file="psp.yaml" ghlink="/docs/concepts/policy/psp.yaml" %}
|
|
|
|
|
|
|
|
|
|
Create the policy by downloading the example file and then running this command:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl create -f ./psp.yaml
|
|
|
|
|
podsecuritypolicy "permissive" created
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Getting a list of Pod Security Policies
|
|
|
|
|
|
|
|
|
|
To get a list of existing policies, use `kubectl get`:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl get psp
|
|
|
|
|
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
|
|
|
|
|
permissive false [] RunAsAny RunAsAny RunAsAny RunAsAny false [*]
|
|
|
|
|
privileged true [] RunAsAny RunAsAny RunAsAny RunAsAny false [*]
|
|
|
|
|
restricted false [] RunAsAny MustRunAsNonRoot RunAsAny RunAsAny false [emptyDir secret downwardAPI configMap persistentVolumeClaim projected]
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Editing a Pod Security Policy
|
|
|
|
|
|
|
|
|
|
To modify policy interactively, use `kubectl edit`:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl edit psp permissive
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
This command will open a default text editor where you will be able to modify policy.
|
|
|
|
|
|
|
|
|
|
## Deleting a Pod Security Policy
|
|
|
|
|
|
|
|
|
|
Once you don't need a policy anymore, simply delete it with `kubectl`:
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
|
$ kubectl delete psp permissive
|
|
|
|
|
podsecuritypolicy "permissive" deleted
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Enabling Pod Security Policies
|
|
|
|
|
|
|
|
|
|
In order to use Pod Security Policies in your cluster you must ensure the
|
|
|
|
|
following
|
|
|
|
|
|
|
|
|
|
1. You have enabled the API type `extensions/v1beta1/podsecuritypolicy` (only for versions prior 1.6)
|
|
|
|
|
1. [You have enabled the admission control plug-in `PodSecurityPolicy`](/docs/admin/admission-controllers/#how-do-i-turn-on-an-admission-control-plug-in)
|
|
|
|
|
1. You have defined your policies
|
|
|
|
|
|
|
|
|
|
## Working With RBAC
|
|
|
|
|
|
|
|
|
|
In Kubernetes 1.5 and newer, you can use PodSecurityPolicy to control access to
|
|
|
|
|
privileged containers based on user role and groups. Access to different
|
|
|
|
|
PodSecurityPolicy objects can be controlled via authorization.
|
|
|
|
|
|
|
|
|
|
Note that [Controller Manager](/docs/admin/kube-controller-manager/) must be run
|
|
|
|
|
against [the secured API port](/docs/admin/accessing-the-api/), and must not
|
|
|
|
|
have superuser permissions. Otherwise requests would bypass authentication and
|
|
|
|
|
authorization modules, all PodSecurityPolicy objects would be allowed,
|
|
|
|
|
and user will be able to create privileged containers.
|
|
|
|
|
|
|
|
|
|
PodSecurityPolicy authorization uses the union of all policies available to the
|
|
|
|
|
user creating the pod and
|
|
|
|
|
[the service account specified on the pod](/docs/tasks/configure-pod-container/configure-service-account/).
|
|
|
|
|
|
|
|
|
|
Access to given PSP policies for a user will be effective only when creating
|
|
|
|
|
Pods directly.
|
|
|
|
|
|
|
|
|
|
For pods created on behalf of a user, in most cases by Controller Manager,
|
|
|
|
|
access should be given to the service account specified on the pod spec
|
|
|
|
|
template. Examples of resources that create pods on behalf of a user are
|
|
|
|
|
Deployments, ReplicaSets, etc.
|
|
|
|
|
|
|
|
|
|
For more details, see the
|
|
|
|
|
[PodSecurityPolicy RBAC example](https://git.k8s.io/examples/staging/podsecuritypolicy/rbac/README.md)
|
|
|
|
|
of applying PodSecurityPolicy to control access to privileged containers based
|
|
|
|
|
on role and groups when deploying Pods directly.
|
|
|
|
|
**seccomp.security.alpha.kubernetes.io/allowedProfileNames** - Annotation that
|
|
|
|
|
specifies which values are allowed for the pod seccomp annotations. Specified as
|
|
|
|
|
a comma-delimited list of allowed values. Possible values are those listed
|
|
|
|
|
above, plus `*` to allow all profiles. Absence of this annotation means that the
|
|
|
|
|
default cannot be changed.
|
|
|
|
|