Merge pull request #23042 from sftim/20200809_update_audit_task
Update cluster auditing task pagepull/24854/head
commit
cb7c4b6431
|
@ -9,10 +9,11 @@ title: Auditing
|
|||
|
||||
<!-- overview -->
|
||||
|
||||
Kubernetes auditing provides a security-relevant chronological set of records documenting
|
||||
the sequence of activities that have affected system by individual users, administrators
|
||||
or other components of the system. It allows cluster administrator to
|
||||
answer the following questions:
|
||||
Kubernetes _auditing_ provides a security-relevant, chronological set of records documenting
|
||||
the sequence of actions in a cluster. The cluster audits the activities generated by users,
|
||||
by applications that use the Kubernetes API, and by the control plane itself.
|
||||
|
||||
Auditing allows cluster administrators to answer the following questions:
|
||||
|
||||
- what happened?
|
||||
- when did it happen?
|
||||
|
@ -32,7 +33,7 @@ a certain policy and written to a backend. The policy determines what's recorded
|
|||
and the backends persist the records. The current backend implementations
|
||||
include logs files and webhooks.
|
||||
|
||||
Each request can be recorded with an associated "stage". The known stages are:
|
||||
Each request can be recorded with an associated _stage_. The defined stages are:
|
||||
|
||||
- `RequestReceived` - The stage for events generated as soon as the audit
|
||||
handler receives the request, and before it is delegated down the handler
|
||||
|
@ -45,19 +46,23 @@ Each request can be recorded with an associated "stage". The known stages are:
|
|||
- `Panic` - Events generated when a panic occurred.
|
||||
|
||||
{{< note >}}
|
||||
The audit logging feature increases the memory consumption of the API server
|
||||
because some context required for auditing is stored for each request.
|
||||
Additionally, memory consumption depends on the audit logging configuration.
|
||||
Audit events are different from the
|
||||
[Event](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#event-v1-core)
|
||||
API object.
|
||||
{{< /note >}}
|
||||
|
||||
## Audit Policy
|
||||
The audit logging feature increases the memory consumption of the API server
|
||||
because some context required for auditing is stored for each request.
|
||||
Memory consumption depends on the audit logging configuration.
|
||||
|
||||
## Audit policy
|
||||
|
||||
Audit policy defines rules about what events should be recorded and what data
|
||||
they should include. The audit policy object structure is defined in the
|
||||
[`audit.k8s.io` API group](https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/staging/src/k8s.io/apiserver/pkg/apis/audit/v1/types.go).
|
||||
When an event is processed, it's
|
||||
compared against the list of rules in order. The first matching rule sets the
|
||||
"audit level" of the event. The known audit levels are:
|
||||
_audit level_ of the event. The defined audit levels are:
|
||||
|
||||
- `None` - don't log events that match this rule.
|
||||
- `Metadata` - log request metadata (requesting user, timestamp, resource,
|
||||
|
@ -86,26 +91,27 @@ rules:
|
|||
- level: Metadata
|
||||
```
|
||||
|
||||
The audit profile used by GCE should be used as reference by admins constructing their own audit profiles. You can check the
|
||||
If you're crafting your own audit profile, you can use the audit profile for Google Container-Optimized OS as a starting point. You can check the
|
||||
[configure-helper.sh](https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/cluster/gce/gci/configure-helper.sh)
|
||||
script, which generates the audit policy file. You can see most of the audit policy file by looking directly at the script.
|
||||
script, which generates an audit policy file. You can see most of the audit policy file by looking directly at the script.
|
||||
|
||||
## Audit backends
|
||||
|
||||
Audit backends persist audit events to an external storage.
|
||||
Out of the box, the kube-apiserver provides two backends:
|
||||
|
||||
- Log backend, which writes events to a disk
|
||||
- Webhook backend, which sends events to an external API
|
||||
- Log backend, which writes events into the filesystem
|
||||
- Webhook backend, which sends events to an external HTTP API
|
||||
|
||||
In all cases, audit events structure is defined by the API in the
|
||||
`audit.k8s.io` API group. The current version of the API is
|
||||
In all cases, audit events follow a structure defined by the Kubernetes API in the
|
||||
`audit.k8s.io` API group. For Kubernetes {{< param "fullversion" >}}, that
|
||||
API is at version
|
||||
[`v1`](https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/staging/src/k8s.io/apiserver/pkg/apis/audit/v1/types.go).
|
||||
|
||||
{{< note >}}
|
||||
In case of patches, request body is a JSON array with patch operations, not a JSON object
|
||||
with an appropriate Kubernetes API object. For example, the following request body is a valid patch
|
||||
request to `/apis/batch/v1/namespaces/some-namespace/jobs/some-job-name`.
|
||||
request to `/apis/batch/v1/namespaces/some-namespace/jobs/some-job-name`:
|
||||
|
||||
```json
|
||||
[
|
||||
|
@ -125,8 +131,8 @@ request to `/apis/batch/v1/namespaces/some-namespace/jobs/some-job-name`.
|
|||
|
||||
### Log backend
|
||||
|
||||
Log backend writes audit events to a file in JSON format. You can configure
|
||||
log audit backend using the following `kube-apiserver` flags:
|
||||
The log backend writes audit events to a file in [JSONlines](https://jsonlines.org/) format.
|
||||
You can configure the log audit backend using the following `kube-apiserver` flags:
|
||||
|
||||
- `--audit-log-path` specifies the log file path that log backend uses to write
|
||||
audit events. Not specifying this flag disables log backend. `-` means standard out
|
||||
|
@ -134,15 +140,16 @@ log audit backend using the following `kube-apiserver` flags:
|
|||
- `--audit-log-maxbackup` defines the maximum number of audit log files to retain
|
||||
- `--audit-log-maxsize` defines the maximum size in megabytes of the audit log file before it gets rotated
|
||||
|
||||
In case kube-apiserver is configured as a Pod,remember to mount the hostPath to the location of the policy file and log file. For example,
|
||||
`
|
||||
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
|
||||
--audit-log-path=/var/log/audit.log
|
||||
`
|
||||
If your cluster's control plane runs the kube-apiserver as a Pod, remember to mount the `hostPath`
|
||||
to the location of the policy file and log file, so that audit records are persisted. For example:
|
||||
```shell
|
||||
--audit-policy-file=/etc/kubernetes/audit-policy.yaml \
|
||||
--audit-log-path=/var/log/audit.log
|
||||
```
|
||||
then mount the volumes:
|
||||
|
||||
|
||||
```
|
||||
```yaml
|
||||
...
|
||||
volumeMounts:
|
||||
- mountPath: /etc/kubernetes/audit-policy.yaml
|
||||
name: audit
|
||||
|
@ -151,9 +158,10 @@ volumeMounts:
|
|||
name: audit-log
|
||||
readOnly: false
|
||||
```
|
||||
finally the hostPath:
|
||||
and finally configure the `hostPath`:
|
||||
|
||||
```
|
||||
```yaml
|
||||
...
|
||||
- name: audit
|
||||
hostPath:
|
||||
path: /etc/kubernetes/audit-policy.yaml
|
||||
|
@ -163,19 +171,19 @@ finally the hostPath:
|
|||
hostPath:
|
||||
path: /var/log/audit.log
|
||||
type: FileOrCreate
|
||||
|
||||
|
||||
```
|
||||
|
||||
|
||||
|
||||
### Webhook backend
|
||||
|
||||
Webhook backend sends audit events to a remote API, which is assumed to be the
|
||||
same API as `kube-apiserver` exposes. You can configure webhook
|
||||
audit backend using the following kube-apiserver flags:
|
||||
The webhook audit backend sends audit events to a remote web API, which is assumed to
|
||||
be a form of the Kubernetes API, including means of authentication. You can configure
|
||||
a webhook audit backend using the following kube-apiserver flags:
|
||||
|
||||
- `--audit-webhook-config-file` specifies the path to a file with a webhook
|
||||
configuration. Webhook configuration is effectively a
|
||||
configuration. The webhook configuration is effectively a specialized
|
||||
[kubeconfig](/docs/tasks/access-application-cluster/configure-access-multiple-clusters).
|
||||
- `--audit-webhook-initial-backoff` specifies the amount of time to wait after the first failed
|
||||
request before retrying. Subsequent requests are retried with exponential backoff.
|
||||
|
@ -183,7 +191,7 @@ audit backend using the following kube-apiserver flags:
|
|||
The webhook config file uses the kubeconfig format to specify the remote address of
|
||||
the service and credentials used to connect to it.
|
||||
|
||||
### Batching
|
||||
## Event batching {#batching}
|
||||
|
||||
Both log and webhook backends support batching. Using webhook as an example, here's the list of
|
||||
available flags. To get the same flag for log backend, replace `webhook` with `log` in the flag
|
||||
|
@ -193,9 +201,10 @@ throttling is enabled in `webhook` and disabled in `log`.
|
|||
- `--audit-webhook-mode` defines the buffering strategy. One of the following:
|
||||
- `batch` - buffer events and asynchronously process them in batches. This is the default.
|
||||
- `blocking` - block API server responses on processing each individual event.
|
||||
- `blocking-strict` - Same as blocking, but when there is a failure during audit logging at RequestReceived stage, the whole request to apiserver will fail.
|
||||
- `blocking-strict` - Same as blocking, but when there is a failure during audit logging at the
|
||||
RequestReceived stage, the whole request to the kube-apiserver fails.
|
||||
|
||||
The following flags are used only in the `batch` mode.
|
||||
The following flags are used only in the `batch` mode:
|
||||
|
||||
- `--audit-webhook-batch-buffer-size` defines the number of events to buffer before batching.
|
||||
If the rate of incoming events overflows the buffer, events are dropped.
|
||||
|
@ -207,16 +216,16 @@ The following flags are used only in the `batch` mode.
|
|||
- `--audit-webhook-batch-throttle-burst` defines the maximum number of batches generated at the same
|
||||
moment if the allowed QPS was underutilized previously.
|
||||
|
||||
#### Parameter tuning
|
||||
## Parameter tuning
|
||||
|
||||
Parameters should be set to accommodate the load on the apiserver.
|
||||
Parameters should be set to accommodate the load on the API server.
|
||||
|
||||
For example, if kube-apiserver receives 100 requests each second, and each request is audited only
|
||||
on `ResponseStarted` and `ResponseComplete` stages, you should account for ~200 audit
|
||||
on `ResponseStarted` and `ResponseComplete` stages, you should account for ≅200 audit
|
||||
events being generated each second. Assuming that there are up to 100 events in a batch,
|
||||
you should set throttling level at least 2 QPS. Assuming that the backend can take up to
|
||||
5 seconds to write events, you should set the buffer size to hold up to 5 seconds of events, i.e.
|
||||
10 batches, i.e. 1000 events.
|
||||
you should set throttling level at least 2 queries per second. Assuming that the backend can take up to
|
||||
5 seconds to write events, you should set the buffer size to hold up to 5 seconds of events;
|
||||
that is: 10 batches, or 1000 events.
|
||||
|
||||
In most cases however, the default parameters should be sufficient and you don't have to worry about
|
||||
setting them manually. You can look at the following Prometheus metrics exposed by kube-apiserver
|
||||
|
@ -226,192 +235,18 @@ and in the logs to monitor the state of the auditing subsystem.
|
|||
- `apiserver_audit_error_total` metric contains the total number of events dropped due to an error
|
||||
during exporting.
|
||||
|
||||
### Truncate
|
||||
### Log entry truncation {#truncate}
|
||||
|
||||
Both log and webhook backends support truncating. As an example, the following is the list of flags
|
||||
available for the log backend:
|
||||
Both log and webhook backends support limiting the size of events that are logged.
|
||||
As an example, the following is the list of flags available for the log backend:
|
||||
|
||||
- `audit-log-truncate-enabled` whether event and batch truncating is enabled.
|
||||
- `audit-log-truncate-max-batch-size` maximum size in bytes of the batch sent to the underlying backend.
|
||||
- `audit-log-truncate-max-event-size` maximum size in bytes of the audit event sent to the underlying backend.
|
||||
|
||||
By default truncate is disabled in both `webhook` and `log`, a cluster administrator should set `audit-log-truncate-enabled` or `audit-webhook-truncate-enabled` to enable the feature.
|
||||
|
||||
## Setup for multiple API servers
|
||||
|
||||
If you're extending the Kubernetes API with the [aggregation
|
||||
layer](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/),
|
||||
you can also set up audit logging for the aggregated apiserver. To do this,
|
||||
pass the configuration options in the same format as described above to the
|
||||
aggregated apiserver and set up the log ingesting pipeline to pick up audit
|
||||
logs. Different apiservers can have different audit configurations and
|
||||
different audit policies.
|
||||
|
||||
## Log Collector Examples
|
||||
|
||||
### Use fluentd to collect and distribute audit events from log file
|
||||
|
||||
[Fluentd](https://www.fluentd.org/) is an open source data collector for unified logging layer.
|
||||
In this example, we will use fluentd to split audit events by different namespaces.
|
||||
|
||||
{{< note >}}
|
||||
The `fluent-plugin-forest` and `fluent-plugin-rewrite-tag-filter` are plugins for fluentd.
|
||||
You can get details about plugin installation from
|
||||
[fluentd plugin-management](https://docs.fluentd.org/v1.0/articles/plugin-management).
|
||||
{{< /note >}}
|
||||
|
||||
1. Install [`fluentd`](https://docs.fluentd.org/v1.0/articles/quickstart#step-1:-installing-fluentd),
|
||||
`fluent-plugin-forest` and `fluent-plugin-rewrite-tag-filter` in the kube-apiserver node
|
||||
|
||||
1. Create a config file for fluentd
|
||||
|
||||
```
|
||||
cat <<'EOF' > /etc/fluentd/config
|
||||
# fluentd conf runs in the same host with kube-apiserver
|
||||
<source>
|
||||
@type tail
|
||||
# audit log path of kube-apiserver
|
||||
path /var/log/kube-audit
|
||||
pos_file /var/log/audit.pos
|
||||
format json
|
||||
time_key time
|
||||
time_format %Y-%m-%dT%H:%M:%S.%N%z
|
||||
tag audit
|
||||
</source>
|
||||
|
||||
<filter audit>
|
||||
#https://github.com/fluent/fluent-plugin-rewrite-tag-filter/issues/13
|
||||
@type record_transformer
|
||||
enable_ruby
|
||||
<record>
|
||||
namespace ${record["objectRef"].nil? ? "none":(record["objectRef"]["namespace"].nil? ? "none":record["objectRef"]["namespace"])}
|
||||
</record>
|
||||
</filter>
|
||||
|
||||
<match audit>
|
||||
# route audit according to namespace element in context
|
||||
@type rewrite_tag_filter
|
||||
<rule>
|
||||
key namespace
|
||||
pattern /^(.+)/
|
||||
tag ${tag}.$1
|
||||
</rule>
|
||||
</match>
|
||||
|
||||
<filter audit.**>
|
||||
@type record_transformer
|
||||
remove_keys namespace
|
||||
</filter>
|
||||
|
||||
<match audit.**>
|
||||
@type forest
|
||||
subtype file
|
||||
remove_prefix audit
|
||||
<template>
|
||||
time_slice_format %Y%m%d%H
|
||||
compress gz
|
||||
path /var/log/audit-${tag}.*.log
|
||||
format json
|
||||
include_time_key true
|
||||
</template>
|
||||
</match>
|
||||
EOF
|
||||
```
|
||||
|
||||
1. Start fluentd
|
||||
|
||||
```shell
|
||||
fluentd -c /etc/fluentd/config -vv
|
||||
```
|
||||
|
||||
1. Start kube-apiserver with the following options:
|
||||
|
||||
```shell
|
||||
--audit-policy-file=/etc/kubernetes/audit-policy.yaml --audit-log-path=/var/log/kube-audit --audit-log-format=json
|
||||
```
|
||||
|
||||
1. Check audits for different namespaces in `/var/log/audit-*.log`
|
||||
|
||||
### Use logstash to collect and distribute audit events from webhook backend
|
||||
|
||||
[Logstash](https://www.elastic.co/products/logstash)
|
||||
is an open source, server-side data processing tool. In this example,
|
||||
we will use logstash to collect audit events from webhook backend, and save events of
|
||||
different users into different files.
|
||||
|
||||
1. install [logstash](https://www.elastic.co/guide/en/logstash/current/installing-logstash.html)
|
||||
|
||||
1. create config file for logstash
|
||||
|
||||
```
|
||||
cat <<EOF > /etc/logstash/config
|
||||
input{
|
||||
http{
|
||||
#TODO, figure out a way to use kubeconfig file to authenticate to logstash
|
||||
#https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http.html#plugins-inputs-http-ssl
|
||||
port=>8888
|
||||
}
|
||||
}
|
||||
filter{
|
||||
split{
|
||||
# Webhook audit backend sends several events together with EventList
|
||||
# split each event here.
|
||||
field=>[items]
|
||||
# We only need event subelement, remove others.
|
||||
remove_field=>[headers, metadata, apiVersion, "@timestamp", kind, "@version", host]
|
||||
}
|
||||
mutate{
|
||||
rename => {items=>event}
|
||||
}
|
||||
}
|
||||
output{
|
||||
file{
|
||||
# Audit events from different users will be saved into different files.
|
||||
path=>"/var/log/kube-audit-%{[event][user][username]}/audit"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
```
|
||||
|
||||
1. start logstash
|
||||
|
||||
```shell
|
||||
bin/logstash -f /etc/logstash/config --path.settings /etc/logstash/
|
||||
```
|
||||
|
||||
1. create a [kubeconfig file](/docs/tasks/access-application-cluster/configure-access-multiple-clusters/) for kube-apiserver webhook audit backend
|
||||
|
||||
cat <<EOF > /etc/kubernetes/audit-webhook-kubeconfig
|
||||
apiVersion: v1
|
||||
kind: Config
|
||||
clusters:
|
||||
- cluster:
|
||||
server: http://<ip_of_logstash>:8888
|
||||
name: logstash
|
||||
contexts:
|
||||
- context:
|
||||
cluster: logstash
|
||||
user: ""
|
||||
name: default-context
|
||||
current-context: default-context
|
||||
preferences: {}
|
||||
users: []
|
||||
EOF
|
||||
|
||||
1. start kube-apiserver with the following options:
|
||||
|
||||
```shell
|
||||
--audit-policy-file=/etc/kubernetes/audit-policy.yaml --audit-webhook-config-file=/etc/kubernetes/audit-webhook-kubeconfig
|
||||
```
|
||||
|
||||
1. check audits in logstash node's directories `/var/log/kube-audit-*/audit`
|
||||
|
||||
Note that in addition to file output plugin, logstash has a variety of outputs that
|
||||
let users route data where they want. For example, users can emit audit events to elasticsearch
|
||||
plugin which supports full-text search and analytics.
|
||||
- `audit-log-truncate-enabled` whether event and batch truncating is enabled.
|
||||
- `audit-log-truncate-max-batch-size` maximum size in bytes of the batch sent to the underlying backend.
|
||||
- `audit-log-truncate-max-event-size` maximum size in bytes of the audit event sent to the underlying backend.
|
||||
|
||||
By default truncate is disabled in both `webhook` and `log`, a cluster administrator should set
|
||||
`audit-log-truncate-enabled` or `audit-webhook-truncate-enabled` to enable the feature.
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
Learn about [Mutating webhook auditing annotations](/docs/reference/access-authn-authz/extensible-admission-controllers/#mutating-webhook-auditing-annotations).
|
||||
|
||||
* Learn about [Mutating webhook auditing annotations](/docs/reference/access-authn-authz/extensible-admission-controllers/#mutating-webhook-auditing-annotations).
|
||||
|
|
Loading…
Reference in New Issue