Merge pull request #23042 from sftim/20200809_update_audit_task

Update cluster auditing task page
2020-11-20 08:18:50 -08:00 · 2020-11-20 08:18:50 -08:00 · cb7c4b6431
parent d63933db6e 304661b156
commit cb7c4b6431
1 changed files with 60 additions and 225 deletions
--- a/content/en/docs/tasks/debug-application-cluster/audit.md
+++ b/content/en/docs/tasks/debug-application-cluster/audit.md
@ -9,10 +9,11 @@ title: Auditing

 <!-- overview -->

-Kubernetes auditing provides a security-relevant chronological set of records documenting
-the sequence of activities that have affected system by individual users, administrators
-or other components of the system. It allows cluster administrator to
-answer the following questions:
+Kubernetes _auditing_ provides a security-relevant, chronological set of records documenting
+the sequence of actions in a cluster. The cluster audits the activities generated by users,
+by applications that use the Kubernetes API, and by the control plane itself.
+
+Auditing allows cluster administrators to answer the following questions:

 - what happened?
 - when did it happen?
@ -32,7 +33,7 @@ a certain policy and written to a backend. The policy determines what's recorded
 and the backends persist the records. The current backend implementations
 include logs files and webhooks.

-Each request can be recorded with an associated "stage". The known stages are:
+Each request can be recorded with an associated _stage_. The defined stages are:

 - `RequestReceived` - The stage for events generated as soon as the audit
  handler receives the request, and before it is delegated down the handler
@ -45,19 +46,23 @@ Each request can be recorded with an associated "stage". The known stages are:
 - `Panic` - Events generated when a panic occurred.

 {{< note >}}
-The audit logging feature increases the memory consumption of the API server
-because some context required for auditing is stored for each request.
-Additionally, memory consumption depends on the audit logging configuration.
+Audit events are different from the
+[Event](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#event-v1-core)
+API object.
 {{< /note >}}

-## Audit Policy
+The audit logging feature increases the memory consumption of the API server
+because some context required for auditing is stored for each request.
+Memory consumption depends on the audit logging configuration.
+
+## Audit policy

 Audit policy defines rules about what events should be recorded and what data
 they should include. The audit policy object structure is defined in the
 [`audit.k8s.io` API group](https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/staging/src/k8s.io/apiserver/pkg/apis/audit/v1/types.go).
 When an event is processed, it's
 compared against the list of rules in order. The first matching rule sets the
-"audit level" of the event. The known audit levels are:
+_audit level_ of the event. The defined audit levels are:

 - `None` - don't log events that match this rule.
 - `Metadata` - log request metadata (requesting user, timestamp, resource,
@ -86,26 +91,27 @@ rules:
 - level: Metadata
 ```

-The audit profile used by GCE should be used as reference by admins constructing their own audit profiles. You can check the
+If you're crafting your own audit profile, you can use the audit profile for Google Container-Optimized OS as a starting point. You can check the
 [configure-helper.sh](https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/cluster/gce/gci/configure-helper.sh)
-script, which generates the audit policy file. You can see most of the audit policy file by looking directly at the script.
+script, which generates an audit policy file. You can see most of the audit policy file by looking directly at the script.

 ## Audit backends

 Audit backends persist audit events to an external storage.
 Out of the box, the kube-apiserver provides two backends:

- Log backend, which writes events to a disk
- Webhook backend, which sends events to an external API
+- Log backend, which writes events into the filesystem
+- Webhook backend, which sends events to an external HTTP API

-In all cases, audit events structure is defined by the API in the
-`audit.k8s.io` API group. The current version of the API is
+In all cases, audit events follow a structure defined by the Kubernetes API in the
+`audit.k8s.io` API group. For Kubernetes {{< param "fullversion" >}}, that
+API is at version
 [`v1`](https://github.com/kubernetes/kubernetes/blob/{{< param "githubbranch" >}}/staging/src/k8s.io/apiserver/pkg/apis/audit/v1/types.go).

 {{< note >}}
 In case of patches, request body is a JSON array with patch operations, not a JSON object
 with an appropriate Kubernetes API object. For example, the following request body is a valid patch
-request to `/apis/batch/v1/namespaces/some-namespace/jobs/some-job-name`.
+request to `/apis/batch/v1/namespaces/some-namespace/jobs/some-job-name`:

 ```json
 [
@ -125,8 +131,8 @@ request to `/apis/batch/v1/namespaces/some-namespace/jobs/some-job-name`.

 ### Log backend

-Log backend writes audit events to a file in JSON format. You can configure
-log audit backend using the following `kube-apiserver` flags:
+The log backend writes audit events to a file in [JSONlines](https://jsonlines.org/) format.
+You can configure the log audit backend using the following `kube-apiserver` flags:

 - `--audit-log-path` specifies the log file path that log backend uses to write
  audit events. Not specifying this flag disables log backend. `-` means standard out
@ -134,15 +140,16 @@ log audit backend using the following `kube-apiserver` flags:
 - `--audit-log-maxbackup` defines the maximum number of audit log files to retain
 - `--audit-log-maxsize` defines the maximum size in megabytes of the audit log file before it gets rotated

-In case kube-apiserver is configured as a Pod,remember to mount the hostPath to the location of the policy file and log file. For example, 
-`
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--audit-log-path=/var/log/audit.log
-`
+If your cluster's control plane runs the kube-apiserver as a Pod, remember to mount the `hostPath`
+to the location of the policy file and log file, so that audit records are persisted. For example:
+```shell
+    --audit-policy-file=/etc/kubernetes/audit-policy.yaml \
+    --audit-log-path=/var/log/audit.log
+```
 then mount the volumes:

-
-```
+```yaml
+...
 volumeMounts:
  - mountPath: /etc/kubernetes/audit-policy.yaml
    name: audit
@ -151,9 +158,10 @@ volumeMounts:
    name: audit-log
    readOnly: false
 ```
-finally the hostPath:
+and finally configure the `hostPath`:

-```
+```yaml
+...
 - name: audit
  hostPath:
    path: /etc/kubernetes/audit-policy.yaml
@ -163,19 +171,19 @@ finally the hostPath:
  hostPath:
    path: /var/log/audit.log
    type: FileOrCreate
-    
+
 ```



 ### Webhook backend

-Webhook backend sends audit events to a remote API, which is assumed to be the
-same API as `kube-apiserver` exposes. You can configure webhook
-audit backend using the following kube-apiserver flags:
+The webhook audit backend sends audit events to a remote web API, which is assumed to
+be a form of the Kubernetes API, including means of authentication. You can configure
+a webhook audit backend using the following kube-apiserver flags:

 - `--audit-webhook-config-file` specifies the path to a file with a webhook
-  configuration. Webhook configuration is effectively a
+  configuration. The webhook configuration is effectively a specialized
  [kubeconfig](/docs/tasks/access-application-cluster/configure-access-multiple-clusters).
 - `--audit-webhook-initial-backoff` specifies the amount of time to wait after the first failed
  request before retrying. Subsequent requests are retried with exponential backoff.
@ -183,7 +191,7 @@ audit backend using the following kube-apiserver flags:
 The webhook config file uses the kubeconfig format to specify the remote address of
 the service and credentials used to connect to it.

-### Batching
+## Event batching {#batching}

 Both log and webhook backends support batching. Using webhook as an example, here's the list of
 available flags. To get the same flag for log backend, replace `webhook` with `log` in the flag
@ -193,9 +201,10 @@ throttling is enabled in `webhook` and disabled in `log`.
 - `--audit-webhook-mode` defines the buffering strategy. One of the following:
  - `batch` - buffer events and asynchronously process them in batches. This is the default.
  - `blocking` - block API server responses on processing each individual event.
-  - `blocking-strict` - Same as blocking, but when there is a failure during audit logging at RequestReceived stage, the whole request to apiserver will fail.
+  - `blocking-strict` - Same as blocking, but when there is a failure during audit logging at the
+     RequestReceived stage, the whole request to the kube-apiserver fails.

-The following flags are used only in the `batch` mode.
+The following flags are used only in the `batch` mode:

 - `--audit-webhook-batch-buffer-size` defines the number of events to buffer before batching.
  If the rate of incoming events overflows the buffer, events are dropped.
@ -207,16 +216,16 @@ The following flags are used only in the `batch` mode.
 - `--audit-webhook-batch-throttle-burst` defines the maximum number of batches generated at the same
  moment if the allowed QPS was underutilized previously.

-#### Parameter tuning
+## Parameter tuning

-Parameters should be set to accommodate the load on the apiserver.
+Parameters should be set to accommodate the load on the API server.

 For example, if kube-apiserver receives 100 requests each second, and each request is audited only
-on `ResponseStarted` and `ResponseComplete` stages, you should account for ~200 audit
+on `ResponseStarted` and `ResponseComplete` stages, you should account for ≅200 audit
 events being generated each second. Assuming that there are up to 100 events in a batch,
-you should set throttling level at least 2 QPS. Assuming that the backend can take up to
-5 seconds to write events, you should set the buffer size to hold up to 5 seconds of events, i.e.
-10 batches, i.e. 1000 events.
+you should set throttling level at least 2 queries per second. Assuming that the backend can take up to
+5 seconds to write events, you should set the buffer size to hold up to 5 seconds of events;
+that is: 10 batches, or 1000 events.

 In most cases however, the default parameters should be sufficient and you don't have to worry about
 setting them manually. You can look at the following Prometheus metrics exposed by kube-apiserver
@ -226,192 +235,18 @@ and in the logs to monitor the state of the auditing subsystem.
 - `apiserver_audit_error_total` metric contains the total number of events dropped due to an error
  during exporting.

-### Truncate
+### Log entry truncation {#truncate}

-Both log and webhook backends support truncating. As an example, the following is the list of flags
-available for the log backend:
+Both log and webhook backends support limiting the size of events that are logged.
+As an example, the following is the list of flags available for the log backend:

- - `audit-log-truncate-enabled` whether event and batch truncating is enabled.
- - `audit-log-truncate-max-batch-size` maximum size in bytes of the batch sent to the underlying backend.
- - `audit-log-truncate-max-event-size` maximum size in bytes of the audit event sent to the underlying backend.
-
-By default truncate is disabled in both `webhook` and `log`, a cluster administrator should set `audit-log-truncate-enabled` or `audit-webhook-truncate-enabled` to enable the feature.
-
-## Setup for multiple API servers
-
-If you're extending the Kubernetes API with the [aggregation
-layer](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/),
-you can also set up audit logging for the aggregated apiserver. To do this,
-pass the configuration options in the same format as described above to the
-aggregated apiserver and set up the log ingesting pipeline to pick up audit
-logs. Different apiservers can have different audit configurations and
-different audit policies.
-
-## Log Collector Examples
-
-### Use fluentd to collect and distribute audit events from log file
-
-[Fluentd](https://www.fluentd.org/) is an open source data collector for unified logging layer.
-In this example, we will use fluentd to split audit events by different namespaces.
-
-{{< note >}}
-The `fluent-plugin-forest` and `fluent-plugin-rewrite-tag-filter` are plugins for fluentd.
-You can get details about plugin installation from
-[fluentd plugin-management](https://docs.fluentd.org/v1.0/articles/plugin-management).
-{{< /note >}}
-
-1. Install [`fluentd`](https://docs.fluentd.org/v1.0/articles/quickstart#step-1:-installing-fluentd),
-   `fluent-plugin-forest` and `fluent-plugin-rewrite-tag-filter` in the kube-apiserver node
-
-1. Create a config file for fluentd
-
-    ```
-    cat <<'EOF' > /etc/fluentd/config
-    # fluentd conf runs in the same host with kube-apiserver
-    <source>
-        @type tail
-        # audit log path of kube-apiserver
-        path /var/log/kube-audit
-        pos_file /var/log/audit.pos
-        format json
-        time_key time
-        time_format %Y-%m-%dT%H:%M:%S.%N%z
-        tag audit
-    </source>
-
-    <filter audit>
-        #https://github.com/fluent/fluent-plugin-rewrite-tag-filter/issues/13
-        @type record_transformer
-        enable_ruby
-        <record>
-         namespace ${record["objectRef"].nil? ? "none":(record["objectRef"]["namespace"].nil? ? "none":record["objectRef"]["namespace"])}
-        </record>
-    </filter>
-
-    <match audit>
-        # route audit according to namespace element in context
-        @type rewrite_tag_filter
-        <rule>
-            key namespace
-            pattern /^(.+)/
-            tag ${tag}.$1
-        </rule>
-    </match>
-
-    <filter audit.**>
-       @type record_transformer
-       remove_keys namespace
-    </filter>
-
-    <match audit.**>
-        @type forest
-        subtype file
-        remove_prefix audit
-        <template>
-            time_slice_format %Y%m%d%H
-            compress gz
-            path /var/log/audit-${tag}.*.log
-            format json
-            include_time_key true
-        </template>
-    </match>
-    EOF
-    ```
-
-1. Start fluentd
-
-    ```shell
-    fluentd -c /etc/fluentd/config  -vv
-    ```
-
-1. Start kube-apiserver with the following options:
-
-    ```shell
-    --audit-policy-file=/etc/kubernetes/audit-policy.yaml --audit-log-path=/var/log/kube-audit --audit-log-format=json
-    ```
-
-1. Check audits for different namespaces in `/var/log/audit-*.log`
-
-### Use logstash to collect and distribute audit events from webhook backend
-
-[Logstash](https://www.elastic.co/products/logstash)
-is an open source, server-side data processing tool. In this example,
-we will use logstash to collect audit events from webhook backend, and save events of
-different users into different files.
-
-1. install [logstash](https://www.elastic.co/guide/en/logstash/current/installing-logstash.html)
-
-1. create config file for logstash
-
-    ```
-    cat <<EOF > /etc/logstash/config
-    input{
-        http{
-            #TODO, figure out a way to use kubeconfig file to authenticate to logstash
-            #https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http.html#plugins-inputs-http-ssl
-            port=>8888
-        }
-    }
-    filter{
-        split{
-            # Webhook audit backend sends several events together with EventList
-            # split each event here.
-            field=>[items]
-            # We only need event subelement, remove others.
-            remove_field=>[headers, metadata, apiVersion, "@timestamp", kind, "@version", host]
-        }
-        mutate{
-            rename => {items=>event}
-        }
-    }
-    output{
-        file{
-            # Audit events from different users will be saved into different files.
-            path=>"/var/log/kube-audit-%{[event][user][username]}/audit"
-        }
-    }
-    EOF
-    ```
-
-1. start logstash
-
-    ```shell
-    bin/logstash -f /etc/logstash/config --path.settings /etc/logstash/
-    ```
-
-1. create a [kubeconfig file](/docs/tasks/access-application-cluster/configure-access-multiple-clusters/) for kube-apiserver webhook audit backend
-
-        cat <<EOF > /etc/kubernetes/audit-webhook-kubeconfig
-        apiVersion: v1
-        kind: Config
-        clusters:
-        - cluster:
-            server: http://<ip_of_logstash>:8888
-          name: logstash
-        contexts:
-        - context:
-            cluster: logstash
-            user: ""
-          name: default-context
-        current-context: default-context
-        preferences: {}
-        users: []
-        EOF
-
-1. start kube-apiserver with the following options:
-
-    ```shell
-    --audit-policy-file=/etc/kubernetes/audit-policy.yaml --audit-webhook-config-file=/etc/kubernetes/audit-webhook-kubeconfig
-    ```
-
-1. check audits in logstash node's directories `/var/log/kube-audit-*/audit`
-
-Note that in addition to file output plugin, logstash has a variety of outputs that
-let users route data where they want. For example, users can emit audit events to elasticsearch
-plugin which supports full-text search and analytics.
+- `audit-log-truncate-enabled` whether event and batch truncating is enabled.
+- `audit-log-truncate-max-batch-size` maximum size in bytes of the batch sent to the underlying backend.
+- `audit-log-truncate-max-event-size` maximum size in bytes of the audit event sent to the underlying backend.

+By default truncate is disabled in both `webhook` and `log`, a cluster administrator should set
+`audit-log-truncate-enabled` or `audit-webhook-truncate-enabled` to enable the feature.

 ## {{% heading "whatsnext" %}}

-Learn about [Mutating webhook auditing annotations](/docs/reference/access-authn-authz/extensible-admission-controllers/#mutating-webhook-auditing-annotations).
-
+* Learn about [Mutating webhook auditing annotations](/docs/reference/access-authn-authz/extensible-admission-controllers/#mutating-webhook-auditing-annotations).