Refactor stackdriver logging page

pull/2142/head
Mik Vyatskov 2017-01-05 15:28:50 +01:00
parent 455a14a77d
commit fa59f3779c
2 changed files with 130 additions and 140 deletions

View File

@ -7,26 +7,26 @@ title: Logging Overview
Application and systems logs can help you understand what is happening inside your cluster. The logs are particularly useful for debugging problems and monitoring cluster activity. Most modern applications have some kind of logging mechanism; as such, most container engines are likewise designed to support some kind of logging. The easiest and most embraced logging method for containerized applications is to write to the standard output and standard error streams.
However, the native functionality provided by a container engine or runtime is usually not enough for a complete logging solution. For example, if a container crashes, a pod is evicted, or a node dies, you'll usually still want to access your application's logs. As such, logs should have a separate storage and lifecycle independent of nodes, pods, or containers; this concept is called __cluster-level-logging__. Cluster-level logging requires a separate back-end to store, analyze, and query logs. Kubernetes provides no native storage solution for logs data, but you can integrate many existing logging solutions into your Kubernetes cluster.
However, the native functionality provided by a container engine or runtime is usually not enough for a complete logging solution. For example, if a container crashes, a pod is evicted, or a node dies, you'll usually still want to access your application's logs. As such, logs should have a separate storage and lifecycle independent of nodes, pods, or containers. This concept is called _cluster-level-logging_. Cluster-level logging requires a separate backend to store, analyze, and query logs. Kubernetes provides no native storage solution for log data, but you can integrate many existing logging solutions into your Kubernetes cluster.
In this document, you can find:
This document includes:
* A basic demonstration of logging in Kubernetes using the standard output stream
* A detailed description of the node logging architecture in Kubernetes
* Guidance for implementing cluster-level logging in Kubernetes
The guidance for cluster-level logging assumes that a logging back-end is present inside or outside of your cluster. If you're not interested in having cluster-level logging, you might still find the description how logs are stored and handled on the node to be useful.
The guidance for cluster-level logging assumes that a logging backend is present inside or outside of your cluster. If you're not interested in having cluster-level logging, you might still find the description of how logs are stored and handled on the node to be useful.
## Basic logging in Kubernetes
In this section, you can see an example of basic logging in Kubernetes that outputs data to the standard output stream. This demonstration uses a [pod specification](/docs/user-guide/logging/counter-pod.yaml) with a container that writes some text to standard output once per second.
{% include code.html language="yaml" file="counter-pod.yaml" %}
{% include code.html language="yaml" file="counter-pod.yaml" ghlink="/docs/user-guide/counter-pod.yaml" %}
To run this pod, use the following command:
```shell
$ kubectl create -f counter-pod.yaml
$ kubectl create -f http://k8s.io/docs/user-guide/counter-pod.yaml
pod "counter" created
```
@ -43,17 +43,17 @@ $ kubectl logs counter
...
```
You can use `kubectl logs` to retrieve logs from a previous instantiation of a container with `--previous` flag, in case the container has crashed. If your pod has multiple containers, you should specify which container's logs you want to access by appending a container name to the command. See the [`kubectl logs` documentation](/docs/user-guide/kubectl/kubectl_logs) for more details.
You can use `kubectl logs` to retrieve logs from a previous instantiation of a container with `--previous` flag, in case the container has crashed. If your pod has multiple containers, you should specify which container's logs you want to access by appending a container name to the command. See the [`kubectl logs` documentation](/docs/user-guide/kubectl/kubectl_logs/) for more details.
## Logging at the node level
![Node level logging](/images/docs/user-guide/logging/logging-node-level.png)
Everything a containerized application writes to `stdout` and `stderr` is handled and redirected somewhere by a container engine. For example, Docker container engine redirects those two streams to [a logging driver](https://docs.docker.com/engine/admin/logging/overview), which is configured in Kubernetes to write to a file in json format.
Everything a containerized application writes to `stdout` and `stderr` is handled and redirected somewhere by a container engine. For example, the Docker container engine redirects those two streams to [a logging driver](https://docs.docker.com/engine/admin/logging/overview), which is configured in Kubernetes to write to a file in json format.
**Note:** The Docker json logging driver treats each line as a separate message. When using the Docker logging driver, there is no direct support for multi-line messages. To do so, you'll need to handle these at the logging agent level or higher.
**Note:** The Docker json logging driver treats each line as a separate message. When using the Docker logging driver, there is no direct support for multi-line messages. You need to handle multi-line messages at the logging agent level or higher.
By default, if a container restarts, kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted, along with their logs.
By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted, along with their logs.
An important consideration in node-level logging is implementing log rotation, so that logs don't consume all available storage on the node. Kubernetes uses the [`logrotate`](http://www.linuxcommand.org/man_pages/logrotate8.html) tool to implement log rotation.
@ -61,31 +61,45 @@ Kubernetes performs log rotation daily, or if the log file grows beyond 10MB in
The Kubernetes logging configuration differs depending on the node type. For example, you can find detailed information for GCI in the corresponding [configure helper](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/cluster/gce/gci/configure-helper.sh#L96).
When you run [`kubectl logs`](/docs/user-guide/kubectl/kubectl_logs), as in the basic logging example, the kubelet on the node handles the request and reads directly from the log file, returning the contents in the response. Note that `kubectl logs` **only returns the last rotation**; you must manually extract prior rotations, if desired.
When you run [`kubectl logs`](/docs/user-guide/kubectl/kubectl_logs), as in the basic logging example, the kubelet on the node handles the request and reads directly from the log file, returning the contents in the response. Note that `kubectl logs` **only returns the last rotation**; you must manually extract prior rotations, if desired and cluster-level logging is not enabled.
### System components logs
### System component logs
Kubernetes system components use a different logging mechanism than the application containers in pods. Components such as `kube-proxy` (among others) use the [glog](https://godoc.org/github.com/golang/glog) logging library. You can find the conventions for logging severity for those components in the [development docs on logging](https://github.com/kubernetes/kubernetes/tree/{{page.githubbranch}}/docs/devel/logging.md).
There are two types of system components: those that run in a container and those
that do not run in a container. For example:
System components write directly to log files in the `/var/log` directory in the node's host filesystem. Like container logs, system component logs are rotated daily and based on size. However, system component logs have a higher size retention: by default, they store 100MB.
* The Kubernets scheduler and kube-proxy run in a container.
* The kubelet and container runtime, for example Docker, do not run in containers.
On machines with systemd, the kubelet and container runtime write to journald. If
systemd is not present, they write to `.log` files in the `/var/log` directory.
System components inside containers always write to the `/var/log` directory,
bypassing the default logging mechanism. They use the [glog](https://godoc.org/github.com/golang/glog)
logging library. You can find the conventions for logging severity for those
components in the [development docs on logging](https://github.com/kubernetes/community/blob/master/contributors/devel/logging.md).
Similarly to the container logs, system component logs in the `/var/log`
directory are rotated daily and based on the log size. However,
system component logs have a higher size retention: by default,
they can store up to 100MB.
## Cluster-level logging architectures
While Kubernetes does not provide a native solution for cluster-level logging, there are several common approaches you can consider:
While Kubernetes does not provide a native solution for cluster-level logging, there are several common approaches you can consider. Here are some options:
* You can use a node-level logging agent that runs on every node.
* You can include a dedicated sidecar container for logging in an application pod.
* You can push logs directly to a back-end from within an application.
* Use a node-level logging agent that runs on every node.
* Include a dedicated sidecar container for logging in an application pod.
* Push logs directly to a backend from within an application.
### Using a node logging agent
![Using a node level logging agent](/images/docs/user-guide/logging/logging-with-node-agent.png)
You can implement cluster-level logging by including a _node-level logging agent_ on each node. The logging agent is a dedicated tool that exposes logs or pushes logs to a back-end. Commonly, the logging agent is a container that has access to a directory with log files from all of the application containers on that node.
You can implement cluster-level logging by including a _node-level logging agent_ on each node. The logging agent is a dedicated tool that exposes logs or pushes logs to a backend. Commonly, the logging agent is a container that has access to a directory with log files from all of the application containers on that node.
Because the logging agent must run on every node, it's common to implement it as either a DaemonSet replica, a manifest pod, or a dedicated native process on the node. However the latter two approaches are deprecated and highly discouraged.
Using a node-level logging agent is the most common and encouraged approach for a Kubernetes cluster, since it creates only one agent per node and it doesn't require any changes to the applications running on the node. However, node-level logging _only works for applications' standard output and standard error_.
Using a node-level logging agent is the most common and encouraged approach for a Kubernetes cluster, because it creates only one agent per node, and it doesn't require any changes to the applications running on the node. However, node-level logging _only works for applications' standard output and standard error_.
Kubernetes doesn't specify a logging agent, but two optional logging agents are packaged with the Kubernetes release: [Stackdriver Logging](/docs/user-guide/logging/stackdriver) for use with Google Cloud Platform, and [Elasticsearch](/docs/user-guide/logging/elasticsearch). You can find more information and instructions in the dedicated documents. Both use [fluentd](http://www.fluentd.org/) with custom configuration as an agent on the node.
@ -93,9 +107,9 @@ Kubernetes doesn't specify a logging agent, but two optional logging agents are
![Using a sidecar container with the logging agent](/images/docs/user-guide/logging/logging-with-sidecar.png)
You can implement cluster-level logging by including a dedicated logging agent _for each application_ on your cluster. You can include this logging agent as a "sidecar" container in the pod spec for each application; the sidecar container should contain only the logging agent.
You can implement cluster-level logging by including a dedicated logging agent for each application on your cluster. You can include this logging agent as a _sidecar container_ in the pod spec for each application; the sidecar container should contain only the logging agent.
The concrete implementation of the logging agent, the interface between agent and the application, and the interface between the logging agent and the logs back-end are completely up to a you. For an example implementation, see the [fluentd sidecar container](https://github.com/kubernetes/contrib/tree/b70447aa59ea14468f4cd349760e45b6a0a9b15d/logging/fluentd-sidecar-gcp) for the Stackdriver logging backend.
The concrete implementation of the logging agent, the interface between agent and the application, and the interface between the logging agent and the logs backend are completely up to a you. For an example implementation, see the [fluentd sidecar container](https://github.com/kubernetes/contrib/tree/b70447aa59ea14468f4cd349760e45b6a0a9b15d/logging/fluentd-sidecar-gcp) for the Stackdriver logging backend.
**Note:** Using a sidecar container for logging may lead to significant resource consumption.

View File

@ -5,44 +5,42 @@ assignees:
title: Logging with Stackdriver Logging
---
Before reading this page, it's recommended to familiarize yourself with the [overview of logging in Kubernetes](/docs/user-guide/logging/overview).
Before reading this page, it's highly recommended to familiarize yourself with the [overview of logging in Kubernetes](/docs/user-guide/logging/overview).
This article assumes that you have created a Kubernetes cluster with cluster-level logging support for sending logs to Stackdriver Logging. You can do this either by selecting "Enable Stackdriver Logging" checkbox in create cluster dialogue in [GKE](https://cloud.google.com/container-engine/) or by setting flag `KUBE_LOGGING_DESTINATION` to `gcp` when manually starting cluster using `kube-up.sh`.
This article assumes that you have created a Kubernetes cluster with cluster-level logging support for sending logs to Stackdriver Logging. You can do this either by selecting the **Enable Stackdriver Logging** checkbox in the create cluster dialogue in [GKE](https://cloud.google.com/container-engine/), or by setting the `KUBE_LOGGING_DESTINATION` flag to `gcp` when manually starting a cluster using `kube-up.sh`.
The following guide describes gathering a container's standard output and standard error. To gather logs written by an application to a file, you can use [a sidecar approach](https://github.com/kubernetes/contrib/blob/master/logging/fluentd-sidecar-gcp/README.md).
## Overview
After creation, your cluster has a collection of system pods running in the `kube-system` namespace that support monitoring, logging, and DNS resolution for Kuberentes service names. You can see these system pods by running the following command:
After creation, you can discover logging agent pods in the `kube-system` namespace,
one per node, by running the following command:
```shell
$ kubectl get pods --namespace=kube-system
NAME READY REASON RESTARTS AGE
fluentd-cloud-logging-kubernetes-node-0f64 1/1 Running 0 32m
fluentd-cloud-logging-kubernetes-node-27gf 1/1 Running 0 32m
fluentd-cloud-logging-kubernetes-node-pk22 1/1 Running 0 31m
fluentd-cloud-logging-kubernetes-node-20ej 1/1 Running 0 31m
kube-dns-v3-pk22 3/3 Running 0 32m
monitoring-heapster-v1-20ej 0/1 Running 9 32m
NAME READY STATUS RESTARTS AGE
...
fluentd-gcp-v1.30-50gnc 1/1 Running 0 5d
fluentd-gcp-v1.30-v255c 1/1 Running 0 5d
fluentd-gcp-v1.30-f02l5 1/1 Running 0 5d
...
```
Here is the same information in a picture which shows how the pods might be placed on specific nodes.
To understand how logging with Stackdriver works, consider the following
synthetic log generator pod specification [counter-pod.yaml](/docs/user-guide/logging/counter-pod.yaml):
![image](/images/blog-logging/diagrams/cloud-logging.png)
{% include code.html language="yaml" file="counter-pod.yaml" ghlink="/docs/user-guide/counter-pod.yaml" %}
This diagram shows four nodes created on a Google Compute Engine cluster with the name of each VM node on a purple background. The internal and public IPs of each node are shown on gray boxes and the pods running in each node are shown in green boxes. Each pod box shows the name of the pod and the namespace it runs in, the IP address of the pod and the images which are run as part of the pod's execution. Here we see that every node is running a fluentd-cloud-logging pod which is collecting the log output of the containers running on the same node and sending them to Stackdriver Logging. A pod which provides the
[cluster DNS service](/docs/admin/dns) runs on one of the nodes and a pod which provides monitoring support runs on another node.
To help explain how cluster-level logging works, consider the following synthetic log generator pod specification [counter-pod.yaml](/docs/user-guide/logging/counter-pod.yaml):
{% include code.html language="yaml" file="counter-pod.yaml" %}
This pod specification has one container which runs a bash script when the container is born. This script simply writes out the value of a counter and the date once per second and runs indefinitely. Let's create the pod in the default namespace.
This pod specification has one container that runs a bash script
that writes out the value of a counter and the date once per
second, and runs indefinitely. Let's create this pod in the default namespace.
```shell
$ kubectl create -f examples/blog-logging/counter-pod.yaml
pods/counter
$ kubectl create -f counter-pod.yaml
pod "counter" created
```
We can observe the running pod:
You can observe the running pod:
```shell
$ kubectl get pods
@ -50,123 +48,101 @@ NAME READY STATUS RESTARTS AG
counter 1/1 Running 0 5m
```
This step may take a few minutes to download the ubuntu:14.04 image during which the pod status will be shown as `Pending`.
One of the nodes is now running the counter pod:
![image](/images/blog-logging/diagrams/27gf-counter.png)
When the pod status changes to `Running` we can use the `kubectl logs` command to view the output of this counter pod.
For a short period of time you can observe the 'Pending' pod status, because the kubelet
has to download the container image first. When the pod status changes to `Running`
you can use the `kubectl logs` command to view the output of this counter pod.
```shell
$ kubectl logs counter
0: Tue Jun 2 21:37:31 UTC 2015
1: Tue Jun 2 21:37:32 UTC 2015
2: Tue Jun 2 21:37:33 UTC 2015
3: Tue Jun 2 21:37:34 UTC 2015
4: Tue Jun 2 21:37:35 UTC 2015
5: Tue Jun 2 21:37:36 UTC 2015
0: Mon Jan 1 00:00:00 UTC 2001
1: Mon Jan 1 00:00:01 UTC 2001
2: Mon Jan 1 00:00:02 UTC 2001
...
```
This command fetches the log text from the Docker log file for the image that is running in this container. We can connect to the running container and observe the running counter bash script.
```shell
$ kubectl exec -i counter bash
ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 17976 2888 ? Ss 00:02 0:00 bash -c for ((i = 0; ; i++)); do echo "$i: $(date)"; sleep 1; done
root 468 0.0 0.0 17968 2904 ? Ss 00:05 0:00 bash
root 479 0.0 0.0 4348 812 ? S 00:05 0:00 sleep 1
root 480 0.0 0.0 15572 2212 ? R 00:05 0:00 ps aux
```
<<<<<<< HEAD:docs/getting-started-guides/logging.md
What happens if for any reason the image in this pod is killed off and then restarted by Kubernetes? Will we still see the log lines from the previous invocation of the container followed by the log lines for the started container? Or will we lose the log lines from the original container's execution and only see the log lines for the new container? Let's find out. First let's delete the currently running counter.
=======
If, for any reason, the image in this pod is killed off and then restarted by Kubernetes, or the pod was evicted from the node, logs for the container are lost.
Try deleting the currently running counter container:
>>>>>>> 69304ab37e14fac33455b629442bdc3995d05ad2:docs/user-guide/logging/stackdriver.md
As described in the logging overview, this command fetches log entries
from the container log file. If the container is killed and then restarted by
Kubernetes, you can still access logs from the previous container. However,
if the pod is evicted from the node, log files are lost. Let's demonstrate this
by deleting the currently running counter container:
```shell
$ kubectl delete pod counter
pods/counter
```
Now let's restart the counter.
and then recreating it:
```shell
$ kubectl create -f examples/blog-logging/counter-pod.yaml
$ kubectl create -f counter-pod.yaml
pods/counter
```
Let's wait for the container to restart and get the log lines again.
After some time, you can access logs from the counter pod again:
```shell
$ kubectl logs counter
0: Tue Jun 2 21:51:40 UTC 2015
1: Tue Jun 2 21:51:41 UTC 2015
2: Tue Jun 2 21:51:42 UTC 2015
3: Tue Jun 2 21:51:43 UTC 2015
4: Tue Jun 2 21:51:44 UTC 2015
5: Tue Jun 2 21:51:45 UTC 2015
6: Tue Jun 2 21:51:46 UTC 2015
7: Tue Jun 2 21:51:47 UTC 2015
8: Tue Jun 2 21:51:48 UTC 2015
```
As expected, the log lines from the first invocation of the container in this pod have been lost. However, you'll likely want to preserve all the log lines from each invocation of each container in the pod. Furthermore, even if the pod is restarted, you might still want to preserve all the log lines that were ever emitted by the containers in the pod. This is exactly the functionality provided by cluster-level logging in Kubernetes.
## Viewing logs
We can click on the Logs item under the Monitoring section of the Google Developer Console and select the logs for the counter container, which will be called kubernetes.counter_default_count. This identifies the name of the pod (counter), the namespace (default) and the name of the container (count) for which the log collection occurred. Using this name we can select just the logs for our counter container from the drop down menu:
![Cloud Logging Console](/images/docs/cloud-logging-console.png)
When we view the logs in the Developer Console we observe the logs for both invocations of the container.
![Both Logs](/images/docs/all-lines.png)
Note the first container counted to 108 and then it was terminated. When the next container image restarted the counting process resumed from 0. Similarly if we deleted the pod and restarted it we would capture the logs for all instances of the containers in the pod whenever the pod was running.
Logs ingested into Stackdriver Logging may be exported to various other destinations including [Google Cloud Storage](https://cloud.google.com/storage/) buckets and [BigQuery](https://cloud.google.com/bigquery/). Use the Exports tab in the Cloud Logging console to specify where logs should be streamed to. You can also follow this link to the
[settings tab](https://pantheon.corp.google.com/project/_/logs/settings).
We could query the ingested logs from BigQuery using the SQL query which reports the counter log lines showing the newest lines first:
```shell
SELECT metadata.timestamp, structPayload.log
FROM [mylogs.kubernetes_counter_default_count_20150611]
ORDER BY metadata.timestamp DESC
```
Here is some sample output:
![BigQuery](/images/docs/bigquery-logging.png)
We could also fetch the logs from Google Cloud Storage buckets to our desktop or laptop and then search them locally. The following command fetches logs for the counter pod running in a cluster which is itself in a Compute Engine project called `myproject`. Only logs for the date 2015-06-11 are fetched.
```shell
$ gsutil -m cp -r gs://myproject/kubernetes.counter_default_count/2015/06/11 .
```
Now we can run queries over the ingested logs. The example below uses the [jq](http://stedolan.github.io/jq/) program to extract just the log lines.
```shell
$ cat 21\:00\:00_21\:59\:59_S0.json | jq '.structPayload.log'
"0: Thu Jun 11 21:39:38 UTC 2015\n"
"1: Thu Jun 11 21:39:39 UTC 2015\n"
"2: Thu Jun 11 21:39:40 UTC 2015\n"
"3: Thu Jun 11 21:39:41 UTC 2015\n"
"4: Thu Jun 11 21:39:42 UTC 2015\n"
"5: Thu Jun 11 21:39:43 UTC 2015\n"
"6: Thu Jun 11 21:39:44 UTC 2015\n"
"7: Thu Jun 11 21:39:45 UTC 2015\n"
0: Mon Jan 1 00:01:00 UTC 2001
1: Mon Jan 1 00:01:01 UTC 2001
2: Mon Jan 1 00:01:02 UTC 2001
...
```
This page has touched briefly on the underlying mechanisms that support gathering cluster-level logs on a Kubernetes deployment. The approach here only works for gathering the standard output and standard error output of the processes running in the pod's containers. To gather other logs that are stored in files one can use a sidecar container to gather the required files as described at the page [Collecting log files within containers with Fluentd](https://github.com/kubernetes/contrib/blob/master/logging/fluentd-sidecar-gcp/README.md) and sending them to the Stackdriver Logging service.
As expected, only recent log lines are present. However, for a real-world
application you will likely want to be able to access logs from all containers,
especially for the debug purposes. This is exactly when the previously enabled
Stackdriver Logging can help.
Some of the material in this section also appears in the blog article [Cluster-level Logging with Kubernetes](http://blog.kubernetes.io/2015/06/cluster-level-logging-with-kubernetes.html)
## Viewing logs
Stackdriver Logging agent attaches metadata to each log entry, for you to use later
in queries to select only the messages you're interested in: for example,
the messages from a particular pod.
The most important pieces of metadata are the resource type and log name.
The resource type of a container log is `container`, which is named
`GKE Containers` in the UI (even if the Kubernetes cluster is not on GKE).
The log name is the name of the container, so that if you have a pod with
two containers, named `container_1` and `container_2` in the spec, their logs
will have log names `container_1` and `container_2` respectively.
System components have resource type `compute`, which is named
`GCE VM Instance` in the interface. Log names for system components are fixed.
For a GKE node, every log entry from a system component has one the following
log names:
* docker
* kubelet
* kube-proxy
You can learn more about viewing logs on [the dedicated Stackdriver page](https://cloud.google.com/logging/docs/view/logs_viewer).
One of the possible ways to view logs is using the
[`gcloud logging`](https://cloud.google.com/logging/docs/api/gcloud-logging)
command line interface from the [Google Cloud SDK](https://cloud.google.com/sdk/).
It uses Stackdriver Logging [filtering syntax](https://cloud.google.com/logging/docs/view/advanced_filters)
to query specific logs. For example, you can run the following command:
```shell
$ gcloud beta logging read 'logName="projects/$YOUR_PROJECT_ID/logs/count"' --format json | jq '.[].textPayload'
...
"2: Mon Jan 1 00:01:02 UTC 2001\n"
"1: Mon Jan 1 00:01:01 UTC 2001\n"
"0: Mon Jan 1 00:01:00 UTC 2001\n"
...
"2: Mon Jan 1 00:00:02 UTC 2001\n"
"1: Mon Jan 1 00:00:01 UTC 2001\n"
"0: Mon Jan 1 00:00:00 UTC 2001\n"
```
As you can see, it outputs messages for the count container from both
the first and second runs, despite the fact that the kubelet already deleted
the logs for the first container.
### Exporting logs
You can export logs to [Google Cloud Storage](https://cloud.google.com/storage/)
or to [BigQuery](https://cloud.google.com/bigquery/) to run further
analysis. Stackdriver Logging offers the concept of sinks, where you can
specify the destination of log entries. More information is available on
the Stackdriver [Exporting Logs page](https://cloud.google.com/logging/docs/export/configure_export_v2).