2017-03-29 22:11:59 +00:00
---
2018-02-18 19:29:37 +00:00
reviewers:
2017-03-29 22:11:59 +00:00
- mikedanese
- thockin
2017-06-08 20:41:57 +00:00
title: Troubleshoot Applications
2018-06-22 18:20:04 +00:00
content_template: templates/concept
2017-03-29 22:11:59 +00:00
---
2018-06-22 18:20:04 +00:00
{{% capture overview %}}
2017-03-29 22:11:59 +00:00
This guide is to help users debug applications that are deployed into Kubernetes and not behaving correctly.
This is *not* a guide for people who want to debug their cluster. For that you should check out
2017-08-18 00:38:42 +00:00
[this guide ](/docs/admin/cluster-troubleshooting ).
2017-03-29 22:11:59 +00:00
2018-06-22 18:20:04 +00:00
{{% /capture %}}
2017-03-29 22:11:59 +00:00
2018-06-22 18:20:04 +00:00
{{% capture body %}}
2017-03-29 22:11:59 +00:00
## Diagnosing the problem
The first step in troubleshooting is triage. What is the problem? Is it your Pods, your Replication Controller or
your Service?
* [Debugging Pods ](#debugging-pods )
* [Debugging Replication Controllers ](#debugging-replication-controllers )
* [Debugging Services ](#debugging-services )
### Debugging Pods
The first step in debugging a Pod is taking a look at it. Check the current state of the Pod and recent events with the following command:
```shell
2019-03-07 09:31:05 +00:00
kubectl describe pods ${POD_NAME}
2017-03-29 22:11:59 +00:00
```
Look at the state of the containers in the pod. Are they all `Running` ? Have there been recent restarts?
Continue debugging depending on the state of the pods.
#### My pod stays pending
If a Pod is stuck in `Pending` it means that it can not be scheduled onto a node. Generally this is because
there are insufficient resources of one type or another that prevent scheduling. Look at the output of the
`kubectl describe ...` command above. There should be messages from the scheduler about why it can not schedule
your pod. Reasons include:
* **You don't have enough resources**: You may have exhausted the supply of CPU or Memory in your cluster, in this case
you need to delete Pods, adjust resource requests, or add new nodes to your cluster. See [Compute Resources document ](/docs/user-guide/compute-resources/#my-pods-are-pending-with-event-message-failedscheduling ) for more information.
* **You are using `hostPort` **: When you bind a Pod to a `hostPort` there are a limited number of places that pod can be
scheduled. In most cases, `hostPort` is unnecessary, try using a Service object to expose your Pod. If you do require
`hostPort` then you can only schedule as many Pods as there are nodes in your Kubernetes cluster.
#### My pod stays waiting
If a Pod is stuck in the `Waiting` state, then it has been scheduled to a worker node, but it can't run on that machine.
Again, the information from `kubectl describe ...` should be informative. The most common cause of `Waiting` pods is a failure to pull the image. There are three things to check:
2017-08-15 12:10:47 +00:00
* Make sure that you have the name of the image correct.
2017-03-29 22:11:59 +00:00
* Have you pushed the image to the repository?
* Run a manual `docker pull <image>` on your machine to see if the image can be pulled.
#### My pod is crashing or otherwise unhealthy
First, take a look at the logs of
the current container:
```shell
2019-03-07 09:31:05 +00:00
kubectl logs ${POD_NAME} ${CONTAINER_NAME}
2017-03-29 22:11:59 +00:00
```
If your container has previously crashed, you can access the previous container's crash log with:
```shell
2019-03-07 09:31:05 +00:00
kubectl logs --previous ${POD_NAME} ${CONTAINER_NAME}
2017-03-29 22:11:59 +00:00
```
Alternately, you can run commands inside that container with `exec` :
```shell
2019-03-07 09:31:05 +00:00
kubectl exec ${POD_NAME} -c ${CONTAINER_NAME} -- ${CMD} ${ARG1} ${ARG2} ... ${ARGN}
2017-03-29 22:11:59 +00:00
```
2018-08-20 20:44:48 +00:00
{{< note > }}
2018-11-06 19:33:04 +00:00
`-c ${CONTAINER_NAME}` is optional. You can omit it for Pods that only contain a single container.
2018-08-20 20:44:48 +00:00
{{< / note > }}
2017-03-29 22:11:59 +00:00
As an example, to look at the logs from a running Cassandra pod, you might run
```shell
2019-03-07 09:31:05 +00:00
kubectl exec cassandra -- cat /var/log/cassandra/system.log
2017-03-29 22:11:59 +00:00
```
If none of these approaches work, you can find the host machine that the pod is running on and SSH into that host,
but this should generally not be necessary given tools in the Kubernetes API. Therefore, if you find yourself needing to ssh into a machine, please file a
feature request on GitHub describing your use case and why these tools are insufficient.
#### My pod is running but not doing what I told it to do
If your pod is not behaving as you expected, it may be that there was an error in your
pod description (e.g. `mypod.yaml` file on your local machine), and that the error
was silently ignored when you created the pod. Often a section of the pod description
is nested incorrectly, or a key name is typed incorrectly, and so the key is ignored.
For example, if you misspelled `command` as `commnd` then the pod will be created but
will not use the command line you intended it to use.
The first thing to do is to delete your pod and try creating it again with the `--validate` option.
Official 1.14 Release Docs (#13174)
* Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S. (#11752)
* Added documentation about Poseidon-Firmament scheduler
* Fixed some style issues.
* Udpated the document as per the review comments.
* Fixed some typos and updated the document
* Updated the document as per the review comments.
* Document timeout attribute for kms-plugin. (#12158)
See 72540.
* Official documentation on Poseidon/Firmament, a new multi-scheduler (#12343)
* Removed the old version of the Poseidon documentation. Incorrect location.
* Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S (#12069)
* Official documentation on Poseidon/Firmament, a new multi-scheduler support for K8S. (#11752)
* Added documentation about Poseidon-Firmament scheduler
* Fixed some style issues.
* Udpated the document as per the review comments.
* Fixed some typos and updated the document
* Updated the document as per the review comments.
* Updated the document as per review comments. Added config details.
* Updated the document as per the latest review comments. Fixed nits
* Made changes as per latest suggestions.
* Some more changes added.
* Updated as per suggestions.
* Changed the release process section.
* SIG Docs edits
Small edits to match style guidelines.
* add plus to feature state
* capitalization
* revert feature state shortcode
since this is a Kubernetes extension, not a direct feature, it shouldn't use the regular feature state tagging.
(cherry picked from commit 7730c1540b637be74b9b21d4128a145994eb19cc)
* Remove initializers from doc. It will be removed in 1.14 (#12331)
* kubeadm: Document CRI auto detection functionality (#12462)
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
* Minor doc change for GAing Pod DNS Config (#12514)
* Graduate ExpandInUsePersistentVolumes feature to beta (#10574)
* Rename 2018-11-07-grpc-load-balancing-with-linkerd.md.md file (#12594)
* Add dynamic percentage of node scoring to user docs (#12235)
* Add dynamic percentage of node scoring to user docs
* addressed review comments
* delete special symbol (#12445)
* Update documentation for VolumeSubpathEnvExpansion (#11843)
* Update documentation for VolumeSubpathEnvExpansion
* Address comments - improve descriptions
* Graduate Pod Priority and Preemption to GA (#12428)
* Added Instana links to the documentation (#12977)
* Added link to the Instana Kubernetes integration
* Added Instana link for services section
Added Instana and a link to the Kubernetes integration to the analytics services section and broadened the scope to APM, monitoring and analytics.
* Oxford comma /flex
* More Oxford commas, because they matter
* Update kubectl plugins to stable (#12847)
* documentation for CSI topology beta (#12889)
* Document changes to default RBAC discovery ClusterRole(Binding)s (#12888)
* Document changes to default RBAC discovery ClusterRole(Binding)s
Documentation for https://github.com/kubernetes/enhancements/issues/789 and https://github.com/kubernetes/kubernetes/pull/73807
* documentation review feedback
* CSI raw block to beta (#12931)
* Change incorrect string raw to block (#12926)
Fixes #12925
* Update documentation on node OS/arch labels (#12976)
These labels have been promoted to GA:
https://github.com/kubernetes/enhancements/issues/793
* local pv GA doc updates (#12915)
* Publish CRD OpenAPI Documentation (#12910)
* add documentation for CustomResourcePublishOpenAPI
* address comments
fix links, ordered lists, style and typo
* kubeadm: add document for upgrading from 1.13 to 1.14 (single CP and HA) (#13189)
* kubeadm: add document for upgrading from 1.13 to 1.14
- remove doc for upgrading 1.10 -> 1.11
* kubeadm: apply amends to upgrade-1.14 doc
* kubeadm: apply amends to upgrade-1.14 doc (part2)
* kubeadm: apply amends to upgrade-1.14 doc (part3)
* kubeadm: add note about "upgrade node experimental-control-plane"
+ add comment about `upgrade plan`
* kubeadm: add missing "You should see output similar to this"
* fix bullet indentation (#13214)
* mark PodReadinessGate GA (#12800)
* Update RuntimeClass documentation for beta (#13043)
* Update RuntimeClass documentation for beta
* Update feature gate & add upgrade section
* formatting fixes
* Highlight upgrade action required
* Address feedback
* CSI ephemeral volume alpha documentation (#10934)
* update kubectl documentation (#12867)
* update kubectl documentation
* add document for Secret/ConfigMap generators
* replace `kubectl create -f` by `kubectl apply -f`
* Add page for kustomization support in kubectl
* fix spelling errors and address comments
* Documentation for Windows GMSA feature (#12936)
* Documentation for Windows GMSA feature
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Enhancements to GMSA docs
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Fix links
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Fix GMSA link
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Add GMSA feature flag in feature flag list
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Relocate GMSA to container configuration
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Add example for container spec
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Remove changes in Windows index
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Update configure-gmsa.md
* Update configure-gmsa.md
* Update configure-gmsa.md
* Update configure-gmsa.md
* Rearrange the steps into two sections and other edits
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Fix links
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Add reference to script to generate GMSA YAMLs
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Some more clarifications for GMSA
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* HugePages graduated to GA (#13004)
* HugePages graduated to GA
* fixing nit for build
* Docs for node PID limiting (https://github.com/kubernetes/kubernetes/pull/73651) (#12932)
* kubeadm: update the reference documentation for 1.14 (#12911)
* kubeadm: update list of generated files for 1.14
NOTE: PLACEHOLDERS! these files are generated by SIG Docs each
release, but we need them to pass the k/website PR CI.
- add join_phase* (new sub phases of join)
- add init_phase_upload-certs.md (new upload certs phase for init)
- remove alpha-preflight (now both init and join have this)
* kubeadm: update reference docs includes for 1.14
- remove includes from alpha.md
- add upload-certs to init-phase.md
- add join-phase.md and it's phases
* kubeadm: update the editorial content of join and init
- cleanup master->control-plane node
- add some notes about phases and join
- remove table about pre-pulling images
- remove outdated info about self-hosting
* kubeadm: update target release for v1alpha3 removal
1.14 -> 1.15
* kubeadm: copy edits for 1.14 reference docs (part1)
* kubeadm: use "shell" for code blocks
* kubeadm: update the 1.14 HA guide (#13191)
* kubeadm: update the 1.14 HA guide
* kubeadm: try to fix note/caution indent in HA page
* kubeadm: fix missing sudo and minor amends in HA doc
* kubeadm: apply latest amends to the HA doc for 1.14
* fixed a few missed merge conflicts
* Admission Webhook new features doc (#12938)
- kubernetes/kubernetes#74998
- kubernetes/kubernetes#74477
- kubernetes/kubernetes#74562
* Clarifications and fixes in GMSA doc (#13226)
* Clarifications and fixes in GMSA doc
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Update configure-gmsa.md
* Reformat to align headings and pre-reqs better
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Reformat to align headings and pre-reqs better
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Reformat to fix bullets
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Reword application of sample gmsa
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Update configure-gmsa.md
* Address feedback to use active voice
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Address feedback to use active voice
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* RunAsGroup documentation for Progressing this to Beta (#12297)
* start serverside-apply documentation (#13077)
* start serverside-apply documentation
* add more concept info on server side apply
* Update api concepts
* Update api-concepts.md
* fix style issues
* Document CSI update (#12928)
* Document CSI update
* Finish CSI documentation
Also fix mistake with ExpandInUsePersistentVolumes documented as beta
* Overall docs for CSI Migration feature (#12935)
* Placeholder docs for CSI Migration feature
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Address CR comments and update feature gates
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Add mappings for CSI plugins
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Add sections for AWS and GCE PD migration
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Add docs for Cinder and CSI Migration info
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Clarify scope to volumes with file system
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Change the format of EBS and Cinder CSI Migration sections to follow the GCE template
Signed-off-by: Deep Debroy <ddebroy@docker.com>
* Windows documentation updates for 1.14 (#12929)
* Updated the note to indicate doc work for 1.14
* first attempt at md export from gdoc
* simplifyig
* big attempt
* moving DRAFT windows content to PR for review
* moving content to PR in markdown for review
* updated note tags
* Delete windows-contributing.md
deleting this file as it is already ported to the github contributor guide
* fixed formatting in intro and cluster setup guide
* updating formatting for running containers guide
* rejiggered end of troubleshooting
* fixed minor typos
* Clarified the windows binary download step
* Update _index.md
making updates based on feedback
* Update _index.md
updating ovn-kubernetes docs
* Update _index.md
* Update _index.md
* updating relative docs links
updating all the links to be relative links to /docs
* Update _index.md
* Update _index.md
updates for windows services and ovn-kubernetes
* formatted for correct step numbering
* fix typos
* Update _index.md
updates for flannel PR in troubleshooting
* Update _index.md
* Update _index.md
updating a few sections like roadmap, services, troubleshooting/filing tickets
* Update _index.md
* Update _index.md
* Update _index.md
* Fixed a few whitespace issues
* Update _index.md
* Update _index.md
* Update _index.md
* add section on upgrading CoreDNS (#12909)
* documentation for kubelet resource metrics endpoint (#12934)
* windows docs updates for 1.14 (#13279)
* Delete sample-l2bridge-wincni-config.json
this file is not used anywhere
* Update _index.md
* Update _index.md
* Update _index.md
* Update _index.md
* Update _index.md
* Rename content/en/docs/getting-started-guides/windows/_index.md to content/en/docs/setup/windows/_index.md
moving to new location
* Delete flannel-master-kubectl-get-ds.png
* Delete flannel-master-kubeclt-get-pods.png
* Delete windows-docker-error.png
* Add files via upload
* Rename _index.md to add-windows-nodes.md
* Create _index.md
* Update _index.md
* Update add-windows-nodes.md
* Update add-windows-nodes.md
* Create user-guide-windows-nodes.md
* Create user-guide-windows-containers.md
* Update and rename add-windows-nodes.md to intro-windows-nodes.md
* Update user-guide-windows-containers.md
* Rename intro-windows-nodes.md to intro-windows-in-kubernetes.md
* Update user-guide-windows-nodes.md
* Update user-guide-windows-containers.md
* Update user-guide-windows-containers.md
* Update user-guide-windows-nodes.md
* Update user-guide-windows-containers.md
* Update _index.md
* Update intro-windows-in-kubernetes.md
* Update intro-windows-in-kubernetes.md
fixing the pause image
* Update intro-windows-in-kubernetes.md
changing tables from html to MD
* Update user-guide-windows-nodes.md
converting tables from HTML to MD
* Update intro-windows-in-kubernetes.md
* Update user-guide-windows-nodes.md
* Update user-guide-windows-nodes.md
* Update user-guide-windows-nodes.md
updating the numbering , even though it messes up the notes a little bit. Jim will file a ticket to follow up
* Update user-guide-windows-nodes.md
* update to windows docs for 1.14 (#13322)
* Update intro-windows-in-kubernetes.md
* Update intro-windows-in-kubernetes.md
* Update intro-windows-in-kubernetes.md
* Update intro-windows-in-kubernetes.md
* Update intro-windows-in-kubernetes.md
* Update user-guide-windows-containers.md
* Update user-guide-windows-nodes.md
* Update intro-windows-in-kubernetes.md (#13344)
* server side apply followup (#13321)
* change some parts of serverside apply docs in response to comments
* fix typos and wording
* Update config.toml (#13365)
2019-03-25 22:06:16 +00:00
For example, run `kubectl apply --validate -f mypod.yaml` .
2018-10-30 16:45:56 +00:00
If you misspelled `command` as `commnd` then will give an error like this:
2017-03-29 22:11:59 +00:00
```shell
I0805 10:43:25.129850 46757 schema.go:126] unknown field: commnd
I0805 10:43:25.129973 46757 schema.go:129] this may be a false alarm, see https://github.com/kubernetes/kubernetes/issues/6842
pods/mypod
```
<!-- TODO: Now that #11914 is merged, this advice may need to be updated -->
The next thing to check is whether the pod on the apiserver
matches the pod you meant to create (e.g. in a yaml file on your local machine).
For example, run `kubectl get pods/mypod -o yaml > mypod-on-apiserver.yaml` and then
manually compare the original pod description, `mypod.yaml` with the one you got
back from apiserver, `mypod-on-apiserver.yaml` . There will typically be some
lines on the "apiserver" version that are not on the original version. This is
expected. However, if there are lines on the original that are not on the apiserver
version, then this may indicate a problem with your pod spec.
### Debugging Replication Controllers
Replication controllers are fairly straightforward. They can either create Pods or they can't. If they can't
create pods, then please refer to the [instructions above ](#debugging-pods ) to debug your pods.
You can also use `kubectl describe rc ${CONTROLLER_NAME}` to introspect events related to the replication
controller.
### Debugging Services
Services provide load balancing across a set of pods. There are several common problems that can make Services
not work properly. The following instructions should help debug Service problems.
First, verify that there are endpoints for the service. For every Service object, the apiserver makes an `endpoints` resource available.
You can view this resource with:
```shell
2019-03-07 09:31:05 +00:00
kubectl get endpoints ${SERVICE_NAME}
2017-03-29 22:11:59 +00:00
```
Make sure that the endpoints match up with the number of containers that you expect to be a member of your service.
For example, if your Service is for an nginx container with 3 replicas, you would expect to see three different
IP addresses in the Service's endpoints.
#### My service is missing endpoints
If you are missing endpoints, try listing pods using the labels that Service uses. Imagine that you have
a Service where the labels are:
```yaml
...
spec:
- selector:
name: nginx
type: frontend
```
You can use:
```shell
2019-03-07 09:31:05 +00:00
kubectl get pods --selector=name=nginx,type=frontend
2017-03-29 22:11:59 +00:00
```
to list pods that match this selector. Verify that the list matches the Pods that you expect to provide your Service.
If the list of pods matches expectations, but your endpoints are still empty, it's possible that you don't
have the right ports exposed. If your service has a `containerPort` specified, but the Pods that are
selected don't have that port listed, then they won't be added to the endpoints list.
2019-11-26 05:31:10 +00:00
Verify that the pod's `containerPort` matches up with the Service's `targetPort`
2017-03-29 22:11:59 +00:00
#### Network traffic is not forwarded
If you can connect to the service, but the connection is immediately dropped, and there are endpoints
in the endpoints list, it's likely that the proxy can't contact your pods.
There are three things to
check:
2017-08-18 00:38:42 +00:00
* Are your pods working correctly? Look for restart count, and [debug pods ](#debugging-pods ).
* Can you connect to your pods directly? Get the IP address for the Pod, and try to connect directly to that IP.
2017-03-29 22:11:59 +00:00
* Is your application serving on the port that you configured? Kubernetes doesn't do port remapping, so if your application serves on 8080, the `containerPort` field needs to be 8080.
2018-06-22 18:20:04 +00:00
{{% /capture %}}
{{% capture whatsnext %}}
2017-03-29 22:11:59 +00:00
If none of the above solves your problem, follow the instructions in [Debugging Service document ](/docs/user-guide/debugging-services ) to make sure that your `Service` is running, has `Endpoints` , and your `Pods` are actually serving; you have DNS working, iptables rules installed, and kube-proxy does not seem to be misbehaving.
2017-05-19 20:02:31 +00:00
You may also visit [troubleshooting document ](/docs/troubleshooting/ ) for more information.
2018-06-22 18:20:04 +00:00
{{% /capture %}}