2017-10-28 17:11:58 +00:00
---
title: Troubleshooting kubeadm
---
2018-05-05 16:00:51 +00:00
{{% capture overview %}}
2017-10-28 17:11:58 +00:00
As with any program, you might run into an error using or operating it. Below we have listed
common failure scenarios and have provided steps that will help you to understand and hopefully
fix the problem.
If your problem is not listed below, please follow the following steps:
- If you think your problem is a bug with kubeadm:
- Go to [github.com/kubernetes/kubeadm ](https://github.com/kubernetes/kubeadm/issues ) and search for existing issues.
- If no issue exists, please [open one ](https://github.com/kubernetes/kubeadm/issues/new ) and follow the issue template.
- If you are unsure about how kubeadm or kubernetes works, and would like to receive
support about your question, please ask on Slack in #kubeadm , or open a question on StackOverflow. Please include
relevant tags like `#kubernetes` and `#kubeadm` so folks can help you.
If your cluster is in an error state, you may have trouble in the configuration if you see Pod statuses like `RunContainerError` ,
`CrashLoopBackOff` or `Error` . If this is the case, please read below.
2018-05-05 16:00:51 +00:00
{{% /capture %}}
2017-10-28 17:11:58 +00:00
2018-03-05 08:43:51 +00:00
#### `ebtables` or some similar executable not found during installation
2017-10-28 17:11:58 +00:00
If you see the following warnings while running `kubeadm init`
```
[preflight] WARNING: ebtables not found in system path
[preflight] WARNING: ethtool not found in system path
```
Release 1.9 (#5978)
* Trivial change to open release branch
* Undo trivial change
* add service ipvs overview
* Add instructions on how to setup kubectl
* Document conntrack dependency for kube-proxy
* Add an a
This is kind of jarring / missing an article. I'm guessing it should either be ' to a rack of bare metal servers.' or '...to racks of bare metal servers.'.
* adding example responses for common issues
- support request
- code bug report
* Trivial change to open release branch
* Undo trivial change
* Signed-off-by: Ziqi Zhao <zhaoziqi@qiniu.com> (#5366)
Fix the not-working test case yaml for /doc/concepts/storage/volumes.md
* kubectl-overview
* temp fix for broken pod and deployment links
* Update Table of Solutions for Juju
* Revise certificates documentation (#5965)
* Update review-issues.md
Some edits for clarity and condensed language.
* Update init-containers.md
Fix leading spaces in commands.
* Update kubectl-overview.md
Fix format.
* Update clc.md
Fix format.
* Update openstack-heat.md
The url no need. just highlight.
* Typo
I believe this should be "users" not "uses"
* making explicit hostname uniq requirement
* Update scheduling-hugepages.md
* Update update-daemon-set.md
* fix redirection of PersistentVolume
* Update hpa.md
* update kubectl instruction
* Use the format of kubeadm init
* fix spelling error
guarnatees to guarantees
* add matchLabels description (#6020)
* search and replace for k8s.github.io to website (#6019)
* fix scale command of object-management (#6011)
* Update replicaset.md (#6009)
* Update secret.md (#6008)
* specify password for mysql image (#5990)
* specify password for mysql image
* specify password for mysql image
* link error for run-stateless-application-deployment.md (#5985)
* link error for run-stateless-application-deployment.md
* link error for run-stateless-application-deployment.md
* Add performance implications of inter-pod affinity/anti-affinity (#5979)
* 404 monthly maintenance - October 2017 (#5977)
* Updated redirects
* More redirects
* Add conjure-up to Turnkey Cloud Solutions list (#5973)
* Add conjure-up to Turnkey Cloud Solutions list
* Changed wording slightly
* change the StatefulSet to ReplicaSet in reference (#5968)
* Clarification of failureThreshold of probes (#5963)
* Mention usage of block storage version param (#5925)
Mention usage of block storage version (bs-version) parameter to
workaround attachment issues using older K8S versions on an OpenStack
cloud with path-based endpoints.
Resolves: https://github.com/kubernetes/kubernetes.github.io/issues/5924
* Update sysctl-cluster.md (#5894)
Include guide on enabling unsafe sysctls in minikube
* Avoid Latin phrases & format note (#5889)
* Avoid Latin phrases & format note
according the Documentation Style Guide
* Update scratch.md
* Update scratch.md
* resolves jekyll rendering error (#5976)
- chinese isn't understood for keys in YAML frontmatter in jekyll, so
replaced it with the english equivalent that doesn't throw the
following error on rendering:
Error reading file src/kubernetes.github.io/cn/docs/concepts/cluster-administration/device-plugins.md: (<unknown>): could not find expected ':' while scanning a simple key at line 4 column 1
* Change VM to pod. (#6022)
* Add link to custom metrics. (#6023)
* Rephrase core group. (#6024)
* Added explanation on context to when joining (#6018)
* Update create-cluster-kubeadm.md (#5761)
Update Canal version in pod network apply commands
* Fixes issue #5620 (#5869)
* Fixes issue #5620
Signed-off-by: Brad Topol <btopol@us.ibm.com>
* Restructured so that review process is for both current and upcoming
releases. Added content describing the use of tech reviewers.
* Removed incorrect Kubernetes reviewer link.
* Fixed tech reviewer URL to now use website
* Update pod-priority-preemption.md
fix-wrong-link-to-pod-preemption
* pod-security-policy.md: add links to the page about admission plugins.
* Adding all files for BlaBlaCar case study (#5857)
* Adding all files for BlaBlaCar case study
* Update blablacar.html
* Fix changed URL for google containers
* Add /docs/reference/auto-generated directory
* correct the downwardapi redirect
* Remove links using "here"
* Rename to /docs/reference/generated directory
* add Concept template
* Change title to just Ingress
* Link mistake (#6038)
* link mistake
* link mistake
* skip title check for skip_title_check.txt
* skip title check for skip_title_check.txt
* remove doesn't exist link.
* Fix podpreset task (#5705)
* Add a simple pod manifest to pod overview (#5986)
* Split PodPreset concept out from task doc (#5984)
* Add selector spec description (#5789)
* Add selector spec description
* Fix selector field explanation
* Put orphaned topics in TOC. (#6051)
* static-pod example bad format in the final page (#6050)
* static-pod example bad format in the final page
* static-pod example bad format in the final page
* static-pod example bad format in the final page
* static-pod example bad format in the final page
* static-pod example bad format in the final page
* Fix `backoffLimit` field misplacement (#6042)
It should be placed in JobSpec according to:
https://github.com/kubernetes/kubernetes/blob/master/api/swagger-spec/batch_v1.json#L1488-L1514
* Update addons.md (#6061)
* add info about VMware NSX-T CNI plugin (#5987)
* add info about VMware NSX-T CNI plugin
Hello,
I'm VMware Networking and Security Architect and would like to include short information about our CNI plugin implementation similar to what other vendors did
Best regards
Emil Gagala
* Update networking.md
* Update networking.md
* Update networking.md
* Update: Using universal zsh configuration (#5669)
* Update install-kubectl.md
Zsh is not only oh-my-zsh, so I added universal configuration for zsh that also can be used in prezto.
* fix merge error after rebase
* Operating etcd cluster for Kubernetes bad format in the final page (#6056)
* Operating etcd cluster for Kubernetes bad format in the final page
* Update configure-upgrade-etcd.md
* Update configure-upgrade-etcd.md
* Usage note and warning tags. (#6053)
* Usage note and warning tags.
* Update configure-upgrade-etcd.md
* Update configure-upgrade-etcd.md
* Document jekyll includes snippets
* Add jekyll includes to docs home toc
- Remove extra kubernetes home in toc
* document docker cgroupdriver req (#5937)
* Update test blacklists (#6063)
* Update toc check blacklist
* Update title check blacklist
* wip
* wip
* Fix typo
* Document unconfined apparmor profile
* Revert "Document the unconfined profile for AppArmor" (#6268)
* CRD Validation: remove alpha warning, change enable instructions to (#6066)
disable
* Documented service annotation for AWS ELB SSL policy
* kubeadm: add a note about the new `--print-join-command` flag.
This is a new flag for the `kubeadm token create` command.
* Add a note to PDB page
* Improve Kubeadm reference doc (#6103)
* automatically-generated kubeadm reference doc
* user-mantained kubeadm reference doc
* Documentation for CSIPersistentVolume
* change replicaset documentation to use apps/v1 APIs
* Update service.md
ipvs alpha version -> beta version
* Updated Deployment concept docs (#6494)
* Updated Deployment concept docs
* Addressed comments
* Documentation for volume scheduling alpha feature
* Update admission control docs for webhooks
* Improve DNS documentation (#6479)
* update ds for 1.9
* Update service.md
* Update service.md
* Revert "begin updating webhook documentation" (#6575)
* Update version numbers to include 1.9 (#6518)
* Update site versions for 1.9
* Removed 1.4 docs
* Update _config.yml
* Update _config.yml
* updates for raw block devices
* rbac: docs for aggregated cluster roles (#6474)
* Added IPv6 information for Kubelet arguments (#6498)
* Added IPv6 info to kube-proxy arguments
* Added IPv6 information for argument for kubelet
* Update PVC resizing documentation (#6487)
* Updates for Windows Server version 1709 with K8s v1.8 (#6180)
* Updated for WSv1709 and K8s v1.8
* Updated picture and CNI config
* Fixed formatting on CNI Config
* Updated docs to reference Microsoft/SDN GitHub docs
* fix typo
* Workaround for Jekyllr frontmatter
* Added section on features and limitations, with example yaml files.
* Update index.md
* Added kubeadm section, few other small fixes
* Few minor grammar fixes
* Update access-cluster.md with a comment that for IPv6
the user should use [::1] for the localhost
* Addressed a number of issues brought up against the base PR
* Fixed windows-host-setup link
* Rewrite PodSecurityPolicy guide
* Update index.md
Signed-off-by: Alin Balutoiu <abalutoiu@cloudbasesolutions.com>
Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
* Spelling correction and sentence capitalization.
- Corrected the spelling error for storing, was put in as 'stoing'.
- Capitalized list items.
- Added '.' at end of sentences in the list items.
* Update index.md
* Update index.md
* Addressed comments and rebased
* Fixed formatting
* Fixed formatting
* Updated header link
* Updated hyperlinks
* Updated warning
* formatting
* formatting
* formatting
* Revert "Update access-cluster.md with a comment that for IPv6"
This reverts commit 31e4dbdc25a60e4584ce01a6b1915e13ac63bc67.
* Revert "fix typo"
This reverts commit c05678752d3b481e2907bc53d3971bb49eab6609.
* Revert "Workaround for Jekyllr frontmatter"
This reverts commit b84ac59624b625e6534ccd97bb4ba65e51b441e4.
* Fixed grammatical issues and reverted non-related commits
* Revert "Rewrite PodSecurityPolicy guide"
This reverts commit 5d39cfeae41b3237a5e1247bc1c1f98e0727c5fd.
* Revert "Spelling correction and sentence capitalization."
This reverts commit 47eed4346e4491c9a63c2e0cb76bdd37bff5677c.
* Fixed auto-numbering
* Minor formatting updates
* CoreDNS feature documentation (#6463)
* Initial placeholder PR for CoreDNS feature documentation
* Remove from admin, add content
* Fix missing endcapture
* Add to tasks.yml
* Review feedback
* Postpone Deletion of a Persistent Volume Claim in case It Is Used by a Pod (#6415)
* Postpone Deletion of a Persistent Volume Claim in case It Is Used by a Pod
A new feature PVC Protection was added into K8s 1.9 that's why this documentation change is needed.
* Added tag at the top of each new area.
* Fix typo
* Fix: switched on in (all kubelets) -> (all K8s components).
* Added link to admission controller
* Moved PVC Protection configuration into Before you begin section.
* Added steps how to verify PVC Protection feature.
* Fixes for admission controller plugin description and for PVC Protection description in PVC lifecycle.
* Testing official rendering of enumerations (1., 2., 3., etc.)
* Re-write to address comments from review.
* Fixed definition when a PVC is in active use by a pod.
* Change auditing docs page for 1.9 release (#6427)
* Change auditing docs page for 1.9 release
Signed-off-by: Mik Vyatskov <vmik@google.com>
* Address review comments
Signed-off-by: Mik Vyatskov <vmik@google.com>
* Address review comments
Signed-off-by: Mik Vyatskov <vmik@google.com>
* Address review comments
Signed-off-by: Mik Vyatskov <vmik@google.com>
* Fix broken link
Signed-off-by: Mik Vyatskov <vmik@google.com>
* short circuit deny docs (#6536)
* line wrap
* short circuit deny
* address comments
* Add kubeadm 1.9 upgrade docs (#6485)
* kubeadm: Improve kubeadm documentation for v1.9 (#6645)
* Update admission control docs for webhooks (re-send #6368) (#6650)
* Update admission control docs for webhooks
* update in response to comments
* Revamp rkt and add CRI-O as alternative runtime (#6371)
Signed-off-by: Lorenzo Fontana <lo@linux.com>
* Documented NLB for Kubernetes 1.9 (#6260)
* Added IPV6 information to setup cluster using kubeadm (#6465)
* Added IPV6 information to setup cluster using kubeadm
* Updated kubeadm.md & create-cluster-kubeadm.md with IPv6 related information
* Added IPv6 options for kubeadm --init & automated address binding for kube-proxy based on version of IP configured for API server)
* Changes to kubeadm.md as per comments
* Modified kubeadm.md and create-cluster-kubeadm.md
* Implemented changes requested by zacharysarah
* Removed autogenerated kubeadm.md changes
* StatefulSet 1.9 updates. (#6550)
* updates sts concept and tutorials to use 1.9 apps/v1
* Update statefulset.md
* clarify pod name label
* Garbage collection updates for 1.9 (#6555)
* 1.9 gc policy update
* carify deletion
* Couple nits for dnsConfig doc (#6652)
* Add doc for AllowedFlexVolume (#6563)
* Update OpenStack Cloud Provider API support for v1.9 (#6638)
* Flex volume is GA. Remove alpha notation. (#6666)
* Update generated ref docs for Kubernetes and Federation components. (#6658)
* Update generated ref docs for Kubernetes and Federation components.
* Rename kubectl-options to kubectl.
* Add title to kubectl.
* Fix double synopsis.
* Update Federation API ref docs for 1.9. (#6636)
* Update federation API ref docs.
* Move and redirect.
* Move generated Federation docs to the generated directory.
* Fix titles.
* Type
* Fix titles
* Update auto-generated Kubernetes APi ref docs. (#6646)
* Update kubectl commands for 1.9 (#6635)
* add ExtendedResourceToleration admission controller (#6618)
* Update API reference paths for v1.9 (#6681)
2017-12-15 23:36:13 +00:00
Then you may be missing `ebtables` , `ethtool` or a similar executable on your Linux machine. You can install them with the following commands:
2017-10-28 17:11:58 +00:00
2017-12-06 01:36:54 +00:00
- For ubuntu/debian users, run `apt install ebtables ethtool` .
- For CentOS/Fedora users, run `yum install ebtables ethtool` .
2017-11-12 13:15:59 +00:00
2017-12-06 01:36:54 +00:00
#### kubeadm blocks waiting for control plane during installation
2017-11-12 13:15:59 +00:00
2017-12-06 01:36:54 +00:00
If you notice that `kubeadm init` hangs after printing out the following line:
2017-10-28 17:11:58 +00:00
```
2017-11-12 13:15:59 +00:00
[apiclient] Created API client, waiting for the control plane to become ready
2017-10-28 17:11:58 +00:00
```
2017-12-06 01:36:54 +00:00
This may be caused by a number of problems. The most common are:
2017-11-14 06:34:18 +00:00
2017-12-06 01:36:54 +00:00
- network connection problems. Check that your machine has full network connectivity before continuing.
- the default cgroup driver configuration for the kubelet differs from that used by Docker.
Check the system log file (e.g. `/var/log/message` ) or examine the output from `journalctl -u kubelet` . If you see something like the following:
2017-11-12 13:15:59 +00:00
2017-12-06 01:36:54 +00:00
```shell
error: failed to run Kubelet: failed to create kubelet:
misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
```
2018-01-10 08:24:59 +00:00
There are two common ways to fix the cgroup driver problem:
2018-02-04 22:17:05 +00:00
1. Install docker again following instructions
2018-01-09 08:51:02 +00:00
[here ](/docs/setup/independent/install-kubeadm/#installing-docker ).
2018-01-11 02:07:26 +00:00
1. Change the kubelet config to match the Docker cgroup driver manually, you can refer to
2018-03-05 08:43:51 +00:00
[Configure cgroup driver used by kubelet on Master Node ](/docs/setup/independent/install-kubeadm/#configure-cgroup-driver-used-by-kubelet-on-master-node )
2018-01-11 02:07:26 +00:00
for detailed instructions.
2018-03-05 08:43:51 +00:00
The `kubectl describe pod` or `kubectl logs` commands can help you diagnose errors. For example:
```bash
kubectl -n ${NAMESPACE} describe pod ${POD_NAME}
kubectl -n ${NAMESPACE} logs ${POD_NAME} -c ${CONTAINER_NAME}
```
2018-01-10 08:24:59 +00:00
2017-12-06 01:36:54 +00:00
- control plane Docker containers are crashlooping or hanging. You can check this by running `docker ps` and investigating each container by running `docker logs` .
2017-11-14 06:34:18 +00:00
2018-05-18 17:15:53 +00:00
#### kubeadm blocks when removing managed containers
The following could happen if Docker halts and does not remove any Kubernetes-managed containers:
```bash
sudo kubeadm reset
[preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Removing kubernetes-managed containers
(block)
```
A possible solution is to restart the Docker service and then re-run `kubeadm reset` :
```bash
sudo systemctl restart docker.service
sudo kubeadm reset
```
2017-10-28 17:11:58 +00:00
#### Pods in `RunContainerError`, `CrashLoopBackOff` or `Error` state
Right after `kubeadm init` there should not be any such Pods. If there are Pods in
such a state _right after_ `kubeadm init` , please open an issue in the kubeadm repo.
`kube-dns` should be in the `Pending` state until you have deployed the network solution.
However, if you see Pods in the `RunContainerError` , `CrashLoopBackOff` or `Error` state
after deploying the network solution and nothing happens to `kube-dns` , it's very
likely that the Pod Network solution that you installed is somehow broken. You
might have to grant it more RBAC privileges or use a newer version. Please file
an issue in the Pod Network providers' issue tracker and get the issue triaged there.
#### `kube-dns` is stuck in the `Pending` state
This is **expected** and part of the design. kubeadm is network provider-agnostic, so the admin
should [install the pod network solution ](/docs/concepts/cluster-administration/addons/ )
of choice. You have to install a Pod Network
before `kube-dns` may deployed fully. Hence the `Pending` state before the network is set up.
#### `HostPort` services do not work
The `HostPort` and `HostIP` functionality is available depending on your Pod Network
provider. Please contact the author of the Pod Network solution to find out whether
`HostPort` and `HostIP` functionality are available.
Verified HostPort CNI providers:
- Calico
- Canal
- Flannel
For more information, read the [CNI portmap documentation ](https://github.com/containernetworking/plugins/blob/master/plugins/meta/portmap/README.md ).
If your network provider does not support the portmap CNI plugin, you may need to use the [NodePort feature of
services](/docs/concepts/services-networking/service/#type-nodeport) or use `HostNetwork=true` .
#### Pods are not accessible via their Service IP
Many network add-ons do not yet enable [hairpin mode ](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/#a-pod-cannot-reach-itself-via-service-ip )
which allows pods to access themselves via their Service IP if they don't know about their podIP. This is an issue
related to [CNI ](https://github.com/containernetworking/cni/issues/476 ). Please contact the providers of the network
add-on providers to get timely information about whether they support hairpin mode.
If you are using VirtualBox (directly or via Vagrant), you will need to
ensure that `hostname -i` returns a routable IP address (i.e. one on the
second network interface, not the first one). By default, it doesn't do this
and kubelet ends-up using first non-loopback network interface, which is
usually NATed. Workaround: Modify `/etc/hosts` , take a look at this
`Vagrantfile` [ubuntu-vagrantfile ](https://github.com/errordeveloper/k8s-playground/blob/22dd39dfc06111235620e6c4404a96ae146f26fd/Vagrantfile#L11 ) for how this can be achieved.
#### TLS certificate errors
The following error indicates a possible certificate mismatch.
```
# kubectl get po
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
```
Verify that the `$HOME/.kube/config` file contains a valid certificate, and regenerate a certificate if necessary.
Another workaround is to overwrite the default `kubeconfig` for the "admin" user:
```
mv $HOME/.kube $HOME/.kube.bak
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
```
2018-01-07 23:21:40 +00:00
### Default NIC When using flannel as the pod network in Vagrant
2018-01-07 23:29:11 +00:00
The following error might indicate that something was wrong in the pod network:
2018-01-07 23:21:40 +00:00
```
Error from server (NotFound): the server could not find the requested resource
```
If you're using flannel as the pod network inside vagrant, then you will have to specify the default interface name for flannel.
2018-01-08 09:47:13 +00:00
Vagrant typically assigns two interfaces to all VMs. The first, for which all hosts are assigned the IP address `10.0.2.15` , is for external traffic that gets NATed.
2018-01-07 23:21:40 +00:00
2018-01-08 09:47:13 +00:00
This may lead to problems with flannel. By default, flannel selects the first interface on a host. This leads to all hosts thinking they have the same public IP address. To prevent this issue, pass the `--iface eth1` flag to flannel so that the second interface is chosen.
2018-03-03 20:13:54 +00:00
### Routing errors
In some situations `kubectl logs` and `kubectl run` commands may return with the following errors despite an otherwise apparently correctly working cluster:
```
Error from server: Get https://10.19.0.41:10250/containerLogs/default/mysql-ddc65b868-glc5m/mysql: dial tcp 10.19.0.41:10250: getsockopt: no route to host
```
This is due to Kubernetes using an IP that can not communicate with other IPs on the seemingly same subnet, possibly by policy of the machine provider. As an example, Digital Ocean assigns a public IP to `eth0` as well as a private one to be used internally as anchor for their floating IP feature, yet `kubelet` will pick the latter as the node's `InternalIP` instead of the public one.
Use `ip addr show` to check for this scenario instead of `ifconfig` because `ifconfig` will not display the offending alias IP address. Alternatively an API endpoint specific to Digital Ocean allows to query for the anchor IP from the droplet:
```
curl http://169.254.169.254/metadata/v1/interfaces/public/0/anchor_ipv4/address
```
The workaround is to tell `kubelet` which IP to use using `--node-ip` . When using Digital Ocean, it can be the public one (assigned to `eth0` ) or the private one (assigned to `eth1` ) should you want to use the optional private network. For example:
```
IFACE=eth0 # change to eth1 for DO's private network
DROPLET_IP_ADDRESS=$(ip addr show dev $IFACE | awk 'match($0,/inet (([0-9]|\.)+).* scope global/,a) { print a[1]; exit }')
echo $DROPLET_IP_ADDRESS # check this, just in case
echo "Environment=\"KUBELET_EXTRA_ARGS=--node-ip=$DROPLET_IP_ADDRESS\"" >> /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
```
Please note that this assumes `KUBELET_EXTRA_ARGS` hasn't already been set in the unit file.
Then restart `kubelet` :
```
systemctl daemon-reload
systemctl restart kubelet
```