Commit Graph

7174 Commits (k3s-v1.14.1)

Author SHA1 Message Date
k8s-ci-robot 5401f9458b
Merge pull request #67877 from tianshapjq/podprefix-used
use podPrefix as it's defined
2018-09-25 13:29:17 -07:00
David Eads c76f8f194c make sure that log includes user information 2018-09-25 14:10:09 -04:00
k8s-ci-robot a8e8e891f2
Merge pull request #68289 from denkensk/fix-simple-pkg-kubelet
Simple code fixed in in pkg/kubelet
2018-09-25 06:14:03 -07:00
k8s-ci-robot c16691037d
Merge pull request #68255 from leakingtapan/golint-fix-ebs
Fix golint for pkg/volume/aws_ebs
2018-09-25 06:13:33 -07:00
Mayank Kumar ef976f1f70 add validation for etc resolve parsing 2018-09-24 21:01:59 -07:00
FengyunPan2 34a8b1fd9f Add helpful log for checking cgrop path
Currently I just get 'xxx cgroup does not exist', but I don't know
which path has missed. Let's add log for it.
2018-09-25 10:10:12 +08:00
k8s-ci-robot 8346631860
Merge pull request #68053 from Pingan2017/rmifblock
clean up unneeded else block
2018-09-24 17:17:29 -07:00
Benjamin Elder 8b56eb8588 hack/update-gofmt.sh 2018-09-24 12:21:29 -07:00
Benjamin Elder f828c6f662 hack/update-bazel.sh 2018-09-24 12:03:24 -07:00
Benjamin Elder 088cf3c37b find & replace version import 2018-09-24 12:03:24 -07:00
k8s-ci-robot 170dcc2ea0
Merge pull request #68754 from bradhoekstra/optional-service-env-variables
kubelet: Make service environment variables optional
2018-09-24 10:59:32 -07:00
Renaud Gaubert 79056292aa Update pluginwatcher doc 2018-09-24 15:11:21 +02:00
Cheng Pan 000e30086b fix golint for pkg/volume/aws_ebs 2018-09-22 05:56:05 +00:00
Brad Hoekstra 69551689d5 Fix spelling 2018-09-22 00:07:08 -04:00
Brad Hoekstra 42da186b62 Address review comments 2018-09-21 20:06:32 -04:00
Brad Hoekstra c4ec40eca8 Update comment to reflect the new logic 2018-09-21 16:26:37 -04:00
Renaud Gaubert 63436ab4a3 Renamed pluginwatcher README to README.md 2018-09-21 16:25:33 +02:00
FengyunPan2 6af9e97fa5 Configure resource-only container with memory limit
Fixed: #68928
The docker memory limit should base on the memory capacity of
machine. Currently CgroupManager specify wrong memory limit.
2018-09-21 17:50:54 +08:00
Krzysztof Jastrzebski ad330f7dbe Start synchronizing pods after network is ready. 2018-09-21 10:12:49 +02:00
k8s-ci-robot fb50b3cb32
Merge pull request #67793 from fisherxu/use_ctx
Refactor grpc dial with dialcontext
2018-09-20 20:35:36 -07:00
Krzysztof Jastrzebski 3b21995c95 Process only CPU and memory stats when Kubelete stats API is called with
only_cpu_and_memory parameter. Before all stats were processed and
removed before returning.
2018-09-20 12:35:56 +02:00
Pingan2017 5de6ada98f fix a small typo 2018-09-20 16:04:12 +08:00
k8s-ci-robot 3429b9aca4
Merge pull request #62544 from astefanutti/56297
Init Kubelet runtime cache before dependent stats provider
2018-09-19 08:38:16 -07:00
Davanum Srinivas 02489f8988
Avoid setting Masked/ReadOnly paths when pod is privileged
In the recent PR on adding ProcMount, we introduced a regression when
pods are privileged. This shows up in 18.06 docker with kubeadm in the
kube-proxy container.

The kube-proxy container is privilged, but we end up setting the
`/proc/sys` to Read-Only which causes failures when running kube-proxy
as a pod. This shows up as a failure when using sysctl to set various
network things.

Change-Id: Ic61c4c9c961843a4e064e783fab0b54350762a8d
2018-09-18 17:46:16 -04:00
Brad Hoekstra e8366c8e99 Fix to inject KUBERNETES_ env vars when enableServiceLinks is
false and the pod is in the master namespace.
2018-09-17 16:28:49 -04:00
Brad Hoekstra ac8799a80d kubelet: Make service environment variables optional 2018-09-17 16:27:36 -04:00
Pingan2017 158552ff35 fix golint failures - /pkg/kubelet/images 2018-09-17 10:52:25 +08:00
k8s-ci-robot fb79943553
Merge pull request #67951 from liggitt/remove-deprecated-flags
Remove deprecated feature flags
2018-09-15 14:50:11 -07:00
Pingan2017 2f2c4ebc14 del internalError 2018-09-13 11:25:26 +08:00
k8s-ci-robot 9b8b6571a2
Merge pull request #68521 from yujuhong/nil-client
kubelet: skip initializing/using the RuntimeClass in standalone mode
2018-09-12 15:05:12 -07:00
k8s-ci-robot 37ef6eeb6d
Merge pull request #68431 from dashpole/cadvisor_godep_update
Update cAdvisor godeps to v0.31.0
2018-09-12 15:04:53 -07:00
Jon Friesen b971c3e200 Fix golint for pkg/probe
This change adds comments to exported things and renames the tcp,
http, and exec probe interfaces to just be Prober within their
namespace.

Issue #68026
2018-09-12 14:18:16 -07:00
Yu-Ju Hong a1f7ae7ab3 kubelet: skip initializing/using the RuntimeClass in standalone mode
In standalone mode, kubelet will not be configured to talk to an
apiserver. The RuntimeClass manager should be disabled in this case.
2018-09-11 13:21:53 -07:00
k8s-ci-robot 25cbd1c753
Merge pull request #67781 from dashpole/fix_priority_tests
Fix priority tests
2018-09-10 12:48:05 -07:00
David Ashpole 788196e45b update cadvisor to v0.31.0 2018-09-10 10:31:56 -07:00
knight a578c707c3 refactor kubelet/network/dns 2018-09-10 17:32:28 +08:00
Kubernetes Submit Queue 60ec6bf359
Merge pull request #64867 from dixudx/missing_container_ready_ltt
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

add missing LastTransitionTime of ContainerReady condition

**What this PR does / why we need it**:
add missing LastTransitionTime of ContainerReady condition

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
xref #64646

**Special notes for your reviewer**:
/cc freehan yujuhong

**Release note**:

```release-note
add missing LastTransitionTime of ContainerReady condition
```
2018-09-08 17:22:30 -07:00
fisherxu 89f3fa3d62 use dailcontext 2018-09-08 16:07:38 +08:00
David Ashpole 90f58c1157 critical pod test should not rely on feature gate set in framework; non-critical pods are always preemptable 2018-09-07 17:43:42 -07:00
Clayton Coleman 7e398dc31f
Remove dependency on docker daemon for core credential types
We are removing dependencies on docker types where possible in the core
libraries. credentialprovider is generic to Docker and uses a public API
(the config file format) that must remain stable. Create an equivalent type
and use a type cast (which would error if we ever change the type) in the
dockershim. We already perform a transformation like this for CRI and so
we aren't changing much.
2018-09-07 16:36:14 -04:00
Kubernetes Submit Queue a6eb49f0dc
Merge pull request #68195 from luxas/consolidate_componentconfig_code_standards
Automatic merge from submit-queue (batch tested with PRs 67950, 68195). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Consolidate componentconfig code standards

**What this PR does / why we need it**:

This PR fixes a bunch of very small misalignments in ComponentConfig packages:
 - Add sane comments to all functions/variables in componentconfig `register.go` files
 - Make the `register.go` files of componentconfig pkgs follow the same pattern and not differ from each other like they do today.
 - Register the `openapi-gen` tag in all `doc.go` files where the pkg contains _external_ types.
 - Add the `groupName` tag where missing
 - Fix cases where `addKnownTypes` was registered twice in the `SchemeBuilder`
 - Add `Readme` and `OWNERS` files to `Godeps` directories if missing.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:


**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/assign @sttts @thockin
2018-09-07 11:19:40 -07:00
David Ashpole 137c6d638e remove feature gate from kubelet defaulting 2018-09-06 18:17:09 -07:00
Kubernetes Submit Queue 4bb3712a75
Merge pull request #68119 from WanLinghao/token_controller_cachekey_fix
Automatic merge from submit-queue (batch tested with PRs 68119, 68191). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

fix token controller keyFunc bug

Currently, token manager use keyFunc like: `fmt.Sprintf("%q/%q/%#v", name, namespace, tr.Spec)`.
Since tr.Spec contains point fields, new token request would not reuse the cache at all.
This patch fix this, also adds unit test.

```release-note
NONE
```
2018-09-06 16:20:36 -07:00
Krzysztof Jastrzebski 138a3c7172 Add "only_cpu_and_memory" GET parameter to /stats/summary http handler in kubelet. If parameter is true then only cpu and memory will be present in response. The parameter will be used by Metric Server to avoid sending/decoding unneeded data. 2018-09-06 21:49:00 +02:00
WanLinghao 794e665d7b Currently, token manager use keyFunc like: `fmt.Sprintf("%q/%q/%#v", name, namespace, tr.Spec)`.
Since tr.Spec contains point fields, new token request would not reuse
the cache at all.  This patch fix this, also adds unit test.

Signed-off-by: Mike Danese <mikedanese@google.com>
2018-09-06 09:03:26 -07:00
Renaud Gaubert 8dd1d27c03 Updated the device manager pluginwatcher handler 2018-09-06 15:34:46 +02:00
Renaud Gaubert 78b55eb5bf Updated the CSI pluginwatcher handler 2018-09-06 15:34:46 +02:00
Renaud Gaubert 29d225e90c Update pluginwatcher tests 2018-09-06 14:44:03 +02:00
Renaud Gaubert 4d18aa63cd Refactor pluginwatcher to use the new API 2018-09-06 14:42:21 +02:00
Renaud Gaubert 2eb91e89c0 Update the plugin watcher interface 2018-09-06 14:42:21 +02:00
Lucas Käldström 83d53ea1c2
Standardize componentconfig code/comment patterns 2018-09-06 13:42:02 +03:00
Kubernetes Submit Queue 4bc9e94fee
Merge pull request #67690 from feiskyer/iptables-cross
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Kubelet: only sync iptables on linux

**What this PR does / why we need it**:

Iptables is only supported on Linux, kubelet should only sync NAT rules on Linux.

Without this PR, Kubelet on Windows would logs following errors on each `syncNetworkUtil()`:

```
kubelet.err.log:4692:E0711 22:03:42.103939    2872 kubelet_network.go:102] Failed to ensure that nat chain KUBE-MARK-DROP exists: error creating chain "KUBE-MARK-DROP": executable file
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65713

**Special notes for your reviewer**:

**Release note**:

```release-note
Kubelet now only sync iptables on Linux.
```
2018-09-05 22:55:15 -07:00
wangqingcan 6506e0c51a Simple code and typo fixed in in kubelet 2018-09-06 09:12:39 +08:00
Kubernetes Submit Queue 0df5d8d205
Merge pull request #67909 from tallclair/runtimeclass-kubelet
Automatic merge from submit-queue (batch tested with PRs 68161, 68023, 67909, 67955, 67731). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Dynamic RuntimeClass implementation

**What this PR does / why we need it**:

Implement RuntimeClass using the dynamic client to break the dependency on https://github.com/kubernetes/kubernetes/pull/67791

Once (if) https://github.com/kubernetes/kubernetes/pull/67791 merges, I will migrate to the typed client.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
For https://github.com/kubernetes/features/issues/585

**Release note**:
Covered by #67737
```release-note
NONE
```

/sig node
/kind feature
/priority important-soon
/milestone v1.12
2018-09-05 14:51:47 -07:00
Kubernetes Submit Queue 70a0089ae6
Merge pull request #68200 from RenaudWasTaken/pluginwatcher-beta
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

KubeletPluginsWatcher feature is beta in 1.12 release

*What this PR does / why we need it:*
Graduates DevicePlugins feature to beta.

*Which issue(s) this PR fixes:*
Related but does not fix: https://github.com/kubernetes/features/issues/595 as well as https://github.com/kubernetes/kubernetes/issues/65773

*Special notes for your reviewer:*
Includes upgrading the gRPC pluginwatcher API to beta. Based on the [device plugin model](https://github.com/kubernetes/kubernetes/pull/59588).

*Depends on https://github.com/kubernetes/kubernetes/pull/64621 being merged* 

Release note:

```release-note
KubeletPluginsWatcher feature graduates to beta.
```

/sig node
/sig storage

/cc @vladimirvivien @sbezverk @vikaschoudhary16 @saad-ali @vishh @jiayingz
2018-09-05 13:18:39 -07:00
wangqingcan b0c308f082 Simple code and typo fixed in in pkg/kubelet 2018-09-05 21:51:32 +08:00
Kubernetes Submit Queue 743e4fba63
Merge pull request #67709 from feiskyer/inodes-clean
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

 Kubelet: only apply default hard evictions of nodefs.inodesFree on Linux

**What this PR does / why we need it**:

Kubelet sets default hard evictions of `nodefs.inodesFree ` for all platforms today. This will cause errors on Windows and a lot `no observation found for eviction signal nodefs.inodesFree` errors will be logs for kubelet.

```
kubelet.err.log:4961:W0711 22:21:12.378789    2872 helpers.go:808] eviction manager: no observation found for eviction signal nodefs.inodesFree
kubelet.err.log:4967:W0711 22:21:30.411371    2872 helpers.go:808] eviction manager: no observation found for eviction signal nodefs.inodesFree
kubelet.err.log:4974:W0711 22:21:48.446456    2872 helpers.go:808] eviction manager: no observation found for eviction signal nodefs.inodesFree
kubelet.err.log:4978:W0711 22:22:06.482441    2872 helpers.go:808] eviction manager: no observation found for eviction signal nodefs.inodesFree
```

This PR updates the default hard eviction value and only apply nodefs.inodesFree on Linux.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66088

**Special notes for your reviewer**:

**Release note**:

```release-note
Kubelet only applies default hard evictions of nodefs.inodesFree on Linux
```
2018-09-04 23:08:30 -07:00
Kubernetes Submit Queue 8f906fefae
Merge pull request #66427 from feiskyer/win-pods-stats
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Add kubelet stats for windows system container "pods"

**What this PR does / why we need it**:

This PR adds kubelet stats for windows system container "pods". Without this, kubelet will always logs error: 

```
kubelet.err.log:4832:E0711 22:12:49.241358    2872 helpers.go:735] eviction manager: failed to construct signal: "allocatableMemory.available" error: system container "pods" not found
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66087

**Special notes for your reviewer**:

/sig windows
/sig node

**Release note**:

```release-note
Add kubelet stats for windows system container "pods"
```
2018-09-04 21:59:49 -07:00
Pengfei Ni 376b45cb64 Fix unit tests for Windows
* TestMakeBlockVolume is moved to Linux only.
* TestMakeMounts are running on both Linux and Windows
2018-09-05 10:22:53 +08:00
Pengfei Ni aeea967149 Kubelet: only sync iptables on linux 2018-09-05 10:22:48 +08:00
Tim Allclair 63f3bc1b7e
Implement RuntimeClass support for the Kubelet & CRI 2018-09-04 13:45:11 -07:00
Renaud Gaubert 44dd0672b6 Add pluginwatcher generated files 2018-09-04 20:22:59 +02:00
Renaud Gaubert f8e80e45e7 Create pkg/kubelet/apis/pluginregistration/v1beta1 directory 2018-09-04 20:22:59 +02:00
Pengfei Ni 8255318b96 Kubelet: do not report used inodes on Windows 2018-09-03 16:42:33 +08:00
Pengfei Ni e1fdaa177f Kubelet: only apply default hard evictions of nodefs.inodesFree on Linux 2018-09-03 16:42:30 +08:00
Lucas Käldström 8b6a7ee075
autogenerated go code, godeps, bazel and gofmt 2018-09-02 14:38:59 +03:00
Lucas Käldström 15760506c2
Move the kubelet's external types to k8s.io/kubelet 2018-09-02 14:19:38 +03:00
Lucas Käldström 0707b1274f
Automated package reference rename 2018-09-02 14:15:38 +03:00
Sandor Szücs 588d2808b7
fix #51135 make CFS quota period configurable, adds a cli flag and config option to kubelet to be able to set cpu.cfs_period and defaults to 100ms as before.
It requires to enable feature gate CustomCPUCFSQuotaPeriod.

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
2018-09-01 20:19:59 +02:00
Kubernetes Submit Queue 33cca5251c
Merge pull request #67255 from bertinatto/promote_mount_propagation
Automatic merge from submit-queue (batch tested with PRs 65251, 67255, 67224, 67297, 68105). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Promote mount propagation to GA

**What this PR does / why we need it**:

This PR promotes mount propagation to GA.

Website PR: https://github.com/kubernetes/website/pull/9823

**Release note**:

```release-note
Mount propagation has promoted to GA. The `MountPropagation` feature gate is deprecated and will be removed in 1.13.
```
2018-08-31 19:25:30 -07:00
Kubernetes Submit Queue 85300f4f5d
Merge pull request #67803 from saad-ali/csiClusterReg3
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

CSI Cluster Registry and Node Info CRDs

**What this PR does / why we need it**:
Introduces the new `CSIDriver` and `CSINodeInfo` API Object as proposed in https://github.com/kubernetes/community/pull/2514 and https://github.com/kubernetes/community/pull/2034

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/features/issues/594

**Special notes for your reviewer**:
Per the discussion in https://groups.google.com/d/msg/kubernetes-sig-storage-wg-csi/x5CchIP9qiI/D_TyOrn2CwAJ the API is being added to the staging directory of the `kubernetes/kubernetes` repo because the consumers will be attach/detach controller and possibly kubelet, but it will be installed as a CRD (because we want to move in the direction where the API server is Kubernetes agnostic, and all Kubernetes specific types are installed).

**Release note**:

```release-note
Introduce CSI Cluster Registration mechanism to ease CSI plugin discovery and allow CSI drivers to customize Kubernetes' interaction with them.
```

CC @jsafrane
2018-08-31 16:46:41 -07:00
Kubernetes Submit Queue 39004e852b
Merge pull request #64283 from jessfraz/ProcMountType
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Add a ProcMount option to the SecurityContext & AllowedProcMountTypes to PodSecurityPolicy

So there is a bit of a chicken and egg problem here in that the CRI runtimes will need to implement this for there to be any sort of e2e testing.

**What this PR does / why we need it**: This PR implements design proposal https://github.com/kubernetes/community/pull/1934. This adds a ProcMount option to the SecurityContext and AllowedProcMountTypes to PodSecurityPolicy

Relies on https://github.com/google/cadvisor/pull/1967

**Release note**:

```release-note
ProcMount added to SecurityContext and AllowedProcMounts added to PodSecurityPolicy to allow paths in the container's /proc to not be masked.
```

cc @Random-Liu @mrunalp
2018-08-31 16:46:33 -07:00
Jan Safranek 7d673cb8f0 Pass new CSI API Client and informer to Volume Plugins 2018-08-31 12:25:59 -07:00
Fabio Bertinatto b87a57a111 Promote mount propagation to GA 2018-08-31 10:04:51 +02:00
Kubernetes Submit Queue c1e37a5f16
Merge pull request #66056 from mikedanese/fixhang
Automatic merge from submit-queue (batch tested with PRs 67349, 66056). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

wait until apiserver connection before starting kubelet tls bootstrap

I wonder if this helps with sometimes slow network programming

cc @mwielgus @awly
2018-08-30 20:16:32 -07:00
Jess Frazelle 1a4cf7a36e
make update
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 18:24:23 -04:00
Mike Danese 2cf1c75e07 wait until apiserver connection before starting kubelet tls bootstrap 2018-08-30 11:37:05 -07:00
Jess Frazelle 20cc40a5dc
ProcMount: add dockershim support
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:40:06 -04:00
Jess Frazelle 31ffd9f881
vendor: update docker cadvisor winterm
This vendor change was purely for the changes in docker to allow for
setting the Masked and Read-only paths.

See: moby/moby#36644

But because of the docker dep update it also needed cadvisor to be
updated and winterm due to changes in pkg/tlsconfig in docker

See: google/cadvisor#1967

Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:40:05 -04:00
Jess Frazelle dbf7186bee
update jsonlog path for updated vendor
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:40:05 -04:00
Jess Frazelle 30dcca6233
ProcMount: add api options and feature gate
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:40:02 -04:00
Jess Frazelle 6b7c39a4f8
pkg/kubelet/apis/cri/runtime: add masked_paths and readonly_paths
generate runtime protobufs

Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
2018-08-30 11:39:18 -04:00
Pingan2017 2f1284bc34 cleanup unneeded if block 2018-08-30 17:18:56 +08:00
Lucas Käldström 844487aea4
autogenerated 2018-08-29 20:21:17 +03:00
Lucas Käldström 994ac98586
Update api violations, golint failures and gofmt 2018-08-29 20:21:09 +03:00
Lucas Käldström 7a840cb4c8
automated: Rename all package references 2018-08-29 19:07:52 +03:00
Lucas Käldström 62bfe29ce4
automated, boring: Rename pkg/kubelet/apis/{kubelet,}config 2018-08-29 18:59:05 +03:00
Kubernetes Submit Queue cd06419973
Merge pull request #67369 from tianshapjq/should-not-eventf-directly
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

should not event directly

**What this PR does / why we need it**:
should not event directly, using recordContainerEvent() to generate ref and deduplicate events instead.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2018-08-28 16:18:13 -07:00
Kubernetes Submit Queue a26e1ddacc
Merge pull request #67739 from liggitt/hostname-override
Automatic merge from submit-queue (batch tested with PRs 67739, 65222). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Honor --hostname-override, report compatible hostname addresses with cloud provider

xref #67714

7828e5d made cloud providers authoritative for the addresses reported on Node objects, so that the addresses used by the node (and requested as SANs in serving certs) could be verified via cloud provider metadata.

This had the effect of no longer reporting addresses of type Hostname for Node objects for some cloud providers. Cloud providers that have the instance hostname available in metadata should add a `type: Hostname` address to node status. This is being tracked in #67714

This PR does a couple other things to ease the transition to authoritative cloud providers:
* if `--hostname-override` is set on the kubelet, make the kubelet report that `Hostname` address. if it can't be verified via cloud-provider metadata (for cert approval, etc), the kubelet deployer is responsible for fixing the situation by adjusting the kubelet configuration (as they were in 1.11 and previously)
* if `--hostname-override` is not set, *and* the cloud provider didn't report a Hostname address, *and* the auto-detected hostname matches one of the addresses the cloud provider *did* report, make the kubelet report that as a Hostname address. That lets the addresses remain verifiable via cloud provider metadata, while still including a `Hostname` address whenever possible.

/sig node
/sig cloud-provider

/cc @mikedanese

fyi @hh

```release-note
NONE
```
2018-08-28 12:31:00 -07:00
Jordan Liggitt e309bd3abf
Remove deprecated feature flags 2018-08-28 15:25:46 -04:00
Jordan Liggitt 2857de73ce
Honor --hostname-override, report compatible hostname addresses with cloud provider 2018-08-28 11:21:01 -04:00
Kubernetes Submit Queue 2eb14e3007
Merge pull request #64973 from nokia/k8s-sctp
Automatic merge from submit-queue (batch tested with PRs 67694, 64973, 67902). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

SCTP support implementation for Kubernetes

**What this PR does / why we need it**: This PR adds SCTP support to Kubernetes, including Service, Endpoint, and NetworkPolicy.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #44485

**Special notes for your reviewer**:

**Release note**:

```release-note

SCTP is now supported as additional protocol (alpha) alongside TCP and UDP in Pod, Service, Endpoint, and NetworkPolicy.  

```
2018-08-28 07:21:18 -07:00
Tim Allclair 62d56060b7 Remove unused kubelet dependency 2018-08-27 16:48:12 -07:00
tianshapjq 9daaf12397 use podPrefix as it's defined 2018-08-27 14:32:26 +08:00
Laszlo Janosi cbe94df8c6 gofmt update 2018-08-27 05:59:50 +00:00
Laszlo Janosi e466bdc67e Changes according to the approved KEP. SCTP is supported for HostPort and LoadBalancer. Alpha feature flag SCTPSupport controls the support of SCTP. Kube-proxy config parameter is removed. 2018-08-27 05:58:36 +00:00
Laszlo Janosi a6da2b1472 K8s SCTP support implementation for the first pull request
The requested Service Protocol is checked against the supported protocols of GCE Internal LB. The supported protocols are TCP and UDP.

SCTP is not supported by OpenStack LBaaS. If SCTP is requested in a Service with type=LoadBalancer, the request is rejected. Comment style is also corrected.

SCTP is not allowed for LoadBalancer Service and for HostPort. Kube-proxy can be configured not to start listening on the host port for SCTP: see the new SCTPUserSpaceNode parameter

changed the vendor github.com/nokia/sctp to github.com/ishidawataru/sctp. I.e. from now on we use the upstream version.

netexec.go compilation fixed. Various test cases fixed

SCTP related conformance tests removed. Netexec's pod definition and Dockerfile are updated to expose the new SCTP port(8082)

SCTP related e2e test cases are removed as the e2e test systems do not support SCTP

sctp related firewall config is removed from cluster/gce/util.sh. Variable name sctp_addr is corrected to sctpAddr in pkg/proxy/ipvs/proxier.go

cluster/gce/util.sh is copied from master
2018-08-27 05:56:27 +00:00
Michael Taufen 1b7d06e025 Kubelet creates and manages node leases
This extends the Kubelet to create and periodically update leases in a
new kube-node-lease namespace. Based on [KEP-0009](https://github.com/kubernetes/community/blob/master/keps/sig-node/0009-node-heartbeat.md),
these leases can be used as a node health signal, and will allow us to
reduce the load caused by over-frequent node status reporting.

- add NodeLease feature gate
- add kube-node-lease system namespace for node leases
- add Kubelet option for lease duration
- add Kubelet-internal lease controller to create and update lease
- add e2e test for NodeLease feature
- modify node authorizer and node restriction admission controller
to allow Kubelets access to corresponding leases
2018-08-26 16:03:36 -07:00
Kubernetes Submit Queue 83030032ad
Merge pull request #67425 from Lion-Wei/kubelet-ipv6
Automatic merge from submit-queue (batch tested with PRs 65247, 63633, 67425). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix kubelet iptclient in ipv6 cluster

**What this PR does / why we need it**:
Kubelet uses "iptables" instead of "ip6tables" in an ipv6-only cluster. This causes failed traffic for type: LoadBalancer services (and probably a lot of other problems).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #67398 

**Special notes for your reviewer**:


**Release note**:
```release-note
NONE
```
2018-08-23 14:15:12 -07:00
Kubernetes Submit Queue d67a03183a
Merge pull request #67687 from Lion-Wei/remote-reschrduler
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove rescheduler since scheduling DS pods by default scheduler is moving to beta

**What this PR does / why we need it**:

remove rescheduler since scheduling DS pods by default scheduler is moving to beta

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64725

**Special notes for your reviewer**:

**Release note**:
```release-note
Remove rescheduler since scheduling DS pods by default scheduler is moving to beta.
```
2018-08-23 12:32:17 -07:00
Kubernetes Submit Queue e46203c40d
Merge pull request #67031 from krzysztof-jastrzebski/node_startup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce latency to node ready after CIDR is assigned.

This adds code to execute an immediate runtime and node status update when the Kubelet sees that it has a CIDR, which significantly decreases the latency to node ready.

```release-note
Speed up kubelet start time by executing an immediate runtime and node status update when the Kubelet sees that it has a CIDR.
```
2018-08-23 10:37:30 -07:00
liangwei 67f4be87c0 fix kubelet iptclient in ipv6 cluster 2018-08-23 15:08:51 +08:00
Krzysztof Jastrzebski 7ffa4e17e0 Reduce latency to node ready after CIDR is assigned. 2018-08-22 10:43:58 +02:00
Kubernetes Submit Queue c491d48cde
Merge pull request #67430 from choury/cpumanager
Automatic merge from submit-queue (batch tested with PRs 67430, 67550). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

cpumanager: rollback state if updateContainerCPUSet failed

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63018

If `updateContainerCPUSet`  failed, the container will start failed. We should rollback the state to avoid CPU leak.
**Special notes for your reviewer**:

**Release note**:

```release-note
cpumanager: rollback state if updateContainerCPUSet failed
```
2018-08-21 23:20:58 -07:00
Kubernetes Submit Queue 444373b404
Merge pull request #67599 from neolit123/owners-kubelet
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add labels to kubelet OWNERS files

**What this PR does / why we need it**:

This change makes it possible to automatically add the two labels: `area/kubelet` to PRs that touch the paths in question.

this already exists for kubeadm:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/OWNERS#L17-L19

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
refs https://github.com/kubernetes/community/issues/1808

**Special notes for your reviewer**:
none

**Release note**:

```release-note
NONE
```
/area kubelet
@kubernetes/sig-node-pr-reviews
2018-08-21 21:10:28 -07:00
liangwei 5ea138f4e9 remove rescheduler 2018-08-22 11:49:14 +08:00
Kubernetes Submit Queue 7cd140aa4f
Merge pull request #67518 from tallclair/runtimeclass-cri
Automatic merge from submit-queue (batch tested with PRs 67298, 67518, 67635, 67673). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add RuntimeHandler to the CRI RunPodSandboxRequest

**What this PR does / why we need it**:

Adds the CRI portion of the [RuntimeClass](https://github.com/kubernetes/community/blob/master/keps/sig-node/0014-runtime-class.md#runtime-handler) API.

**Which issue(s) this PR fixes**:
For https://github.com/kubernetes/features/issues/585

**Special notes for your reviewer**:
The Kubernetes API is still blocked on a decision about alpha field usage, see [discussion on sig-architecture](https://groups.google.com/forum/#!topic/kubernetes-sig-architecture/y9FulL9Uq6A). I'd like to start with the CRI piece so we can unblock work on the CRI implementation side to have support ready when Kubernetes support is there.

**Release note**:
```release-note
[CRI] Adds a "runtime_handler" field to RunPodSandboxRequest, for selecting the runtime configuration to run the sandbox with (alpha feature).
```

/sig node
/milestone v1.12
/priority important-soon
/kind api-change
2018-08-21 18:33:04 -07:00
Lubomir I. Ivanov 1a1d236f61 Add labels to kubelet OWNERS files 2018-08-22 00:43:32 +03:00
Kubernetes Submit Queue c94ececccc
Merge pull request #67672 from dims/add-labels-to-owners-files
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add Labels to various OWNERS files

**What this PR does / why we need it**:

Will reduce the burden of manually adding labels. Information pulled
from:
https://github.com/kubernetes/community/blob/master/sigs.yaml

Change-Id: I17e661e37719f0bccf63e41347b628269cef7c8b

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-08-21 14:37:21 -07:00
Kubernetes Submit Queue 473ebb21d1
Merge pull request #67632 from feiskyer/verbose-fix
Automatic merge from submit-queue (batch tested with PRs 67661, 67497, 66523, 67622, 67632). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce verbose logs of node addresses requesting

**What this PR does / why we need it**:

Kubelet build from the master branch is flushing node addresses requesting logs, which is too verbose:

```sh
Aug 16 10:09:40 node-1 kubelet[24217]: I0816 10:09:40.658479   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:09:40 node-1 kubelet[24217]: I0816 10:09:40.666114   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
Aug 16 10:09:50 node-1 kubelet[24217]: I0816 10:09:50.666357   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:09:50 node-1 kubelet[24217]: I0816 10:09:50.674322   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
Aug 16 10:10:01 node-1 kubelet[24217]: I0816 10:10:00.674644   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:10:01 node-1 kubelet[24217]: I0816 10:10:00.682794   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
Aug 16 10:10:10 node-1 kubelet[24217]: I0816 10:10:10.683002   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:10:10 node-1 kubelet[24217]: I0816 10:10:10.689641   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
Aug 16 10:10:20 node-1 kubelet[24217]: I0816 10:10:20.690006   24217 cloud_request_manager.go:97] Requesting node addresses from cloud provider for node "node-1"
Aug 16 10:10:20 node-1 kubelet[24217]: I0816 10:10:20.696545   24217 cloud_request_manager.go:116] Node addresses from cloud provider for node "node-1" collected
```

This PR sets them to level 5.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/cc @ingvagabund
2018-08-21 13:00:13 -07:00
Davanum Srinivas 9b43d97cd4
Add Labels to various OWNERS files
Will reduce the burden of manually adding labels. Information pulled
from:
https://github.com/kubernetes/community/blob/master/sigs.yaml

Change-Id: I17e661e37719f0bccf63e41347b628269cef7c8b
2018-08-21 13:59:08 -04:00
Ismo Puustinen dd3eeb3f46 device manager: don't do operations on nil pointer.
If grpc.DialContext() fails, a nil connection is returned. Check the
error before calling conn.Close().
2018-08-21 15:20:36 +03:00
Kubernetes Submit Queue d017bebf6b
Merge pull request #67145 from jiayingz/reboot-fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fail container start if its requested device plugin resource is unknown.

With the change, Kubelet device manager now checks whether it has cached option state for the requested device plugin resource to make sure the resource is in ready state when we start the container.



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/67107

**Special notes for your reviewer**:

**Release note**:

```release-note
Fail container start if its requested device plugin resource hasn't registered after Kubelet restart.
```
2018-08-21 01:48:54 -07:00
Pengfei Ni 2d82cd811f Reduce verbose logs of node addresses requesting 2018-08-21 13:23:01 +08:00
Tim Allclair e6eb2e7dea Add RuntimeHandler to the CRI RunPodSandboxRequest 2018-08-17 10:56:49 -07:00
choury 36b92b9b29 cpumanager: rollback state if updateContainerCPUSet failed 2018-08-17 18:08:58 +08:00
Kubernetes Submit Queue 4819c65028
Merge pull request #67380 from tianshapjq/nits-in-manager.go
Automatic merge from submit-queue (batch tested with PRs 66209, 67380, 67499, 67437, 67498). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

nits in manager.go

**What this PR does / why we need it**:
just found some nits in the manager.go

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-08-17 03:01:09 -07:00
Kubernetes Submit Queue da3f1a3ea1
Merge pull request #64445 from squeed/more-cni-capabilities
Automatic merge from submit-queue (batch tested with PRs 64445, 67459, 67434). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim/network: pass ipRange CNI capabilities

**What this PR does / why we need it**:
Updates the dynamic (capability args) passed from Kubernetes to the CNI plugin. This means CNI plugin authors can offer more features and / or reduce their dependency on the APIServer.

Currently, we only pass the `portMappings` capability. CNI now supports `bandwidth` for bandwidth limiting and `ipRanges` for preferred IP blocks. This PR adds support for these two new capabilities.

Bandwidth limits are provided - as implemented in kubenet - via the pod annotations `kubernetes.io/ingress-bandwidth` and `kubernetes.io/egress-bandwidth`.

The ipRanges field simply passes the PodCIDR. This does mean that we need to change the NodeReady algorithm. Previously, we would only set NodeNotReady on missing PodCIDR when using Kubenet. Now, if the CNI configuration includes the `ipRanges` capability, we need to do the same.

**Which issue(s) this PR fixes**:
Fixes #64393

**Release note**:

```release-note
The dockershim now sets the "bandwidth" and "ipRanges" CNI capabilities (dynamic parameters). Plugin authors and administrators can now take advantage of this by updating their CNI configuration file. For more information, see the [CNI docs](https://github.com/containernetworking/cni/blob/master/CONVENTIONS.md#dynamic-plugin-specific-fields-capabilities--runtime-configuration)
```
2018-08-15 22:54:07 -07:00
Kubernetes Submit Queue cffa2aed0e
Merge pull request #64601 from hzxuzhonghu/cm-dynamic-loglevel-set
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Other components support set log level dynamically

**What this PR does / why we need it**:

#63777 introduced a way to set glog.logging.verbosity dynamically. 
We should enable this for all other components, which is specially useful in debugging. 


**Release note**:

```release-note
Expose `/debug/flags/v` to allow kubelet dynamically set glog logging level.  If want to change glog level to 3, you only have to send a PUT request like `curl -X PUT http://127.0.0.1:8080/debug/flags/v -d "3"`.
```
2018-08-15 21:32:46 -07:00
Kubernetes Submit Queue b904a3dc48
Merge pull request #67109 from MHBauer/error-typo
Automatic merge from submit-queue (batch tested with PRs 65561, 67109, 67450, 67456, 67402). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

error text refers to wrong stream type

**What this PR does / why we need it**:
clarify error text

**Special notes for your reviewer**:
I think this was a copy and paste error.

**Release note**:
```release-note
NONE
```
2018-08-15 18:15:10 -07:00
Kubernetes Submit Queue 6faf115870
Merge pull request #65561 from k82cn/k8s_65372_1
Automatic merge from submit-queue (batch tested with PRs 65561, 67109, 67450, 67456, 67402). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Compared preemption by priority in Kubelet

Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65372 

**Release note**:
```release-note
None
```
2018-08-15 18:15:06 -07:00
Casey Callendrello 5d9ec20d7e kubelet/dockershim/network: pass ipRange dynamically to the CNI plugin
CNI now supports passing ipRanges dynamically. Pass podCIDR so that
plugins no longer have to look it up.
2018-08-15 17:41:09 +02:00
Kubernetes Submit Queue c5e74d128d
Merge pull request #66884 from NickrenREN/attacher-detacher-refactor
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Attacher/Detacher refactor for local storage

Proposal link: https://github.com/kubernetes/community/pull/2438

**What this PR does / why we need it**:

Attacher/Detacher refactor for the plugins which just need to mount device, but do not need to attach, such as local storage plugin.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

```release-note
Attacher/Detacher refactor for local storage
```

/sig storage
/kind feature
2018-08-15 07:03:48 -07:00
xuzhonghu 815799638b run update all 2018-08-15 17:18:27 +08:00
xuzhonghu c867bf9cab kubelet support dynamically set glog log level --v 2018-08-15 17:18:25 +08:00
Kubernetes Submit Queue c65f65cf6a
Merge pull request #65065 from sjenning/reduce-backoff-logging
Automatic merge from submit-queue (batch tested with PRs 66177, 66185, 67136, 67157, 65065). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: reduce logging for backoff situations

xref https://bugzilla.redhat.com/show_bug.cgi?id=1555057#c6

Pods that are in `ImagePullBackOff` or `CrashLoopBackOff` currently generate a lot of logging at the `glog.Info()` level.  This PR moves some of that logging to `V(3)` and avoids logging in situations where the `SyncPod` only fails because pod are in a BackOff error condition.

@derekwaynecarr @liggitt
2018-08-15 02:09:20 -07:00
Kubernetes Submit Queue fba4cf6f4c
Merge pull request #67334 from fqsghostcloud/indent-error-flow
Automatic merge from submit-queue (batch tested with PRs 67294, 67320, 67335, 67334, 67325). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

indent error flow
2018-08-15 00:07:18 -07:00
Kubernetes Submit Queue b4bfb1847c
Merge pull request #66446 from bertinatto/metrics_volume_manager
Automatic merge from submit-queue (batch tested with PRs 61212, 66369, 66446, 66895, 66969). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add more metrics for Volume Manager

**What this PR does / why we need it**:

This PR adds a few metrics described in the [Metrics Spec](https://docs.google.com/document/d/1Fh0T60T_y888LsRwC51CQHO75b2IZ3A34ZQS71s_F0g/edit#heading=h.ys6pjpbasqdu):

* Number of volumes in ActualStateofWorld and DesiredStateofWorld
* Number of times ReconstructVolume Spec on kubelet failed

**Release note**:

```release-note
NONE
```
2018-08-14 21:18:12 -07:00
Kubernetes Submit Queue 1f86c1cf26
Merge pull request #61212 from charrywanganthony/duplicated_import
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove duplicated import

**Release note**:

```release-note
NONE
```
2018-08-14 20:18:00 -07:00
Kubernetes Submit Queue 99053fbf33
Merge pull request #64877 from AdamDang/patch-11
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Typo fix in returned message: utilites->utilities

Line 250: utilites->utilities
2018-08-14 18:57:50 -07:00
Kubernetes Submit Queue af2f72af47
Merge pull request #66587 from feiskyer/revert-63905
Automatic merge from submit-queue (batch tested with PRs 66491, 66587, 66856, 66657, 66923). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert #63905: Setup dns servers and search domains for Windows Pods

**What this PR does / why we need it**:

From https://github.com/kubernetes/kubernetes/pull/63905#issuecomment-396709775:

> I don't think this change does anything on Windows. On windows, the network endpoint configuration is taken care of completely by CNI. If you would like to pass on the custom dns polices from the pod spec, it should be dynamically going to the cni configuration that gets passed to CNI. From there, it would be passed down to platform and would be taken care of appropriately by HNS.

> etc\resolve.conf is very specific to linux and that should remain linux speicfic implementation. We should be trying to move away from platform specific code in Kubelet.
Docker is not managing the networking here for windows. So it doens't really care about any network settings. So passing it to docker shim's hostconfig also doens;t make sense here.

DNS for Windows containers will be set by CNI plugins.  And this change also introduced two endpoints for sandbox container.  So this PR reverts #63905 .


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

The PR should also be cherry-picked to release-1.11.

Also, https://github.com/kubernetes/kubernetes/issues/66588 is opened to track the process of pushing this to CNI.

**Release note**:

```release-note
Revert #63905: Setup dns servers and search domains for Windows Pods. DNS for Windows containers will be set by CNI plugins.
```

/sig windows
/sig node
/kind bug
2018-08-14 17:55:07 -07:00
tianshapjq 81081dc9e7 nits in manager.go 2018-08-15 08:16:04 +08:00
tianshapjq 27c5ced809 should not event directly 2018-08-14 14:35:47 +08:00
NickrenREN c7e4466873 attacher/detacher refactor 2018-08-14 11:12:41 +08:00
Fabio Bertinatto 376a94e039 Add more metrics for Volume Manager
Specifically:

* Number of volumes in ActualStateofWorld and DesiredStateofWorld
* Number of times ReconstructVolume Spec on kubelet failed
2018-08-13 17:36:36 +02:00
fqsghostcloud 21f9ac0e7e
indent error flow
indent error flow
2018-08-13 17:31:31 +08:00
Yu-Ju Hong 390b158db9 kubelet: plumb context for log requests
This allows kubelets to stop the necessary work when the context has
been canceled (e.g., connection closed), and not leaking a goroutine
and inotify watcher waiting indefinitely.
2018-08-10 17:35:46 -07:00
Kubernetes Submit Queue 57bb26911d
Merge pull request #53042 from chentao1596/support-unit-test-case-for-pod-format
Automatic merge from submit-queue (batch tested with PRs 67177, 53042). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding unit tests to methods of pod's format

What this PR does / why we need it:

Add unit test cases, thank you!
2018-08-08 23:49:06 -07:00
Jiaying Zhang 7b1ae66432 Fail container start if its requested device plugin resource doesn't
have cached option state to make sure the device plugin resource is
in ready state when we start the container.
2018-08-08 13:11:36 -07:00
Morgan Bauer 0b709dcf7d
error text refers to wrong stream type 2018-08-07 18:20:24 -07:00
Kubernetes Submit Queue 60ac433922
Merge pull request #66946 from LinEricYang/unused-variable
Automatic merge from submit-queue (batch tested with PRs 66512, 66946, 66083). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet/cm/cpumanager: Fix unused variable "skipIfPermissionsError"

The variable "skipIfPermissionsError" is not needed even when
permission error happened.
2018-08-06 19:44:04 -07:00
Kubernetes Submit Queue d114692a58
Merge pull request #58058 from tianshapjq/cleanup-useless-var-deviceplugin/types.go
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

clean up useless variables in deviceplugin/types.go

**What this PR does / why we need it**:
some variables is useless for reasons, I think we need a clean up.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note

```NONE
2018-08-06 16:33:54 -07:00
Da K. Ma a75d625cc3 Compared preemption by priority.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-08-04 11:33:07 +08:00
Kubernetes Submit Queue cb1ef9f7e8
Merge pull request #64815 from dixudx/hostname_empty
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

error out empty hostname

**What this PR does / why we need it**:
For linux, the hostname is read from file `/proc/sys/kernel/hostname` directly, which can be overwritten with whitespaces.

Should error out such invalid hostnames.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes kubernetes/kubeadm#835

**Special notes for your reviewer**:
/cc luxas timothysc 

**Release note**:

```release-note
nodes: improve handling of erroneous host names
```
2018-08-03 17:13:32 -07:00
Kubernetes Submit Queue 6a33d1ba10
Merge pull request #66938 from sjenning/avoid-mount-delay
Automatic merge from submit-queue (batch tested with PRs 62901, 66562, 66938, 66927, 66926). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: volumemanager: poll immediate when waiting for volume attachment

Currently, `WaitForAttachAndMount()` introduces a 300ms minimum delay by using `wait.Poll()` rather than `wait.PollImmediate()`.  This wait constitutes >99% of the total processing time for `syncPod()`.  Changing this reduced `syncPod()` processing time for a simple busybox pod with one emptyDir volume from 302ms to 2ms.

@derekwaynecarr @pmorie @smarterclayton @jsafrane 

/sig node
/release-note-none
2018-08-02 19:57:15 -07:00
Lin Yang b7e1f0bf17 kubelet/cm/cpumanager: Fix unused variable "skipIfPermissionsError"
The variable "skipIfPermissionsError" is not needed even when
permission error happened.
2018-08-02 17:24:33 -07:00
Kubernetes Submit Queue 266cf70ac0
Merge pull request #66617 from pravisankar/fix-pod-cgroup-parent
Automatic merge from submit-queue (batch tested with PRs 66190, 66871, 66617, 66293, 66891). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Do not set cgroup parent when --cgroups-per-qos is disabled

When --cgroups-per-qos=false (default is true), kubelet sets pod
container management to podContainerManagerNoop implementation and
GetPodContainerName() returns '/' as cgroup parent (default cgroup root).

(1) In case of 'systemd' cgroup driver, '/' is invalid parent as
docker daemon expects '.slice' suffix and throws this error:
'cgroup-parent for systemd cgroup should be a valid slice named as \"xxx.slice\"'
(5fc12449d8/daemon/daemon_unix.go (L618))
'/' corresponds to '-.slice' (root slice) in systemd but I don't think
we want to assign root slice instead of runtime specific default value.
In case of docker runtime, this will be 'system.slice'
(e2593239d9/daemon/oci_linux.go (L698))

(2) In case of 'cgroupfs' cgroup driver, '/' is valid parent but I don't
think we want to assign root instead of runtime specific default value.
In case of docker runtime, this will be '/docker'
(e2593239d9/daemon/oci_linux.go (L695))

Current fix will not set the cgroup parent when --cgroups-per-qos is disabled.

```release-note
Fix pod launch by kubelet when --cgroups-per-qos=false and --cgroup-driver="systemd"
```
2018-08-02 15:42:16 -07:00
Kubernetes Submit Queue 2f21394859
Merge pull request #66190 from linyouchong/issue-66189
Automatic merge from submit-queue (batch tested with PRs 66190, 66871, 66617, 66293, 66891). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix nil pointer dereference in node_container_manager#enforceExisting

**What this PR does / why we need it**:
fix nil pointer dereference in node_container_manager#enforceExisting

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66189

**Special notes for your reviewer**:
NONE

**Release note**:
```release-note
kubelet: fix nil pointer dereference while enforce-node-allocatable flag is not config properly
```
2018-08-02 15:42:09 -07:00
Seth Jennings 0413850d14 kubelet: volumemanager: poll immediate when waiting for volume attachment 2018-08-02 16:41:15 -05:00
Kubernetes Submit Queue c2536e2b0d
Merge pull request #61159 from linyouchong/linyouchong-20180314
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Skip checking when failSwapOn=false

**What this PR does / why we need it**:
Skip checking when failSwapOn=false

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
NONE
**Release note**:
```
NONE
```
2018-08-02 14:09:39 -07:00
Kubernetes Submit Queue 4a54f3f0d6
Merge pull request #66779 from deads2k/api-05-easy-unit
Automatic merge from submit-queue (batch tested with PRs 66850, 66902, 66779, 66864, 66912). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add methods to apimachinery to easy unit testing

When unit testing, you often want a selective scheme and codec factory.  Rather than writing the vars and the init function and the error handling, you can simply do

`scheme, codecs := testing.SchemeForInstallOrDie(install.Install)`

@kubernetes/sig-api-machinery-misc 
@sttts 

```release-note
NONE
```
2018-08-02 10:03:16 -07:00
Kubernetes Submit Queue 94c2c6c842
Merge pull request #66510 from sjenning/add-image-gc-validation
Automatic merge from submit-queue (batch tested with PRs 65730, 66615, 66684, 66519, 66510). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: add image-gc low/high validation check

Currently, there is no protection against setting the high watermark <= the low watermark for image GC

This PR adds a validation rule for that.

@smarterclayton
2018-08-01 15:52:20 -07:00
Kubernetes Submit Queue 7ac32a4f7a
Merge pull request #61983 from mikedanese/closur
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

volumemanager: remove unneccesary closure

```release-note
NONE
```
2018-08-01 14:26:12 -07:00
David Eads d3bd0eb1d5 make package name match all the import aliases 2018-08-01 15:31:12 -04:00
Kubernetes Submit Queue 0a284c1cde
Merge pull request #66082 from sjenning/fix-is-critical-checks
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

move feature gate checks inside IsCriticalPod

Currently `IsCriticalPod()` calls throughout the code are protected by `utilfeature.DefaultFeatureGate.Enabled(features.ExperimentalCriticalPodAnnotation)`.

However, with Pod Priority, this gate could be disabled which skips the priority check inside IsCriticalPod().

This PR moves the feature gate checking inside `IsCriticalPod()` and handles both situations properly.

@aveshagarwal @ravisantoshgudimetla @derekwaynecarr 
/sig node
/sig scheduling
/king bug
2018-08-01 11:47:08 -07:00
Di Xu b3dfe0c652 nodes: improve handling of erroneous host names 2018-08-01 14:57:25 +08:00
Chao Wang 39a4730db6 remove duplicated import 2018-08-01 13:27:42 +08:00
Mike Danese f3922dff19 volumemanager: remove unneccesary closure 2018-07-31 18:48:15 -07:00
Kubernetes Submit Queue c0bf2e680f
Merge pull request #66270 from Pingan2017/delevent
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

delete unused events

**What this PR does / why we need it**:
 events (HostNetworkNotSupported, UndefinedShaper) is unused since #47058
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-31 12:14:06 -07:00
Kubernetes Submit Queue f2c6473e25
Merge pull request #66718 from ipuustin/cpu-manager-validate-offline
Automatic merge from submit-queue (batch tested with PRs 66623, 66718). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

cpumanager: validate topology in static policy

**What this PR does / why we need it**:

This patch adds a check for the static policy state validation. The check fails if the CPU topology obtained from cadvisor doesn't match with the current topology in the state file.

If the CPU topology has changed in a node, cpumanager static policy might try to assign non-present cores to containers.

For example in my test case, static policy had the default CPU set of `0-1,4-7`. Then kubelet was shut down and CPU 7 was offlined. After restarting the kubelet, CPU manager tries to assign the non-existent CPU 7 to containers which don't have exclusive allocations assigned to them:

    Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6)

This breaks the exclusivity, since the CPUs from the shared pool don't get assigned to non-exclusive containers, meaning that they can execute on the exclusive CPUs.

**Release note**:

```release-note
Added CPU Manager state validation in case of changed CPU topology.
```
2018-07-31 08:05:06 -07:00
Ismo Puustinen 3bb5ca9257 cpumanager: add test for available CPUs in static policy.
Test the cases where the number of CPUs available in the system is
smaller or larger than the number of CPUs known in the state, which
should lead to a panic. This covers both CPU onlining and offlining. The
case where the number of CPUs matches is already covered by the
"non-corrupted state" test.
2018-07-31 10:20:37 +03:00
Kubernetes Submit Queue 2bee858a7b
Merge pull request #66284 from stewart-yu/stewart-sharedtype-move
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move the` k8s.io/kubernetes/pkg/util/pointer` package to` k8s.io/utils/pointer`

**What this PR does / why we need it**:
Move `k8s.io/kubernetes/pkg/util/pointer` to  `shared utils` directory, so that we can use it  easily.
Close #66010 accidentally, and can't reopen it, so the same as #66010 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-30 19:50:36 -07:00
Ismo Puustinen 4f604eb73c cpumanager: validate topology in static policy.
This patch adds a check for the static policy state validation. The
check fails if the CPU topology obtained from cadvisor doesn't match
with the current topology in the state file.

If the CPU topology has changed in a node, cpu manager static policy
might try to assign non-present cores to containers.

For example in my test case, static policy had the default CPU set of
0-1,4-7. Then kubelet was shut down and CPU 7 was offlined. After
restarting the kubelet, CPU manager tries to assign the non-existent CPU
7 to containers which don't have exclusive allocations assigned to them:

 Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6)

This breaks the exclusivity, since the CPUs from the shared pool don't
get assigned to non-exclusive containers, meaning that they can execute
on the exclusive CPUs.
2018-07-30 08:49:13 +03:00
hui luo 7101c17498 While reviewing devicemanager code, found
the caching layer on endpoint is redundant.

Here are the 3 related objects in picture:
devicemanager <-> endpoint <-> plugin

Plugin is the source of truth for devices
and device health status.

devicemanager maintain healthyDevices,
unhealthyDevices, allocatedDevices based on updates
from plugin.

So there is no point for endpoint caching devices,
this patch is removing this caching layer on endpoint,

Also removing the Manager.Devices() since i didn't
find any caller of this other than test, i am adding a
notification channel to facilitate testing,

If we need to get all devices from manager in future,
it just need to return healthyDevices + unhealthyDevices,
we don't have to call endpoint after all.

This patch makes code more readable, data model been simplified.
2018-07-29 21:07:14 -07:00
Kubernetes Submit Queue 8e2a444b6d
Merge pull request #66593 from stewart-yu/stewart-kubelet-commentclean
Automatic merge from submit-queue (batch tested with PRs 66593, 66727, 66558). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove the outdate comments in tryRegisterWithAPIServer

**What this PR does / why we need it**:
some judgement about ExternalID removed in #61877, so remove the outdate comments in tryRegisterWithAPIServer


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-27 18:05:00 -07:00
stewart-yu f1343af5d7 auto-generated file 2018-07-28 07:54:17 +08:00
Kubernetes Submit Queue 32e38b6659
Merge pull request #58755 from vikaschoudhary16/probing-mode
Automatic merge from submit-queue (batch tested with PRs 58755, 66414). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use probe based plugin watcher mechanism in Device Manager

**What this PR does / why we need it**:
Uses this probe based utility in the device plugin manager.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56944 

**Notes For Reviewers**:
Changes are backward compatible and existing device plugins will continue to work. At the same time, any new plugins that has required support for probing model (Identity service implementation), will also work. 


**Release note**
```release-note
Add support kubelet plugin watcher in device manager.
```
/sig node
/area hw-accelerators
/cc /cc @jiayingz @RenaudWasTaken @vishh @ScorpioCPH @sjenning @derekwaynecarr @jeremyeder @lichuqiang @tengqm @saad-ali @chakri-nelluri @ConnorDoyle
2018-07-27 15:20:06 -07:00
Kubernetes Submit Queue 2630d09c84
Merge pull request #66596 from BSWANG/master
Automatic merge from submit-queue (batch tested with PRs 66665, 66707, 66596). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix kubelet npe panic on device plugin return zero container

Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>



**What this PR does / why we need it**:
Fix kubelet panic when device plugin return zero containers. Panic logs like follows:
```
Jul 17 12:50:24 iZwz9bqgzuo4i8qu435zk8Z kubelet[25815]: /workspace/anago-v1.10.4-beta.0.68+5ca598b4ba5abb/src/k8s.io/kubernetes/_output/dockerized/go/src/
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
Jul 17 12:50:24 iZwz9bqgzuo4i8qu435zk8Z kubelet[25815]: /workspace/anago-v1.10.4-beta.0.68+5ca598b4ba5abb/src/k8s.io/kubernetes/_output/dockerized/go/src/
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
Jul 17 12:50:24 iZwz9bqgzuo4i8qu435zk8Z kubelet[25815]: /workspace/anago-v1.10.4-beta.0.68+5ca598b4ba5abb/src/k8s.io/kubernetes/_output/dockerized/go/src/
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
Jul 17 12:50:24 iZwz9bqgzuo4i8qu435zk8Z kubelet[25815]: E0717 12:50:24.726856   25815 runtime.go:66] Observed a panic: "index out of range" (runtime error
: index out of range)
```

**Release note**:

```
NONE
```
2018-07-27 12:57:11 -07:00
stewart-yu 55251c716a update the import file for move util/pointer to k8s.io/utils 2018-07-27 19:47:02 +08:00
Kubernetes Submit Queue ed58d0dfd4
Merge pull request #63955 from k82cn/k8s_63897
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Taint node when initializing node.

Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63897 

**Release note**:
```release-note
If `TaintNodesByCondition` enabled, taint node with `TaintNodeUnschedulable` when
initializing node to avoid race condition.
```
2018-07-26 21:01:16 -07:00
Kubernetes Submit Queue cef2d325ee
Merge pull request #66395 from awly/fix-kubelet-exec-plugin-startup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update http.Transport if it already exists in ExecProvider

**What this PR does / why we need it**:
This unbreaks ExecPlugin. Without the change, we hit this error
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/transport/transport.go#L32

**Release note**:
```release-note
Fix kubelet startup failure when using ExecPlugin in kubeconfig
```
2018-07-26 10:47:05 -07:00
Andrew Lytvynov 3357b5ecf4 Set connrotation dialer via restclient.Config.Dialer
Instead of Transport. This fixes ExecPlugin, which fails if
restclient.Config.Transport is set.
2018-07-25 16:23:57 -07:00
stewart-yu ffbd7b22b3 remove the unnecessary comments in tryRegisterWithAPIServer for externalID removed in PR#61877 2018-07-25 11:23:56 +08:00
bingshen.wbs b1bdd043c4 fix kubelet npe on device plugin return zero container
Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>
2018-07-25 10:15:30 +08:00
Pengfei Ni cfb776dcdd Revert #63905: Setup dns servers and search domains for Windows Pods 2018-07-25 09:58:47 +08:00
Seth Jennings b1ec6da4c7 kubelet: add image-gc low/high validation check 2018-07-23 13:14:31 -05:00
Da K. Ma aac9f1cbaa Taint node when initializing node.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-07-23 12:52:05 +08:00
Lee Verberne 7c558fb7bb Remove kubelet-level docker shared pid flag
The --docker-disable-shared-pid flag has been deprecated since 1.10 and
has been superceded by ShareProcessNamespace in the pod API, which is
scheduled for beta in 1.12.
2018-07-22 16:54:44 +02:00
Kubernetes Submit Queue 53ee0c8652
Merge pull request #65660 from mtaufen/incremental-refactor-kubelet-node-status
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor kubelet node status setters, add test coverage

This internal refactor moves the node status setters to a new package, explicitly injects dependencies to facilitate unit testing, and adds individual unit tests for the setters.

I gave each setter a distinct commit to facilitate review.

Non-goals:
- I intentionally excluded the class of setters that return a "modified" boolean, as I want to think more carefully about how to cleanly handle the behavior, and this PR is already rather large.
- I would like to clean up the status update control loops as well, but that belongs in a separate PR.

```release-note
NONE
```
2018-07-20 12:12:24 -07:00
Ravi Sankar Penta 0282720e29 Do not set cgroup parent when --cgroups-per-qos is disabled
When --cgroups-per-qos=false (default is true), kubelet sets pod
container management to podContainerManagerNoop implementation and
GetPodContainerName() returns '/' as cgroup parent (default cgroup root).

(1) In case of 'systemd' cgroup driver, '/' is invalid parent as
docker daemon expects '.slice' suffix and throws this error:
'cgroup-parent for systemd cgroup should be a valid slice named as \"xxx.slice\"'
(5fc12449d8/daemon/daemon_unix.go (L618))
'/' corresponds to '-.slice' (root slice) in systemd but I don't think
we want to assign root slice instead of runtime specific default value.
In case of docker runtime, this will be 'system.slice'
(e2593239d9/daemon/oci_linux.go (L698))

(2) In case of 'cgroupfs' cgroup driver, '/' is valid parent but I don't
think we want to assign root instead of runtime specific default value.
In case of docker runtime, this will be '/docker'
(e2593239d9/daemon/oci_linux.go (L695))

Current fix will not set the cgroup parent when --cgroups-per-qos is disabled.
2018-07-20 10:25:50 -07:00
Pengfei Ni 4272c0fde6 Add unit tests for windows stats 2018-07-20 13:01:23 +08:00
Kubernetes Submit Queue d2cc34fb07
Merge pull request #65771 from smarterclayton/untyped
Automatic merge from submit-queue (batch tested with PRs 65771, 65849). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add a new conversion path to replace GenericConversionFunc

reflect.Call is very expensive. We currently use a switch block as part of AddGenericConversionFunc to avoid the bulk of top level a->b conversion for our primary types which is hand-written. Instead of having these be handwritten, we should generate them.

The pattern for generating them looks like:

```
scheme.AddConversionFunc(&v1.Type{}, &internal.Type{}, func(a, b interface{}, scope conversion.Scope) error {
  return Convert_v1_Type_to_internal_Type(a.(*v1.Type), b.(*internal.Type), scope)
})
```

which matches AddDefaultObjectFunc (which proved out the approach last year). The
conversion machinery should then do a simple map lookup based on the incoming types and invoke the function.  Like defaulting, it's up to the caller to match the types to arguments, which we do by generating this code.  This bypasses reflect.Call and in the future allows Golang mid-stack inlining to optimize this code.

As part of this change I strengthened registration of custom functions to be generated instead of hand registered, and also strengthened error checking of the generator when it sees a manual conversion to error out.  Since custom functions are automatically used by the generator, we don't really have a case for not registering the functions.

Once this is fully tested out, we can remove the reflection based path and the old registration methods, and all conversion will work from point to point methods (whether generated or custom).

Much of the need for the reflection path has been removed by changes to generation (to omit fields) and changes to Go (to make assigning equivalent structs easy).

```release-note
NONE
```
2018-07-19 09:29:00 -07:00
Pengfei Ni a2fe1ab059 Add stats for system containers "pods" 2018-07-19 22:20:24 +08:00
Kubernetes Submit Queue afcc156806
Merge pull request #66350 from aveshagarwal/master-rhbz-1601378
Automatic merge from submit-queue (batch tested with PRs 66175, 66324, 65828, 65901, 66350). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Start cloudResourceSyncsManager before getNodeAnyWay (initializeModules) to avoid kubelet getting stuck in retrieving node addresses from a cloudprovider.

**What this PR does / why we need it**:
This PR starts cloudResourceSyncsManager before getNodeAnyWay (initializeModules) otherwise kubelet gets stuck in setNodeAddress->kl.cloudResourceSyncManager.NodeAddresses() (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_node_status.go#L470) forever retrieving node addresses from a cloud provider, and due to this cloudResourceSyncsManager will not be started at all.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```

@ingvagabund @derekwaynecarr @sjenning @kubernetes/sig-node-bugs
2018-07-18 16:42:22 -07:00
Avesh Agarwal 6c33ca13e9 Start cloudResourceSyncsManager before getNodeAnyWay (initializeModules)
so that kubelet does not get stuck in retriving node addresses from a cloudprovider.
2018-07-18 15:15:03 -04:00
Clayton Coleman ef561ba8b5
generated: Avoid use of reflect.Call in conversion code paths 2018-07-17 23:02:16 -04:00
Shimin Guo e8cd28ae57 fix a panic due to assignment to nil map 2018-07-17 12:34:20 -07:00
vikaschoudhary16 a5842503eb Use probe based plugin discovery mechanism in device manager 2018-07-17 04:02:31 -04:00
chentao1596 ce3f5002dd Add unit tests for methods of pod's format 2018-07-17 15:37:13 +08:00
chentao1596 9319be121e Change the method name from PodsWithDeletiontimestamps to PodsWithDeletionTimestamps 2018-07-17 15:34:32 +08:00
Pingan2017 45fac6f469 delete unused events 2018-07-17 14:19:50 +08:00
Michael Taufen 0f8976bf84 eliminate unnecessary helper 2018-07-16 09:09:48 -07:00
Michael Taufen 0170826542 port setVolumeLimits to Setter abstraction, add test 2018-07-16 09:09:48 -07:00
Michael Taufen c5a5e21639 port setNodeStatusGoRuntime to Setter abstraction 2018-07-16 09:09:48 -07:00
Michael Taufen 8e217f7102 port setNodeStatusImages to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen b7ec333f01 port setNodeStatusDaemonEndpoints to Setter abstraction 2018-07-16 09:09:47 -07:00
Michael Taufen 59bb21051e port setNodeStatusVersionInfo to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen 596fa89af0 port setNodeStatusMachineInfo to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen aa94a3ba4e lift node-info setters into defaultNodeStatusFuncs
Instead of hiding these behind a helper, we just register them in a
uniform way. We are careful to keep the call-order of the setters the
same, though we can consider re-ordering in a future PR to achieve
fewer appends.
2018-07-16 09:09:47 -07:00
Michael Taufen 2df7e1ad5c port setNodeVolumesInUseStatus to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen 3e03e0611e port setNodeReadyCondition to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen e0b6ae219f port setNodePIDPressureCondition to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen b26e4dfa7f port setNodeDiskPressureCondition to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen f057c9a4ae port setNodeMemoryPressureCondition to Setter abstraction, add test 2018-07-16 09:09:47 -07:00
Michael Taufen c33f321acd port setNodeOODCondition to Setter abstraction 2018-07-16 09:09:47 -07:00
Michael Taufen 15b03b8c0c port setNodeAddress to Setter abstraction, port test
also put cloud_request_manager.go in its own package
2018-07-16 09:09:47 -07:00
Michael Taufen a3cbbbd931 move call to defaultNodeStatusFuncs to after the rest of the Kubelet is constructed 2018-07-16 09:03:13 -07:00
Michael Taufen 08c94e0616 add nodestatus package with Setter abstraction for composable node constructors 2018-07-16 09:03:13 -07:00
Michael Taufen d245e72bae remove incorrect comment referencing removed functionality
The cbr0 configuration behavior this comment references was removed in #34906
2018-07-16 09:03:13 -07:00
linyouchong 6ff285bce3 fix nil pointer dereference in node_container_manager#enforceExistingCgroup 2018-07-14 10:42:42 +08:00
Joonyoung Park e6d02e9410 fix metrics help comment
pod_start_latency_microseconds is not broken down by podname.
2018-07-13 10:26:35 +09:00
Kubernetes Submit Queue 337dfe0a9c
Merge pull request #65594 from liggitt/node-csr-addresses-2
Automatic merge from submit-queue (batch tested with PRs 65052, 65594). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Derive kubelet serving certificate CSR template from node status addresses

xref https://github.com/kubernetes/features/issues/267
fixes #55633

Builds on https://github.com/kubernetes/kubernetes/pull/65587

* Makes the cloud provider authoritative when recording node status addresses
* Makes the node status addresses authoritative for the kube-apiserver determining how to speak to a kubelet (stops paying attention to the hostname label when determining how to reach a kubelet, which was only done to support kubelets < 1.5)
* Updates kubelet certificate rotation to be driven from node status
  * Avoids needing to compute node addresses a second time, and differently, in order to request serving certificates.
  * Allows the kubelet to react to changes in its status addresses by updating its serving certificate
  * Allows the kubelet to be driven by external cloud providers recording node addresses on the node status

test procedure:
```sh
# setup
export FEATURE_GATES=RotateKubeletServerCertificate=true
export KUBELET_FLAGS="--rotate-server-certificates=true --cloud-provider=external"

# cleanup from previous runs
sudo rm -fr /var/lib/kubelet/pki/

# startup
hack/local-up-cluster.sh

# wait for a node to register, verify it didn't set addresses
kubectl get nodes 
kubectl get node/127.0.0.1 -o jsonpath={.status.addresses}

# verify the kubelet server isn't available, and that it didn't populate a serving certificate
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
ls -la /var/lib/kubelet/pki

# set an address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
  -H "Content-Type: application/merge-patch+json" \
  --data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"}]}}'

# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...

# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname, but NOT the IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki

# set an hostname and IP address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
  -H "Content-Type: application/merge-patch+json" \
  --data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"},{"type":"InternalIP","address":"127.0.0.1"}]}}'

# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...

# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname AND IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki
```

```release-note
* kubelets that specify `--cloud-provider` now only report addresses in Node status as determined by the cloud provider
* kubelet serving certificate rotation now reacts to changes in reported node addresses, and will request certificates for addresses set by an external cloud provider
```
2018-07-11 22:25:07 -07:00
Seth Jennings f2a7654978 move feature gate checks inside IsCriticalPod 2018-07-11 16:10:05 -05:00
Kubernetes Submit Queue 0972ce1acc
Merge pull request #65649 from rsc/fix-printf
Automatic merge from submit-queue (batch tested with PRs 66076, 65792, 65649). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubernetes: fix printf format errors

These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.

```release-note
NONE
```
2018-07-11 14:09:08 -07:00
jiaxuanzhou 6ac4a8588e fix bug for garbage collection 2018-07-11 09:33:08 +08:00
Russ Cox 2bd91dda64 kubernetes: fix printf format errors
These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.

Lubomir I. Ivanov <neolit123@gmail.com>
applied ammend for:
  pkg/cloudprovider/provivers/vsphere/nodemanager.go
2018-07-11 00:10:15 +03:00
Kubernetes Submit Queue 421789328f
Merge pull request #65997 from tallclair/writer
Automatic merge from submit-queue (batch tested with PRs 66030, 65997). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove unused io util writer & volume host GetWriter()

Cleanup unused code.
Fixes https://github.com/kubernetes/kubernetes/issues/16971

**Release note**:
```release-note
NONE
```

/kind cleanup
/sig storage
2018-07-10 12:46:09 -07:00
Jordan Liggitt 7828e5d0f9
Make cloud provider authoritative for node status address reporting 2018-07-10 14:33:48 -04:00
Jordan Liggitt db9d3c2d10
Derive kubelet serving certificate CSR template from node status addresses 2018-07-10 14:33:48 -04:00
Kubernetes Submit Queue 13f9c26fd7
Merge pull request #65902 from wojtek-t/kube_proxy_less_allocations_2
Automatic merge from submit-queue (batch tested with PRs 65902, 65781). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Avoid unnecessary allocations in kube-proxy
2018-07-09 23:07:01 -07:00
Kubernetes Submit Queue 55620e2be6
Merge pull request #65987 from Random-Liu/fix-pod-worker-deadlock
Automatic merge from submit-queue (batch tested with PRs 65987, 65962). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix pod worker deadlock.

Preemption will stuck forever if `killPodNow` timeout once. The sequence is:
* `killPodNow` create the response channel (size 0) and send it to pod worker.
* `killPodNow` timeout and return.
*  Pod worker finishes killing the pod, and tries to send back response via the channel.

However, because the channel size is 0, and the receiver has exited, the pod worker will stuck forever.

In @jingxu97's case, this causes a critical system pod (apiserver) unable to come up, because the csi pod can't be preempted.

I checked the history, and the bug was introduced 2 years ago 6fefb428c1.

I think we should at least cherrypick this to `1.11` since preemption is beta and enabled by default in 1.11.

@kubernetes/sig-node-bugs @derekwaynecarr @dashpole @yujuhong 
Signed-off-by: Lantao Liu <lantaol@google.com>

```release-note
none
```
2018-07-09 16:53:59 -07:00
Tim Allclair b1012b2543
Remove unused io util writer & volume host GetWriter() 2018-07-09 14:09:48 -07:00
Lantao Liu 0f4c739b2c Fix pod worker deadlock.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-07-09 11:45:26 -07:00
Kubernetes Submit Queue f70410959d
Merge pull request #65226 from ingvagabund/store-cloud-provider-latest-node-addresses
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Store the latest cloud provider node addresses

**What this PR does / why we need it**:
Buffer the recently retrieved node address so they can be used as soon as the next node status update is run.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65814

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
2018-07-09 10:47:07 -07:00
Kubernetes Submit Queue e943d09fa3
Merge pull request #63194 from m1093782566/cni-ts
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding traffic shaping support for CNI network driver

**What this PR does / why we need it**:

Adding traffic shaping support for CNI network driver - it's also a sub-task of kubenet deprecation work.

Design document is available here: https://github.com/kubernetes/community/pull/1893

**Which issue(s) this PR fixes**:
Fixes #

**Special notes for your reviewer**:

/cc @freehan @jingax10 @caseydavenport @dcbw 

/sig network
/sig node

**Release note**:

```release-note
Support traffic shaping for CNI network driver
```
2018-07-08 23:54:25 -07:00
liangwei 34d848eb1a add cni bandwidth test 2018-07-09 09:51:33 +08:00
m1093782566 8038a0dfa6 add traffic shaping support for CNI network driver 2018-07-08 22:22:25 +08:00
wojtekt 6e50f39dbd Avoid allocations when parsing iptables 2018-07-08 10:55:19 +02:00
Kubernetes Submit Queue 097f300a4d
Merge pull request #65707 from dims/remove-deprecated-cadvisor-port
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove --cadvisor-port - has been deprecated since v1.10

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56523

**Special notes for your reviewer**:
- Deprecated in https://github.com/kubernetes/kubernetes/pull/59827 (v1.10)
- Disabled in https://github.com/kubernetes/kubernetes/pull/63881 (v1.11)

**Release note**:

```release-note
[action required] The formerly publicly-available cAdvisor web UI that the kubelet started using `--cadvisor-port` is now entirely removed in 1.12. The recommended way to run cAdvisor if you still need it, is via a DaemonSet.
```
2018-07-07 05:28:13 -07:00
Lantao Liu 3193a4a469 Fix RunAsGroup. 2018-07-06 15:42:26 -07:00
choury 8e4b62a74b
Remove duplicate check line
There is a same [line](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/cpumanager/policy_static.go#L81).
2018-07-05 11:07:56 +08:00
Jordan Liggitt b7b4b84afe
Add healthz check to ensure logging is not blocked 2018-07-03 22:27:23 -04:00
Jan Chaloupka 9d9fb4de29 Put all the node address cloud provider retrival complex logic into cloudResourceSyncManager 2018-07-03 20:11:35 +02:00
Davanum Srinivas 5feab86329
Remove --cadvisor-port - has been deprecated since v1.10
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2018-07-02 08:54:14 -04:00
wojtekt e50c0b904f Speed up cluster startup in GCE 2018-07-02 10:22:32 +02:00
Kubernetes Submit Queue b265f7c682
Merge pull request #65582 from dtaniwaki/fix-test-failure-of-truncated-time
Automatic merge from submit-queue (batch tested with PRs 65582, 65480, 65310, 65644, 65645). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix test failure of truncated time

**What this PR does / why we need it**:

The test of `TestFsStoreAssignedModified` in `pkg/kubelet/kubeletconfig/checkpoint/store` fails in my environment like below.

```
$ make test WHAT=./pkg/kubelet/kubeletconfig/checkpoint/store/
Running tests for APIVersion: v1,admissionregistration.k8s.io/v1alpha1,admissionregistration.k8s.io/v1beta1,admission.k8s.io/v1beta1,apps/v1beta1,apps/v1beta2,apps/v1,authentication.k8s.io/v1,authentication.k8s.io/v1beta1,authorization.k8s.io/v1,authorization.k8s.io/v1beta1,autoscaling/v1,autoscaling/v2beta1,batch/v1,batch/v1beta1,batch/v2alpha1,certificates.k8s.io/v1beta1,coordination.k8s.io/v1beta1,extensions/v1beta1,events.k8s.io/v1beta1,imagepolicy.k8s.io/v1alpha1,networking.k8s.io/v1,policy/v1beta1,rbac.authorization.k8s.io/v1,rbac.authorization.k8s.io/v1beta1,rbac.authorization.k8s.io/v1alpha1,scheduling.k8s.io/v1alpha1,scheduling.k8s.io/v1beta1,settings.k8s.io/v1alpha1,storage.k8s.io/v1beta1,storage.k8s.io/v1,storage.k8s.io/v1alpha1,
+++ [0628 22:53:39] Running tests without code coverage
--- FAIL: TestFsStoreAssignedModified (0.00s)
        fsstore_test.go:316: expect "2018-06-28T22:53:43+09:00" but got "2018-06-28T22:53:43+09:00"
FAIL
FAIL    k8s.io/kubernetes/pkg/kubelet/kubeletconfig/checkpoint/store    0.236s
make: *** [test] Error 1
```

My environment is
OS: macOS Sierra Version 10.12.6
File System: Journaled HFS+

The error message confused me because the comparing times looked the same in the error log. If we know certain systems truncate times, I think we can just compare less precise times to avoid confusions in tests.

**Special notes for your reviewer**:
N/A

**Release note**:

```release-note
NONE
```
2018-06-29 20:14:06 -07:00
Daisuke Taniwaki 7d4c85b02c
Fix test failure of truncated time 2018-06-30 01:14:44 +09:00
Kubernetes Submit Queue 93f3249e3c
Merge pull request #65595 from sjenning/feature-gate-lsi-capacity
Automatic merge from submit-queue (batch tested with PRs 60150, 65467, 65487, 65595, 65374). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: feature gate LSI capacity calculation

Currently if `cm.cadvisorInterface.RootFsInfo()` fails, the whole kubelet bails.  If `/var/lib/kubelet` is on a tmpfs or bindmount, this can happen (this is the case for some of our CI envs https://github.com/openshift/origin/issues/19948).

We would be able to workaround this, in the short term, by disabling the LSI feature gate if the capacity calculate was protected by the gate, but currently it isn't.

This PR adds the gate check around setting the ephemeral storage capacity.

@liggitt @derekwaynecarr @dashpole 

It might be a different discussion about whether or not this should be fatal.  If it isn't fatal, seems that it would just prevent pods that had a ephemeral storage request from being scheduled.

/sig node
2018-06-28 19:15:15 -07:00
Kubernetes Submit Queue c57cdc1d35
Merge pull request #65587 from liggitt/node-csr-addresses-1
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "certs: only append locally discovered addresses when we got none from the cloudprovider"

This reverts commit 7354bbe5ac.

https://github.com/kubernetes/kubernetes/pull/61869 caused a mismatch between the requested CSR and the addresses in node status.

Instead of computing addresses in two places, the cert manager should derive its CSR request from the addresses in node status. This would enable the kubelet to react to address changes, as well as be driven by an external cloud provider.

/cc @mikedanese

```release-note
NONE
```
2018-06-28 17:36:45 -07:00
Kubernetes Submit Queue 44073e6f43
Merge pull request #64660 from figo/master
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add support for plugin directory hierarchy

**What this PR does / why we need it**:

Add hierarchy support for plugin directory, it traverses and 
watch plugin directory and its sub directory recursively.

plugin socket file only need be unique within one directory,
``` 
 plugin socket directory  
    |  
    ---->sub directory 1
    |              |  
    |              ----->  socket1,  socket2 ...
    ----->sub directory 2
                  |
                  ------> socket1, socket2 ...  
```
the design itself allow sub directory be anything,
but in practical, each plugin type could just use one sub directory.

**Which issue(s) this PR fixes**:
Fixes #64003

**Special notes for your reviewer**:

twos bonus changes added as below

1) propose to let pluginWatcher bookkeeping registered plugins,
to make sure plugin name is unique within one plugin type.  
arguably, we could let each handler do the same work, but it requires
every handler repeat the same thing.    
 
2) extract example handler out from test, it is easier to read the code with the
seperation.  


**Release note**:

```release-note
N/A
```

/sig node
/cc @vikaschoudhary16  @jiayingz @RenaudWasTaken @vishh @derekwaynecarr  @saad-ali @vladimirvivien @dchen1107 @yujuhong @tallclair @Random-Liu @anfernee @akutz
2018-06-28 14:53:44 -07:00
Seth Jennings 3234b0fa5b feature gate LSI capacity calculation 2018-06-28 14:01:08 -05:00
Jordan Liggitt f1adf74b4e
Revert "certs: only append locally discovered addresses when we got none from the cloudprovider"
This reverts commit 7354bbe5ac.
2018-06-28 12:36:24 -04:00
Kubernetes Submit Queue 270b675c61
Merge pull request #65513 from tallclair/test-cleanup2
Automatic merge from submit-queue (batch tested with PRs 65453, 65523, 65513, 65560). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cleanup verbose cAdvisor mocking in Kubelet unit tests

These tests had a lot of duplicate code to set up the cAdvisor mock, but weren't really depending on the mock functionality. By moving the tests to use the fake cAdvisor, most of the setup can be cleaned up.

/kind cleanup
/sig node

```release-note
NONE
```
2018-06-27 22:30:12 -07:00
ceshihao 3b9ed9afff pod status should contain ContainerStatuses after eviction 2018-06-28 11:52:08 +08:00
Tim Allclair 5955b839ff
Cleanup verbose cAdvisor mocking in Kubelet unit tests 2018-06-27 11:53:41 -07:00
stewart-yu d5513c6d14 fix wrong output messages about EnforceNodeAllocatable 2018-06-27 15:31:32 +08:00
Kubernetes Submit Queue 991a84758f
Merge pull request #59214 from kdembler/cpumanager-checkpointing
Automatic merge from submit-queue (batch tested with PRs 59214, 65330). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Migrate cpumanager to use checkpointing manager

**What this PR does / why we need it**:
This PR migrates `cpumanager` to use new kubelet level node checkpointing feature (#56040) to decrease code redundancy and improve consistency.

**Which issue(s) this PR fixes**:
Fixes #58339

**Notes**:
At point of submitting PR the most straightforward approach was used - `state_checkpoint` implementation of `State` interface was added. However, with checkpointing implementation there might be no point to keep `State` interface and just use single implementation with checkpoint backend and in case of different backend than filestore needed just supply `cpumanager` with custom `CheckpointManager` implementation.

/kind feature
/sig node
cc @flyingcougar @ConnorDoyle
2018-06-25 18:19:00 -07:00
hui luo d04f596829 Add hierarchy support for plugin directory
it traverses and watch plugin directory and its sub directory recursively,
plugin socket file only need be unique within one directory,

- plugin socket directory
-    |
-    ---->sub directory 1
-    |              |
-    |              ----->  socket1,  socket2 ...
-    ----->sub directory 2
-                  |
-                  ------> socket1, socket2 ...

the design itself allow sub directory be anything,
but in practical, each plugin type could just use one sub directory.

four bonus changes added as below

1. extract example handler out from test, it is easier to read the code
with the seperation.

2. there are two variables here: "Watcher" and "watcher".
"Watcher" is the plugin watcher, and "watcher" is the fsnotify watcher.
so rename the "watcher" to "fsWatcher" to make code easier to
understand.

3. change RegisterCallbackFn() return value order, it is
conventional to return error last, after this change,
the pkg/volume/csi is compliance with golint, so remove it
from hack/.golint_failures

4. refactor errors handling at invokeRegistrationCallbackAtHandler()
to make error message more clear.
2018-06-25 17:32:18 -07:00
Jeff Grafton 23ceebac22 Run hack/update-bazel.sh 2018-06-22 16:22:57 -07:00
Jeff Grafton a725660640 Update to gazelle 0.12.0 and run hack/update-bazel.sh 2018-06-22 16:22:18 -07:00
Jeff Grafton 01f94051c8 Remove the go_default_library_protos filegroups using buildozer 2018-06-22 16:22:18 -07:00
Kubernetes Submit Queue f09a938bcd
Merge pull request #64675 from yue9944882/fix-data-race-cli-file-linux
Automatic merge from submit-queue (batch tested with PRs 61330, 64793, 64675, 65059, 65368). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixes data races for pkg/kubelet/config/file_linux_test.go

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64655

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-22 14:52:37 -07:00
Kubernetes Submit Queue 1ca851baec
Merge pull request #64860 from wgliang/master.kubelet-check-limit
Automatic merge from submit-queue (batch tested with PRs 65290, 65326, 65289, 65334, 64860). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

checkLimitsForResolvConf for the  pod create and update events instead of checking period

**What this PR does / why we need it**:

- Check for the same at pod create and update events instead of checking continuously for every 30 seconds.
- Increase the logging level to 4 or higher since the event is not catastrophic to cluster health .


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64849

**Special notes for your reviewer**:
@ravisantoshgudimetla 

**Release note**:

```release-note
checkLimitsForResolvConf for the  pod create and update events instead of checking period
```
2018-06-22 04:43:16 -07:00
Kubernetes Submit Queue 96c7f3a34a
Merge pull request #64752 from wojtek-t/default_to_watching_managers
Automatic merge from submit-queue (batch tested with PRs 65187, 65206, 65223, 64752, 65238). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Kubelet watches necessary secrets/configmaps instead of periodic polling
2018-06-21 19:48:14 -07:00
Kubernetes Submit Queue 02dba36128
Merge pull request #65019 from mirake/fix-typo-toto
Automatic merge from submit-queue (batch tested with PRs 65265, 64822, 65026, 65019, 65077). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Typo fix: toto -> to
2018-06-21 11:25:16 -07:00
Kubernetes Submit Queue d1f5cb2348
Merge pull request #65050 from sttts/sttts-deepcopy-update
Automatic merge from submit-queue (batch tested with PRs 64895, 64938, 63700, 65050, 64957). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Bump gengo to include uniform pointer deepcopy

This bumps k8s.io/gengo with uniform pointer support in deepcopy-gen.

Fixes https://github.com/kubernetes/code-generator/issues/45.
2018-06-21 04:15:16 -07:00
Kubernetes Submit Queue 332da0a943
Merge pull request #64491 from hzxuzhonghu/kubelet-node-schedule-event-record
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

move oldNodeUnschedulable pkg var to kubelet struct

**What this PR does / why we need it**:

move oldNodeUnschedulable pkg var to kubelet struct


**Release note**:

```release-note
NONE
```
2018-06-20 23:02:52 -07:00
Kubernetes Submit Queue ce09da5653
Merge pull request #64880 from dixudx/manifest_file_not_found
Automatic merge from submit-queue (batch tested with PRs 58690, 64773, 64880, 64915, 64831). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

ignore not found file error when watching manifests

**What this PR does / why we need it**:
An alternative of #63910.

When using vim to create a new file in manifest folder, a temporary file, with an arbitrary number (like 4913) as its name, will be created to check if a directory is writable and see the resulting ACL.

These temporary files will be deleted later, which should by ignored when watching the manifest folder.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #55928, #59009, #48219

**Special notes for your reviewer**:
/cc dims luxas yujuhong liggitt tallclair

**Release note**:

```release-note
ignore not found file error when watching manifests
```
2018-06-20 14:21:17 -07:00
Kubernetes Submit Queue aa25539ef6
Merge pull request #64451 from wgliang/master.remove-kubelet
Automatic merge from submit-queue (batch tested with PRs 64688, 64451, 64504, 64506, 56358). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

cleanup some dead kubelet code

**Release note**:

```release-note
NONE
```
2018-06-20 05:48:11 -07:00
Kubernetes Submit Queue a622f1404c
Merge pull request #64672 from mcluseau/wip-remote-grpc-message-size
Automatic merge from submit-queue (batch tested with PRs 65032, 63471, 64104, 64672, 64427). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

pkg: kubelet: remote: increase grpc client default size to 16MiB

**What this PR does / why we need it**:

Increase the gRPC max message size to 16MB in the remote container runtime. I've seen sizes over 8MB in clusters with big (256GB RAM) nodes.

**Release note**:
```release-note
Increase the gRPC max message size to 16MB in the remote container runtime.
```
2018-06-20 04:23:21 -07:00
Kubernetes Submit Queue 381b663b66
Merge pull request #63580 from dixudx/fix_cni_flag_binding
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

bind alpha feature network plugin flags correctly

**What this PR does / why we need it**:
When working #63542, I found the flags, like `--cni-conf-dir` and `cni-bin-dir`, were not correctly bound.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
/cc kubernetes/sig-node-pr-reviews

**Release note**:

```release-note
None
```
2018-06-20 01:26:52 -07:00
Kubernetes Submit Queue 148350d3c4
Merge pull request #64426 from cofyc/remove_unnecessary_fakemounters
Automatic merge from submit-queue (batch tested with PRs 64142, 64426, 62910, 63942, 64548). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Clean up fake mounters.

**What this PR does / why we need it**:

Fixes https://github.com/kubernetes/kubernetes/issues/61502

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

list of fake mounters:

- (keep) pkg/util/mount.FakeMounter
- (removed) pkg/kubelet/cm.fakeMountInterface:
- (inherit from mount.FakeMounter) pkg/util/mount.fakeMounter
- (inherit from mount.FakeMounter) pkg/util/removeall.fakeMounter
- (removed) pkg/volume/host_path.fakeFileTypeChecker

**Release note**:

```release-note
NONE
```
2018-06-20 00:05:10 -07:00
Kubernetes Submit Queue c399c306e2
Merge pull request #59174 from tianshapjq/todo-already-done
Automatic merge from submit-queue (batch tested with PRs 65230, 57355, 59174, 63698, 63659). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

TODO has already been implemented

**What this PR does / why we need it**:
TODO has already been implemented, remove the TODO tag.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note

```NONE
2018-06-19 20:19:17 -07:00
wojtekt 72a0f4d167 Enable watching secret and configmap manager 2018-06-19 22:13:18 +02:00
wojtekt ffb32472bb Kubelet manager configuration 2018-06-19 22:12:55 +02:00
vikaschoudhary16 e8119dc134 Start plugin watcher after initialization of all kubelet components 2018-06-14 01:03:37 -04:00
Andrew Lytvynov 2c0f043957 Re-use private key after failed CSR
If we create a new key on each CSR, if CSR fails the next attempt will
create a new one instead of reusing previous CSR.

If approver/signer don't handle CSRs as quickly as new nodes come up,
they can pile up and approver would keep handling old abandoned CSRs and
Nodes would keep timing out on startup.
2018-06-13 13:12:43 -07:00
Seth Jennings f1551798e4 reduce logging for backoff situations 2018-06-13 13:25:20 -05:00
Dr. Stefan Schimanski 1208437f84 Update generated files 2018-06-13 12:35:13 +02:00
Kubernetes Submit Queue bb7e14429d
Merge pull request #64922 from dcbw/dcbw-dockershim-network-approver
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim/network: add dcbw to OWNERS as an approver

I've been involved with the kubelet network code, including most
of this code, for a couple years and contributed a good number
of PRs for these directories. I've also been a SIG Network
co-lead for couple years.

I've also been on the CNI maintainers team for a couple years.

```release-note
NONE
```
@freehan @thockin @kubernetes/sig-network-pr-reviews
2018-06-12 13:31:15 -07:00
Kubernetes Submit Queue 67ebbc675a
Merge pull request #64862 from feiskyer/win-cni
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert #64189: Fix Windows CNI for the sandbox case

**What this PR does / why we need it**:

This reverts PR #64189, which breaks DNS for Windows containers.

Refer https://github.com/kubernetes/kubernetes/pull/64189#issuecomment-395248704

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64861

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

cc @madhanrm @PatrickLang @alinbalutoiu @dineshgovindasamy
2018-06-12 11:18:01 -07:00
ruicao 95c232ee07 Typo fix: toto -> to 2018-06-12 23:12:39 +08:00
Kubernetes Submit Queue 8e03228c1a
Merge pull request #64643 from dashpole/memcg_poll
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use unix.EpollWait to determine when memcg events are available to be Read

**What this PR does / why we need it**:
This fixes a file descriptor leak introduced in https://github.com/kubernetes/kubernetes/pull/60531 when the `--experimental-kernel-memcg-notification` kubelet flag is enabled.  The root of the issue is that `unix.Read` blocks indefinitely when reading from an event file descriptor and there is nothing to read.  Since we refresh the memcg notifications, these reads accumulate until the memcg threshold is crossed, at which time all reads complete.  However, if the node never comes under memory pressure, the node can run out of file descriptors.

This PR changes the eviction manager to use `unix.EpollWait` to wait, with a 10 second timeout, for events to be available on the eventfd.  We only read from the eventfd when there is an event available to be read, preventing an accumulation of `unix.Read` threads, and allowing the event file descriptors to be reclaimed by the kernel.

This PR also breaks the creation, and updating of the memcg threshold into separate portions, and performs creation before starting the periodic synchronize calls.  It also moves the logic of configuring memory thresholds into memory_threshold_notifier into a separate file.

This also reverts https://github.com/kubernetes/kubernetes/pull/64582, as the underlying leak that caused us to disable it for testing is fixed here.

Fixes #62808

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority critical-urgent
2018-06-11 17:29:19 -07:00
David Ashpole b7deb6d9e0 fix eviction event formatting 2018-06-11 11:38:00 -07:00
David Ashpole 93b6d026d9 fix memcg fd leak 2018-06-11 11:37:50 -07:00
Dan Williams 37792076b4 dockershim/network: add dcbw to OWNERS as an approver
I've been involved with the kubelet network code, including most
of this code, for a couple years and contributed a good number
of PRs for these directories. I've also been a SIG Network
co-lead for couple years.

I've also been on the CNI maintainers team for a couple years.
2018-06-08 10:06:19 -05:00
yue9944882 d467b29c5c remove duplicated cleaning up func 2018-06-08 14:28:19 +08:00
WanLinghao 52140ea1d3 fix a bug of wrong parameters which could cause token projection failure 2018-06-08 12:00:58 +08:00
Kubernetes Submit Queue 38beee65d3
Merge pull request #63905 from feiskyer/win-dns
Automatic merge from submit-queue (batch tested with PRs 63905, 64855). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Setup dns servers and search domains for Windows Pods

**What this PR does / why we need it**:

Kubelet is depending on docker container's ResolvConfPath (e.g. /var/lib/docker/containers/439efe31d70fc17485fb6810730679404bb5a6d721b10035c3784157966c7e17/resolv.conf) to setup dns servers and search domains. While this is ok for Linux containers, ResolvConfPath is always an empty string for windows containers. So that the DNS setting for windows containers is always not set.

This PR setups DNS for Windows sandboxes. In this way, Windows Pods could also use kubernetes dns policies.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #61579

**Special notes for your reviewer**:

Requires Docker EE version >= 17.10.0.

**Release note**:

```release-note
Setup dns servers and search domains for Windows Pods in dockershim. Docker EE version >= 17.10.0 is required for propagating DNS to containers.
```

/cc @PatrickLang @taylorb-microsoft @michmike @JiangtianLi
2018-06-07 11:40:11 -07:00
Di Xu 6d14771fd8 ignore not found file error when watching manifests 2018-06-07 22:02:53 +08:00
AdamDang 37a6fbfd6a
Typo fix in returned message: utilites->utilities
utilites->utilities
2018-06-07 21:49:38 +08:00
Klaudiusz Dembler a9df2acc4b Typo fix 2018-06-07 12:08:48 +02:00
Pengfei Ni d0cd1d17ae Add clarification for Windows DNS setup flow 2018-06-07 16:26:13 +08:00
Di Xu 8285a26589 add missing LastTransitionTime of ContainerReady condition 2018-06-07 14:51:14 +08:00
yue9944882 a221218681 fixes data races 2018-06-07 11:24:35 +08:00
Guoliang Wang 4f9d2047dd checkLimitsForResolvConf for the pod create and update events instead of checking period 2018-06-07 10:14:22 +08:00
Pengfei Ni 10b6f405e1 Revert "Fix Windows CNI for the sandbox case"
This reverts commit 49e762ab3a.
2018-06-07 09:56:13 +08:00
Kubernetes Submit Queue 8013bdb180
Merge pull request #64749 from Random-Liu/fix-standalone-dockershim
Automatic merge from submit-queue (batch tested with PRs 64749, 64797). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix standalone dockershim.

Ref https://github.com/kubernetes-incubator/cri-tools/pull/320#issuecomment-394554484.

This PR fixes a bug that standalone dockershim exits immediately.

This PR:
1) Changes standalone dockershim to wait on `stopCh`, so that it won't exit immediately.
2) Removes `stopCh` from dockershim internal. It doesn't help much for graceful stop, because kubelet will exit immediately anyway. https://github.com/kubernetes/kubernetes/blob/master/cmd/kubelet/app/server.go#L748

@kubernetes/sig-node-pr-reviews @yujuhong @feiskyer 

**Release note**:

```release-note
none
```
2018-06-06 10:08:12 -07:00
Kubernetes Submit Queue f54593b740
Merge pull request #64795 from mikedanese/fixgke
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

auth: standalone kubelets shouldn't start a token manager

fixes https://github.com/kubernetes/kubernetes/issues/64789
2018-06-06 06:58:28 -07:00
Kubernetes Submit Queue f4668d281c
Merge pull request #64800 from dashpole/cadvisor_godep
Automatic merge from submit-queue (batch tested with PRs 63717, 64646, 64792, 64784, 64800). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update cadvisor godeps to v0.30.0

**What this PR does / why we need it**:
cAdvisor godep update corresponding to 1.11

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63204

**Release note**:
```release-note
Use IONice to reduce IO priority of du and find
cAdvisor ContainerReference no longer contains Labels. Use ContainerSpec instead.
Fix a bug where cadvisor failed to discover a sub-cgroup that was created soon after the parent cgroup.
```

/sig node
/kind bug
/priority critical-urgent

/assign @dchen1107
2018-06-06 01:24:26 -07:00
Kubernetes Submit Queue a32e5b6a59
Merge pull request #64784 from jiayingz/status-ready
Automatic merge from submit-queue (batch tested with PRs 63717, 64646, 64792, 64784, 64800). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reconcile extended resource capacity after kubelet restart.

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/64632

**Special notes for your reviewer**:

**Release note**:

```release-note
Kubelet will set extended resource capacity to zero after it restarts. If the extended resource is exported by a device plugin, its capacity will change to a valid value after the device plugin re-connects with the Kubelet. If the extended resource is exported by an external component through direct node status capacity patching, the component should repatch the field after kubelet becomes ready again. During the time gap, pods previously assigned with such resources may fail kubelet admission but their controller should create new pods in response to such failures.
```
2018-06-06 01:24:21 -07:00
Kubernetes Submit Queue 0b8394a1f4
Merge pull request #64646 from freehan/pod-ready-plus2-new
Automatic merge from submit-queue (batch tested with PRs 63717, 64646, 64792, 64784, 64800). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add ContainersReady condition into Pod Status

**Last 3 commits are new**

Follow up PR of: https://github.com/kubernetes/kubernetes/pull/64057 and https://github.com/kubernetes/kubernetes/pull/64344

Have a single PR for adding ContainersReady per https://github.com/kubernetes/kubernetes/pull/64344#issuecomment-394038384

```release-note
Introduce ContainersReady condition in Pod Status
```


/assign yujuhong for review
/assign thockin for the tiny API change
2018-06-06 01:24:14 -07:00
Kubernetes Submit Queue b6f75ac30e
Merge pull request #63717 from ingvagabund/promote-sysctl-annotations-to-fields
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Promote sysctl annotations to fields

#


**What this PR does / why we need it**:

Promoting experimental sysctl feature from annotations to API fields.

**Special notes for your reviewer**:

Following sysctl KEP: https://github.com/kubernetes/community/pull/2093

**Release note**:

```release-note
The Sysctls experimental feature has been promoted to beta (enabled by default via the `Sysctls` feature flag). PodSecurityPolicy and Pod objects now have fields for specifying and controlling sysctls. Alpha sysctl annotations will be ignored by 1.11+ kubelets. All alpha sysctl annotations in existing deployments must be converted to API fields to be effective.
```

**TODO**:

* [x] - Promote sysctl annotation in Pod spec
* [x] - Promote sysctl annotation in PodSecuritySpec spec
* [x] - Feature gate the sysctl
* [x] - Promote from alpha to beta
* [x] - docs PR - https://github.com/kubernetes/website/pull/8804
2018-06-06 00:47:36 -07:00
Kubernetes Submit Queue 0e44d8c40b
Merge pull request #64354 from mtaufen/dkcfg-safe-fields
Automatic merge from submit-queue (batch tested with PRs 64009, 64780, 64354, 64727, 63650). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

per-field dynamic config advice

Dynamic Kubelet config gives cluster admins and k8s-as-a-service providers a lot of flexibility around reconfiguring the Kubelet in live environments. With great power comes great responsibility. These comments intend to provide more nuanced guidance around using dynamic Kubelet config by adding items to consider when changing various fields and pointing out where cluster admins and k8s-as-service providers should maintain extra caution.

@kubernetes/sig-node-pr-reviews PLEASE provide feedback and help fill in the blanks here, I don't have domain expertise in all of these features.

https://github.com/kubernetes/features/issues/281

```release-note
NONE
```
2018-06-05 22:24:46 -07:00
Kubernetes Submit Queue 999b2da440
Merge pull request #64009 from feiskyer/windows-security-context
Automatic merge from submit-queue (batch tested with PRs 64009, 64780, 64354, 64727, 63650). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Kubelet: Add security context for Windows containers

**What this PR does / why we need it**:

This PR adds windows containers to Kubelet CRI and also implements security context setting for docker containers.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

RunAsUser from Kubernetes API only accept int64 today, which is not supported on Windows. It should be changed to intstr for working with both Windows and Linux containers in a separate PR.

**Release note**:

```release-note
Kubelet: Add security context for Windows containers
```

/cc @PatrickLang @taylorb-microsoft @michmike @JiangtianLi @yujuhong @dchen1107
2018-06-05 22:24:38 -07:00
Mike Danese 90ba15ee74 auth: standalone kubelets shouldn't start a token manager 2018-06-05 17:31:26 -07:00
David Ashpole 4afcccd225 disable process scheduler metrics 2018-06-05 17:12:56 -07:00
Seth Jennings 6729add11c sysctls: create feature gate to track promotion 2018-06-06 00:23:11 +02:00
Jan Chaloupka 3cc15363bc Run make update 2018-06-06 00:12:40 +02:00
Lantao Liu bc0264fbae Fix standalone dockershim. 2018-06-05 21:52:08 +00:00
Jiaying Zhang 35efc4f96a Reconcile extended resource capacity after kubelet restart. 2018-06-05 14:38:49 -07:00
Jan Chaloupka ab616a88b9 Promote sysctl annotations to API fields 2018-06-05 23:17:00 +02:00
Kubernetes Submit Queue 0b90c414c5
Merge pull request #64094 from vladimirvivien/block-MapDeviceFunc-Refactor
Automatic merge from submit-queue (batch tested with PRs 64276, 64094, 64719, 64766, 64750). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Delegate map operation to BlockVolumeMapper plugin 

**What this PR does / why we need it**:
This PR refactors the volume controller's operation generator, for block mapping, to delegate core block mounting sequence to the `volume.BlockVolumeMapper` plugin instead of living in the operation generator.  This is to ensure better customization of block volume logic for existing internal volume plugins.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64093

```release-note
NONE
```
/sig storage
2018-06-05 11:35:13 -07:00
Minhan Xia 370268f123 Inject ContainersReady 2018-06-05 11:10:38 -07:00
Minhan Xia 176f34ea07 Generate ContainersReady condition 2018-06-05 11:10:38 -07:00
Minhan Xia 6b08ef575f add ContainersReady condition 2018-06-05 11:10:38 -07:00
Kubernetes Submit Queue c178c7fd65
Merge pull request #62005 from mikedanese/svcacctproj
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

implement ServiceAccountTokenProjection

design here: https://github.com/kubernetes/community/pull/1973

part of https://github.com/kubernetes/kubernetes/pull/61858

```release-note
Add a volume projection that is able to project service account tokens.
```

part of https://github.com/kubernetes/kubernetes/issues/48408

@kubernetes/sig-auth-pr-reviews @kubernetes/sig-storage-pr-reviews
2018-06-05 09:30:56 -07:00
Michael Taufen 5afde17860 document per-field advice for dynamic Kubelet config 2018-06-05 09:27:02 -07:00
Kubernetes Submit Queue 3b6c2472c3
Merge pull request #64709 from gnufied/fix-node-alpha-tests
Automatic merge from submit-queue (batch tested with PRs 64344, 64709, 64717, 63631, 58647). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix panic caused by no cloudprovider in test

We should not panic when no cloudprovider is present

Fixes https://github.com/kubernetes/kubernetes/issues/64704

Also added a test to cover the panic.

/sig storage
/sig node

```release-note
None
```
2018-06-05 02:16:08 -07:00
Kubernetes Submit Queue e64b81342b
Merge pull request #64344 from freehan/pod-ready-plus2
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Teach Kubelet about Pod Ready++

Follow up PR of https://github.com/kubernetes/kubernetes/pull/62306 and https://github.com/kubernetes/kubernetes/pull/64057, **Only the last 3 commits are new.** Will rebase once the previous ones are merged.

ref: https://github.com/kubernetes/community/blob/master/keps/sig-network/0007-pod-ready%2B%2B.md


kind/feature
priority/important-soon
sig/network
sig/node

/assign @yujuhong


```release-note
NONE
```
2018-06-05 01:50:27 -07:00
Kubernetes Submit Queue 84ec43c75b
Merge pull request #64560 from sbezverk/csi_registration
Automatic merge from submit-queue (batch tested with PRs 62266, 64351, 64366, 64235, 64560). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding CSI driver registration with plugin watcher

Adding CSI driver registration bits.  The registration process will leverage driver-registrar side which will open the `registration` socket and will listen for pluginwatcher's GetInfo calls.
 
```release-note
Adding CSI driver registration code.
```
/sig sig-storage
2018-06-04 18:44:23 -07:00
Kubernetes Submit Queue 2cb5c47b12
Merge pull request #64351 from msau42/fix-readonly
Automatic merge from submit-queue (batch tested with PRs 62266, 64351, 64366, 64235, 64560). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Bind mount subpath with same read/write settings as underlying volume

**What this PR does / why we need it**:
https://github.com/kubernetes/kubernetes/pull/63045 broke two scenarios:
* If volumeMount path already exists in container image, container runtime will try to chown the volume
* In SELinux system, we will try to set SELinux labels when starting the container

This fix makes it so that the subpath bind mount will inherit the read/write settings of the underlying volume mount. It does this by using the "bind,remount" mount options when doing the bind mount.

The underlying volume mount is ro when the volumeSource.readOnly flag is set. This is for persistent volume types like PVC, GCE PD, NFS, etc.  When this is set, we won't try to configure SELinux labels.  Also in this mode, subpaths have to already exist in the volume, we cannot make new directories on a read only volume.

When volumeMount.readOnly is set, the container runtime is in charge of making the volume in the container readOnly, but the underlying volume mount on the host can be writable. This can be set for any volume type, and is permanently set for atomic volume types like configmaps, secrets.  In this case, SELinux labels will be applied before the container runtime makes the volume readOnly.  And subpaths don't have to exist.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64120

**Special notes for your reviewer**:

**Release note**:

```release-note
Fixes issue for readOnly subpath mounts for SELinux systems and when the volume mountPath already existed in the container image.
```
2018-06-04 18:44:13 -07:00
Kubernetes Submit Queue 7d83484ec1
Merge pull request #62266 from feiskyer/win-log-stats
Automatic merge from submit-queue (batch tested with PRs 62266, 64351, 64366, 64235, 64560). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add log and fs stats for Windows containers

**What this PR does / why we need it**:

Add log and fs stats for Windows containers.

Without this, kubelet will report errors continuously:

```
Unable to fetch container log stats for path \var\log\pods\2a70ed65-37ae-11e8-8730-000d3a14b1a0\echo: Du not supported for this build.
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #60180 #62047

**Special notes for your reviewer**:

**Release note**:

```release-note
Add log and fs stats for Windows containers
```
2018-06-04 18:44:10 -07:00
Pengfei Ni 7ba26ba25c Setup docker options according to windows security context 2018-06-05 09:29:24 +08:00
Pengfei Ni 6da502e016 Setup windows security context in CRI 2018-06-05 09:27:40 +08:00
Pengfei Ni eeec15a7d9 Add security context for Windows containers 2018-06-05 09:27:40 +08:00
Mike Danese 91feb345aa implement service account token projection 2018-06-04 17:22:08 -07:00
Kubernetes Submit Queue 898831ad9d
Merge pull request #64592 from ravisantoshgudimetla/revert-64364-remove-rescheduler
Automatic merge from submit-queue (batch tested with PRs 63453, 64592, 64482, 64618, 64661). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "Remove rescheduler and corresponding tests from master"

Reverts kubernetes/kubernetes#64364

After discussing with @bsalamat on how DS controllers(ref: https://github.com/kubernetes/kubernetes/pull/63223#discussion_r192277527) cannot create pods if the cluster is at capacity and they have to rely on rescheduler for making some space, we thought it is better to 

- Bring rescheduler back.
- Make rescheduler priority aware.
- If cluster is full and if **only** DS controller is not able to create pods, let rescheduler be run and let it evict some pods which have less priority.
- The DS controller pods will be scheduled now.

So, I am reverting this PR now. Step 2, 3 above are going to be in rescheduler.

/cc @bsalamat @aveshagarwal @k82cn 

Please let me know your thoughts on this. 

```release-note
Revert #64364 to resurrect rescheduler. More info https://github.com/kubernetes/kubernetes/issues/64725 :)
```
2018-06-04 16:56:11 -07:00
Serguei Bezverkhi 1c05ca5575 Adding CSI driver registration 2018-06-04 16:47:24 -04:00
Minhan Xia ac4e015e12 trigger kubelet sync pod on reconciliation 2018-06-04 12:17:04 -07:00
Minhan Xia d46cdbed6c Generate pod ready status with readiness gates 2018-06-04 12:16:56 -07:00
Michelle Au f3f1a04705 Only mount subpath as readonly if specified in volumeMount 2018-06-04 12:05:23 -07:00
Hemant Kumar 32b69193c6 Fix panic caused by no cloudprovider in test
We should not panic when no cloudprovider is present
2018-06-04 14:50:18 -04:00
Mikaël Cluseau b0073097c0 pkg: kubelet: remote: increase grpc client default size to 16MiB 2018-06-04 11:09:30 +11:00
Vladimir Vivien 3569287993 Refactor of GenerateMapDeviceFunc to delegate Map call to volume plugin. 2018-06-03 17:25:37 -04:00
Kubernetes Submit Queue 2b26234003
Merge pull request #64644 from Random-Liu/address-comments-in-#64006
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Address comments in #64006.

Address comments in #64006 

@tallclair @yujuhong 
@kubernetes/sig-node-pr-reviews 
Signed-off-by: Lantao Liu <lantaol@google.com>

**Release note**:

```release-note
none
```
2018-06-03 06:31:26 -07:00
Kubernetes Submit Queue e5686a3668
Merge pull request #64154 from gnufied/impelemnt-volume-count
Automatic merge from submit-queue (batch tested with PRs 64613, 64596, 64573, 64154, 64639). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Implement dynamic volume limits

Implement dynamic volume limits depending on node type.

xref https://github.com/kubernetes/community/pull/2051

```release-note
Add Alpha support for dynamic volume limits based on node type
```
2018-06-02 06:30:19 -07:00
Yecheng Fu 40c3937320 Clean up fake mounters. 2018-06-02 15:55:19 +08:00
Kubernetes Submit Queue 91b9b62ae8
Merge pull request #64189 from alinbalutoiu/master
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix Windows CNI for the sandbox case

**What this PR does / why we need it**:
Windows supports both sandbox and non-sandbox cases. The non-sandbox
case is for Windows Server 2016 and for Windows Server version greater
than 1709 which use Hyper-V containers.

Currently, the CNI on Windows fetches the IP from the containers
within the pods regardless of the mode. This should be done only
in the non-sandbox mode where the IP of the actual container
will be different than the IP of the sandbox container.

In the case where the sandbox container is supported, all the containers
from the same pod will share the network details of the sandbox container.

This patch updates the CNI to fetch the IP from the sandbox container
when this mode is supported.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64188

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-01 20:32:28 -07:00
Di Xu 5ebd40cb6a bind alpha feature network plugin flags correctly 2018-06-02 11:31:01 +08:00
Dan Williams 931f6718b0 kubelet: remove unused parameter from runtime's SyncPod() 2018-06-01 21:55:40 -05:00
Lantao Liu 9677616eaf Address comments in #64006.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-06-01 17:25:56 -07:00
Hemant Kumar 1f9404dfc0 Implement kubelet side changes for writing volume limit to node
Add tests for checking node limits
2018-06-01 19:17:30 -04:00
Kubernetes Submit Queue d2495b8329
Merge pull request #63143 from jsafrane/containerized-subpath
Automatic merge from submit-queue (batch tested with PRs 63348, 63839, 63143, 64447, 64567). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Containerized subpath

**What this PR does / why we need it**:
Containerized kubelet needs a different implementation of `PrepareSafeSubpath` than kubelet running directly on the host.

On the host we safely open the subpath and then bind-mount `/proc/<pidof kubelet>/fd/<descriptor of opened subpath>`.

With kubelet running in a container, `/proc/xxx/fd/yy` on the host contains path that works only inside the container, i.e. `/rootfs/path/to/subpath` and thus any bind-mount on the host fails.

Solution:
- safely open the subpath and gets its device ID and inode number
- blindly bind-mount the subpath to `/var/lib/kubelet/pods/<uid>/volume-subpaths/<name of container>/<id of mount>`. This is potentially unsafe, because user can change the subpath source to a link to a bad place (say `/run/docker.sock`) just before the bind-mount.
- get device ID and inode number of the destination. Typical users can't modify this file, as it lies on /var/lib/kubelet on the host.
- compare these device IDs and inode numbers.

**Which issue(s) this PR fixes**
Fixes #61456

**Special notes for your reviewer**:

The PR contains some refactoring of `doBindSubPath` to extract the common code. New `doNsEnterBindSubPath` is added for the nsenter related parts.

**Release note**:

```release-note
NONE
```
2018-06-01 12:12:19 -07:00
Kubernetes Submit Queue 5710943612
Merge pull request #63839 from wgliang/master.movepkg
Automatic merge from submit-queue (batch tested with PRs 63348, 63839, 63143, 64447, 64567). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move pkg/scheduler/schedulercache -> pkg/scheduler/cache

**What this PR does / why we need it**:
Move pkg/scheduler/schedulercache -> pkg/scheduler/cache

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63813

**Special notes for your reviewer**:

In order to prevent name conflicts still rename the `cache` to `schedulercache`.

**Release note**:

```release-note
NONE
```
2018-06-01 12:12:15 -07:00
vikaschoudhary16 f2eeb087e9 Add feature gate for kubelet plugin watcher 2018-06-01 04:42:30 -04:00
Kubernetes Submit Queue 8d10a8f74f
Merge pull request #64006 from Random-Liu/streaming-auth
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add proxy for container streaming in kubelet for streaming auth.

For https://github.com/kubernetes/kubernetes/issues/36666, option 2 of https://github.com/kubernetes/kubernetes/issues/36666#issuecomment-378440458.

This PR:
1. Removed the `DirectStreamingRuntime`, and changed `IndirectStreamingRuntime` to `StreamingRuntime`. All `DirectStreamingRuntime`s, `dockertools` and `rkt`, were removed.
2. Proxy container streaming in kubelet instead of returning redirect to apiserver. This solves the container runtime authentication issue, which is what we agreed on in https://github.com/kubernetes/kubernetes/issues/36666.

Please note that, this PR replaced the redirect with proxy directly instead of adding a knob to switch between the 2 behaviors. For existing CRI runtimes like containerd and cri-o, they should change to serve container streaming on localhost, so as to make the whole container streaming connection secure.

 If a general authentication mechanism proposed in https://github.com/kubernetes/kubernetes/issues/62747 is ready, we can switch back to redirect, and all code can be found in github history.

Please also note that this added some overhead in kubelet when there are container streaming connections. However, the actual bottleneck is in the apiserver anyway, because it does proxy for all container streaming happens in the cluster. So it seems fine to get security and simplicity with this overhead. @derekwaynecarr @mrunalp Are you ok with this? Or do you prefer a knob?

@yujuhong @timstclair @dchen1107 @mikebrow @feiskyer 
/cc @kubernetes/sig-node-pr-reviews 
**Release note**:

```release-note
Kubelet now proxies container streaming between apiserver and container runtime. The connection between kubelet and apiserver is authenticated. Container runtime should change streaming server to serve on localhost, to make the connection between kubelet and container runtime local.

In this way, the whole container streaming connection is secure. To switch back to the old behavior, set `--redirect-container-streaming=true` flag.
```
2018-05-31 22:45:29 -07:00
RaviSantosh Gudimetla 872addf9e3
Revert "Remove rescheduler and corresponding tests from master" 2018-05-31 22:18:49 -04:00
Lantao Liu 746c32db4c Update bazel. 2018-05-31 15:26:32 -07:00
Lantao Liu 1eb721248b Update unit test. 2018-05-31 15:26:32 -07:00
Lantao Liu 174b6d0e2f Proxy container streaming in kubelet. 2018-05-31 15:26:32 -07:00
Hemant Kumar 179e5d7006 Rename online resizine feature gate 2018-05-31 17:28:12 -04:00
Guoliang Wang 761cf41427 Move pkg/scheduler/schedulercache -> pkg/scheduler/cache 2018-05-31 22:55:34 +08:00
mlmhl ca12c73323 implement kubelet side online file system resize for volume 2018-05-31 17:10:24 +08:00
Kubernetes Submit Queue a762ea1beb
Merge pull request #64364 from ravisantoshgudimetla/remove-rescheduler
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove rescheduler and corresponding tests from master

**What this PR does / why we need it**:
This is to remove rescheduler from master branch as we are promoting priority and preemption to beta.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of #57471

**Special notes for your reviewer**:
/cc @bsalamat @aveshagarwal 
**Release note**:

```release-note
Remove rescheduler from master.
```
2018-05-30 22:20:26 -07:00
Kubernetes Submit Queue 4df4a607cd
Merge pull request #64486 from mtaufen/cleanup-unused-status-message
Automatic merge from submit-queue (batch tested with PRs 64338, 64219, 64486, 64495, 64347). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove unused status per TODO

This should have been deleted in #63221, as it is now unused.

```release-note
NONE
```
2018-05-30 20:17:19 -07:00
Kubernetes Submit Queue 3e127ccbef
Merge pull request #57082 from tianshapjq/small-nit-container/os.go
Automatic merge from submit-queue (batch tested with PRs 57082, 64325, 64016, 64443, 64403). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

small nit in the annotations of pkg/kubelet/container/os.go

**What this PR does / why we need it**:
just a small nit in the annotations of container/os.go, but, it looks quite uncomfortable cause others all get right.
2018-05-30 18:49:10 -07:00
Kubernetes Submit Queue e978c47f5e
Merge pull request #64170 from mtaufen/cap-node-num-images
Automatic merge from submit-queue (batch tested with PRs 61803, 64305, 64170, 64361, 64339). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add a flag to control the cap on images reported in node status

While I normally try to avoid adding flags, this is a short term
scalability fix for v1.11, and there are other long-term solutions in
the works, so we shouldn't commit to this in the v1beta1 Kubelet config.
Flags are our escape hatch here.

```release-note
NONE
```
2018-05-30 17:34:18 -07:00
Kubernetes Submit Queue ea92879fab
Merge pull request #62306 from freehan/pod-status-patch2
Automatic merge from submit-queue (batch tested with PRs 58920, 58327, 60577, 49388, 62306). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use Patch instead of Put to sync pod status

ref: https://github.com/kubernetes/community/blob/master/keps/sig-network/0007-pod-ready%2B%2B.md
```release-note
Use Patch instead of Put to sync pod status
```
2018-05-30 16:09:36 -07:00
Kubernetes Submit Queue 6b2fc7cb75
Merge pull request #49388 from HotelsDotCom/feature/Dynamic-env-in-subpath
Automatic merge from submit-queue (batch tested with PRs 58920, 58327, 60577, 49388, 62306). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Dynamic env in subpath - Fixes Issue 48677

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #48677

**Special notes for your reviewer**:

**Release note**:

```release-note
Adds the VolumeSubpathEnvExpansion alpha feature to support environment variable expansion
Sub-paths cannot be mounted with a dynamic volume mount name.
This fix provides environment variable expansion to sub paths
This reduces the need to manage symbolic linking within sidecar init containers to achieve the same goal  
```
2018-05-30 16:09:31 -07:00
Michael Taufen 0539086ff3 add a flag to control the cap on images reported in node status
While I normally try to avoid adding flags, this is a short term
scalability fix for v1.11, and there are other long-term solutions in
the works, so we shouldn't commit to this in the v1beta1 Kubelet config.
Flags are our escape hatch.
2018-05-30 12:54:30 -07:00
Minhan Xia 85e0d05ac7 add utils for pod condition 2018-05-30 11:33:55 -07:00
Minhan Xia 78b86333c1 make update 2018-05-30 11:33:55 -07:00
Minhan Xia cb9ac04777 fix unit tests using Patch in fake client 2018-05-30 11:33:55 -07:00
Minhan Xia 35777c31ea change kubelet status manager to use patch instead of put to update pod status 2018-05-30 11:15:47 -07:00
Kubernetes Submit Queue 4a44cda40a
Merge pull request #63328 from vikaschoudhary16/probe-watcher-duplicate
Automatic merge from submit-queue (batch tested with PRs 63328, 64316, 64444, 64449, 64453). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add probe based mechanism for kubelet plugin discovery

**Which issue(s) this PR fixes**
Fixes #56944 
[Design Doc](https://docs.google.com/document/d/1dtHpGY-gPe9sY7zzMGnm8Ywo09zJfNH-E1KEALFV39s/edit#heading=h.7fe6spexljh6)

**Notes For Reviewers**:
Original PR is https://github.com/kubernetes/kubernetes/pull/59963. But because of too many comments(171) that PR does not open sometimes. Therefore this new PR is created to get the github link working.
 
Related PR is https://github.com/kubernetes/kubernetes/pull/58755 
For review efficiency, separating out of the commits or original PR here. 

```release-note
Add probe based mechanism for kubelet plugin discovery
```
/sig node
/area hw-accelerators
/cc @jiayingz @RenaudWasTaken @vishh @ScorpioCPH @sjenning @derekwaynecarr @jeremyeder @lichuqiang @tengqm @saad-ali @chakri-nelluri @ConnorDoyle @vladimirvivien
2018-05-30 08:42:11 -07:00
Kubernetes Submit Queue 15cd355281
Merge pull request #64213 from dashpole/eviction_event_annotation
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add metadata to kubelet eviction event annotations

**What this PR does / why we need it**:
Add annotations to kubelet eviction events.  Annotations include 
"offending_containers" : comma-seperated list of containers.
"offending_containers_usage": comma-seperated list of usage.
"starved_resource": v1.ResourceName of the starved resource

**Special notes for your reviewer**:
Adding annotations to events required changing the `EventRecorder` interface to add a `AnnotatedEventf` function, which can add annotations to an event.

**Release note**:
```release-note
NONE
```
/assign @dchen1107 
cc @mwielgus @schylek @kgrygiel
2018-05-29 23:37:47 -07:00
xuzhonghu 9492cf368e move oldNodeUnschedulable pkg var to kubelet struct 2018-05-30 14:09:13 +08:00
Michael Taufen 665f166c29 remove unused status per TODO
This should have been deleted in #63221, as it is now unused.
2018-05-29 17:34:00 -07:00
ravisantoshgudimetla aeccffc339 Phase out rescheduler in favor of priority and preemption 2018-05-29 19:52:06 -04:00
Kubernetes Submit Queue c6e0a225f9
Merge pull request #64155 from figo/master
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

improve test: verify kubelet.config.Restore only happen once

**What this PR does / why we need it**:
This patch is to add additional test coverage of pod config restore, 
it verifies that restore can only happen once.

in the second restore attempt, we should expect no error and no channel update.

**Which issue(s) this PR fixes**:

this is a test improvement based on test been added at https://github.com/kubernetes/kubernetes/pull/63553


**Special notes for your reviewer**:

**Release note**:

```release-note
None
```

/sig node
/cc @rphillips @jiayingz @vikaschoudhary16 @anfernee @Random-Liu  @dchen1107  @derekwaynecarr 
@vishh @yujuhong @tallclair
2018-05-29 16:17:28 -07:00
Lantao Liu aeb6cacf01 Remove direct and indirect streaming runtime interface. 2018-05-29 15:08:15 -07:00
Kevin Taylor b2d4426f09 Add dynamic environment variable substitution to subpaths 2018-05-29 17:01:09 +01:00
vikaschoudhary16 3a2e3bcc70 Add probe based mechanism for kubelet plugin discovery 2018-05-29 12:00:37 -04:00
vikaschoudhary16 401bab3642 Auto-generated files 2018-05-29 12:00:37 -04:00
Guoliang Wang 9449a4372e cleanup some dead kubelet code 2018-05-29 22:38:01 +08:00
Kubernetes Submit Queue be43b7cc9d
Merge pull request #64352 from Random-Liu/clean-limit-writer
Automatic merge from submit-queue (batch tested with PRs 64355, 64328, 64352). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove unused limit writer.

All container runtimes are integrated through CRI now. Write limit is handled in https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kuberuntime/logs/logs.go now.

Signed-off-by: Lantao Liu <lantaol@google.com>

@yujuhong @feiskyer @kubernetes/sig-node-pr-reviews 

**Release note**:

```release-note
none
```
2018-05-27 04:08:09 -07:00
Kubernetes Submit Queue 2cb7ab012b
Merge pull request #62984 from feiskyer/klet-validation
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Validate cgroups-per-qos for Windows

**What this PR does / why we need it**:

cgroups-per-qos and enforce-node-allocatable is not supported on Windows, but kubelet allows it on Windows. And then Pods may stuck in terminating state because of it. Refer #61716.

This PR adds validation for them and make kubelet refusing to start in this case.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #61716

**Special notes for your reviewer**:

**Release note**:

```release-note
Fail fast if cgroups-per-qos is set on Windows
```
2018-05-26 03:03:13 -07:00
Lantao Liu 7c17ee25ec Remove unused limit writer.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-05-25 16:55:08 -07:00
Andrew McDermott ca58578b24 Resurrect lost log line 2018-05-24 20:44:12 +01:00
Andrew McDermott 9cbd54018f Remove signal handler registration from pkg/kubelet
The goal of this change is to remove the registration of signal
handling from pkg/kubelet. We now pass in a stop channel.

If you register a signal handler in `main()` to aid in a controlled
and deliberate exit then the handler registered in `pkg/kubelet` often
wins and the process exits immediately. This means all other signal
handler registrations are currently racy if `DockerServer.Start()` is
directly or indirectly invoked.

This change also removes another signal handler registration from
`NewAPIServerCommand()`; a stop channel is now passed to this
function.
2018-05-24 20:44:12 +01:00
Kubernetes Submit Queue 97f4a64fac
Merge pull request #63434 from adfinis-forks/bug_typo_kubelet_volume_stats
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix typo in volume_stats.go

**What this PR does / why we need it**:
While reviewing the implementation details I came across a typo in volume_stats.go
sed/volumeStatsCollecotr/volumeStatsCollector/

**Release note**:

```release-note
NONE
```
2018-05-24 11:44:20 -07:00
David Ashpole 2be67e7dde use memory metrics from the pod cgroup for eviction ranking 2018-05-24 10:59:53 -07:00
Alin-Gheorghe Balutoiu 49e762ab3a Fix Windows CNI for the sandbox case
Windows supports both sandbox and non-sandbox cases. The non-sandbox
case is for Windows Server 2016 and for Windows Server version greater
than 1709 which use Hyper-V containers.

Currently, the CNI on Windows fetches the IP from the containers
within the pods regardless of the mode. This should be done only
in the non-sandbox mode where the IP of the actual container
will be different than the IP of the sandbox container.

In the case where the sandbox container is supported, all the containers
from the same pod will share the network details of the sandbox container.

This patch updates the CNI to fetch the IP from the sandbox container
when this mode is supported.

Signed-off-by: Alin Balutoiu <abalutoiu@cloudbasesolutions.com>
2018-05-24 08:56:30 +02:00
Kubernetes Submit Queue 731eaecfd1
Merge pull request #57527 from mtaufen/kc-metric
Automatic merge from submit-queue (batch tested with PRs 64013, 63896, 64139, 57527, 62102). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add dynamic config metrics

This PR exports config-releated metrics from the Kubelet.
The Guages for active, assigned, and last-known-good config can be used
to identify config versions and produce aggregate counts across several
nodes. The error-reporting Gauge can be used to determine whether a node
is experiencing a config-related error, and to prodouce an aggregate
count of nodes in an error state.

https://github.com/kubernetes/features/issues/281

```release-note
The Kubelet now exports metrics that report the assigned (node_config_assigned), last-known-good (node_config_last_known_good), and active (node_config_active) config sources, and a metric indicating whether the node is experiencing a config-related error (node_config_error). The config source metrics always report the value 1, and carry the node_config_name, node_config_uid, node_config_resource_version, and node_config_kubelet_key labels, which identify the config version. The error metric reports 1 if there is an error, 0 otherwise.
```
2018-05-23 19:44:21 -07:00
David Ashpole fd1f19fc42 add metadata to kubelet eviction event annotations 2018-05-23 16:12:54 -07:00
hui luo bf48d39f39 add test: verify kubelet.config.Restore only happen once 2018-05-23 10:10:40 -07:00
Kubernetes Submit Queue 02818ed092
Merge pull request #63912 from luxas/move_rotatecerts_kubelet
Automatic merge from submit-queue (batch tested with PRs 59851, 64114, 63912, 64156, 64191). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: Move RotateCertificates to the KubeletConfiguration struct

**What this PR does / why we need it**:
Moves `.RotateCertificates` to the `KubeletConfiguration` struct, so it can be configured via the Config file smoothly.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/63878
Fixes https://github.com/kubernetes/kubernetes/issues/61653

**Special notes for your reviewer**:
Pretty similar to https://github.com/kubernetes/kubernetes/pull/62352

**Release note**:

```release-note
The kubelet certificate rotation feature can now be enabled via the `.RotateCertificates` field in the kubelet's config file. The `--rotate-certificates` flag is now deprecated, and will be removed in a future release.
```
@kubernetes/sig-node-pr-reviews @kubernetes/sig-cluster-lifecycle-pr-reviews
2018-05-23 09:06:15 -07:00
Jan Safranek 7e3fb502a8 Change SafeMakeDir to resolve symlinks in mounter implementation
Kubelet should not resolve symlinks outside of mounter interface.
Only mounter interface knows, how to resolve them properly on the host.

As consequence, declaration of SafeMakeDir changes to simplify the
implementation:
from SafeMakeDir(fullPath string, base string, perm os.FileMode)
to   SafeMakeDir(subdirectoryInBase string, base string, perm os.FileMode)
2018-05-23 10:21:20 +02:00
Jan Safranek 74ba0878a1 Enhance ExistsPath check
It should return error when the check fails (e.g. no permissions, symlink link
loop etc.)
2018-05-23 10:21:20 +02:00
Jan Safranek 97b5299cd7 Add GetMode to mounter interface.
Kubelet must not call os.Lstat on raw volume paths when it runs in a container.
Mounter knows where the file really is.
2018-05-23 10:17:59 +02:00
Pengfei Ni 40abe94a40 Validate cgroups-per-qos for windows 2018-05-23 11:12:59 +08:00
Kubernetes Submit Queue 36b1f67617
Merge pull request #64026 from jsafrane/csi-selinux
Automatic merge from submit-queue (batch tested with PRs 63914, 63887, 64116, 64026, 62933). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Enable SELinux relabeling in CSI volumes

**What this PR does / why we need it**:
CSI volume plugin should provide correct information in `GetAttributes` call so kubelet can ask container runtime to relabel the volume. Therefore CSI volume plugin needs to check if a random volume mounted by a CSI driver supports SELinux or not by checking for "seclabel" mount or superblock option.


**Which issue(s) this PR fixes**
Fixes #63965

**Release note**:
```release-note
NONE
```

@saad-ali @vladimirvivien @davidz627 
@cofyc, FYI, I'm changing `struct mountInfo`.
2018-05-22 17:36:18 -07:00
Lucas Käldström 57e74f9928
autogenerated 2018-05-23 00:19:21 +03:00
Lucas Käldström 2590e127f9
kubelet: Move RotateCertificates to the KubeletConfiguration struct 2018-05-23 00:19:11 +03:00
Michael Taufen fd3432ef05 add dynamic config metrics
This PR exports config-releated metrics from the Kubelet.
The Guages for active, assigned, and last-known-good config can be used
to identify config versions and produce aggregate counts across several
nodes. The error-reporting Gauge can be used to determine whether a node
is experiencing a config-related error, and to prodouce an aggregate
count of nodes in an error state.
2018-05-22 14:08:55 -07:00
Kubernetes Submit Queue 6935b755b9
Merge pull request #63553 from rphillips/fixes/checkpoint_logic_on_restore
Automatic merge from submit-queue (batch tested with PRs 63151, 63795, 63553, 64068, 64113). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: fix checkpoint manager logic bug on restore

**What this PR does / why we need it**:
I am testing the new checkpoint logic within the kubelet and ran across a logic bug on API server restores.

Initial PR: https://github.com/kubernetes/kubernetes/pull/56040

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:
/cc @vikaschoudhary16 

**Release note**:
```release-note
NONE
```
2018-05-21 21:41:18 -07:00
Kubernetes Submit Queue 0ea35f4c61
Merge pull request #63795 from wojtek-t/watching_secret_manager
Automatic merge from submit-queue (batch tested with PRs 63151, 63795, 63553, 64068, 64113). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Implement watch-based secret manager

Initial experiments on 5000-node Kubemark show that apiserver is handling those with no real issues.
That said, we shouldn't enable it in prod without much more extensive scalability tests (so most probably not in 1.11), but having that in would enable easier testing.

@liggitt
2018-05-21 21:41:14 -07:00
Kubernetes Submit Queue 9eb0c35668
Merge pull request #63701 from scf0920/branch-1
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix typo: peirodically->periodically

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-05-21 18:42:59 -07:00
Kubernetes Submit Queue 2a989c60ff
Merge pull request #63221 from mtaufen/dkcfg-live-configmap
Automatic merge from submit-queue (batch tested with PRs 63881, 64046, 63409, 63402, 63221). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Kubelet responds to ConfigMap mutations for dynamic Kubelet config

This PR makes dynamic Kubelet config easier to reason about by leaving less room for silent skew scenarios. The new behavior is as follows:
- ConfigMap does not exist: Kubelet reports error status due to missing source
- ConfigMap is created: Kubelet starts using it
- ConfigMap is updated: Kubelet respects the update (but we discourage this pattern, in favor of incrementally migrating to a new ConfigMap)
- ConfigMap is deleted: Kubelet keeps using the config (non-disruptive), but reports error status due to missing source
- ConfigMap is recreated: Kubelet respects any updates (but, again, we discourage this pattern)

This PR also makes a small change to the config checkpoint file tree structure, because ResourceVersion is now taken into account when saving checkpoints. The new structure is as follows:
```
- dir named by --dynamic-config-dir (root for managing dynamic config)
| - meta
  | - assigned (encoded kubeletconfig/v1beta1.SerializedNodeConfigSource object, indicating the assigned config)
  | - last-known-good (encoded kubeletconfig/v1beta1.SerializedNodeConfigSource object, indicating the last-known-good config)
| - checkpoints
  | - uid1 (dir for versions of object identified by uid1)
    | - resourceVersion1 (dir for unpacked files from resourceVersion1)
    | - ...
  | - ...
```


fixes: #61643

```release-note
The dynamic Kubelet config feature will now update config in the event of a ConfigMap mutation, which reduces the chance for silent config skew. Only name, namespace, and kubeletConfigKey may now be set in Node.Spec.ConfigSource.ConfigMap. The least disruptive pattern for config management is still to create a new ConfigMap and incrementally roll out a new Node.Spec.ConfigSource.
```
2018-05-21 17:05:42 -07:00
Kubernetes Submit Queue 6d510f52f2
Merge pull request #63409 from mtaufen/kc-validation-feature-gates
Automatic merge from submit-queue (batch tested with PRs 63881, 64046, 63409, 63402, 63221). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Kubelet config: Validate new config against future feature gates

This fixes an issue with KubeletConfiguration validation, where the             
feature gates set by the new config were not taken into account.                
                                                                                
Also fixes a validation issue with dynamic Kubelet config, where flag           
precedence was not enforced prior to dynamic config validation in the           
controller; this prevented rejection of dynamic configs that don't merge        
well with values set via legacy flags. 

Fixes #63305 

```release-note
NONE
```
2018-05-21 17:05:34 -07:00
Ryan Phillips 6469c8e333 kubelet: fix checkpoint manager logic bug on restore 2018-05-21 16:17:48 -05:00
Michael Taufen b5648c3f61 dynamic Kubelet config reconciles ConfigMap updates 2018-05-21 09:03:58 -07:00
Klaudiusz Dembler 9384937f2f Update bazel 2018-05-21 17:39:51 +02:00
Klaudiusz Dembler de1063bc7d Add compatibility tests 2018-05-21 14:50:31 +02:00
Klaudiusz Dembler 3d09101b6f Add docstrings 2018-05-21 11:40:04 +02:00
Michael Taufen 647e90341c Kubelet config: Validate new config against future feature gates
This fixes an issue with KubeletConfiguration validation, where the
feature gates set by the new config were not taken into account.

Also fixes a validation issue with dynamic Kubelet config, where flag
precedence was not enforced prior to dynamic config validation in the
controller; this prevented rejection of dynamic configs that don't merge
well with values set via legacy flags.
2018-05-20 13:15:59 -07:00
Davanum Srinivas eb6bd67446
Bump grpc max message size for docker service 2018-05-19 16:52:00 -04:00
Kubernetes Submit Queue b9c07d272c
Merge pull request #63963 from wojtek-t/collapse_configmap_manager
Automatic merge from submit-queue (batch tested with PRs 63598, 63913, 63459, 63963, 60464). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor ConfigMapManager

This is a follow up from https://github.com/kubernetes/kubernetes/pull/63857 to unify SecretManager with ConfigMap manager.

This is more-or-less a copy of that PR for ConfigMapManager, with super minor changes in secretManager code.
2018-05-19 06:49:23 -07:00
Kubernetes Submit Queue 4d786a937d
Merge pull request #63977 from runcom/increase-grpc-resp-size
Automatic merge from submit-queue (batch tested with PRs 60012, 63692, 63977, 63960, 64008). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

pkg: kubelet: remote: increase grpc client default size

Signed-off-by: Antonio Murdaca <runcom@redhat.com>



**What this PR does / why we need it**:

when running lots and lots of containers and having tons of images on a given node, we started seeing this in the logs (with docker):

```
Unable to retrieve pods: rpc error: code = ResourceExhausted desc = grpc: received message larger than max (4208374 vs. 4194304)
```

That's because the grpc client is defaulting to a 4MB response size.
This patch increases the resp size to 8MB to avoid such issue.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
increase grpc client default response size
```
2018-05-18 23:35:20 -07:00
Mikhail Mazurskiy 5e8e570dbd
Use Dial with context 2018-05-19 08:14:37 +10:00
Kubernetes Submit Queue 1b950d1e8e
Merge pull request #63337 from vikaschoudhary16/fix-e2e
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix e2e "When checkpoint file is corrupted should complete pod sandbo…

…x clean up"



**What this PR does / why we need it**:
This PR fixes the e2e-node test, "When checkpoint file is corrupted should complete pod sandbox clean up"

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62738
Related #62937

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
/cc @dashpole @derekwaynecarr 
/sig node
2018-05-18 04:12:43 -07:00
Antonio Murdaca 57a2eec677
pkg: kubelet: remote: increase grpc client default size
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2018-05-17 17:32:33 +02:00
Jan Safranek 598ca5accc Add GetSELinuxSupport to mounter. 2018-05-17 13:36:37 +02:00
wojtekt 068844aeb1 WatchingSecretManager 2018-05-17 12:18:14 +02:00
wojtekt 01e58de70c Refactor ConfigMapManager 2018-05-17 11:37:35 +02:00
Kubernetes Submit Queue 8f0bb37fdc
Merge pull request #63857 from wojtek-t/collapse_secret_manager
Automatic merge from submit-queue (batch tested with PRs 63886, 63857, 63824). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor cache based manager

This is support to be no-op refactoring. It will only allow to share code between secret and configmap managers.
2018-05-17 02:08:55 -07:00
Pengfei Ni 7cbee06a00 Add Pod stats for Windows containers 2018-05-17 15:28:46 +08:00
Kubernetes Submit Queue da8e25c63d
Merge pull request #63936 from awly/extract-connwatch
Automatic merge from submit-queue (batch tested with PRs 63865, 57849, 63932, 63930, 63936). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Extract connection rotating dialer into a package

**What this PR does / why we need it**: This will be re-used for exec auth plugin to rotate connections on
credential change.

**Special notes for your reviewer**: this was split from https://github.com/kubernetes/kubernetes/pull/61803 to simplify review

**Release note**:
```release-note
NONE
```
2018-05-17 00:28:30 -07:00
Kubernetes Submit Queue 2accf11f1a
Merge pull request #57849 from dashpole/eviction_test_event
Automatic merge from submit-queue (batch tested with PRs 63865, 57849, 63932, 63930, 63936). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Eviction Node e2e test checks for eviction reason

**What this PR does / why we need it**:
Currently, the eviction test simply ensures that pods are marked `Failed`.  However, this could occur because of an OOM, rather than an eviction.
To ensure that pods are actually being evicted, check for the Reason in the pod status to ensure it is evicted.

**Release note**:
```release-note
NONE
```

cc @kubernetes/sig-node-pr-reviews
2018-05-17 00:28:19 -07:00
Pengfei Ni 196381c039 Add fs status for Windows containers 2018-05-17 14:22:21 +08:00
Kubernetes Submit Queue c7bfc2a14e
Merge pull request #63220 from dashpole/fix_memcg_format
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix formatting for kubelet memcg notification threshold

/kind bug
**What this PR does / why we need it**:
This fixes the following errors (found in [this node_e2e serial test log](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-serial/4118/artifacts/tmp-node-e2e-49baaf8a-cos-stable-63-10032-71-0/kubelet.log)):
`eviction_manager.go:256] eviction manager attempting to integrate with kernel memcg notification api`
`threshold_notifier_linux.go:70] eviction: setting notification threshold to 4828488Ki`
`eviction_manager.go:272] eviction manager: failed to create hard memory threshold notifier: invalid argument`

**Special notes for your reviewer**:
This needs to be cherrypicked back to 1.10.
This regression was added in https://github.com/kubernetes/kubernetes/pull/60531, because the `quantity` being used was changed from a DecimalSI to BinarySI, which changes how it is printed out in the String() method.  To make it more explicit that we want the value, just convert Value() to a string.

**Release note**:
```release-note
Fix memory cgroup notifications, and reduce associated log spam.
```
2018-05-16 15:25:06 -07:00
Andrew Lytvynov 85a61ff3aa Extract connection rotating dialer into a package
This will be re-used for exec auth plugin to rotate connections on
credential change.
2018-05-16 10:30:53 -07:00
wojtekt de37da8532 Refactor cache based manager 2018-05-16 10:59:32 +02:00
weipeng1213 a09a11fcb6 fix typo: writeable->writable 2018-05-16 12:37:17 +08:00
Pengfei Ni 228ba8e909 Setup dns servers and search domains for Windows Pods 2018-05-16 12:18:19 +08:00
Kubernetes Submit Queue 792832bafc
Merge pull request #62242 from feiskyer/pod-cidr
Automatic merge from submit-queue (batch tested with PRs 63314, 63884, 63799, 63521, 62242). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Check CIDR before updating node status

**What this PR does / why we need it**:

Check CIDR before updating node status.  See #62164.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62164

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-05-15 19:55:19 -07:00
Kubernetes Submit Queue 6934c4f599
Merge pull request #63521 from dashpole/allocatable_memcg
Automatic merge from submit-queue (batch tested with PRs 63314, 63884, 63799, 63521, 62242). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add memcg notifications for allocatable cgroup

**What this PR does / why we need it**:
Use memory cgroup notifications to trigger the eviction manager when the allocatable eviction threshold is crossed.  This allows the eviction manager to respond more quickly when the allocatable cgroup's available memory becomes low.  Evictions are preferable to OOMs in the cgroup since the kubelet can enforce its priorities on which pod is killed.

**Which issue(s) this PR fixes**:
Fixes https://github.com/kubernetes/kubernetes/issues/57901

**Special notes for your reviewer**:
This adds the alloctable cgroup from the container manager to the eviction config.

**Release note**:
```release-note
NONE
```
/sig node
/priority important-soon
/kind feature

I would like this to be included in the 1.11 release.
2018-05-15 19:55:15 -07:00
Kubernetes Submit Queue 2fcac6abf2
Merge pull request #63314 from mtaufen/dkcfg-structured-status
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move to a structured status for dynamic kubelet config

This PR updates dynamic Kubelet config to use a structured status, rather than a node condition. This makes the status machine-readable, and thus more useful for config orchestration. 

Fixes: #56896

```release-note
The status of dynamic Kubelet config is now reported via Node.Status.Config, rather than the KubeletConfigOk node condition.
```
2018-05-15 19:41:36 -07:00
Michael Taufen fcc1f8e7b6 Move to a structured status for dynamic Kubelet config
Updates dynamic Kubelet config to use a structured status, rather than a
node condition. This makes the status machine-readable, and thus more
useful for config orchestration.

Fixes: #56896
2018-05-15 11:25:12 -07:00
Klaudiusz Dembler aa325ec2d9 Change JSON letter case in tests 2018-05-15 18:43:48 +02:00
Klaudiusz Dembler 7bb047ec75 Rebase and backward compatibility 2018-05-15 18:34:53 +02:00
Kubernetes Submit Queue b71966acea
Merge pull request #62015 from feiskyer/container-log
Automatic merge from submit-queue (batch tested with PRs 63603, 63557, 62015). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

CRI: update documents for container logpath

**What this PR does / why we need it**:

The container log path has been changed from  `containername_attempt#.log` to `containername/attempt#.log` in #59906. This PR updates CRI documents for it.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
CRI: update documents for container logpath. The container log path has been changed from containername_attempt#.log to containername/attempt#.log 
```
2018-05-15 02:07:44 -07:00
Kubernetes Submit Queue 8220171d8a
Merge pull request #63492 from liggitt/node-heartbeat-close-connections
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

track/close kubelet->API connections on heartbeat failure

xref #48638
xref https://github.com/kubernetes-incubator/kube-aws/issues/598

we're already typically tracking kubelet -> API connections and have the ability to force close them as part of client cert rotation. if we do that tracking unconditionally, we gain the ability to also force close connections on heartbeat failure as well. it's a big hammer (means reestablishing pod watches, etc), but so is having all your pods evicted because you didn't heartbeat.

this intentionally does minimal refactoring/extraction of the cert connection tracking transport in case we want to backport this

* first commit unconditionally sets up the connection-tracking dialer, and moves all the cert management logic inside an if-block that gets skipped if no certificate manager is provided (view with whitespace ignored to see what actually changed)
* second commit plumbs the connection-closing function to the heartbeat loop and calls it on repeated failures

follow-ups:
* consider backporting this to 1.10, 1.9, 1.8
* refactor the connection managing dialer to not be so tightly bound to the client certificate management

/sig node
/sig api-machinery

```release-note
kubelet: fix hangs in updating Node status after network interruptions/changes between the kubelet and API server
```
2018-05-14 16:56:35 -07:00
Klaudiusz Dembler ba8d82c96a
Update error indicating unexistent checkpoint 2018-05-14 09:51:27 +02:00
Klaudiusz Dembler 0b1a73e94b
Make cpuManagerCheckpoint exported 2018-05-14 09:51:27 +02:00
Klaudiusz Dembler cc3fa67bda
Add comments to MockCheckpoint functions and gofmt 2018-05-14 09:51:27 +02:00
Klaudiusz Dembler 0fbd19bc06
Tweaks 2018-05-14 09:51:26 +02:00
Klaudiusz Dembler 3991ed5d2f
Add tests 2018-05-14 09:51:26 +02:00
Klaudiusz Dembler 6bfceed4ab
Migrate cpumanager to use checkpointing manager 2018-05-14 09:45:58 +02:00
Kubernetes Submit Queue 6017f6daef
Merge pull request #63170 from micahhausler/node-ip-fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Report node DNS info with --node-ip

**What this PR does / why we need it**:
This PR adds `ExternalDNS`, `InternalDNS`, and `ExternalIP` info for kubelets with the `--nodeip` flag enabled. 

**Which issue(s) this PR fixes** 
Fixes #63158

**Special notes for your reviewer**:

I added a field to the Kubelet to make IP validation more testable (`validateNodeIP` relies on the `net` package and the IP address of the host that is executing the test.) I also converted the test to use a table so new cases could be added more easily.

**Release Notes**
```release-note
Report node DNS info with --node-ip flag
```

@andrewsykim
@nckturner 

/sig node
/sig network
2018-05-11 15:46:35 -07:00
Kubernetes Submit Queue 204520b029
Merge pull request #63344 from RobertKrawitz/fix-process-kill-algorithm
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Correct kill logic for pod processes

Correct the kill logic for processes in the pod's cgroup.  os.FindProcess() does not check whether the process exists on POSIX systems.
2018-05-11 11:41:19 -07:00
Chengfei Shang 27dcb1f362 fix typo: peirodically->periodically 2018-05-11 14:39:07 +08:00
Kubernetes Submit Queue 7eb88f11d2
Merge pull request #59727 from wgliang/master.time
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

should use time.Since instead of time.Now().Sub

**What this PR does / why we need it**:
should use time.Since instead of time.Now().Sub

**Special notes for your reviewer**:
2018-05-10 20:29:40 -07:00
Kubernetes Submit Queue 321201f672
Merge pull request #63406 from derekwaynecarr/label-pod-cgroups
Automatic merge from submit-queue (batch tested with PRs 60200, 63623, 63406). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Apply pod name and namespace labels for pod cgroup for cadvisor metrics

**What this PR does / why we need it**:
1. Enable Prometheus users to determine usage by pod name and namespace for pod cgroup sandbox.
1. Label cAdvisor metrics for pod cgroups by pod name and namespace.
1. Aligns with kubelet stats summary endpoint pod cpu and memory stats.

**Special notes for your reviewer**:
This provides parity with the summary API enhancements done here:
https://github.com/kubernetes/kubernetes/pull/55969

**Release note**:
```release-note
Apply pod name and namespace labels to pod cgroup in cAdvisor metrics
```
2018-05-10 08:33:11 -07:00
Kubernetes Submit Queue d42df4561a
Merge pull request #61976 from atlassian/ticker-with-stop
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Stop() for Ticker to enable leak-free code

**What this PR does / why we need it**:
I wanted to use the clock package but the `Ticker` without a `Stop()` method is a deal breaker for me.

**Release note**:
```release-note
NONE
```
/kind enhancement
/sig api-machinery
2018-05-09 19:06:56 -07:00
Kubernetes Submit Queue b2fe2a0a6d
Merge pull request #59847 from mtaufen/dkcfg-explicit-keys
Automatic merge from submit-queue (batch tested with PRs 63624, 59847). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

explicit kubelet config key in Node.Spec.ConfigSource.ConfigMap

This makes the Kubelet config key in the ConfigMap an explicit part of
the API, so we can stop using magic key names.
    
As part of this change, we are retiring ConfigMapRef for ConfigMap.


```release-note
You must now specify Node.Spec.ConfigSource.ConfigMap.KubeletConfigKey when using dynamic Kubelet config to tell the Kubelet which key of the ConfigMap identifies its config file.
```
2018-05-09 17:55:13 -07:00
Kubernetes Submit Queue 8de6600a55
Merge pull request #63539 from wojtek-t/refactor_secret_configmap_manager
Automatic merge from submit-queue (batch tested with PRs 63593, 63539). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor cachingSecretManager

I have a POC of watch-based implementation of SecretManager in https://github.com/kubernetes/kubernetes/pull/63461

This is an initial refactoring that would make that change easier.

@yujuhong - if you're fine with this PR, I will do the same for configmaps in the follow up PR.
2018-05-09 14:49:14 -07:00
wojtekt 52713cd837 Rename Add/Delete to *Reference 2018-05-09 18:14:50 +02:00
wojtekt 8e55220523 Refactor cachingSecretManager 2018-05-09 18:09:59 +02:00
Kubernetes Submit Queue ba3176d94c
Merge pull request #58580 from k82cn/k8s_58505
Automatic merge from submit-queue (batch tested with PRs 58580, 63120). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Admit BestEffort if it tolerates memory pressure.

Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #58505 

**Release note**:
```release-note
None
```
2018-05-08 21:45:10 -07:00
David Ashpole a5df208866 eviction test ensures failed pods are evicted 2018-05-08 16:08:35 -07:00
Michael Taufen c41cf55a2c explicit kubelet config key in Node.Spec.ConfigSource.ConfigMap
This makes the Kubelet config key in the ConfigMap an explicit part of
the API, so we can stop using magic key names.

As part of this change, we are retiring ConfigMapRef for ConfigMap.
2018-05-08 15:37:26 -07:00
David Eads c5445d3c56 simplify api registration 2018-05-08 18:33:50 -04:00
Kubernetes Submit Queue f13fa1e3af
Merge pull request #63415 from dashpole/eviction_event
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Clean up kubelet eviction events

**What this PR does / why we need it**:
This makes eviction events better.  
* Exceeding container disk limits no longer says "node was low on X", since the node isn't actually low on a resource.  The container limit was just exceeded.  Same for pods and volumes.
* Eviction message now lists containers which were exceeding their requests.  This is an event from a container evicted while under memory pressure: 
`reason: 'Evicted' The node was low on resource: memory. Container high-priority-memory-hog was using 166088Ki, which exceeds its request of 10Mi.`
* Eviction messages now displays real resources, when they exist.  Rather than `The node was low on resource: nodefs`, it will now show `The node was low on resource: ephemeral-storage`.

This also cleans up eviction code in order to accomplish this.  We previously had a resource for each signal: e.g. `SignalNodeFsAvailable` mapped to the resource`nodefs`, and `nodefs` maps to reclaim functions, and ranking functions.  Now, signals map directly to reclaim and ranking functions, and signals map to real resources: e.g. `SignalNodeFsAvailable` maps to the resource `ephemeral-storage`, which is what we use in events.
This also cleans up duplicated code by reusing the `evictPod` function.  It also removes the unused signal `SignalAllocatableNodeFsAvailable`.

**Release note**:
```release-note
NONE
```

/sig node
/priority important-longterm

/assign @dchen1107 @jingxu97
2018-05-08 10:38:29 -07:00
Kubernetes Submit Queue e6b6e5c4b4
Merge pull request #63045 from msau42/fix-subpath-readonly
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

passthrough readOnly to subpath

**What this PR does / why we need it**:
If a volume is mounted as readonly, or subpath volumeMount is configured as readonly, then the subpath bind mount should be readonly.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62752

**Special notes for your reviewer**:

**Release note**:

```release-note
Fixes issue where subpath readOnly mounts failed
```
2018-05-07 23:36:49 -07:00
Kubernetes Submit Queue af70337f8c
Merge pull request #63083 from mgdevstack/master-make-verify
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixed hack/verify-golint.sh reported errors 

**What this PR does / why we need it**:
Fixes errors reported by `hack/verify-golint.sh` while running `make verify`

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63065 

**Special notes for your reviewer**:
Fixing `hack/verify-gofmt.sh` reported error for `k8s.io/code-generator/cmd/client-gen/generators/client_generator.go` mentioned in #63065 throws error. So didn't modify that file.

**Release note**:

```release-note
NONE
```
/sig-testing-bugs
2018-05-07 18:23:46 -07:00
David Ashpole 2294f09e4e add memcg notifications for allocatable cgroup 2018-05-07 17:15:23 -07:00
Kubernetes Submit Queue 6549342386
Merge pull request #63496 from mtaufen/dkcfg-fs-tests
Automatic merge from submit-queue (batch tested with PRs 63488, 63496). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve test coverage of Kubelet file utils

Improves from 30.9% to 77.8%.

```release-note
NONE
```
2018-05-07 14:18:06 -07:00
Jordan Liggitt 814b065928
Close all kubelet->API connections on heartbeat failure 2018-05-07 15:06:31 -04:00
Jordan Liggitt 52876f77e9
Always track kubelet -> API connections 2018-05-07 15:06:30 -04:00
Derek Carr a09990cd43 Apply pod name and namespace labels for pod cgroup for cadvisor metrics 2018-05-07 14:51:12 -04:00
Michael Taufen e7b42f8a77 Improve test coverage of Kubelet file utils
Improves from 30.9% to 77.8%.
2018-05-07 11:17:21 -07:00
Kubernetes Submit Queue bc1c2de163
Merge pull request #62914 from sjenning/kubelet-unit-flake
Automatic merge from submit-queue (batch tested with PRs 62914, 63431). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: fix flake in TestUpdateExistingNodeStatusTimeout

xref https://github.com/openshift/origin/issues/19443

There are cases where some process, outside the test, attempts to connect to the port we are using to do the test, leading to a attempt count greater than what we expect.

To deal with this, just ensure that we have seen *at least* the number of connection attempts we expect.

@liggitt 

```release-note
NONE
```
2018-05-07 10:44:05 -07:00
David Ashpole db99c20a9a cleanup eviction events 2018-05-04 11:02:25 -07:00
Kubernetes Submit Queue 484f62a568
Merge pull request #63333 from deads2k/api-14-snip
Automatic merge from submit-queue (batch tested with PRs 63421, 63432, 63333). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

update tests to be specific about the versions they are testing

When setting up tests, you want to rely on your own scheme.  This eliminates coupling to floating versions which gives unnecessary flexibility in most cases and prevents testing all the versions you need.

@liggitt  scrubs unnecessary deps.

```release-note
NONE
```
2018-05-04 10:52:10 -07:00
Lukas Grossar 64dee74bb7
Fix typo in volume_stats.go
volumeStatsCollecotr -> volumeStatsCollector
2018-05-04 15:49:07 +02:00
Kubernetes Submit Queue 1929e0d86d
Merge pull request #63298 from dims/kubelet-remove-unused-code
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet - Remove unused code

**What this PR does / why we need it**:

Looks like we have a bunch of unused methods. Let's clean them up

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-05-04 04:20:06 -07:00
Mayank Gaikwad e03a41ddf8 fixed golint error on redundant if 2018-05-04 11:58:30 +05:30
vikaschoudhary16 f06fca9823 Fix e2e "When checkpoint file is corrupted should complete pod sandbox clean up" 2018-05-03 14:27:49 -04:00
Kubernetes Submit Queue 865321c2d6
Merge pull request #61940 from alinbalutoiu/master
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add support for CNI on Windows Server 2016 RTM

**What this PR does / why we need it**:
Windows Server 2016 RTM has limited CNI support. This PR makes it possible for the CNI plugin to be used to setup POD networking on Windows Server 2016 RTM (build number 14393).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #61939

**Special notes for your reviewer**:
The old mode is not supported and tested on Windows Server 2016 RTM. This change allows the CNI plugin to be used on Windows Server 2016 RTM to retrieve the container IP instead of using workarounds (docker inspect).

CNI support has been added for Windows Server 2016 version 1709 (build number 16299), this patch will just allow the same support for older build numbers.

Windows Server 2016 RTM has a longer lifecycle (LTS) than Windows Server 2016 version 1709.
https://support.microsoft.com/en-us/lifecycle/search/19761 vs https://support.microsoft.com/en-us/lifecycle/search/20311


**Release note**:

```release-note
NONE
```
2018-05-03 10:17:03 -07:00
Kubernetes Submit Queue 592c39bccc
Merge pull request #62541 from filbranden/cgroupname1
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use a []string for CgroupName, which is a more accurate internal representation

**What this PR does / why we need it**:

This is purely a refactoring and should bring no essential change in behavior.

It does clarify the cgroup handling code quite a bit.

It is preparation for further changes we might want to do in the cgroup hierarchy. (But it's useful on its own, so even if we don't do any, it should still be considered.)

**Special notes for your reviewer**:

The slice of strings more precisely captures the hierarchic nature of the cgroup paths we use to represent pods and their groupings.

It also ensures we're reducing the chances of passing an incorrect path format to a cgroup driver that requires a different path naming, since now explicit conversions are always needed.

The new constructor `NewCgroupName` starts from an existing `CgroupName`, which enforces a hierarchy where a root is always needed. It also performs checking on the component names to ensure invalid characters ("/" and "_") are not in use.

A `RootCgroupName` for the top of the cgroup hierarchy tree is introduced.

This refactor results in a net reduction of around 30 lines of code,
mainly with the demise of ConvertCgroupNameToSystemd which had fairly
complicated logic in it and was doing just too many things.

There's a small TODO in a helper `updateSystemdCgroupInfo` that was introduced to make this commit possible. That logic really belongs in libcontainer, I'm planning to send a PR there to include it there. (The API already takes a field with that information, only that field is only processed in cgroupfs and not systemd driver, we should fix that.)

Tested: By running the e2e-node tests on both Ubuntu 16.04 (with cgroupfs driver) and CentOS 7 (with systemd driver.)

**NOTE**: I only tested this with dockershim, we should double-check that this works with the CRI endpoints too, both in cgroupfs and systemd modes.

/assign @derekwaynecarr 
/assign @dashpole 
/assign @Random-Liu 

**Release note**:

```release-note
NONE
```
2018-05-03 08:16:45 -07:00
Kubernetes Submit Queue 4f56127582
Merge pull request #63073 from andyxning/refactor_grpc_dial_with_dialcontext
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

refactor device plugin grpc dial with dialcontext

**What this PR does / why we need it**:
Refactor grpc `dial` with `dialContext` as `grpc.WithTimeout` has been deprecated by:
> use DialContext and context.WithTimeout instead.

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-05-03 01:16:34 -07:00
Kubernetes Submit Queue 186dd7beb1
Merge pull request #62903 from cofyc/fixfsgroupcheckinlocal
Automatic merge from submit-queue (batch tested with PRs 62657, 63278, 62903, 63375). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add more volume types in e2e and fix part of them.

**What this PR does / why we need it**:

- Add dir-link/dir-bindmounted/dir-link-bindmounted/bockfs volume types for e2e tests.
- Fix fsGroup related e2e tests partially.
- Return error if we cannot resolve volume path.
  - Because we should not fallback to volume path, if it's a symbolic link, we may get wrong results.

To safely set fsGroup on local volume, we need to implement these two methods correctly for all volume types both on the host and in container:

- get volume path kubelet can access
  - paths on the host and in container are different
- get mount references
  - for directories, we cannot use its mount source (device field) to identify mount references, because directories on same filesystem have same mount source (e.g. tmpfs), we need to check filesystem's major:minor and directory root path on it

Here is current status:

| | (A) volume-path (host) | (B) volume-path (container) | (C) mount-refs (host) | (D) mount-refs (container) |
| --- | --- | --- | --- | --- |
| (1) dir | OK | FAIL | FAIL | FAIL |
| (2) dir-link | OK | FAIL | FAIL | FAIL |
| (3) dir-bindmounted | OK | FAIL | FAIL | FAIL |
| (4) dir-link-bindmounted | OK | FAIL | FAIL | FAIL |
| (5) tmpfs| OK | FAIL | FAIL | FAIL |
| (6) blockfs| OK | FAIL | OK | FAIL |
| (7) block| NOTNEEDED | NOTNEEDED | NOTNEEDED | NOTNEEDED |
| (8) gce-localssd-scsi-fs| NOTTESTED | NOTTESTED | NOTTESTED | NOTTESTED |

- This PR uses `nsenter ... readlink` to resolve path in container as @msau42  @jsafrane [suggested](https://github.com/kubernetes/kubernetes/pull/61489#pullrequestreview-110032850). This fixes B1:B6 and D6, , the rest will be addressed in https://github.com/kubernetes/kubernetes/pull/62102.
- C5:D5 marked `FAIL` because `tmpfs` filesystems can share same mount source, we cannot rely on it to check mount references. e2e tests passes due to we use unique mount source string in tests.
- A7:D7 marked `NOTNEEDED` because we don't set fsGroup on block devices in local plugin. (TODO: Should we set fsGroup on block device?)
- A8:D8 marked `NOTTESTED` because I didn't test it, I leave it to `pull-kubernetes-e2e-gce`. I think it should be same as `blockfs`.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-05-02 20:13:11 -07:00
Kubernetes Submit Queue b5f61ac129
Merge pull request #62657 from matthyx/master
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update all script shebangs to use /usr/bin/env interpreter instead of /bin/interpreter

This is required to support systems where bash doesn't reside in /bin (such as NixOS, or the *BSD family) and allow users to specify a different interpreter version through $PATH manipulation.
https://www.cyberciti.biz/tips/finding-bash-perl-python-portably-using-env.html
```release-note
Use /usr/bin/env in all script shebangs to increase portability.
```
2018-05-02 19:44:32 -07:00
Yecheng Fu 3748197876 Add more volume types in e2e and fix part of them.
- Add dir-link/dir-bindmounted/dir-link-bindmounted/blockfs volume types for e2e
tests.
- Return error if we cannot resolve volume path.
- Add GetFSGroup/GetMountRefs methods for mount.Interface.
- Fix fsGroup related e2e tests partially.
2018-05-02 10:31:42 +08:00
Robert Krawitz 3f3c04d722 WIP: Correct kill logic for cgroup processes 2018-05-01 19:38:12 -04:00
David Eads 94e3d94d67 update tests to be specific about the versions they are testing instead of floating 2018-05-01 13:18:41 -04:00
Kubernetes Submit Queue 8eb7eeef39
Merge pull request #63321 from sjenning/fix-pod-deletor
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

 kubelet: force filterContainerID to empty string when removeAll is true 

fixes https://github.com/kubernetes/kubernetes/issues/57865

alternative to https://github.com/kubernetes/kubernetes/pull/62170

There is a bug in the container deletor where if `removeAll` is `true` in `deleteContainersInPod()` the `filterContainerID` is still used to filter results in `getContainersToDeleteInPod()`.  If the filter container is not found, no containers are returned for deletion.

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pod_container_deletor.go#L74-L77

This is the case for the delayed deletion a pod in `CrashLoopBackoff` as the death of the infra container in response to the `DELETE` is detected by PLEG and triggers an attempt to clean up all containers but uses the infra container id as a filter with `removeAll` set to `true`.  The infra container is immediately deleted and thus can not be found when `getContainersToDeleteInPod()` tries to find it.  Thus the dead app container from the previous restart attempt still exists.

`canBeDeleted()` in the status manager will return `false` until all the pod containers are deleted, delaying the deletion of the pod on the API server.

The removal of the containers is eventually forced by the API server `REMOVE` after the grace period.
2018-05-01 09:50:13 -07:00
Filipe Brandenburger b230fb8ac4 Use a []string for CgroupName, which is a more accurate internal representation
The slice of strings more precisely captures the hierarchic nature of
the cgroup paths we use to represent pods and their groupings.

It also ensures we're reducing the chances of passing an incorrect path
format to a cgroup driver that requires a different path naming, since
now explicit conversions are always needed.

The new constructor NewCgroupName starts from an existing CgroupName,
which enforces a hierarchy where a root is always needed. It also
performs checking on the component names to ensure invalid characters
("/" and "_") are not in use.

A RootCgroupName for the top of the cgroup hierarchy tree is introduced.

This refactor results in a net reduction of around 30 lines of code,
mainly with the demise of ConvertCgroupNameToSystemd which had fairly
complicated logic in it and was doing just too many things.

There's a small TODO in a helper updateSystemdCgroupInfo that was
introduced to make this commit possible. That logic really belongs in
libcontainer, I'm planning to send a PR there to include it there.
(The API already takes a field with that information, only that field is
only processed in cgroupfs and not systemd driver, we should fix that.)

Tested by running the e2e-node tests on both Ubuntu 16.04 (with cgroupfs
driver) and CentOS 7 (with systemd driver.)
2018-05-01 08:29:06 -07:00
Seth Jennings 1fb3b24b63 kubelet: fix warning message to not print pointer addrs 2018-04-30 16:38:43 -05:00
Seth Jennings 2aba27da86 kubelet: force filterContainerID to empty string when removeAll is true 2018-04-30 16:29:17 -05:00
Kubernetes Submit Queue 15cc20630d
Merge pull request #60034 from pohly/device-manager-goroutine
Automatic merge from submit-queue (batch tested with PRs 58474, 60034, 62101, 63198). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

avoid race condition in device manager and plugin startup/shutdown: wait for goroutines

**What this PR does / why we need it**:

Commit 1325c2f worked around issue #59488, but it is still worthwhile to fix the underlying root cause properly.

**Which issue(s) this PR fixes**:
Fixes #59488

**Special notes for your reviewer**:

This is an alternative to PR #59861, which used a different approach. Personally I tend to prefer this one now.

**Release note**:
```release-note
NONE
```

/sig node
/area hw-accelerators
/assign vikaschoudhary16
2018-04-30 13:24:08 -07:00
Davanum Srinivas 4bacd77321 Remove unused code 2018-04-30 14:57:26 -04:00
Lantao Liu 4bb16659ee Make kubelet `ReadLogs` backward compatible.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-04-27 16:03:29 -07:00
Micah Hausler 1a218aaee2 Report node DNS info with --node-ip
```release-note
Report node DNS info with --node-ip flag
```
2018-04-27 13:18:40 -07:00
Kubernetes Submit Queue 284e8182a4
Merge pull request #63160 from sjenning/no-waitlogs-stopped-pod
Automatic merge from submit-queue (batch tested with PRs 63252, 63160). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: logs: do not wait when following terminated container

Currently, a `kubectl logs -f` on a terminated container will output the logs, wait 5 seconds (`stateCheckPeriod`), then return.  The 5 seconds delay should not occur as the container is terminated and unable to generate additional log messages.

This PR puts a check at the beginning of `waitLogs()` to avoid doing the wait when the container is not running.

@derekwaynecarr @smarterclayton
2018-04-27 12:27:05 -07:00
David Eads e2fc5cf259 remove versioning interface 2018-04-27 07:56:42 -04:00
Pengfei Ni 91c6cfed2f Also update CRI to indicate runtimes should not update empty CIDR 2018-04-27 11:14:43 +08:00
Pengfei Ni 335d70a6d1 Check CIDR before updating node status 2018-04-27 11:07:48 +08:00
Kubernetes Submit Queue 7ff38a23f0
Merge pull request #62937 from vikaschoudhary16/fix-dockershim-e2e
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix dockershim e2e

**What this PR does / why we need it**:
Delete checkpoint file when GetCheckpoint fails due to corrupt checkpoint. Earlier, before checkpointmanager, [`GetCheckpoint` in dockershim was deleting corrupt checkpoint file implicitly](https://github.com/kubernetes/kubernetes/pull/56040/files#diff-9a174fa21408b7faeed35309742cc631L116). In checkpointmanager's `GetCheckpoint` this implicit deletion of corrupt checkpoint is not happening. Because of this few e2e tests are failing because these tests are testing this deletion.
Changes are being added to delete checkpoint file if found corrupted. 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62738 


**Special notes for your reviewer**:
No new behavior is being introduced. Implicit deletion of corrupt checkpoint is being done explicitly.

**Release note**:

```release-note
None
```
/cc @dashpole @sjenning @derekwaynecarr
2018-04-26 16:26:14 -07:00
Seth Jennings 5da3a1d514 kubelet: logs: do not wait on following terminated container 2018-04-26 16:53:54 -05:00
Kubernetes Submit Queue a38a02792b
Merge pull request #62662 from wangzhen127/runtime-default
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Change seccomp annotation from "docker/default" to "runtime/default"

**What this PR does / why we need it**:
This PR changes seccomp annotation from "docker/default" to "runtime/default", so that it is can be applied to all kinds of container runtimes. This PR is a followup of [#1963](https://github.com/kubernetes/community/pull/1963).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #39845

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-04-26 14:33:53 -07:00
Michelle Au 6b1947c9ba passthrough readOnly to subpath 2018-04-26 11:18:36 -07:00
David Eads a89291a5de stop duplicating preferred version order 2018-04-26 10:03:36 -04:00
Lantao Liu a321646e74 Add level to remote client glog.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-04-26 01:21:20 -07:00
Kubernetes Submit Queue 9e52d14eb9
Merge pull request #62805 from awly/take-reviews
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add awly as reviewer in several subtrees

```release-note
NONE
```
2018-04-25 21:24:31 -07:00
Kubernetes Submit Queue 97287177ee
Merge pull request #63075 from deads2k/api-05-eliminate-indirection
Automatic merge from submit-queue (batch tested with PRs 62982, 63075, 63067, 62877, 63141). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

eliminate indirection from type registration

Some years back there was a partial attempt to revamp api type registration, but the effort was never completed and this was before we started splitting schemes. With separate schemes, the idea of partial registration no longer makes sense.  This pull starts removing cruft from the registration process and pulls out a layer of indirection that isn't needed.

@kubernetes/sig-api-machinery-pr-reviews 
@lavalamp @cheftako @sttts @smarterclayton 

Rebase cost is fairly high, so I'd like to avoid this lingering.

/assign @sttts 
/assign @cheftako 

```release-note
NONE
```
2018-04-25 11:53:14 -07:00
Kubernetes Submit Queue af5f9bc9bb
Merge pull request #62982 from dixudx/warning_kubelet_remote_sandbox
Automatic merge from submit-queue (batch tested with PRs 62982, 63075, 63067, 62877, 63141). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add warnings on using pod-infra-container-image for remote container runtime

**What this PR does / why we need it**:
We should warn on using `--pod-infra-container-image` to avoid confusions, when users are using remote container runtime.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #55676,#62388,#62732

**Special notes for your reviewer**:
/cc @kubernetes/sig-node-pr-reviews 

**Release note**:

```release-note
add warnings on using pod-infra-container-image for remote container runtime
```
2018-04-25 11:53:11 -07:00
David Eads e7fbbe0e3c eliminate indirection from type registration 2018-04-25 09:02:31 -04:00
Andy Xie b01657d0c7 refactor device plugin grpc dial with dialcontext 2018-04-25 18:40:23 +08:00
Kubernetes Submit Queue 046baee847
Merge pull request #63118 from vikaschoudhary16/start-stop-race
Automatic merge from submit-queue (batch tested with PRs 62951, 57460, 63118). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix device plugin re-registration

**What this PR does / why we need it**:
While registering a new endpoint, device manager copies all the devices from the old endpoint for the same resource and then it stops the old endpoint and starts the new endpoint.

There is no sync between stopping the old and starting the new. While stopping the old, manager marks devices(which are copied to new endpoint as well) as "Unhealthy".

In the endpoint.go, when after restart, plugin reports devices healthy, same health state (healthy) is found  in the endpoint database and endpoint module does not update manager database.

Solution in the PR is to mark devices as unhealthy before copying to new endpoint. 


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62773

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
/cc @jiayingz @vishh @RenaudWasTaken @derekwaynecarr
2018-04-25 02:01:56 -07:00
Di Xu b47ab8b2d3 add warnings for docker-only flags 2018-04-25 12:56:53 +08:00
Kubernetes Submit Queue 61892abc94
Merge pull request #62874 from dcbw/dockershim-SetUpPod-cleanup-on-failure
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim/sandbox: clean up pod network even if SetUpPod() failed

If the CNI network plugin completes successfully, but something fails
between that success and dockerhsim's sandbox setup code, plugin resources
may not be cleaned up. A non-trivial amount of code runs after the
plugin itself exits and the CNI driver's SetUpPod() returns, and any error
condition recognized by that code would cause this leakage.

The Kubernetes CRI RunPodSandbox() request does not attempt to clean
up on errors, since it cannot know how much (if any) networking
was actually set up. It depends on the CRI implementation to do
that cleanup for it.

In the dockershim case, a SetUpPod() failure means networkReady is
FALSE for the sandbox, and TearDownPod() will not be called later by
garbage collection even though networking was configured, because
dockershim can't know how far SetUpPod() got.

Concrete examples include if the sandbox's container is somehow
removed during during that time, or another OS error is encountered,
or the plugin returns a malformed result to the CNI driver.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1532965

```release-note
NONE
```
2018-04-24 21:48:01 -07:00
vikaschoudhary16 c846d5fe63 Fix race between stopping old and starting new endpoint 2018-04-24 22:22:39 -04:00
Kubernetes Submit Queue a4271c03cb
Merge pull request #63090 from mtaufen/fix-qosreserved-json-tag
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix qosReserved json tag (lowercase qos, instead of uppercase QOS)

The API conventions specify that json keys should start with a lowercase
character, and if the key starts with an initialism, all characters in
the initialism should be lowercase. See `tlsCipherSuites` as an example.

API Conventions:
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md

>All letters in the acronym should have the same case, using the
>appropriate case for the situation. For example, at the beginning
>of a field name, the acronym should be all lowercase, such as "httpGet".

Follow up to: https://github.com/kubernetes/kubernetes/pull/62925

```release-note
NONE
```

@sjenning @derekwaynecarr
2018-04-24 19:01:20 -07:00
Kubernetes Submit Queue f68d10cfe4
Merge pull request #62853 from tony612/fix-resultRun-reset
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

reset resultRun to 0 on pod restart

**What this PR does / why we need it**:

The resultRun should be reset to 0 on pod restart, so that resultRun on the first failure of the new container will be 1, which is correct. Otherwise, the actual FailureThreshold after restarting will be `FailureThreshold - 1`.

**Which issue(s) this PR fixes**:

This PR is related to https://github.com/kubernetes/kubernetes/issues/53530. https://github.com/kubernetes/kubernetes/pull/46371 fixed that issue but there's still a little problem like what I said above.

**Special notes for your reviewer**:

**Release note**:
```release-note
fix resultRun by resetting it to 0 on pod restart
```
2018-04-24 13:28:25 -07:00
Kubernetes Submit Queue 44b57338d5
Merge pull request #59692 from mtaufen/dkcfg-unpack-configmaps
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

unpack dynamic kubelet config payloads to files

This PR unpacks the downloaded ConfigMap to a set of files on the node.

This enables other config files to ride alongside the
KubeletConfiguration, and the KubeletConfiguration to refer to these
cohabitants with relative paths.

This PR also stops storing dynamic config metadata (e.g. current,
last-known-good config records) in the same directory as config
checkpoints. Instead, it splits the storage into `meta` and
`checkpoints` dirs.

The current store dir structure is as follows:
```
- dir named by --dynamic-config-dir (root for managing dynamic config)
| - meta (dir for metadata, e.g. which config source is currently assigned, last-known-good)
  | - current (a serialized v1 NodeConfigSource object, indicating the assigned config)
  | - last-known-good (a serialized v1 NodeConfigSource object, indicating the last-known-good config)
| - checkpoints (dir for config checkpoints)
  | - uid1 (dir for unpacked config, identified by uid1)
    | - file1
    | - file2
    | - ...
  | - uid2
  | - ...
```

There are some likely changes to the above structure before dynamic config goes beta, such as renaming "current" to "assigned" for clarity, and extending the checkpoint identifier to include a resource version, as part of resolving #61643.

```release-note
NONE
```

/cc @luxas @smarterclayton
2018-04-24 12:01:37 -07:00
Dan Williams 91321ef85b dockershim/sandbox: clean up pod network even if SetUpPod() failed
If the CNI network plugin completes successfully, but something fails
between that success and dockerhsim's sandbox setup code, plugin resources
may not be cleaned up. A non-trivial amount of code runs after the
plugin itself exits and the CNI driver's SetUpPod() returns, and any error
condition recognized by that code would cause this leakage.

The Kubernetes CRI RunPodSandbox() request does not attempt to clean
up on errors, since it cannot know how much (if any) networking
was actually set up. It depends on the CRI implementation to do
that cleanup for it.

In the dockershim case, a SetUpPod() failure means networkReady is
FALSE for the sandbox, and TearDownPod() will not be called later by
garbage collection even though networking was configured, because
dockershim can't know how far SetUpPod() got.

Concrete examples include if the sandbox's container is somehow
removed during during that time, or another OS error is encountered,
or the plugin returns a malformed result to the CNI driver.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1532965
2018-04-24 11:17:49 -05:00
Michael Taufen 23c21b055c Fix qosReserved json tag (lowercase qos, instead of uppercase QOS)
The API conventions specify that json keys should start with a lowercase
character, and if the key starts with an initialism, all characters in
the initialism should be lowercase. See `tlsCipherSuites` as an example.

API Conventions:
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md

>All letters in the acronym should have the same case, using the
>appropriate case for the situation. For example, at the beginning
>of a field name, the acronym should be all lowercase, such as "httpGet".
2018-04-24 09:12:35 -07:00