* iso: Update kernel to 6.6.95 for x86_64
Generated by running `make iso-menuconfig-x86_64` and updating kernel
version to longterm kernel 6.6.95 and kernel headers to 6.6.x, and then
running `make linux-menuconfig-x86_64` to update the linux config.
Additinally update hyperv-daemons package to use kernel 6.x.
* iso: Update kernel to 6.6.95 for aarch64
Generated by running `make iso-menuconfig-aarch64` and updating kernel
version to longterm kernel 6.6.95 and kernel headers to 6.6.x, and then
running `make linux-menuconfig-aarch64` to update the linux config.
* iso: Enable VirtIO GPU for krunkit driver
The krunkit driver exposes the host GPU via VirtIO GPU, enabling AI
workloads in the guest.
* Updating ISO to v1.36.0-1751445739-20995
---------
Co-authored-by: minikube-bot <minikube-bot@google.com>
* iso: Extract buildroot target
Beofre we can build the iso, we need to clone and configure buildroot.
This is required to run iso-menuconfig-{arch}.
* iso: Extract iso-prepare-% target
This target prepare for building an iso or running menuconfig. With this
change we can run the {iso,linux}-menuconfig-{x86_64,aarch64} targets
without buidling the entire iso.
* iso: Fix linux-menuconfig-% target
Previouly it worked only after building the entire iso. Now we make this
target without building the iso or running iso-menuconfig.
On the first run this downloads and builds lot for packages required to
run the linux-menuconfig target, but it is much shorter than buidling
the entire iso.
* iso: Simplify linux-menuconfig-%
Preveviously we copied the defconfig manauly to the beoard config file.
This can be done using the special linux-update-defconfig target.
With this change we don't need to keep KERNEL_VERSION in the Makefile,
making future upgrade easier.
* iso: Update buildroot configuration for aarch64
Run `make iso-menuconfig-aarch64` without making any changes updates the
buildroot config. It seems that there were manual changes in the config
which are overwritten when running iso-menuconfig. Removing the manual
changes to make it easier to edit the configuration with kconfig.
* iso: Update buildroot configuration for x86_64
Same as the aarch64 change to make it easier to configure using kconfig.
* iso: Update linux configuration for aarch64
Same as iso-menuconfig-aarch64, run `make linux-menuconfig-aarch64` and
exit without any change to update the config. This seems to change the
order, removing manual changes from the config. This will make it easier
to configure using kconfig in the future.
* iso: Update linux configuration for x86_64
Same as the aarch64 changes to make it easier to configure using kconfig
in the future.
* iso: Disable all platform for aarch64
We run on qemu virt machine or apple virtualization so we don't need
support for all kinds of embeded Arm boards. This reduces the arm64 iso
size from 410 MiB to 392 MiB.
* Updating ISO to v1.36.0-1751221996-20991
* Updating ISO to v1.36.0-1751315722-20991
---------
Co-authored-by: minikube-bot <minikube-bot@google.com>
libkrun virtio-net driver enables TSO offloading and checksum
offloading by default, so we must use vment-helper --enable-tso and
--enable-checksum-offload with krunkit. These options do not work with
vfkit.
* vfkit: Log serial console to file
To make debugging easier, add virtio-serial device logging serial
console to file:
~/.minikube/machines/NAME/serial.log
To enable logging, we need to enable the console in the kernel command
line, since we still use direct kernel boot.
Example log:
% cat /Users/nir/.minikube/machines/vfkit/vfkig.log
[ 0.896094] cacheinfo: Unable to detect cache hierarchy for CPU 0
[ 0.897186] loop: module loaded
[ 0.897670] virtio_blk virtio2: [vda] 840488 512-byte logical blocks (430 MB/410 MiB)
[ 0.897733] vda: detected capacity change from 0 to 430329856
[ 0.898460] virtio_blk virtio3: [vdb] 40960000 512-byte logical blocks (21.0 GB/19.5 GiB)
[ 0.898533] vdb: detected capacity change from 0 to 20971520000
...
[ 1.794714] systemd[1]: Detected virtualization vm-other.
[ 1.794752] systemd[1]: Detected architecture arm64.
Welcome to Buildroot 2025.02!
[ 1.794944] systemd[1]: Hostname set to <minikube>.
[ 1.795011] systemd[1]: Initializing machine ID from random generator.
...
[ OK ] Started Container Runtime Interface for OCI (CRI-O).
[ OK ] Reached target Multi-User System.
Welcome to minikube
vfkit login: [ 6.681578] systemd-ssh-generator[630]: Binding SSH to AF_UNIX socket /run/ssh-unix-local/socket.
* vfkit: Use EFI bootloader
With the fixed iso, we can simplify the driver using the EFI bootloader
option[1] instead of the legacy and deprecated --kernel, --kernel-cmdline,
and --initrd options[2].
Example run:
% minikube start -p vfkit --driver vfkit --container-runtime containerd --network vmnet-shared
😄 [vfkit] minikube v1.36.0 on Darwin 15.5 (arm64)
✨ Using the vfkit driver based on user configuration
👍 Starting "vfkit" primary control-plane node in "vfkit" cluster
🔥 Creating vfkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
📦 Preparing Kubernetes v1.33.1 on containerd 1.7.23 ...
▪ Generating certificates and keys ...
▪ Booting up control plane ...
▪ Configuring RBAC rules ...
🔗 Configuring bridge CNI (Container Networking Interface) ...
🔎 Verifying Kubernetes components...
▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟 Enabled addons: default-storageclass, storage-provisioner
🏄 Done! kubectl is now configured to use "vfkit" cluster and "default" namespace by default
Comparing direct kernel boot and --bootloader efi shows that it is little bit faster and boot time is more consistent.
% hyperfine -r 10 -C "minikube delete" \
"vfkit-efi/out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd --no-kubernetes" \
"vfkit-direct/out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd --no-kubernetes"
Benchmark 1: vfkit-efi/out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd --no-kubernetes
Time (mean ± σ): 10.205 s ± 0.656 s [User: 0.381 s, System: 0.266 s]
Range (min … max): 9.106 s … 11.254 s 10 runs
Benchmark 2: vfkit-direct/out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd --no-kubernetes
Time (mean ± σ): 10.933 s ± 1.616 s [User: 0.402 s, System: 0.406 s]
Range (min … max): 9.155 s … 14.168 s 10 runs
Summary
vfkit-efi/out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd --no-kubernetes ran
1.07 ± 0.17 times faster than vfkit-direct/out/minikube start --driver vfkit --network vmnet-shared --container-runtime containerd --no-kubernetes
[1] https://github.com/crc-org/vfkit/blob/main/doc/usage.md#efi-bootloader
[2] https://github.com/crc-org/vfkit/blob/main/doc/usage.md#deprecated-options
* docs: Update vfkit driver documentation
- Separate vfkit requirements and vmnet-shared requirements
- Update minimal macOS version required for --bootloader efi
- Simplify vfkit upgrade, it is available in brew now
* iso: Disable grub timeout
This speeds up machine boot by 5 seconds. The timeout may be helpful for
debugging boot issues but we don't have a way to access the serial
console for debugging currently.
Testing shows about ~5 seconds speedup.
| driver | timeout | start time |
|------------|---------|------------|
| vfkit[1] | 5.0 | 24.01 |
| vfkit[1] | 0.0 | 19.90 |
| qemu | 5.0 | 29.46 |
| qemu | 0.0 | 24.28 |
| krunkit[2] | 5.0 | 25.14 |
| krunkit[2] | 0.0 | 20.51 |
[1] Tested with #20833, booting using iso instead of direct kernel
boot. Direct kernel boot is little bit faster (e.g. 18.x).
[2] Tested with #20826
* Updating ISO to v1.36.0-1749153077-20895
---------
Co-authored-by: minikube-bot <minikube-bot@google.com>
* iso: Fix console for vfkit/krunkit
The serial console name depends on the driver. We had setting for qemu
that does not work for vfkit and krunkit, breaking boot from minikube
iso.
Fixed by using 2 console= options, one is known to work for qemu, and
one for vfkit and krunkit. With this we can use the same iso image with
qemu, vfkit, and krunkit.
This will allow simplifying vfkit driver. Previously we had to extract
the kernel and initrd and start it using the legacy --kernel,
--kernel-cmdline and --initrd options.
I tested this by building the iso with this fix and running with
--iso-url.
Example run with qemu:
% minikube start -p qemu --driver qemu --container-runtime containerd \
--iso-url file://$PWD/minikube-arm64.iso
😄 [qemu] minikube v1.36.0 on Darwin 15.5 (arm64)
✨ Using the qemu2 driver based on user configuration
🌐 Automatically selected the socket_vmnet network
👍 Starting "qemu" primary control-plane node in "qemu" cluster
🔥 Creating qemu2 VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
📦 Preparing Kubernetes v1.33.1 on containerd 1.7.23 ...
▪ Generating certificates and keys ...
▪ Booting up control plane ...
▪ Configuring RBAC rules ...
🔗 Configuring bridge CNI (Container Networking Interface) ...
🔎 Verifying Kubernetes components...
▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟 Enabled addons: default-storageclass, storage-provisioner
🏄 Done! kubectl is now configured to use "qemu" cluster and "default" namespace by default
Example run with krunkit:
% minikube start -p krunkit --driver krunkit --container-runtime containerd \
--iso-url file://$PWD/minikube-arm64.iso
😄 [krunkit] minikube v1.36.0 on Darwin 15.5 (arm64)
✨ Using the krunkit (experimental) driver based on user configuration
👍 Starting "krunkit" primary control-plane node in "krunkit" cluster
🔥 Creating krunkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
📦 Preparing Kubernetes v1.33.1 on containerd 1.7.23 ...
▪ Generating certificates and keys ...
▪ Booting up control plane ...
▪ Configuring RBAC rules ...
🔗 Configuring bridge CNI (Container Networking Interface) ...
🔎 Verifying Kubernetes components...
▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟 Enabled addons: default-storageclass, storage-provisioner
🏄 Done! kubectl is now configured to use "krunkit" cluster and "default" namespace by default
* Updating ISO to v1.36.0-1749066232-20832
---------
Co-authored-by: minikube-bot <minikube-bot@google.com>
* Fix KVM driver tests timeouts
Rewrite KVM driver waiting logic for domain start, getting ip address
and shutting domain down. Add more config/state outputs to aid future
debugging.
Bump go/libvirt to v1.11002.0 and set the minimum memory required for
running all tests to 3GB to avoid some really weird system behaviour.
* revert reduction of timelimit for TestCert tests run
* set memory and debug output in TestNoKubernetes tests
* extend kvm waitForStaticIP timeout
* add console log to debug output
* Updating ISO to v1.36.0-1748823857-20852
---------
Co-authored-by: minikube-bot <minikube-bot@google.com>
* fix QF1011: could omit type *os.File from declaration; it will be inferred from the right-hand side
* fix QF1012: Use fmt.Fprintf(x, ...) instead of x.Write(fmt.Sprintf(...))
* fix QF1001: could apply De Morgan's law
* fix QF1003: could use tagged switch
* fix weakCond: suspicious ; nil check may not be enough, check for len (gocritic)
* fix docStub: silencing go lint doc-comment warnings is unadvised
* fix builtinShadow: shadowing of predeclared identifier: error
* fix importShadow: shadow of imported package
* fix nestingReduce: invert if cond, replace body with , move old body after the statement
* useless-break: useless break in case clause (revive)
* Clear the redundant content in golangci.yaml file
vfkit is using the native virtualization framework, provides good best
performance and all the features needed by minikube. It is also well
maintained and used by other projects like podman.
This fixes the automatic driver selection. When we start minikube on a
system with both vfkit and qemu, vfkit is selected:
% minikube start
😄 minikube v1.35.0 on Darwin 15.5 (arm64)
✨ Automatically selected the vfkit driver. Other choices: qemu2, ssh, podman (experimental)
👍 Starting "minikube" primary control-plane node in "minikube" cluster
🔥 Creating vfkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
🐳 Preparing Kubernetes v1.33.1 on Docker 28.0.4 ...
▪ Generating certificates and keys ...
▪ Booting up control plane ...
▪ Configuring RBAC rules ...
🔗 Configuring bridge CNI (Container Networking Interface) ...
🔎 Verifying Kubernetes components...
▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟 Enabled addons: default-storageclass, storage-provisioner
🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
* Kicbase/ISO: Update cni-plugins from v1.6.2 to v1.7.1
* Updating kicbase image to v0.0.46-1747341282-20771
* Updating ISO to v1.35.0-1747341198-20771
* Update service tests to use-
discoveryv1.EndpointSlice instead of deprecated core.Endpoints
This PR addresses the deprecation warnings for core.Endpoints which is deprecated in Kubernetes v1.33+ in favor of discoveryv1.EndpointSlice.
Changes:
- Replaced all core.Endpoints references with discoveryv1.EndpointSlice
- Updated mock interfaces and test data structures to use the new API
- Maintained all existing test functionality and assertions
- No behavioral changes - just API modernization
Fixes#20677
* fixed lint issue
- make lint now runs without error
* Fix multiple panics in pkg/minikube/service tests by improving mock client initialization
This commit addresses a series of nil pointer dereference panics in the test suite for pkg/minikube/service, ensuring all tests pass reliably. The changes focus on improving the initialization of mock Kubernetes clients to prevent nil pointer dereferences in the fake client's `Invokes` method. The fixes include:
1. **TestPrintURLsForService/throw_error_without_template Panic**:
- Issue: A panic occurred due to an uninitialized `FakeCoreV1` field in `MockCoreClient`, causing a nil pointer dereference when `Services().Get` was called.
- Fix: Initialized `FakeCoreV1` with `Fake: &testing_fake.Fake{}` in the `TestPrintURLsForService` setup, ensuring the fake client is properly configured.
2. **TestGetServiceURLs/correctly_return_serviceURLs Panic**:
- Issue: A similar panic occurred in `GetServiceURLs` due to the `FakeCoreV1` field not being initialized in `MockCoreClient` returned by `MockClientGetter`.
- Fix: Updated `MockClientGetter.GetCoreClient` to initialize `FakeCoreV1` with `Fake: &testing_fake.Fake{}`, ensuring all tests using `MockClientGetter` have a properly initialized client.
3. **TestDeleteSecret/ok Panic**:
- Issue: A panic occurred in `DeleteSecret` when calling `secrets.Delete` for the `foo` namespace, as `secretsNamespaces` lacked an entry for `foo`, returning a nil interface.
- Fix: Added a `MockSecretInterface` for the `foo` namespace to `secretsNamespaces`, ensuring `client.Secrets("foo")` returns a valid interface. Updated `initializeMockObjects` to verify the `Fake` field for the new entry.
Additional improvements:
- Ensured `initializeMockObjects` consistently initializes `Fake` fields across all mock interfaces (`serviceNamespaces`, `serviceNamespaceOther`, `endpointSliceNamespaces`, and `secretsNamespaces`).
- Verified that all test setups align with mock configurations, preventing similar issues in other tests (e.g., `TestCreateSecret`, `TestWaitAndMaybeOpenService`).
- Confirmed no linting issues with `make lint` and validated all tests pass with `go test -v ./pkg/minikube/service/...`.
* Kicbase/ISO: Update cri-dockerd from v0.3.15 to v0.4.0
* Updating kicbase image to v0.0.46-1747166185-20747
* Updating ISO to v1.35.0-1747160120-20747
* vmnet: Improve --network vmnet-shared validation
Previously we did not check that the helper can run with the
--close-from=4 option, so the command could succeed when incorrect
sudoers configuration. For example a user with liberal NOPASSWD rule,
but without the closefrom_override option.
When the check failed, we log unhelpful log:
libmachine: Failed to run vmnet-helper:
%!w(*exec.ExitError=&{0x14000135e30 [115 117 100 111 58 32 97 ... 101 100 10]})
And we returned a bool, so the caller could not provide a suggestion how
to resolve the issue.
Fix by:
- Rename vment.HelperAvaialble to vment.ValidateHelper
- Return an error describing the issue, including a reason.Kind that can
be used to provide a suggestion for resolving the issue.
- Include the ExitError.Stderr int the error. This includes helpful
error messages from sudo.
- Add new reason.NotConfiguredVmnetHelper error
- Improve log when vment.ValidateHelper() succeeded
Example error flow - vment-helper not installed:
% minikube start --driver vfkit --network vmnet-shared
😄 minikube v1.35.0 on Darwin 15.4.1 (arm64)
✨ Using the vfkit (experimental) driver based on user configuration
🙈 Exiting due to NOT_FOUND_VMNET_HELPER: failed to validate vmnet-shared network:
stat /opt/vmnet-helper/bin/vmnet-helper: no such file or directory
💡 Suggestion:
vmnet-helper was not found on the system, resolve by:
Option 1) Installing vmnet-helper:
https://github.com/nirs/vmnet-helper#installation
Option 2) Using the nat network:
minikube start<no value> --driver vfkit --network nat
I resolved the issue by installing vmnet-helper but I did not configured
the sudoers rule:
% minikube start --driver vfkit --network vmnet-shared
😄 minikube v1.35.0 on Darwin 15.4.1 (arm64)
✨ Using the vfkit (experimental) driver based on user configuration
🙈 Exiting due to NOT_CONFIGURED_VMNET_HELPER: failed to validate vmnet-shared network:
exit status 1: sudo: you are not permitted to use the -C option
💡 Suggestion:
Configure vmnet-helper to run without a password.
Please install a vmnet-helper sudoers rule using these instructions:
https://github.com/nirs/vmnet-helper#granting-permission-to-run-vmnet-helper
After installing the sudoers rule minikube could start.
* vfkit: Use helper --socket instead of --fd
The --fd option avoids the need to manage a bound unix sockets, in
particular the limit on unix socket length. It is also more secure;
only the process inheriting the socket can access the helper. However it
requires the sudo --close-from= option, which may not work for some
users. We don't understand why it does not work, and debugging it is
hard since users are not happy to share their local sudoers
configuration.
Avoid the trouble by switching to the --socket option. In this case we
pass a unix socket path to the helper and vfkit. The helper creates a
bound unix datagram socket in the specified path, and waits until vfkit
connects to the socket. When vfkit connects to the unix socket the
programs are connected in the same way they are connected by passing
file descriptors.
When running minikube we will see 3 new files in the machine directory:
- `vfkit-fb64-7802.sock`: vfkit unix datagram socket
- `vmnet-helper.sock`: vmnet-helper unix datagram socket
- `vmnet-helper.sock.lock`: lockfile for vment-helper socket
The files are deleted when vmnet-helper and vfkit are terminated
gracefully. If they are killed the stale files are replaced on the next
start.
Issues:
- If the path exceeds the limit (104 characters), opening the socket
will fail. We have the sames issue with vfkit management socket.
* vment: Fallback to interactive sudo
If vmnet-helper sudoers rule is not configured or does not work for the
user, maybe because the user has disabled the NOPASSWD option, we used
to fail, recommending to configure vmnet sudoers rule. This does not
help a user that cannot fix the sudoers configuration.
Since we switched to --socket mode, we can work without a sudoers rule.
If we can interact with the user, we fall back to interactive sudo. The
user can enter a password to start the machine.
Example run with --interactive=false:
% minikube start --driver vfkit --network vmnet-shared --interactive=false
😄 minikube v1.35.0 on Darwin 15.4.1 (arm64)
✨ Using the vfkit (experimental) driver based on user configuration
🙈 Exiting due to NOT_CONFIGURED_VMNET_HELPER: failed to validate vmnet-shared network:
exit status 1: sudo: a password is required
💡 Suggestion:
Configure vmnet-helper to run without a password.
Please install a vmnet-helper sudoers rule using these instructions:
https://github.com/nirs/vmnet-helper#granting-permission-to-run-vmnet-helper
Example run with --interactive (default):
% minikube start --driver vfkit --network vmnet-shared
😄 minikube v1.35.0 on Darwin 15.4.1 (arm64)
✨ Using the vfkit (experimental) driver based on user configuration
💡 Unable to run vmnet-helper without a password
To configure vment-helper to run without a password, please check the documentation:
https://github.com/nirs/vmnet-helper/#granting-permission-to-run-vmnet-helper
Password:
👍 Starting "minikube" primary control-plane node in "minikube" cluster
🔥 Creating vfkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
🐳 Preparing Kubernetes v1.33.0 on Docker 27.4.0 ...
▪ Generating certificates and keys ...
▪ Booting up control plane ...
▪ Configuring RBAC rules ...
🔗 Configuring bridge CNI (Container Networking Interface) ...
🔎 Verifying Kubernetes components...
▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟 Enabled addons: storage-provisioner, default-storageclass
🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
* vfkit: Remove temporary variable
Remove temporary and unneeded mac variable. It is easier to follow the
code when we use d.MACAddress.
* vfkit: Promote state change to INFO level
System state changes should be more visible to make debugging easier.
* vmnet: Add vmnet package
The package manages the vmnet-helper[1] child process, providing
connection to the vmnet network without running the guest as root.
We will use vmnet-helper for the vfkit driver, which does not have a way
to use shared network, when guests can access other guest in the
network. We can use it later with the qemu driver as alternative to
socket_vmnet.
[1] https://github.com/nirs/vmnet-helper
* vfkit: add vmnet-shared network
Add new network option for vfkit "vmnet-shared", connecting vfkit to the
vmnet shared network. Clusters using this network can access other
clusters in the same network, similar to socket_vmnet with QEMU driver.
If network is not specified, we default to the "nat" network, keeping
the previous behavior. If network is "vmnet-shared", the vfkit driver
manages 2 processes: vfkit and vmnet-helper.
Like vfkit, vmnet-helper is started in the background, in a new process
group, so it not terminated if the minikube process group is terminate.
Since vmnet-helper requires root to start the vmnet interface, we start
it with sudo, creating 2 child processes. vmnet-helper drops privileges
immediately after starting the vmnet interface, and run as the user and
group running minikube.
Stopping the cluster will stop sudo, which will stop the vmnet-helper
process. Deleting the cluster kill both sudo and vmnet-helper by killing
the process group.
This change is not complete, but it is good enough to play with the new
shared network.
Example usage:
1. Install vmnet-helper:
https://github.com/nirs/vmnet-helper?tab=readme-ov-file#installation
2. Setup vmnet-helper sudoers rule:
https://github.com/nirs/vmnet-helper?tab=readme-ov-file#granting-permission-to-run-vmnet-helper
3. Start 2 clusters with vmnet-shared network:
% minikube start -p c1 --driver vfkit --network vmnet-shared
...
% minikube start -p c2 --driver vfkit --network vmnet-shared
...
% minikube ip -p c1
192.168.105.18
% minikube ip -p c2
192.168.105.19
4. Both cluster can access the other cluster:
% minikube -p c1 ssh -- ping -c 3 192.168.105.19
PING 192.168.105.19 (192.168.105.19): 56 data bytes
64 bytes from 192.168.105.19: seq=0 ttl=64 time=0.621 ms
64 bytes from 192.168.105.19: seq=1 ttl=64 time=0.989 ms
64 bytes from 192.168.105.19: seq=2 ttl=64 time=0.490 ms
--- 192.168.105.19 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.490/0.700/0.989 ms
% minikube -p c2 ssh -- ping -c 3 192.168.105.18
PING 192.168.105.18 (192.168.105.18): 56 data bytes
64 bytes from 192.168.105.18: seq=0 ttl=64 time=0.289 ms
64 bytes from 192.168.105.18: seq=1 ttl=64 time=0.798 ms
64 bytes from 192.168.105.18: seq=2 ttl=64 time=0.993 ms
--- 192.168.105.18 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.289/0.693/0.993 ms
* reason: Remove trailing whitepsace
Trailing whitespace is removed by some editors or displayed as a
warning. Clean up to make it easy to make maintain this file.
* start: Validate vfkit --network option
The vfkit driver supports now `nat` and `vmnet-shared` network options.
The `nat` option provides the best performance and is always available,
so it is the default network option. The `vmnet-shared` option provides
access between machines with lower performance compared to `nat`.
If `vment-shared` option is selected, we verify that vmnet-helper is
available. The check ensure that vmnet-helper is installed and sudoers
configuration allows the current user to run vment-helper without a
password.
If validating vment-helper failed, we return a new NotFoundVmnetHelper
reason pointing to vment-helper installation docs or recommending to use
`nat`. This is based on how we treat missing socket_vmnet for QEMU
driver.
* site: Document vfkit network options
* Fix waiting for all kube-system pods having one of specified labels to be Ready
* Fix waiting for all kube-system pods having one of specified labels to be Ready
* Fix waiting for all kube-system pods having one of specified labels to be Ready
* Add process package
Current code contains multiple implementations for managing a process
using pids, with various issues:
- Some are unsafe, terminating a process by pid without validating that
the pid belongs to the right process. Some use unclear
- Using unclear terms like checkPid() (what does it mean?)
- Some are missing tests
Let's clean up the mess by introducing a process package. The package
provides:
- process.WritePidfile(): write a pid to file
- process.ReadPidfile(): read pid from file
- process.Exists(): tells if process matching pid and name exists
- process.Terminate() terminates a process matching pid and name
- process.Kil() kill a process matching pid and name
The library is tested on linux, darwin, and windows. On windows we don't
have a standard way to terminate a process gracefully, so
process.Terminate() is the same as process.Kill().
I want to use this package in vfkit and the new vment package, and later
we can use it for qemu, hyperkit, and other code using managing
processes with pids.
* vfkit: Use process pidfile helpers
- Simplify GetState() using process.ReadPidfile()
- Simplify Start() using process.WritePidfile()
* vfkit: Simpler and more robust GetState()
GetState() had several issues:
- When accessing vfkit HTTP API, we handled only "running",
"VirtualMachineStateRunning", "stopped", and
"VirtualMachineStateStopped", but there are other 10 possible states,
which we handled as state.None, when vfkit is running and need to be
stopped. This can lead to wrong handling in the caller.
- When handling "stopped" and "VirtualMachineStateStopped" we returned
state.Stopped, but did not remove the pidfile. This can cause
termination of unrelated process or reporting wrong status when the
pid is reused.
- Accessing the HTTP API will fail after we stop or kill it. This
cause GetState() to fail when the process is actually stopped, which
can lead to unnecessary retries and long delays (#20503).
- When retuning state.None during Remove(), we use tried to do graceful
shutdown which does not make sense in minikube delete flow, and is not
consistent with state.Running handling.
Accessing vfkit API to check for state does not add much value for our
used case, checking if the vfkit process is running, and it is not
reliable.
Fix all the issues by not using the HTTP API in GetState(), and use only
the process state. We still use the API for stopping and killing vfkit
to do graceful shutdown. This also simplifies Remove(), since we need to
handle only the state.Running state.
With this change we consider vfkit as stopped only when the process does
not exist, which takes about 3 seconds after the state is reported as
"stopped".
Example stop flow:
I0309 18:15:40.260249 18857 main.go:141] libmachine: Stopping "minikube"...
I0309 18:15:40.263225 18857 main.go:141] libmachine: set state: {State:Stop}
I0309 18:15:46.266902 18857 main.go:141] libmachine: Machine "minikube" was stopped.
I0309 18:15:46.267122 18857 stop.go:75] duration metric: took 6.127761459s to stop
Example delete flow:
I0309 17:00:49.483078 18127 out.go:177] * Deleting "minikube" in vfkit ...
I0309 17:00:49.499252 18127 main.go:141] libmachine: set state: {State:HardStop}
I0309 17:00:49.569938 18127 lock.go:35] WriteFile acquiring /Users/nir/.kube/config: ...
I0309 17:00:49.573977 18127 out.go:177] * Removed all traces of the "minikube" cluster.
* vfkit: Use process.Exists()
Previously we did not check the process name when checking a pid from a
pidfile. If the pidfile became state we would assume that vfkit is
running and try to stop it via the HTTP API, which would never succeed.
Now we detect stale pidfile and remove it.
If removing the stale pidfile fails, we don't want to fail the operation
since we know that vfkit is not running. We log the failure to aid
debugging of stale pidfile.
* vfkit: More robust Stop()
If setting vfkit state to "Stop" fails, we used to return an error.
Retrying the operation may never succeed.
Fix by falling back to terminating vfkit using a signal. This terminates
vfkit immediately similar to HardStop[1].
We can still fail if the pidfile is corrupted but this is unlikely and
requires manual cleanup.
In the case when we are sure the vfkit process does not exist, we remove
the pidfile immediately, avoiding leftover pidfile if the caller does
not call GetState() after Stop().
[1] https://github.com/crc-org/vfkit/issues/284
* vfkit: More robust Kill()
We know that setting the state to `HardStop` typically fails:
I0309 19:19:42.378591 21795 out.go:177] 🔥 Deleting "minikube" in vfkit ...
W0309 19:19:42.397472 21795 delete.go:106] remove failed, will retry: kill: Post "http://_/vm/state": EOF
This may lead to unnecessary retries and delays. Fix by falling back to
sending a SIGKILL signal.
Example delete flow when setting vfkit state fails:
I0309 20:07:41.688259 25540 out.go:177] 🔥 Deleting "minikube" in vfkit ...
I0309 20:07:41.712017 25540 main.go:141] libmachine: Failed to set vfkit state to 'HardStop': Post "http://_/vm/state": EOF
We use `HardStop` which seems to do a forced shutdown instead of
graceful shutdown, and to make things worse, always fails with:
I0309 15:00:42.452986 13723 stop.go:66] stop err: Post "http://_/vm/state": EOF
Which leads to unneeded retry and pointless backup attempt timing out
after 135 seconds because vkfit was stopped.
With this change we do a graceful shutdown, and the time to stop the
cluster decreased from 135 seconds to 3 seconds (45 times faster).
Example stop log:
I0309 15:34:33.104429 14440 main.go:141] libmachine: Stopping "minikube"...
I0309 15:34:33.105225 14440 main.go:141] libmachine: get state: {State:VirtualMachineStateRunning}
I0309 15:34:33.105799 14440 main.go:141] libmachine: set state: {State:Stop}
I0309 15:34:33.106099 14440 main.go:141] libmachine: get state: {State:VirtualMachineStateRunning}
I0309 15:34:36.109380 14440 main.go:141] libmachine: Machine "minikube" was stopped.
On macOS >= 15 bootpd is likely allowed to receive incoming connections
as built-in software, and it will not be listed in the allowed
applications. In this case we decide that bootpd is blocked and force
the user to try to add and unblock it, which will never succeed.
Fixed using the new --getallowedsigned option. If the option is enabled,
we know that bootpd is not blocked. If the option is not enabled, or the
fails, we fallback to checking the list.
Tested on macOS 15.3.
* Kicbase/ISO: Update cni-plugins from v1.6.1 to v1.6.2
* Updating kicbase image to v0.0.45-1736763277-20236
* Updating ISO to v1.34.0-1736762773-20236