Compare commits

...

64 Commits

Author SHA1 Message Date
Carlos de Paula a8c6bba80d Fix yaml generation in build script 2024-05-30 17:19:48 -03:00
Yesid Leonardo López Sierra 476c00d129 Update build.sh 2024-05-06 12:25:27 -03:00
Silas 9bda1810a6 validate go version in `make vendor` 2023-07-31 09:52:58 -03:00
Carlos Eduardo aa4f1b7900
Update FUNDING.yml 2022-08-10 23:30:51 -03:00
Renato Caldas 800d2e2b71 Fix Makefile after 'go get' deprecation (fixes #160) 2022-07-14 11:11:02 -03:00
Carlos de Paula 46176b4671 Fix namespace definition reported in #144. Rebuild manifests. 2022-02-08 12:41:51 -03:00
ToMe25 514aa37f9a Add support for more then one suffix domain 2022-02-08 12:25:39 -03:00
Carlos Eduardo c0807b6477
Merge pull request #125 from ToMe25/speedtest_dashboard_fix 2021-11-12 16:49:58 -03:00
ToMe25 b289fa5d4e Fix speedtest grafana dashboard not working on default grafana pod 2021-11-12 16:35:54 +00:00
Carlos Eduardo e5a8f1ab52
Merge pull request #117 from ToMe25/ingress_improvements 2021-11-10 09:35:06 -03:00
Carlos Eduardo 9b70233e4e
Merge pull request #116 from Cian911/master 2021-11-10 09:32:18 -03:00
ToMe25 5397440673 Generate ingress manifests exactly matching the manual changes
This solution isn't perfect, but it works.
2021-10-12 14:23:41 +01:00
ToMe25 3643fdedad Change ingresses from extentions/v1beta1 to networking/v1beta1
This is deprecated too, but not for as long.
I wanted to change it to networking/v1, but ksonnet doesn't support that.
2021-10-11 22:44:30 +01:00
Cian Gallagher 365bee146c
FEATURE: Add speedtest monitor module and custom dashboard. 2021-10-08 19:46:39 +01:00
Carlos Eduardo 8f188464d5
Merge pull request #115 from mimartin12/ingress-patch
Patched ingress to resolve 'unkown spec serviceName'
2021-08-17 17:06:48 -03:00
Micaiah Martin 10a14ac22e Patched ingress to resolve 'unkown spec serviceName' 2021-08-10 09:00:42 -06:00
Carlos Eduardo c04bb08b05
Merge pull request #111 from yosuf/fix-deprecated-ingress-apis
Fix for #110 Upgrade Ingress API versions to networking.k8s.io/v due to deprecation
2021-04-15 18:41:36 -03:00
Carlos Eduardo 9bb111aabc
Merge pull request #108 from fvilers/master
Fix kubelet 1.19 metrics name
2021-04-15 18:41:05 -03:00
Yosuf Haydary b8fb4f2cbf Upgrade Ingress API versions to networking.k8s.io/v due to deprecation
Fixes: Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
Signed-off-by: Yosuf Haydary <yosuf.haydary@rws.nl>
2021-02-14 22:01:29 +01:00
Fabian Vilers be6754470c Fix kubelet 1.19 metrics name
Signed-off-by: Fabian Vilers <fabian.vilers@dev-one.com>
2021-02-01 15:51:03 +01:00
Carlos de Paula a136833aaf Regenerate manifests 2020-12-30 14:07:40 -03:00
Carlos Eduardo 39975951e8
Merge pull request #105 from izewfktvy533zjmn/master
Fixes carlosedp/cluster-monitoring#104
2020-12-30 13:51:20 -03:00
Taichi Soma 680c13fc87 Change manifests/grafana-dashboardDefinitions.yaml 2020-12-11 03:20:27 +09:00
Carlos Eduardo bf9342e95c
Update bug_report.md 2020-10-28 16:39:42 -03:00
Carlos Eduardo eac9776cc6
Merge pull request #99 from jontg/fix/metallb-cluster-role-binding
Bind the metallb-exporter cluster role to prometheus directly
2020-10-27 20:54:40 -03:00
Jón Tómas Grétarsson c89af41371 Bind the metallb-exporter cluster role to prometheus directly 2020-10-13 09:15:52 -07:00
Carlos Eduardo c29680b230
Merge pull request #97 from pascal71/master
Added securityContext.priviledged = true to fix RPi GPU temp issue
2020-10-07 16:13:46 -03:00
pamvdam feefef8121 Added securityContext.priviledged = true to fix RPi GPU temp issue 2020-10-07 18:48:20 +01:00
Carlos Eduardo 5ead7542d1
Merge pull request #96 from NicklasWallgren/topic-20
Fixes carlosedp/cluster-monitoring#20 - Hacktoberfest
2020-10-05 18:26:35 -03:00
Nicklas Wallgren d7c877aa3b Fixes carlosedp/cluster-monitoring#20 2020-10-03 10:09:08 +02:00
Carlos Eduardo 4657fcd62c Update issue templates 2020-08-20 16:58:56 -03:00
Carlos Eduardo 701ef5bad9
Create FUNDING.yml 2020-08-20 16:48:51 -03:00
Carlos Eduardo 0ad298d1b9
Merge pull request #85 from Nashluffy/master 2020-08-17 15:15:55 -03:00
Carlos Eduardo d00fff6e7a
Merge pull request #87 from Nashluffy/patch-1
Added clusterRole, clusterRoleBinding, and SA to metallb module
2020-08-17 15:08:58 -03:00
Nash Luffman 0a3a7424b0
Disable Nginx module by default 2020-08-17 08:06:28 -04:00
Nash Luffman 264569f215
Added clusterRole, clusterRoleBinding, and SA to metallb module 2020-08-16 00:46:50 -04:00
Nash Luffman 3fe1de4029 Fixing commit with a bunch of manifests 2020-08-16 00:36:54 -04:00
Nash Luffman 8537ec4186 Fixing nginx module service 2020-08-16 00:14:16 -04:00
Nash Luffman 9b0050c17d Removing comments 2020-08-15 23:47:13 -04:00
Nash Luffman f7f93e63ea Added nginx module 2020-08-15 23:45:18 -04:00
Carlos de Paula e7c456f380 Restore Prometheus dashboard 2020-08-06 10:45:12 -03:00
Carlos Eduardo d43de654e6
Merge pull request #80 from damacus/79/duplicate-dashboard-ids
[#79] Fix duplicate Grafana Dashboard IDs
2020-08-03 20:21:32 -03:00
Carlos Eduardo 9ad185d8f3
Merge pull request #83 from carlosedp/add-license-1
Create LICENSE
2020-08-03 20:10:28 -03:00
Carlos Eduardo 508600be9d
Create LICENSE 2020-08-03 19:35:27 -03:00
Dan Webb 7d2b7473d8
[#79] Fix duplicate Grafana Dashboard IDs
Fixes the duplicate grafana dashboard UIDs by allowing the DB to generate
the UIDs
2020-07-31 13:14:01 +01:00
Carlos Eduardo bdd892ecce
Merge pull request #77 from naude-r/master
Configure grafana with optional env variables
2020-07-14 13:23:54 -03:00
Roelof Naude c424ae1793 - allow adding environment variables to the grafana deployment. the main use case for
this is to allow plugin download when behind a proxy
2020-07-14 16:26:12 +02:00
Carlos de Paula 301785dc46 Add sample PVs and improve Readme 2020-06-23 18:06:39 -03:00
Carlos de Paula 6b0b8c780a Add Grafana plugins install option
Fixes #67
2020-06-23 17:24:33 -03:00
Carlos de Paula ad1d165158 Add option to configure the StorageClass 2020-06-22 19:38:27 -03:00
Carlos de Paula c2e91b0ea9 Fix make docker target 2020-06-22 19:29:51 -03:00
Carlos de Paula 5c772218d9 Add VSCode config for jsonnet extension 2020-06-22 16:06:50 -03:00
Carlos de Paula 956a160a62 Add Docker Makefile target
With this change, the manifests are built in a Docker container without
the need to install the pre-reqs like Golang, Jsonnet and etc.
2020-06-22 14:07:18 -03:00
Carlos de Paula 9a9e92588c Use a valid name for the TLS secret
Fixes #66.
2020-06-22 13:35:11 -03:00
Carlos de Paula 71e9e55f4c Fix persistence 2020-06-22 09:36:10 -03:00
Carlos de Paula 65d0293701 Update overview dashboard 2020-06-19 14:58:37 -03:00
Carlos de Paula 2ffe9ea31f Bump images and libs. Add params to vars.jsonnet 2020-06-19 09:58:49 -03:00
Carlos de Paula 357a39a627 Add options for pre-created PVs 2020-06-17 15:39:59 -03:00
Carlos de Paula cd17208e4f Fix SMTP relay manifest generation
Fixes #58
2020-06-15 19:46:19 -03:00
Carlos de Paula 33af4a53d1 Fix consolidated Kubernetes dashboard
Fixes #56.
2020-06-15 19:16:47 -03:00
Carlos de Paula b617576d64 Fix TLS secret name 2020-06-15 18:29:18 -03:00
Carlos de Paula 0d8368b29f Improve change_suffix target 2020-06-15 17:52:35 -03:00
Carlos de Paula 0d032eac7f Fix change_suffix and improve Readme
Fix the change_suffix Makefile target allowing the user to just change
the URL suffix of the ingresses for Prometheus, Alertmanager and
Grafana.

This way it's not required to rebuild all manifests from scratch. Also
improved the Readme with this instructions.
2020-06-15 17:06:36 -03:00
Carlos de Paula 40c9318d23 Updated Traefik dashboard 2020-05-27 14:55:06 -03:00
58 changed files with 11005 additions and 31505 deletions

4
.github/FUNDING.yml vendored Normal file
View File

@ -0,0 +1,4 @@
# These are supported funding model platforms
github: carlosedp
patreon: carlosedp

40
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@ -0,0 +1,40 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is. Please do not use an Issue to ask for help deploying the stack in the cluster. Troubleshoot your cluster configuration first.
**Troubleshooting**
1. Which kind of Kubernetes cluster are you using? (Kubernetes, K3s, etc)
2. Are all pods in "Running" state? If any is in CrashLoopback or Error, check it's logs.
3. You cluster already works with other applications that have HTTP/HTTPS? If not, first deploy an example NGINX and test it's access thru the created URL.
4. If you enabled persistence, do your cluster already provides persistent storage (PVs) to other applications?
5. Does it provides dynamic storage thru StorageClass?
6. If you deployed the monitoring stack and some targets are not available or showing no metrics in Grafana, make sure you don't have IPTables rules or use a firewall on your nodes before deploying Kubernetes.
**Customizations**
1. Did you customize `vars.jsonnet`? Put the contents below:
```jsonnet
# vars.jsonnet content
```
2. Did you change any other file? Put the contents below:
```jsonnet
# file x.yz content
```
**What did you see when trying to access Grafana and Prometheus web GUI**
A clear and concise description with screenshots from the URLs and logs from the pods.
**Additional context**
Add any other context about the problem here.

1
.gitignore vendored
View File

@ -2,3 +2,4 @@ vendor
auth
server.crt
server.key
.idea

5
.vscode/settings.json vendored Normal file
View File

@ -0,0 +1,5 @@
{
"jsonnet.libPaths": [
"vendor/",
],
}

21
LICENSE Normal file
View File

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2020 Carlos Eduardo
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@ -5,6 +5,13 @@ JB_BINARY := $(GOPATH)/bin/jb
JSONNET_FMT := $(GOPATH)/bin/jsonnetfmt -n 2 --max-blank-lines 2 --string-style s --comment-style s
GO_MAJOR_VERSION = $(shell go version | cut -c 14- | cut -d' ' -f1 | cut -d'.' -f1)
GO_MINOR_VERSION = $(shell go version | cut -c 14- | cut -d' ' -f1 | cut -d'.' -f2)
MINIMUM_SUPPORTED_GO_MAJOR_VERSION = 1
MINIMUM_SUPPORTED_GO_MINOR_VERSION = 18
GO_VERSION_VALIDATION_ERR_MSG = Your golang version, $(GO_MAJOR_VERSION).$(GO_MINOR_VERSION), is not supported, \
please update to at least $(MINIMUM_SUPPORTED_GO_MAJOR_VERSION).$(MINIMUM_SUPPORTED_GO_MINOR_VERSION).
.PHONY: generate vendor fmt manifests help
all: manifests ## Builds the manifests
@ -18,52 +25,63 @@ manifests: $(JSONNET_BIN) ## Builds the manifests
rm -rf manifests
./scripts/build.sh main.jsonnet $(JSONNET_BIN)
update_libs: $(JB_BINARY) ## Updates vendor libs. Require a regeneration of the manifests
docker: ## Builds the manifests in a Docker container to avoid installing pre-requisites (Golang, Jsonnet, etc)
docker run -it --rm -v $(PWD):/work -w /work --rm golang bash -c "make vendor && make"
update_libs: $(JB_BINARY) ## Updates vendor libs. Require a regeneration of the manifests
$(JB_BINARY) update
vendor: $(JB_BINARY) jsonnetfile.json jsonnetfile.lock.json ## Download vendor libs
vendor: validate-go-version $(JB_BINARY) jsonnetfile.json jsonnetfile.lock.json ## Download vendor libs
rm -rf vendor
$(JB_BINARY) install
fmt: ## Formats all jsonnet and libsonnet files (except on vendor dir)
fmt: ## Formats all jsonnet and libsonnet files (except on vendor dir)
@echo "Formatting jsonnet files"
@find . -name 'vendor' -prune -o -name '*.libsonnet' -o -name '*.jsonnet' -print | xargs -n 1 -- $(JSONNET_FMT) -i
@find . -type f \( -iname "*.libsonnet" -or -iname "*.jsonnet" \) -print -or -name "vendor" -prune | xargs -n 1 -- $(JSONNET_FMT) -i
deploy: manifests ## Rebuilds manifests and deploy to configured cluster
deploy: ## Deploy current manifests to configured cluster
echo "Deploying stack setup manifests..."
kubectl apply -f ./manifests/setup/
echo "Will wait 10 seconds to deploy the additional manifests.."
sleep 10
kubectl apply -f ./manifests/
teardown: ## Delete all monitoring stack resources from configured cluster
teardown: ## Delete all monitoring stack resources from configured cluster
kubectl delete -f ./manifests/
kubectl delete -f ./manifests/setup/
tar: manifests ## Generates a .tar.gz from manifests dir
tar: manifests ## Generates a .tar.gz from manifests dir
rm -rf manifests.tar.gz
tar -cfz manifests.tar.gz manifests
$(JB_BINARY): ## Installs jsonnet-bundler utility
$(JB_BINARY): ## Installs jsonnet-bundler utility
@echo "Installing jsonnet-bundler"
@go get -u github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb
@go install github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb@latest
$(JSONNET_BIN): ## Installs jsonnet and jsonnetfmt utility
$(JSONNET_BIN): ## Installs jsonnet and jsonnetfmt utility
@echo "Installing jsonnet"
@go get -u github.com/google/go-jsonnet/cmd/jsonnet
@go get -u github.com/google/go-jsonnet/cmd/jsonnetfmt
@go get -u github.com/brancz/gojsontoyaml
@go install github.com/google/go-jsonnet/cmd/jsonnet@latest
@go install github.com/google/go-jsonnet/cmd/jsonnetfmt@latest
@go install github.com/brancz/gojsontoyaml@latest
update_tools: ## Updates jsonnet, jsonnetfmt and jb utilities
update_tools: ## Updates jsonnet, jsonnetfmt and jb utilities
@echo "Updating jsonnet"
@go get -u github.com/google/go-jsonnet/cmd/jsonnet
@go get -u github.com/google/go-jsonnet/cmd/jsonnetfmt
@go get -u github.com/brancz/gojsontoyaml
@go get -u github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb
@go install github.com/google/go-jsonnet/cmd/jsonnet@latest
@go install github.com/google/go-jsonnet/cmd/jsonnetfmt@latest
@go install github.com/brancz/gojsontoyaml@latest
@go install github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb@latest
change_suffix: ## Changes suffix for the ingress
@perl -p -i -e 's/^(\s*)\-\ host:.*/\1- host: alertmanager.${IP}.nip.io/g' manifests/ingress-alertmanager.yaml manifests/ingress-prometheus.yaml manifests/ingress-grafana.yaml
@echo "Ingress IPs changed to [service].${IP}.nip.io"
${K3S} kubectl apply -f manifests/ingress-alertmanager.yaml
${K3S} kubectl apply -f manifests/ingress-grafana.yaml
${K3S} kubectl apply -f manifests/ingress-prometheus.yaml
change_suffix: ## Changes suffix for the ingress. Pass suffix=[suffixURL] as argument
@echo "Ingress IPs changed to [service].${suffix}"
@echo "Apply to your cluster with:"
@for f in alertmanager prometheus grafana; do \
cat manifests/ingress-$$f.yaml | sed -e "s/\(.*$$f\.\).*/\1${suffix}/" > manifests/ingress-$$f.yaml-tmp; \
mv -f manifests/ingress-$$f.yaml-tmp manifests/ingress-$$f.yaml; \
echo ${K3S} kubectl apply -f manifests/ingress-$$f.yaml; \
done
validate-go-version: ## Validates the installed version of go to allow the `go install` syntax needed for `make vendor`
@if [ $(GO_MAJOR_VERSION) -lt $(MINIMUM_SUPPORTED_GO_MAJOR_VERSION) -o $(GO_MINOR_VERSION) -lt $(MINIMUM_SUPPORTED_GO_MINOR_VERSION) ]; then \
echo '$(GO_VERSION_VALIDATION_ERR_MSG)'; \
exit 1; \
fi

View File

@ -4,7 +4,7 @@ The Prometheus Operator for Kubernetes provides easy monitoring definitions for
This have been tested on a hybrid ARM64 / X84-64 Kubernetes cluster deployed as [this article](https://medium.com/@carlosedp/building-a-hybrid-x86-64-and-arm-kubernetes-cluster-e7f94ff6e51d).
This repository collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
This repository collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator. The container images support AMD64, ARM64, ARM and PPC64le architectures.
The content of this project is written in jsonnet and is an extension of the fantastic [kube-prometheus](https://github.com/coreos/prometheus-operator/blob/master/contrib/kube-prometheus) project.
@ -25,7 +25,7 @@ There are additional modules (disabled by default) to monitor other components o
The additional modules are:
* arm-exporter to generate temperature metrics (works on some ARM boards like RaspberryPi)
* ARM-exporter to generate temperature metrics (works on some ARM boards like RaspberryPi)
* MetalLB metrics
* Traefik metrics
* ElasticSearch metrics
@ -36,22 +36,36 @@ There are also options to set the ingress domain suffix and enable persistence f
The ingresses can use TLS with the default self-signed certificate from your Ingress controller by setting `TLSingress` to `true` and use a custom certificate by creating the files `server.crt` and `server.key` and enabling the `UseProvidedCerts` parameter at `vars.jsonnet`.
After changing these parameters, rebuild the manifests with `make`.
Persistence for Prometheus and Grafana can be enabled in the `enablePersistence` section. Setting each to `true`, creates the volume PVCs. If no PV names are defined in `prometheusPV` and `grafanaPV`, the default StorageClass will be used to dynamically create the PVs The sizes can be adjusted in `prometheusSizePV` and `grafanaSizePV`.
If using pre-created persistent volumes (samples in [`samples`](samples)), check permissions on the directories hosting the files. The `UID:GID` for Prometheus is `1000:0` and for Grafana is `472:472`.
Changing these parameters require a rebuild of the manifests with `make` followed by `make deploy`. To avoid installing all pre-requisites like Golang, Jsonnet, Jsonnet-bundler, use the target `make docker` to build in a container.
## Quickstart (non K3s)
The repository already provides a set of compiled manifests to be applied into the cluster. The deployment can be customized thru the jsonnet files.
The repository already provides a set of compiled manifests to be applied into the cluster or the deployment can be customized thru the jsonnet files.
For the ingresses, edit `suffixDomain` to have your cluster URL suffix. This will be your ingresses will be exposed (ex. grafana.yourcluster.domain.com).
If you only need the default features and adjust your cluster URL for the ingress, there is no need to rebuild the manifests(and install all tools). Use the `change_suffix` target with argument `suffix=[suffixURL]` with the URL of your cluster ingress controller. If you have a local cluster, use the nip.io domain resolver passing `your_cluster_ip.nip.io` to the `suffix` argument. After this, just run `make deploy`.
To deploy the stack, run:
```bash
# Update the ingress URLs
make change_suffix suffix=[suffixURL]
# Deploy
make deploy
```
To customize the manifests, edit `vars.jsonnet` and rebuild the manifests.
```bash
$ make vendor
$ make
$ make deploy
# Or manually:
$ make vendor
$ make
$ kubectl apply -f manifests/setup/
$ kubectl apply -f manifests/
@ -63,9 +77,7 @@ If you enable the SMTP relay for Gmail in `vars.jsonnet`, the pod will be in an
## Quickstart on Minikube
You can also test and develop the monitoring stack on Minikube. First install minikube by following the instructions [here](https://kubernetes.io/docs/tasks/tools/install-minikube/) for your platform.
Then, adjust and deploy the stack:
You can also test and develop the monitoring stack on Minikube. First install minikube by following the instructions [here](https://kubernetes.io/docs/tasks/tools/install-minikube/) for your platform. Then, follow the instructions similar to the non-K3s deployment:
```bash
# Start minikube (if not started)
@ -77,9 +89,14 @@ minikube addons enable ingress
# Get the minikube instance IP
minikube ip
# Adjust the "suffixDomain" line in `vars.jsonnet` to the IP of the minikube instance keeping the "nip.io"
# Build and deploy the monitoring stack
# Run the change_suffix target
make change_suffix suffix=[minikubeIP.nip.io]
# or customize additional params on vars.jsonnet and rebuild
make vendor
make
# and deploy the manifests
make deploy
# Get the URLs for the exposed applications and open in your browser
@ -99,6 +116,7 @@ After changing these values to deploy the stack, run:
```bash
$ make vendor
$ make
$ make deploy
# Or manually:
@ -125,7 +143,7 @@ To list the created ingresses, run `kubectl get ingress --all-namespaces`, if yo
## Updating the ingress suffixes
To avoid rebuilding all manifests, there is a make target to update the Ingress URL suffix to a different suffix (using nip.io) to match your host IP. Run `make change_suffix IP="[IP-ADDRESS]"` to change the ingress route IP for Grafana, Prometheus and Alertmanager and reapply the manifests. If you have a K3s cluster, run `make change_suffix IP="[IP-ADDRESS] K3S=k3s`.
To avoid rebuilding all manifests, there is a make target to update the Ingress URL suffix to a different suffix. Run `make change_suffix suffix="[clusterURL]"` to change the ingress route IP for Grafana, Prometheus and Alertmanager and reapply the manifests.
## Customizing
@ -133,7 +151,7 @@ The content of this project consists of a set of jsonnet files making up a libra
### Pre-reqs
The project requires json-bundler and the jsonnet compiler. The Makefile does the heavy-lifting of installing them. You need [Go](https://golang.org/dl/) already installed:
The project requires json-bundler and the jsonnet compiler. The Makefile does the heavy-lifting of installing them. You need [Go](https://golang.org/dl/) (version 1.18 minimum) already installed:
```bash
git clone https://github.com/carlosedp/cluster-monitoring

View File

@ -4,12 +4,13 @@ local vars = import 'vars.jsonnet';
{
_config+:: {
namespace: 'monitoring',
namespace: vars._config.namespace,
urls+:: {
prom_ingress: 'prometheus.' + vars.suffixDomain,
alert_ingress: 'alertmanager.' + vars.suffixDomain,
grafana_ingress: 'grafana.' + vars.suffixDomain,
domains: [vars.suffixDomain] + vars.additionalDomains,
prom_ingress: ['prometheus.' + domain for domain in $._config.urls.domains],
alert_ingress: ['alertmanager.' + domain for domain in $._config.urls.domains],
grafana_ingress: ['grafana.' + domain for domain in $._config.urls.domains],
grafana_ingress_external: 'grafana.' + vars.suffixDomain,
},
@ -58,6 +59,8 @@ local vars = import 'vars.jsonnet';
},
},
},
plugins: vars.grafana.plugins,
env: vars.grafana.env
},
},
//---------------------------------------
@ -69,18 +72,21 @@ local vars = import 'vars.jsonnet';
local pvc = k.core.v1.persistentVolumeClaim,
prometheus+: {
spec+: {
// Here one can use parameters from https://coreos.com/operators/prometheus/docs/latest/api.html#prometheusspec
replicas: $._config.prometheus.replicas,
retention: '15d',
externalUrl: 'http://' + $._config.urls.prom_ingress,
retention: vars.prometheus.retention,
scrapeInterval: vars.prometheus.scrapeInterval,
scrapeTimeout: vars.prometheus.scrapeTimeout,
externalUrl: 'http://' + $._config.urls.prom_ingress[0],
}
+ (if vars.enablePersistence.prometheus then {
storage: {
volumeClaimTemplate:
pvc.new() +
pvc.mixin.spec.withAccessModes('ReadWriteOnce') +
pvc.mixin.spec.resources.withRequests({ storage: '20Gi' }),
// Uncomment below to define a StorageClass name
//+ pvc.mixin.spec.withStorageClassName('nfs-master-ssd'),
pvc.mixin.spec.resources.withRequests({ storage: vars.enablePersistence.prometheusSizePV }) +
(if vars.enablePersistence.prometheusPV != null then pvc.mixin.spec.withVolumeName(vars.enablePersistence.prometheusPV)) +
(if vars.enablePersistence.storageClass != null then pvc.mixin.spec.withStorageClassName(vars.enablePersistence.storageClass)),
},
} else {}),
},
@ -120,7 +126,10 @@ local vars = import 'vars.jsonnet';
pvc.mixin.metadata.withNamespace($._config.namespace) +
pvc.mixin.metadata.withName('grafana-storage') +
pvc.mixin.spec.withAccessModes('ReadWriteOnce') +
pvc.mixin.spec.resources.withRequests({ storage: '2Gi' }),
pvc.mixin.spec.resources.withRequests({ storage: vars.enablePersistence.grafanaSizePV }) +
(if vars.enablePersistence.grafanaPV != null then pvc.mixin.spec.withVolumeName(vars.enablePersistence.grafanaPV)) +
(if vars.enablePersistence.storageClass != null then pvc.mixin.spec.withStorageClassName(vars.enablePersistence.storageClass)),
} else {},
grafanaDashboards+:: $._config.grafanaDashboards,
@ -131,9 +140,9 @@ local vars = import 'vars.jsonnet';
local I = utils.newIngress('alertmanager-main', $._config.namespace, $._config.urls.alert_ingress, '/', 'alertmanager-main', 'web');
if vars.TLSingress then
if vars.UseProvidedCerts then
utils.addIngressTLS(I, 'ingress-TLS-secret')
utils.addIngressTLS(I, $._config.urls.alert_ingress, 'ingress-secret')
else
utils.addIngressTLS(I)
utils.addIngressTLS(I, $._config.urls.alert_ingress)
else
I,
@ -141,9 +150,9 @@ local vars = import 'vars.jsonnet';
local I = utils.newIngress('grafana', $._config.namespace, $._config.urls.grafana_ingress, '/', 'grafana', 'http');
if vars.TLSingress then
if vars.UseProvidedCerts then
utils.addIngressTLS(I, 'ingress-TLS-secret')
utils.addIngressTLS(I, $._config.urls.grafana_ingress, 'ingress-secret')
else
utils.addIngressTLS(I)
utils.addIngressTLS(I, $._config.urls.grafana_ingress)
else
I,
@ -151,9 +160,9 @@ local vars = import 'vars.jsonnet';
local I = utils.newIngress('prometheus-k8s', $._config.namespace, $._config.urls.prom_ingress, '/', 'prometheus-k8s', 'web');
if vars.TLSingress then
if vars.UseProvidedCerts then
utils.addIngressTLS(I, 'ingress-TLS-secret')
utils.addIngressTLS(I, $._config.urls.prom_ingress, 'ingress-secret')
else
utils.addIngressTLS(I)
utils.addIngressTLS(I, $._config.urls.prom_ingress)
else
I,
@ -183,6 +192,6 @@ local vars = import 'vars.jsonnet';
// secret.mixin.metadata.withNamespace($._config.namespace),
} + if vars.UseProvidedCerts then {
secret:
utils.newTLSSecret('ingress-TLS-secret', $._config.namespace, vars.TLSCertificate, vars.TLSKey),
utils.newTLSSecret('ingress-secret', $._config.namespace, vars.TLSCertificate, vars.TLSKey),
} else {},
}

File diff suppressed because it is too large Load Diff

View File

@ -1,16 +1,14 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
"list": [{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}]
},
"description": "A dashboard for the CoreDNS DNS server.",
"editable": true,
@ -19,8 +17,7 @@
"id": 14,
"iteration": 1549319226130,
"links": [],
"panels": [
{
"panels": [{
"aliasColors": {},
"bars": false,
"dashLength": 10,
@ -54,17 +51,14 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "total",
"yaxis": 2
}
],
"seriesOverrides": [{
"alias": "total",
"yaxis": 2
}],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "sum(rate(coredns_dns_request_count_total{instance=~\"$instance\"}[5m])) by (proto)",
"format": "time_series",
"intervalFactor": 2,
@ -99,8 +93,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "pps",
"logBase": 1,
"max": null,
@ -154,8 +147,7 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"seriesOverrides": [{
"alias": "total",
"yaxis": 2
},
@ -167,15 +159,13 @@
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(coredns_dns_request_type_count_total{instance=~\"$instance\"}[5m])) by (type)",
"intervalFactor": 2,
"legendFormat": "{{type}}",
"refId": "A",
"step": 60
}
],
"targets": [{
"expr": "sum(rate(coredns_dns_request_type_count_total{instance=~\"$instance\"}[5m])) by (type)",
"intervalFactor": 2,
"legendFormat": "{{type}}",
"refId": "A",
"step": 60
}],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
@ -194,8 +184,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "pps",
"logBase": 1,
"max": null,
@ -249,17 +238,14 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "total",
"yaxis": 2
}
],
"seriesOverrides": [{
"alias": "total",
"yaxis": 2
}],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "sum(rate(coredns_dns_request_count_total{instance=~\"$instance\"}[5m])) by (zone)",
"intervalFactor": 2,
"legendFormat": "{{zone}}",
@ -292,8 +278,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "pps",
"logBase": 1,
"max": null,
@ -347,17 +332,14 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "total",
"yaxis": 2
}
],
"seriesOverrides": [{
"alias": "total",
"yaxis": 2
}],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "sum(rate(coredns_dns_request_do_count_total{instance=~\"$instance\"}[5m]))",
"intervalFactor": 2,
"legendFormat": "DO",
@ -390,8 +372,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "pps",
"logBase": 1,
"max": null,
@ -445,8 +426,7 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"seriesOverrides": [{
"alias": "tcp:90",
"yaxis": 2
},
@ -462,8 +442,7 @@
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "histogram_quantile(0.99, sum(rate(coredns_dns_request_size_bytes_bucket{instance=~\"$instance\",proto=\"udp\"}[5m])) by (le,proto))",
"intervalFactor": 2,
"legendFormat": "{{proto}}:99 ",
@ -503,8 +482,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "bytes",
"logBase": 1,
"max": null,
@ -558,8 +536,7 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"seriesOverrides": [{
"alias": "tcp:90",
"yaxis": 1
},
@ -575,8 +552,7 @@
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "histogram_quantile(0.99, sum(rate(coredns_dns_request_size_bytes_bucket{instance=~\"$instance\",proto=\"tcp\"}[5m])) by (le,proto))",
"intervalFactor": 2,
"legendFormat": "{{proto}}:99 ",
@ -616,8 +592,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "bytes",
"logBase": 1,
"max": null,
@ -675,15 +650,13 @@
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(coredns_dns_response_rcode_count_total{instance=~\"$instance\"}[5m])) by (rcode)",
"intervalFactor": 2,
"legendFormat": "{{rcode}}",
"refId": "A",
"step": 40
}
],
"targets": [{
"expr": "sum(rate(coredns_dns_response_rcode_count_total{instance=~\"$instance\"}[5m])) by (rcode)",
"intervalFactor": 2,
"legendFormat": "{{rcode}}",
"refId": "A",
"step": 40
}],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
@ -702,8 +675,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "pps",
"logBase": 1,
"max": null,
@ -761,8 +733,7 @@
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "histogram_quantile(0.99, sum(rate(coredns_dns_request_duration_milliseconds_bucket{instance=~\"$instance\"}[5m])) by (le, job))",
"intervalFactor": 2,
"legendFormat": "99%",
@ -802,8 +773,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "ms",
"logBase": 1,
"max": null,
@ -857,8 +827,7 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"seriesOverrides": [{
"alias": "udp:50%",
"yaxis": 1
},
@ -878,8 +847,7 @@
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "histogram_quantile(0.99, sum(rate(coredns_dns_response_size_bytes_bucket{instance=~\"$instance\",proto=\"udp\"}[5m])) by (le,proto)) ",
"intervalFactor": 2,
"legendFormat": "{{proto}}:99%",
@ -920,8 +888,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "bytes",
"logBase": 1,
"max": null,
@ -975,8 +942,7 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"seriesOverrides": [{
"alias": "udp:50%",
"yaxis": 1
},
@ -996,8 +962,7 @@
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "histogram_quantile(0.99, sum(rate(coredns_dns_response_size_bytes_bucket{instance=~\"$instance\",proto=\"tcp\"}[5m])) by (le,proto)) ",
"intervalFactor": 2,
"legendFormat": "{{proto}}:99%",
@ -1038,8 +1003,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "bytes",
"logBase": 1,
"max": null,
@ -1097,15 +1061,13 @@
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(coredns_cache_size{instance=~\"$instance\"}) by (type)",
"intervalFactor": 2,
"legendFormat": "{{type}}",
"refId": "A",
"step": 40
}
],
"targets": [{
"expr": "sum(coredns_cache_size{instance=~\"$instance\"}) by (type)",
"intervalFactor": 2,
"legendFormat": "{{type}}",
"refId": "A",
"step": 40
}],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
@ -1124,8 +1086,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "short",
"logBase": 1,
"max": null,
@ -1179,17 +1140,14 @@
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "misses",
"yaxis": 2
}
],
"seriesOverrides": [{
"alias": "misses",
"yaxis": 2
}],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"targets": [{
"expr": "sum(rate(coredns_cache_hits_total{instance=~\"$instance\"}[5m])) by (type)",
"intervalFactor": 2,
"legendFormat": "hits:{{type}}",
@ -1222,8 +1180,7 @@
"show": true,
"values": []
},
"yaxes": [
{
"yaxes": [{
"format": "pps",
"logBase": 1,
"max": null,
@ -1251,33 +1208,31 @@
"coredns"
],
"templating": {
"list": [
{
"allValue": ".*",
"current": {
"text": "All",
"value": "$__all"
},
"datasource": "prometheus",
"definition": "",
"hide": 0,
"includeAll": true,
"label": "Instance",
"multi": false,
"name": "instance",
"options": [],
"query": "up{job=\"coredns\"}",
"refresh": 1,
"regex": ".*instance=\"(.*?)\".*",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
"list": [{
"allValue": ".*",
"current": {
"text": "All",
"value": "$__all"
},
"datasource": "prometheus",
"definition": "",
"hide": 0,
"includeAll": true,
"label": "Instance",
"multi": false,
"name": "instance",
"options": [],
"query": "up{job=\"coredns\"}",
"refresh": 1,
"regex": ".*instance=\"(.*?)\".*",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
}]
},
"time": {
"from": "now-3h",
@ -1311,6 +1266,5 @@
},
"timezone": "utc",
"title": "CoreDNS",
"uid": "q-QViRumz",
"version": 1
}
}

View File

@ -2252,6 +2252,5 @@
},
"timezone": "utc",
"title": "ElasticSearch Cluster",
"uid": "n_nxrE_mk7",
"version": 4
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,352 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"links": [],
"panels": [
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"fill": 1,
"fillGradient": 10,
"gridPos": {
"h": 9,
"w": 8,
"x": 0,
"y": 0
},
"hiddenSeries": false,
"id": 4,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": true,
"rightSide": false,
"show": true,
"total": false,
"values": true
},
"lines": true,
"linewidth": 1,
"nullPointMode": "connected",
"options": {
"alertThreshold": true,
"dataLinks": []
},
"percentage": false,
"pluginVersion": "7.0.3",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": true,
"targets": [
{
"expr": "speedtest_ping_latency_milliseconds",
"format": "time_series",
"instant": false,
"interval": "",
"legendFormat": "Ping (ms)",
"refId": "A"
},
{
"expr": "speedtest_jitter_latency_milliseconds",
"format": "time_series",
"instant": false,
"interval": "",
"legendFormat": "Jitter (ms)",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Ping and Jitter (ms)",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:291",
"format": "ms",
"label": "Time",
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"$$hashKey": "object:292",
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"description": "",
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"fill": 1,
"fillGradient": 10,
"gridPos": {
"h": 9,
"w": 8,
"x": 8,
"y": 0
},
"hiddenSeries": false,
"id": 2,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": true,
"show": true,
"total": false,
"values": true
},
"lines": true,
"linewidth": 1,
"nullPointMode": "connected",
"options": {
"alertThreshold": true,
"dataLinks": []
},
"percentage": false,
"pluginVersion": "7.2.1",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": true,
"targets": [
{
"expr": "speedtest_download_bits_per_second*10^-6",
"format": "time_series",
"instant": false,
"interval": "",
"legendFormat": "Download Speed (Mbits/s)",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Download Speed (Mbits/s)",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:291",
"format": "Mbits",
"label": "Download Speed",
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"$$hashKey": "object:292",
"decimals": null,
"format": "dateTimeAsLocal",
"label": "",
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "prometheus",
"fieldConfig": {
"defaults": {
"custom": {}
},
"overrides": []
},
"fill": 1,
"fillGradient": 10,
"gridPos": {
"h": 9,
"w": 8,
"x": 16,
"y": 0
},
"hiddenSeries": false,
"id": 3,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": true,
"show": true,
"total": false,
"values": true
},
"lines": true,
"linewidth": 1,
"nullPointMode": "connected",
"options": {
"alertThreshold": true,
"dataLinks": []
},
"percentage": false,
"pluginVersion": "7.2.1",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": true,
"targets": [
{
"expr": "speedtest_upload_bits_per_second*10^-6",
"format": "time_series",
"interval": "",
"legendFormat": "Upload Speed (Mbits/s)",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Upload Speed (Mbits/s)",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:291",
"decimals": null,
"format": "Mbits",
"label": "Upload Speed",
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"$$hashKey": "object:292",
"decimals": null,
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
],
"refresh": "15m",
"schemaVersion": 25,
"style": "dark",
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Speedtest Dashboard",
"uid": "-fs18ztMz",
"version": 1
}

File diff suppressed because it is too large Load Diff

View File

@ -1,19 +1,19 @@
{
_config+:: {
versions+:: {
prometheus: 'v2.18.1',
alertmanager: 'v0.20.0',
prometheus: 'v2.19.1',
alertmanager: 'v0.21.0',
kubeStateMetrics: '1.9.6',
kubeRbacProxy: 'v0.5.0',
addonResizer: '2.3',
nodeExporter: 'v0.18.1',
prometheusOperator: 'v0.39.0',
prometheusOperator: 'v0.40.0',
prometheusAdapter: 'v0.7.0',
grafana: '7.0.0',
grafana: '7.0.3',
configmapReloader: 'latest',
prometheusConfigReloader: 'v0.39.0',
prometheusConfigReloader: 'v0.40.0',
armExporter: 'latest',
smtpServer: 'v1.0.1',
smtpRelay: 'v1.0.1',
elasticExporter: '1.0.4rc1',
},
@ -30,7 +30,7 @@
configmapReloader: 'carlosedp/configmap-reload',
prometheusConfigReloader: 'carlosedp/prometheus-config-reloader',
armExporter: 'carlosedp/arm_exporter',
smtpServer: 'carlosedp/docker-smtp',
smtpRelay: 'carlosedp/docker-smtp',
elasticExporter: 'carlosedp/elasticsearch-exporter',
},
},

View File

@ -18,7 +18,7 @@
"subdir": "Documentation/etcd-mixin"
}
},
"version": "747ff75c96df87530bcd8b6b02d1160c5500bf4e",
"version": "d8c8f903eee10b8391abaef7758c38b2cd393c55",
"sum": "pk7mLpdUrHuJKkj2vhD6LGMU7P+oYYooBXAeZyZa398="
},
{
@ -28,8 +28,8 @@
"subdir": "jsonnet/kube-prometheus"
}
},
"version": "5a84ac52c7517a420b3bdf3cd251e8abce59a300",
"sum": "cEMmJvhn8dLnLqUVR0ql/XnwY8Jy3HH0YWIQQRaDD0o="
"version": "17989b42aa10b1c6afa07043cb05bcd5ae492284",
"sum": "2FR289B1LGUf5tTN4PXBj5TjRX7okSFxE8uHkSslzDQ="
},
{
"source": {
@ -38,8 +38,8 @@
"subdir": "jsonnet/prometheus-operator"
}
},
"version": "d0a871b710de7b764c05ced98dbd1eb32a681790",
"sum": "cIOKRTNBUOl3a+QsaA/NjClmZAhyVJHlDFReKlXJBAs="
"version": "e31c69f9b5c6555e0f4a5c1f39d0f03182dd6b41",
"sum": "WggWVWZ+CBEUThQCztSaRELbtqdXf9s3OFzf06HbYNA="
},
{
"source": {
@ -48,8 +48,8 @@
"subdir": "grafonnet"
}
},
"version": "5736b62831d779e28a8344646aee1f72b1fa1d90",
"sum": "ch97Uqauz7z+9mkOwzRz6JErxgWcQlfuJEEg+XHEadg="
"version": "8fb95bd89990e493a8534205ee636bfcb8db67bd",
"sum": "tDuuSKE9f4Ew2bjBM33Rs6behLEAzkmKkShSt+jpAak="
},
{
"source": {
@ -58,7 +58,7 @@
"subdir": "grafana-builder"
}
},
"version": "b9cc0f3529833096c043084c04bc7b3562a134c4",
"version": "881db2241f0c5007c3e831caf34b0c645202b4ab",
"sum": "slxrtftVDiTlQK22ertdfrg4Epnq97gdrLI63ftUfaE="
},
{
@ -79,8 +79,8 @@
"subdir": ""
}
},
"version": "2beabb38d3241eb5da5080cbeb648a0cd1e3cbc2",
"sum": "s6t8ntlUHAjnifWx5V1jnBukTLPya7fX7YZVxJ0GcTk="
"version": "b61c5a34051f8f57284a08fe78ad8a45b430252b",
"sum": "7Hx/5eNm7ubLTsdrpk3b2+e/FLR3XOa4HCukmbRUCAY="
},
{
"source": {
@ -89,7 +89,7 @@
"subdir": "lib/promgrafonnet"
}
},
"version": "2beabb38d3241eb5da5080cbeb648a0cd1e3cbc2",
"version": "b61c5a34051f8f57284a08fe78ad8a45b430252b",
"sum": "VhgBM39yv0f4bKv8VfGg4FXkg573evGDRalip9ypKbc="
},
{
@ -99,7 +99,7 @@
"subdir": "jsonnet/kube-state-metrics"
}
},
"version": "cce1e3309ab2f42953933e441cbb20b54d986551",
"version": "d667979ed55ad1c4db44d331b51d646f5b903aa7",
"sum": "cJjGZaLBjcIGrLHZLjRPU9c3KL+ep9rZTb9dbALSKqA="
},
{
@ -109,7 +109,7 @@
"subdir": "jsonnet/kube-state-metrics-mixin"
}
},
"version": "cce1e3309ab2f42953933e441cbb20b54d986551",
"version": "d667979ed55ad1c4db44d331b51d646f5b903aa7",
"sum": "o5avaguRsfFwYFNen00ZEsub1x4i8Z/ZZ2QoEjFMff8="
},
{
@ -119,7 +119,7 @@
"subdir": "docs/node-mixin"
}
},
"version": "b9c96706a7425383902b6143d097cf6d7cfd1960",
"version": "08ce3c6dd430deb51798826701a395e460620d60",
"sum": "3jFV2qsc/GZe2GADswTYqxxP2zGOiANTj73W/VNFGqc="
},
{
@ -129,8 +129,8 @@
"subdir": "documentation/prometheus-mixin"
}
},
"version": "c9565f08aa4dbd53164ec1b75ea401a70feb1506",
"sum": "kRb3XBTe/AALDcaTFfyuiKqzhxtLvihBkVkvJ5cUd/I=",
"version": "74207c04655e1fd93eea0e9a5d2f31b1cbc4d3d0",
"sum": "lEzhZ8gllSfAO4kmXeTwl4W0anapIeFd5GCaCNuDe18=",
"name": "prometheus"
}
],

View File

@ -20,7 +20,7 @@ spec:
- monitoring
topologyKey: kubernetes.io/hostname
weight: 100
image: prom/alertmanager:v0.20.0
image: prom/alertmanager:v0.21.0
nodeSelector:
kubernetes.io/os: linux
replicas: 1
@ -29,4 +29,4 @@ spec:
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: alertmanager-main
version: v0.20.0
version: v0.21.0

File diff suppressed because it is too large Load Diff

View File

@ -17,7 +17,7 @@ spec:
spec:
containers:
- env: []
image: grafana/grafana:7.0.0
image: grafana/grafana:7.0.3
name: grafana
ports:
- containerPort: 3000

View File

@ -1,17 +1,20 @@
apiVersion: extensions/v1beta1
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: alertmanager-main
namespace: monitoring
spec:
rules:
- host: alertmanager.192.168.15.15.nip.io
- host: alertmanager.192.168.1.15.nip.io
http:
paths:
- backend:
serviceName: alertmanager-main
servicePort: web
service:
name: alertmanager-main
port:
name: web
path: /
pathType: Prefix
tls:
- hosts:
- alertmanager.192.168.15.15.nip.io
- alertmanager.192.168.1.15.nip.io

View File

@ -1,17 +1,20 @@
apiVersion: extensions/v1beta1
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana
namespace: monitoring
spec:
rules:
- host: grafana.192.168.15.15.nip.io
- host: grafana.192.168.1.15.nip.io
http:
paths:
- backend:
serviceName: grafana
servicePort: http
service:
name: grafana
port:
name: http
path: /
pathType: Prefix
tls:
- hosts:
- grafana.192.168.15.15.nip.io
- grafana.192.168.1.15.nip.io

View File

@ -1,17 +1,20 @@
apiVersion: extensions/v1beta1
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: prometheus-k8s
namespace: monitoring
spec:
rules:
- host: prometheus.192.168.15.15.nip.io
- host: prometheus.192.168.1.15.nip.io
http:
paths:
- backend:
serviceName: prometheus-k8s
servicePort: web
service:
name: prometheus-k8s
port:
name: web
path: /
pathType: Prefix
tls:
- hosts:
- prometheus.192.168.15.15.nip.io
- prometheus.192.168.1.15.nip.io

View File

@ -1,24 +0,0 @@
{
"apiVersion": "v1",
"kind": "Service",
"metadata": {
"labels": {
"k8s-app": "kube-controller-manager"
},
"name": "kube-controller-manager-prometheus-discovery",
"namespace": "kube-system"
},
"spec": {
"clusterIP": "None",
"ports": [
{
"name": "http-metrics",
"port": 10252,
"targetPort": 10252
}
],
"selector": {
"component": "kube-controller-manager"
}
}
}

View File

@ -4,7 +4,7 @@ metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.39.0
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
namespace: monitoring
spec:
@ -19,4 +19,4 @@ spec:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.39.0
app.kubernetes.io/version: v0.40.0

View File

@ -25,8 +25,8 @@ spec:
- name: alertmanager-main
namespace: monitoring
port: web
externalUrl: http://prometheus.192.168.15.15.nip.io
image: prom/prometheus:v2.18.1
externalUrl: http://prometheus.192.168.1.15.nip.io
image: prom/prometheus:v2.19.1
nodeSelector:
kubernetes.io/os: linux
podMonitorNamespaceSelector: {}
@ -40,6 +40,8 @@ spec:
matchLabels:
prometheus: k8s
role: alert-rules
scrapeInterval: 30s
scrapeTimeout: 30s
securityContext:
fsGroup: 2000
runAsNonRoot: true
@ -47,4 +49,4 @@ spec:
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
version: v2.18.1
version: v2.19.1

View File

@ -708,24 +708,19 @@ spec:
record: node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile
- name: kube-prometheus-node-recording.rules
rules:
- expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait"}[3m])) BY
(instance)
- expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait"}[3m])) BY (instance)
record: instance:node_cpu:rate:sum
- expr: sum((node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_free_bytes{mountpoint="/"}))
BY (instance)
- expr: sum((node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_free_bytes{mountpoint="/"})) BY (instance)
record: instance:node_filesystem_usage:sum
- expr: sum(rate(node_network_receive_bytes_total[3m])) BY (instance)
record: instance:node_network_receive_bytes:rate:sum
- expr: sum(rate(node_network_transmit_bytes_total[3m])) BY (instance)
record: instance:node_network_transmit_bytes:rate:sum
- expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait"}[5m])) WITHOUT
(cpu, mode) / ON(instance) GROUP_LEFT() count(sum(node_cpu_seconds_total)
BY (instance, cpu)) BY (instance)
- expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait"}[5m])) WITHOUT (cpu, mode) / ON(instance) GROUP_LEFT() count(sum(node_cpu_seconds_total) BY (instance, cpu)) BY (instance)
record: instance:node_cpu:ratio
- expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait"}[5m]))
record: cluster:node_cpu:sum_rate5m
- expr: cluster:node_cpu_seconds_total:rate5m / count(sum(node_cpu_seconds_total)
BY (instance, cpu))
- expr: cluster:node_cpu_seconds_total:rate5m / count(sum(node_cpu_seconds_total) BY (instance, cpu))
record: cluster:node_cpu:ratio
- name: kube-prometheus-general.rules
rules:
@ -737,9 +732,7 @@ spec:
rules:
- alert: KubeStateMetricsListErrors
annotations:
message: kube-state-metrics is experiencing errors at an elevated rate in
list operations. This is likely causing it to not be able to expose metrics
about Kubernetes objects correctly or at all.
message: kube-state-metrics is experiencing errors at an elevated rate in list operations. This is likely causing it to not be able to expose metrics about Kubernetes objects correctly or at all.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatemetricslisterrors
expr: |
(sum(rate(kube_state_metrics_list_total{job="kube-state-metrics",result="error"}[5m]))
@ -751,9 +744,7 @@ spec:
severity: critical
- alert: KubeStateMetricsWatchErrors
annotations:
message: kube-state-metrics is experiencing errors at an elevated rate in
watch operations. This is likely causing it to not be able to expose metrics
about Kubernetes objects correctly or at all.
message: kube-state-metrics is experiencing errors at an elevated rate in watch operations. This is likely causing it to not be able to expose metrics about Kubernetes objects correctly or at all.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatemetricswatcherrors
expr: |
(sum(rate(kube_state_metrics_watch_total{job="kube-state-metrics",result="error"}[5m]))
@ -767,9 +758,7 @@ spec:
rules:
- alert: NodeFilesystemSpaceFillingUp
annotations:
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
has only {{ printf "%.2f" $value }}% available space left and is filling
up.
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available space left and is filling up.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodefilesystemspacefillingup
summary: Filesystem is predicted to run out of space within the next 24 hours.
expr: |
@ -785,9 +774,7 @@ spec:
severity: warning
- alert: NodeFilesystemSpaceFillingUp
annotations:
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
has only {{ printf "%.2f" $value }}% available space left and is filling
up fast.
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available space left and is filling up fast.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodefilesystemspacefillingup
summary: Filesystem is predicted to run out of space within the next 4 hours.
expr: |
@ -803,8 +790,7 @@ spec:
severity: critical
- alert: NodeFilesystemAlmostOutOfSpace
annotations:
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
has only {{ printf "%.2f" $value }}% available space left.
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available space left.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodefilesystemalmostoutofspace
summary: Filesystem has less than 5% space left.
expr: |
@ -818,8 +804,7 @@ spec:
severity: warning
- alert: NodeFilesystemAlmostOutOfSpace
annotations:
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
has only {{ printf "%.2f" $value }}% available space left.
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available space left.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodefilesystemalmostoutofspace
summary: Filesystem has less than 3% space left.
expr: |
@ -833,9 +818,7 @@ spec:
severity: critical
- alert: NodeFilesystemFilesFillingUp
annotations:
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
has only {{ printf "%.2f" $value }}% available inodes left and is filling
up.
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available inodes left and is filling up.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodefilesystemfilesfillingup
summary: Filesystem is predicted to run out of inodes within the next 24 hours.
expr: |
@ -851,9 +834,7 @@ spec:
severity: warning
- alert: NodeFilesystemFilesFillingUp
annotations:
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
has only {{ printf "%.2f" $value }}% available inodes left and is filling
up fast.
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available inodes left and is filling up fast.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodefilesystemfilesfillingup
summary: Filesystem is predicted to run out of inodes within the next 4 hours.
expr: |
@ -869,8 +850,7 @@ spec:
severity: critical
- alert: NodeFilesystemAlmostOutOfFiles
annotations:
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
has only {{ printf "%.2f" $value }}% available inodes left.
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available inodes left.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodefilesystemalmostoutoffiles
summary: Filesystem has less than 5% inodes left.
expr: |
@ -884,8 +864,7 @@ spec:
severity: warning
- alert: NodeFilesystemAlmostOutOfFiles
annotations:
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }}
has only {{ printf "%.2f" $value }}% available inodes left.
description: Filesystem on {{ $labels.device }} at {{ $labels.instance }} has only {{ printf "%.2f" $value }}% available inodes left.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodefilesystemalmostoutoffiles
summary: Filesystem has less than 3% inodes left.
expr: |
@ -899,8 +878,7 @@ spec:
severity: critical
- alert: NodeNetworkReceiveErrs
annotations:
description: '{{ $labels.instance }} interface {{ $labels.device }} has encountered
{{ printf "%.0f" $value }} receive errors in the last two minutes.'
description: '{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf "%.0f" $value }} receive errors in the last two minutes.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodenetworkreceiveerrs
summary: Network interface is reporting many receive errors.
expr: |
@ -910,8 +888,7 @@ spec:
severity: warning
- alert: NodeNetworkTransmitErrs
annotations:
description: '{{ $labels.instance }} interface {{ $labels.device }} has encountered
{{ printf "%.0f" $value }} transmit errors in the last two minutes.'
description: '{{ $labels.instance }} interface {{ $labels.device }} has encountered {{ printf "%.0f" $value }} transmit errors in the last two minutes.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodenetworktransmiterrs
summary: Network interface is reporting many transmit errors.
expr: |
@ -939,8 +916,7 @@ spec:
severity: warning
- alert: NodeClockSkewDetected
annotations:
message: Clock on {{ $labels.instance }} is out of sync by more than 300s.
Ensure NTP is configured correctly on this host.
message: Clock on {{ $labels.instance }} is out of sync by more than 300s. Ensure NTP is configured correctly on this host.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodeclockskewdetected
summary: Clock skew detected.
expr: |
@ -960,8 +936,7 @@ spec:
severity: warning
- alert: NodeClockNotSynchronising
annotations:
message: Clock on {{ $labels.instance }} is not synchronising. Ensure NTP
is configured on this host.
message: Clock on {{ $labels.instance }} is not synchronising. Ensure NTP is configured on this host.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-nodeclocknotsynchronising
summary: Clock not synchronising.
expr: |
@ -973,8 +948,7 @@ spec:
rules:
- alert: KubePodCrashLooping
annotations:
message: Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container
}}) is restarting {{ printf "%.2f" $value }} times / 5 minutes.
message: Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container }}) is restarting {{ printf "%.2f" $value }} times / 5 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
expr: |
rate(kube_pod_container_status_restarts_total{job="kube-state-metrics"}[5m]) * 60 * 5 > 0
@ -983,8 +957,7 @@ spec:
severity: warning
- alert: KubePodNotReady
annotations:
message: Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready
state for longer than 15 minutes.
message: Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready state for longer than 15 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready
expr: |
sum by (namespace, pod) (
@ -999,9 +972,7 @@ spec:
severity: warning
- alert: KubeDeploymentGenerationMismatch
annotations:
message: Deployment generation for {{ $labels.namespace }}/{{ $labels.deployment
}} does not match, this indicates that the Deployment has failed but has
not been rolled back.
message: Deployment generation for {{ $labels.namespace }}/{{ $labels.deployment }} does not match, this indicates that the Deployment has failed but has not been rolled back.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentgenerationmismatch
expr: |
kube_deployment_status_observed_generation{job="kube-state-metrics"}
@ -1012,8 +983,7 @@ spec:
severity: warning
- alert: KubeDeploymentReplicasMismatch
annotations:
message: Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has not
matched the expected number of replicas for longer than 15 minutes.
message: Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has not matched the expected number of replicas for longer than 15 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch
expr: |
(
@ -1030,8 +1000,7 @@ spec:
severity: warning
- alert: KubeStatefulSetReplicasMismatch
annotations:
message: StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} has
not matched the expected number of replicas for longer than 15 minutes.
message: StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} has not matched the expected number of replicas for longer than 15 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetreplicasmismatch
expr: |
(
@ -1048,9 +1017,7 @@ spec:
severity: warning
- alert: KubeStatefulSetGenerationMismatch
annotations:
message: StatefulSet generation for {{ $labels.namespace }}/{{ $labels.statefulset
}} does not match, this indicates that the StatefulSet has failed but has
not been rolled back.
message: StatefulSet generation for {{ $labels.namespace }}/{{ $labels.statefulset }} does not match, this indicates that the StatefulSet has failed but has not been rolled back.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetgenerationmismatch
expr: |
kube_statefulset_status_observed_generation{job="kube-state-metrics"}
@ -1061,28 +1028,32 @@ spec:
severity: warning
- alert: KubeStatefulSetUpdateNotRolledOut
annotations:
message: StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} update
has not been rolled out.
message: StatefulSet {{ $labels.namespace }}/{{ $labels.statefulset }} update has not been rolled out.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubestatefulsetupdatenotrolledout
expr: |
max without (revision) (
kube_statefulset_status_current_revision{job="kube-state-metrics"}
unless
kube_statefulset_status_update_revision{job="kube-state-metrics"}
)
*
(
kube_statefulset_replicas{job="kube-state-metrics"}
!=
kube_statefulset_status_replicas_updated{job="kube-state-metrics"}
max without (revision) (
kube_statefulset_status_current_revision{job="kube-state-metrics"}
unless
kube_statefulset_status_update_revision{job="kube-state-metrics"}
)
*
(
kube_statefulset_replicas{job="kube-state-metrics"}
!=
kube_statefulset_status_replicas_updated{job="kube-state-metrics"}
)
) and (
changes(kube_statefulset_status_replicas_updated{job="kube-state-metrics"}[5m])
==
0
)
for: 15m
labels:
severity: warning
- alert: KubeDaemonSetRolloutStuck
annotations:
message: Only {{ $value | humanizePercentage }} of the desired Pods of DaemonSet
{{ $labels.namespace }}/{{ $labels.daemonset }} are scheduled and ready.
message: Only {{ $value | humanizePercentage }} of the desired Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are scheduled and ready.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck
expr: |
kube_daemonset_status_number_ready{job="kube-state-metrics"}
@ -1093,8 +1064,7 @@ spec:
severity: warning
- alert: KubeContainerWaiting
annotations:
message: Pod {{ $labels.namespace }}/{{ $labels.pod }} container {{ $labels.container}}
has been in waiting state for longer than 1 hour.
message: Pod {{ $labels.namespace }}/{{ $labels.pod }} container {{ $labels.container}} has been in waiting state for longer than 1 hour.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontainerwaiting
expr: |
sum by (namespace, pod, container) (kube_pod_container_status_waiting_reason{job="kube-state-metrics"}) > 0
@ -1103,8 +1073,7 @@ spec:
severity: warning
- alert: KubeDaemonSetNotScheduled
annotations:
message: '{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset
}} are not scheduled.'
message: '{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are not scheduled.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetnotscheduled
expr: |
kube_daemonset_status_desired_number_scheduled{job="kube-state-metrics"}
@ -1115,8 +1084,7 @@ spec:
severity: warning
- alert: KubeDaemonSetMisScheduled
annotations:
message: '{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset
}} are running where they are not supposed to run.'
message: '{{ $value }} Pods of DaemonSet {{ $labels.namespace }}/{{ $labels.daemonset }} are running where they are not supposed to run.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetmisscheduled
expr: |
kube_daemonset_status_number_misscheduled{job="kube-state-metrics"} > 0
@ -1125,8 +1093,7 @@ spec:
severity: warning
- alert: KubeCronJobRunning
annotations:
message: CronJob {{ $labels.namespace }}/{{ $labels.cronjob }} is taking more
than 1h to complete.
message: CronJob {{ $labels.namespace }}/{{ $labels.cronjob }} is taking more than 1h to complete.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecronjobrunning
expr: |
time() - kube_cronjob_next_schedule_time{job="kube-state-metrics"} > 3600
@ -1135,8 +1102,7 @@ spec:
severity: warning
- alert: KubeJobCompletion
annotations:
message: Job {{ $labels.namespace }}/{{ $labels.job_name }} is taking more
than one hour to complete.
message: Job {{ $labels.namespace }}/{{ $labels.job_name }} is taking more than one hour to complete.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubejobcompletion
expr: |
kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"} > 0
@ -1154,8 +1120,7 @@ spec:
severity: warning
- alert: KubeHpaReplicasMismatch
annotations:
message: HPA {{ $labels.namespace }}/{{ $labels.hpa }} has not matched the
desired number of replicas for longer than 15 minutes.
message: HPA {{ $labels.namespace }}/{{ $labels.hpa }} has not matched the desired number of replicas for longer than 15 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpareplicasmismatch
expr: |
(kube_hpa_status_desired_replicas{job="kube-state-metrics"}
@ -1168,8 +1133,7 @@ spec:
severity: warning
- alert: KubeHpaMaxedOut
annotations:
message: HPA {{ $labels.namespace }}/{{ $labels.hpa }} has been running at
max replicas for longer than 15 minutes.
message: HPA {{ $labels.namespace }}/{{ $labels.hpa }} has been running at max replicas for longer than 15 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubehpamaxedout
expr: |
kube_hpa_status_current_replicas{job="kube-state-metrics"}
@ -1182,8 +1146,7 @@ spec:
rules:
- alert: KubeCPUOvercommit
annotations:
message: Cluster has overcommitted CPU resource requests for Pods and cannot
tolerate node failure.
message: Cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuovercommit
expr: |
sum(namespace:kube_pod_container_resource_requests_cpu_cores:sum{})
@ -1196,8 +1159,7 @@ spec:
severity: warning
- alert: KubeMemoryOvercommit
annotations:
message: Cluster has overcommitted memory resource requests for Pods and cannot
tolerate node failure.
message: Cluster has overcommitted memory resource requests for Pods and cannot tolerate node failure.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryovercommit
expr: |
sum(namespace:kube_pod_container_resource_requests_memory_bytes:sum{})
@ -1236,8 +1198,7 @@ spec:
severity: warning
- alert: KubeQuotaExceeded
annotations:
message: Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage
}} of its {{ $labels.resource }} quota.
message: Namespace {{ $labels.namespace }} is using {{ $value | humanizePercentage }} of its {{ $labels.resource }} quota.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubequotaexceeded
expr: |
kube_resourcequota{job="kube-state-metrics", type="used"}
@ -1249,9 +1210,7 @@ spec:
severity: warning
- alert: CPUThrottlingHigh
annotations:
message: '{{ $value | humanizePercentage }} throttling of CPU in namespace
{{ $labels.namespace }} for container {{ $labels.container }} in pod {{
$labels.pod }}.'
message: '{{ $value | humanizePercentage }} throttling of CPU in namespace {{ $labels.namespace }} for container {{ $labels.container }} in pod {{ $labels.pod }}.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh
expr: |
sum(increase(container_cpu_cfs_throttled_periods_total{container!="", }[5m])) by (container, pod, namespace)
@ -1265,9 +1224,7 @@ spec:
rules:
- alert: KubePersistentVolumeFillingUp
annotations:
message: The PersistentVolume claimed by {{ $labels.persistentvolumeclaim
}} in Namespace {{ $labels.namespace }} is only {{ $value | humanizePercentage
}} free.
message: The PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is only {{ $value | humanizePercentage }} free.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefillingup
expr: |
kubelet_volume_stats_available_bytes{job="kubelet", metrics_path="/metrics"}
@ -1279,9 +1236,7 @@ spec:
severity: critical
- alert: KubePersistentVolumeFillingUp
annotations:
message: Based on recent sampling, the PersistentVolume claimed by {{ $labels.persistentvolumeclaim
}} in Namespace {{ $labels.namespace }} is expected to fill up within four
days. Currently {{ $value | humanizePercentage }} is available.
message: Based on recent sampling, the PersistentVolume claimed by {{ $labels.persistentvolumeclaim }} in Namespace {{ $labels.namespace }} is expected to fill up within four days. Currently {{ $value | humanizePercentage }} is available.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumefillingup
expr: |
(
@ -1296,8 +1251,7 @@ spec:
severity: warning
- alert: KubePersistentVolumeErrors
annotations:
message: The persistent volume {{ $labels.persistentvolume }} has status {{
$labels.phase }}.
message: The persistent volume {{ $labels.persistentvolume }} has status {{ $labels.phase }}.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepersistentvolumeerrors
expr: |
kube_persistentvolume_status_phase{phase=~"Failed|Pending",job="kube-state-metrics"} > 0
@ -1308,8 +1262,7 @@ spec:
rules:
- alert: KubeVersionMismatch
annotations:
message: There are {{ $value }} different semantic versions of Kubernetes
components running.
message: There are {{ $value }} different semantic versions of Kubernetes components running.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeversionmismatch
expr: |
count(count by (gitVersion) (label_replace(kubernetes_build_info{job!~"kube-dns|coredns"},"gitVersion","$1","gitVersion","(v[0-9]*.[0-9]*.[0-9]*).*"))) > 1
@ -1318,8 +1271,7 @@ spec:
severity: warning
- alert: KubeClientErrors
annotations:
message: Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance
}}' is experiencing {{ $value | humanizePercentage }} errors.'
message: Kubernetes API server client '{{ $labels.job }}/{{ $labels.instance }}' is experiencing {{ $value | humanizePercentage }} errors.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclienterrors
expr: |
(sum(rate(rest_client_requests_total{code=~"5.."}[5m])) by (instance, job)
@ -1387,10 +1339,13 @@ spec:
rules:
- alert: KubeAPILatencyHigh
annotations:
message: The API server has an abnormal latency of {{ $value }} seconds for
{{ $labels.verb }} {{ $labels.resource }}.
message: The API server has an abnormal latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapilatencyhigh
expr: |
cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job="apiserver",quantile="0.99"}
>
1
and on (verb,resource)
(
cluster:apiserver_request_duration_seconds:mean5m{job="apiserver"}
>
@ -1402,18 +1357,12 @@ spec:
)
) > on (verb) group_left()
1.2 * avg by (verb) (cluster:apiserver_request_duration_seconds:mean5m{job="apiserver"} >= 0)
and on (verb,resource)
cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{job="apiserver",quantile="0.99"}
>
1
for: 5m
labels:
severity: warning
- alert: KubeAPIErrorsHigh
annotations:
message: API server is returning errors for {{ $value | humanizePercentage
}} of requests for {{ $labels.verb }} {{ $labels.resource }} {{ $labels.subresource
}}.
message: API server is returning errors for {{ $value | humanizePercentage }} of requests for {{ $labels.verb }} {{ $labels.resource }} {{ $labels.subresource }}.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeapierrorshigh
expr: |
sum(rate(apiserver_request_total{job="apiserver",code=~"5.."}[5m])) by (resource,subresource,verb)
@ -1424,8 +1373,7 @@ spec:
severity: warning
- alert: KubeClientCertificateExpiration
annotations:
message: A client certificate used to authenticate to the apiserver is expiring
in less than 7.0 days.
message: A client certificate used to authenticate to the apiserver is expiring in less than 7.0 days.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration
expr: |
apiserver_client_certificate_expiration_seconds_count{job="apiserver"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="apiserver"}[5m]))) < 604800
@ -1433,8 +1381,7 @@ spec:
severity: warning
- alert: KubeClientCertificateExpiration
annotations:
message: A client certificate used to authenticate to the apiserver is expiring
in less than 24.0 hours.
message: A client certificate used to authenticate to the apiserver is expiring in less than 24.0 hours.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration
expr: |
apiserver_client_certificate_expiration_seconds_count{job="apiserver"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job="apiserver"}[5m]))) < 86400
@ -1442,10 +1389,7 @@ spec:
severity: critical
- alert: AggregatedAPIErrors
annotations:
message: An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has
reported errors. The number of errors have increased for it in the past
five minutes. High values indicate that the availability of the service
changes too often.
message: An aggregated API {{ $labels.name }}/{{ $labels.namespace }} has reported errors. The number of errors have increased for it in the past five minutes. High values indicate that the availability of the service changes too often.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapierrors
expr: |
sum by(name, namespace)(increase(aggregator_unavailable_apiservice_count[5m])) > 2
@ -1453,8 +1397,7 @@ spec:
severity: warning
- alert: AggregatedAPIDown
annotations:
message: An aggregated API {{ $labels.name }}/{{ $labels.namespace }} is down.
It has not been available at least for the past five minutes.
message: An aggregated API {{ $labels.name }}/{{ $labels.namespace }} is down. It has not been available at least for the past five minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-aggregatedapidown
expr: |
sum by(name, namespace)(sum_over_time(aggregator_unavailable_apiservice[5m])) > 0
@ -1491,8 +1434,7 @@ spec:
severity: warning
- alert: KubeletTooManyPods
annotations:
message: Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage
}} of its Pod capacity.
message: Kubelet '{{ $labels.node }}' is running at {{ $value | humanizePercentage }} of its Pod capacity.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubelettoomanypods
expr: |
max(max(kubelet_running_pod_count{job="kubelet", metrics_path="/metrics"}) by(instance) * on(instance) group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"}) by(node) / max(kube_node_status_capacity_pods{job="kube-state-metrics"} != 1) by(node) > 0.95
@ -1501,8 +1443,7 @@ spec:
severity: warning
- alert: KubeNodeReadinessFlapping
annotations:
message: The readiness status of node {{ $labels.node }} has changed {{ $value
}} times in the last 15 minutes.
message: The readiness status of node {{ $labels.node }} has changed {{ $value }} times in the last 15 minutes.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubenodereadinessflapping
expr: |
sum(changes(kube_node_status_condition{status="true",condition="Ready"}[15m])) by (node) > 2
@ -1511,8 +1452,7 @@ spec:
severity: warning
- alert: KubeletPlegDurationHigh
annotations:
message: The Kubelet Pod Lifecycle Event Generator has a 99th percentile duration
of {{ $value }} seconds on node {{ $labels.node }}.
message: The Kubelet Pod Lifecycle Event Generator has a 99th percentile duration of {{ $value }} seconds on node {{ $labels.node }}.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletplegdurationhigh
expr: |
node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile{quantile="0.99"} >= 10
@ -1521,8 +1461,7 @@ spec:
severity: warning
- alert: KubeletPodStartUpLatencyHigh
annotations:
message: Kubelet Pod startup 99th percentile latency is {{ $value }} seconds
on node {{ $labels.node }}.
message: Kubelet Pod startup 99th percentile latency is {{ $value }} seconds on node {{ $labels.node }}.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletpodstartuplatencyhigh
expr: |
histogram_quantile(0.99, sum(rate(kubelet_pod_worker_duration_seconds_bucket{job="kubelet", metrics_path="/metrics"}[5m])) by (instance, le)) * on(instance) group_left(node) kubelet_node_name{job="kubelet", metrics_path="/metrics"} > 60
@ -1564,8 +1503,7 @@ spec:
rules:
- alert: PrometheusBadConfig
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has failed to
reload its configuration.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has failed to reload its configuration.
summary: Failed Prometheus configuration reload.
expr: |
# Without max_over_time, failed scrapes could create false negatives, see
@ -1576,10 +1514,8 @@ spec:
severity: critical
- alert: PrometheusNotificationQueueRunningFull
annotations:
description: Alert notification queue of Prometheus {{$labels.namespace}}/{{$labels.pod}}
is running full.
summary: Prometheus alert notification queue predicted to run full in less
than 30m.
description: Alert notification queue of Prometheus {{$labels.namespace}}/{{$labels.pod}} is running full.
summary: Prometheus alert notification queue predicted to run full in less than 30m.
expr: |
# Without min_over_time, failed scrapes could create false negatives, see
# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.
@ -1593,10 +1529,8 @@ spec:
severity: warning
- alert: PrometheusErrorSendingAlertsToSomeAlertmanagers
annotations:
description: '{{ printf "%.1f" $value }}% errors while sending alerts from
Prometheus {{$labels.namespace}}/{{$labels.pod}} to Alertmanager {{$labels.alertmanager}}.'
summary: Prometheus has encountered more than 1% errors sending alerts to
a specific Alertmanager.
description: '{{ printf "%.1f" $value }}% errors while sending alerts from Prometheus {{$labels.namespace}}/{{$labels.pod}} to Alertmanager {{$labels.alertmanager}}.'
summary: Prometheus has encountered more than 1% errors sending alerts to a specific Alertmanager.
expr: |
(
rate(prometheus_notifications_errors_total{job="prometheus-k8s",namespace="monitoring"}[5m])
@ -1610,8 +1544,7 @@ spec:
severity: warning
- alert: PrometheusErrorSendingAlertsToAnyAlertmanager
annotations:
description: '{{ printf "%.1f" $value }}% minimum errors while sending alerts
from Prometheus {{$labels.namespace}}/{{$labels.pod}} to any Alertmanager.'
description: '{{ printf "%.1f" $value }}% minimum errors while sending alerts from Prometheus {{$labels.namespace}}/{{$labels.pod}} to any Alertmanager.'
summary: Prometheus encounters more than 3% errors sending alerts to any Alertmanager.
expr: |
min without(alertmanager) (
@ -1626,8 +1559,7 @@ spec:
severity: critical
- alert: PrometheusNotConnectedToAlertmanagers
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} is not connected
to any Alertmanagers.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} is not connected to any Alertmanagers.
summary: Prometheus is not connected to any Alertmanagers.
expr: |
# Without max_over_time, failed scrapes could create false negatives, see
@ -1638,8 +1570,7 @@ spec:
severity: warning
- alert: PrometheusTSDBReloadsFailing
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has detected
{{$value | humanize}} reload failures over the last 3h.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has detected {{$value | humanize}} reload failures over the last 3h.
summary: Prometheus has issues reloading blocks from disk.
expr: |
increase(prometheus_tsdb_reloads_failures_total{job="prometheus-k8s",namespace="monitoring"}[3h]) > 0
@ -1648,8 +1579,7 @@ spec:
severity: warning
- alert: PrometheusTSDBCompactionsFailing
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has detected
{{$value | humanize}} compaction failures over the last 3h.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has detected {{$value | humanize}} compaction failures over the last 3h.
summary: Prometheus has issues compacting blocks.
expr: |
increase(prometheus_tsdb_compactions_failed_total{job="prometheus-k8s",namespace="monitoring"}[3h]) > 0
@ -1658,8 +1588,7 @@ spec:
severity: warning
- alert: PrometheusNotIngestingSamples
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} is not ingesting
samples.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} is not ingesting samples.
summary: Prometheus is not ingesting samples.
expr: |
rate(prometheus_tsdb_head_samples_appended_total{job="prometheus-k8s",namespace="monitoring"}[5m]) <= 0
@ -1668,9 +1597,7 @@ spec:
severity: warning
- alert: PrometheusDuplicateTimestamps
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} is dropping
{{ printf "%.4g" $value }} samples/s with different values but duplicated
timestamp.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} is dropping {{ printf "%.4g" $value }} samples/s with different values but duplicated timestamp.
summary: Prometheus is dropping samples with duplicate timestamps.
expr: |
rate(prometheus_target_scrapes_sample_duplicate_timestamp_total{job="prometheus-k8s",namespace="monitoring"}[5m]) > 0
@ -1679,8 +1606,7 @@ spec:
severity: warning
- alert: PrometheusOutOfOrderTimestamps
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} is dropping
{{ printf "%.4g" $value }} samples/s with timestamps arriving out of order.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} is dropping {{ printf "%.4g" $value }} samples/s with timestamps arriving out of order.
summary: Prometheus drops samples with out-of-order timestamps.
expr: |
rate(prometheus_target_scrapes_sample_out_of_order_total{job="prometheus-k8s",namespace="monitoring"}[5m]) > 0
@ -1689,9 +1615,7 @@ spec:
severity: warning
- alert: PrometheusRemoteStorageFailures
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} failed to send
{{ printf "%.1f" $value }}% of the samples to {{ $labels.remote_name}}:{{
$labels.url }}
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} failed to send {{ printf "%.1f" $value }}% of the samples to {{ $labels.remote_name}}:{{ $labels.url }}
summary: Prometheus fails to send samples to remote storage.
expr: |
(
@ -1710,9 +1634,7 @@ spec:
severity: critical
- alert: PrometheusRemoteWriteBehind
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} remote write
is {{ printf "%.1f" $value }}s behind for {{ $labels.remote_name}}:{{ $labels.url
}}.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} remote write is {{ printf "%.1f" $value }}s behind for {{ $labels.remote_name}}:{{ $labels.url }}.
summary: Prometheus remote write is behind.
expr: |
# Without max_over_time, failed scrapes could create false negatives, see
@ -1728,13 +1650,8 @@ spec:
severity: critical
- alert: PrometheusRemoteWriteDesiredShards
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} remote write
desired shards calculation wants to run {{ $value }} shards for queue {{
$labels.remote_name}}:{{ $labels.url }}, which is more than the max of {{
printf `prometheus_remote_storage_shards_max{instance="%s",job="prometheus-k8s",namespace="monitoring"}`
$labels.instance | query | first | value }}.
summary: Prometheus remote write desired shards calculation wants to run more
than configured max shards.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} remote write desired shards calculation wants to run {{ $value }} shards for queue {{ $labels.remote_name}}:{{ $labels.url }}, which is more than the max of {{ printf `prometheus_remote_storage_shards_max{instance="%s",job="prometheus-k8s",namespace="monitoring"}` $labels.instance | query | first | value }}.
summary: Prometheus remote write desired shards calculation wants to run more than configured max shards.
expr: |
# Without max_over_time, failed scrapes could create false negatives, see
# https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details.
@ -1748,8 +1665,7 @@ spec:
severity: warning
- alert: PrometheusRuleFailures
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has failed to
evaluate {{ printf "%.0f" $value }} rules in the last 5m.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has failed to evaluate {{ printf "%.0f" $value }} rules in the last 5m.
summary: Prometheus is failing rule evaluations.
expr: |
increase(prometheus_rule_evaluation_failures_total{job="prometheus-k8s",namespace="monitoring"}[5m]) > 0
@ -1758,8 +1674,7 @@ spec:
severity: critical
- alert: PrometheusMissingRuleEvaluations
annotations:
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has missed {{
printf "%.0f" $value }} rule group evaluations in the last 5m.
description: Prometheus {{$labels.namespace}}/{{$labels.pod}} has missed {{ printf "%.0f" $value }} rule group evaluations in the last 5m.
summary: Prometheus is missing rule evaluations due to slow rule group evaluation.
expr: |
increase(prometheus_rule_group_iterations_missed_total{job="prometheus-k8s",namespace="monitoring"}[5m]) > 0
@ -1770,8 +1685,7 @@ spec:
rules:
- alert: AlertmanagerConfigInconsistent
annotations:
message: The configuration of the instances of the Alertmanager cluster `{{$labels.service}}`
are out of sync.
message: The configuration of the instances of the Alertmanager cluster `{{$labels.service}}` are out of sync.
expr: |
count_values("config_hash", alertmanager_config_hash{job="alertmanager-main",namespace="monitoring"}) BY (service) / ON(service) GROUP_LEFT() label_replace(max(prometheus_operator_spec_replicas{job="prometheus-operator",namespace="monitoring",controller="alertmanager"}) by (name, job, namespace, controller), "service", "alertmanager-$1", "name", "(.*)") != 1
for: 5m
@ -1779,8 +1693,7 @@ spec:
severity: critical
- alert: AlertmanagerFailedReload
annotations:
message: Reloading Alertmanager's configuration has failed for {{ $labels.namespace
}}/{{ $labels.pod}}.
message: Reloading Alertmanager's configuration has failed for {{ $labels.namespace }}/{{ $labels.pod}}.
expr: |
alertmanager_config_last_reload_successful{job="alertmanager-main",namespace="monitoring"} == 0
for: 10m
@ -1800,10 +1713,8 @@ spec:
rules:
- alert: TargetDown
annotations:
message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service
}} targets in {{ $labels.namespace }} namespace are down.'
expr: 100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job,
namespace, service)) > 10
message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service }} targets in {{ $labels.namespace }} namespace are down.'
expr: 100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job, namespace, service)) > 10
for: 10m
labels:
severity: warning
@ -1822,8 +1733,7 @@ spec:
rules:
- alert: NodeNetworkInterfaceFlapping
annotations:
message: Network interface "{{ $labels.device }}" changing it's up status
often on node-exporter {{ $labels.namespace }}/{{ $labels.pod }}"
message: Network interface "{{ $labels.device }}" changing it's up status often on node-exporter {{ $labels.namespace }}/{{ $labels.pod }}"
expr: |
changes(node_network_up{job="node-exporter",device!~"veth.+"}[2m]) > 2
for: 2m
@ -1833,8 +1743,7 @@ spec:
rules:
- alert: PrometheusOperatorReconcileErrors
annotations:
message: Errors while reconciling {{ $labels.controller }} in {{ $labels.namespace
}} Namespace.
message: Errors while reconciling {{ $labels.controller }} in {{ $labels.namespace }} Namespace.
expr: |
rate(prometheus_operator_reconcile_errors_total{job="prometheus-operator",namespace="monitoring"}[5m]) > 0.1
for: 10m

View File

@ -1,291 +0,0 @@
{
"apiVersion": "apiextensions.k8s.io/v1",
"kind": "CustomResourceDefinition",
"metadata": {
"annotations": {
"controller-gen.kubebuilder.io/version": "v0.2.4"
},
"creationTimestamp": null,
"name": "podmonitors.monitoring.coreos.com"
},
"spec": {
"group": "monitoring.coreos.com",
"names": {
"kind": "PodMonitor",
"listKind": "PodMonitorList",
"plural": "podmonitors",
"singular": "podmonitor"
},
"scope": "Namespaced",
"versions": [
{
"name": "v1",
"schema": {
"openAPIV3Schema": {
"description": "PodMonitor defines monitoring for a set of pods.",
"properties": {
"apiVersion": {
"description": "APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources",
"type": "string"
},
"kind": {
"description": "Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds",
"type": "string"
},
"metadata": {
"type": "object"
},
"spec": {
"description": "Specification of desired Pod selection for target discovery by Prometheus.",
"properties": {
"jobLabel": {
"description": "The label to use to retrieve the job name from.",
"type": "string"
},
"namespaceSelector": {
"description": "Selector to select which namespaces the Endpoints objects are discovered from.",
"properties": {
"any": {
"description": "Boolean describing whether all namespaces are selected in contrast to a list restricting them.",
"type": "boolean"
},
"matchNames": {
"description": "List of namespace names.",
"items": {
"type": "string"
},
"type": "array"
}
},
"type": "object"
},
"podMetricsEndpoints": {
"description": "A list of endpoints allowed as part of this PodMonitor.",
"items": {
"description": "PodMetricsEndpoint defines a scrapeable endpoint of a Kubernetes Pod serving Prometheus metrics.",
"properties": {
"honorLabels": {
"description": "HonorLabels chooses the metric's labels on collisions with target labels.",
"type": "boolean"
},
"honorTimestamps": {
"description": "HonorTimestamps controls whether Prometheus respects the timestamps present in scraped data.",
"type": "boolean"
},
"interval": {
"description": "Interval at which metrics should be scraped",
"type": "string"
},
"metricRelabelings": {
"description": "MetricRelabelConfigs to apply to samples before ingestion.",
"items": {
"description": "RelabelConfig allows dynamic rewriting of the label set, being applied to samples before ingestion. It defines `<metric_relabel_configs>`-section of Prometheus configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs",
"properties": {
"action": {
"description": "Action to perform based on regex matching. Default is 'replace'",
"type": "string"
},
"modulus": {
"description": "Modulus to take of the hash of the source label values.",
"format": "int64",
"type": "integer"
},
"regex": {
"description": "Regular expression against which the extracted value is matched. Default is '(.*)'",
"type": "string"
},
"replacement": {
"description": "Replacement value against which a regex replace is performed if the regular expression matches. Regex capture groups are available. Default is '$1'",
"type": "string"
},
"separator": {
"description": "Separator placed between concatenated source label values. default is ';'.",
"type": "string"
},
"sourceLabels": {
"description": "The source labels select values from existing labels. Their content is concatenated using the configured separator and matched against the configured regular expression for the replace, keep, and drop actions.",
"items": {
"type": "string"
},
"type": "array"
},
"targetLabel": {
"description": "Label to which the resulting value is written in a replace action. It is mandatory for replace actions. Regex capture groups are available.",
"type": "string"
}
},
"type": "object"
},
"type": "array"
},
"params": {
"additionalProperties": {
"items": {
"type": "string"
},
"type": "array"
},
"description": "Optional HTTP URL parameters",
"type": "object"
},
"path": {
"description": "HTTP path to scrape for metrics.",
"type": "string"
},
"port": {
"description": "Name of the pod port this endpoint refers to. Mutually exclusive with targetPort.",
"type": "string"
},
"proxyUrl": {
"description": "ProxyURL eg http://proxyserver:2195 Directs scrapes to proxy through this endpoint.",
"type": "string"
},
"relabelings": {
"description": "RelabelConfigs to apply to samples before ingestion. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config",
"items": {
"description": "RelabelConfig allows dynamic rewriting of the label set, being applied to samples before ingestion. It defines `<metric_relabel_configs>`-section of Prometheus configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs",
"properties": {
"action": {
"description": "Action to perform based on regex matching. Default is 'replace'",
"type": "string"
},
"modulus": {
"description": "Modulus to take of the hash of the source label values.",
"format": "int64",
"type": "integer"
},
"regex": {
"description": "Regular expression against which the extracted value is matched. Default is '(.*)'",
"type": "string"
},
"replacement": {
"description": "Replacement value against which a regex replace is performed if the regular expression matches. Regex capture groups are available. Default is '$1'",
"type": "string"
},
"separator": {
"description": "Separator placed between concatenated source label values. default is ';'.",
"type": "string"
},
"sourceLabels": {
"description": "The source labels select values from existing labels. Their content is concatenated using the configured separator and matched against the configured regular expression for the replace, keep, and drop actions.",
"items": {
"type": "string"
},
"type": "array"
},
"targetLabel": {
"description": "Label to which the resulting value is written in a replace action. It is mandatory for replace actions. Regex capture groups are available.",
"type": "string"
}
},
"type": "object"
},
"type": "array"
},
"scheme": {
"description": "HTTP scheme to use for scraping.",
"type": "string"
},
"scrapeTimeout": {
"description": "Timeout after which the scrape is ended",
"type": "string"
},
"targetPort": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
],
"description": "Deprecated: Use 'port' instead.",
"x-kubernetes-int-or-string": true
}
},
"type": "object"
},
"type": "array"
},
"podTargetLabels": {
"description": "PodTargetLabels transfers labels on the Kubernetes Pod onto the target.",
"items": {
"type": "string"
},
"type": "array"
},
"sampleLimit": {
"description": "SampleLimit defines per-scrape limit on number of scraped samples that will be accepted.",
"format": "int64",
"type": "integer"
},
"selector": {
"description": "Selector to select Pod objects.",
"properties": {
"matchExpressions": {
"description": "matchExpressions is a list of label selector requirements. The requirements are ANDed.",
"items": {
"description": "A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.",
"properties": {
"key": {
"description": "key is the label key that the selector applies to.",
"type": "string"
},
"operator": {
"description": "operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.",
"type": "string"
},
"values": {
"description": "values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.",
"items": {
"type": "string"
},
"type": "array"
}
},
"required": [
"key",
"operator"
],
"type": "object"
},
"type": "array"
},
"matchLabels": {
"additionalProperties": {
"type": "string"
},
"description": "matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is \"key\", the operator is \"In\", and the values array contains only \"value\". The requirements are ANDed.",
"type": "object"
}
},
"type": "object"
}
},
"required": [
"podMetricsEndpoints",
"selector"
],
"type": "object"
}
},
"required": [
"spec"
],
"type": "object"
}
},
"served": true,
"storage": true
}
]
},
"status": {
"acceptedNames": {
"kind": "",
"plural": ""
},
"conditions": [ ],
"storedVersions": [ ]
}
}

View File

@ -20,31 +20,24 @@ spec:
description: PodMonitor defines monitoring for a set of pods.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired Pod selection for target discovery
by Prometheus.
description: Specification of desired Pod selection for target discovery by Prometheus.
properties:
jobLabel:
description: The label to use to retrieve the job name from.
type: string
namespaceSelector:
description: Selector to select which namespaces the Endpoints objects
are discovered from.
description: Selector to select which namespaces the Endpoints objects are discovered from.
properties:
any:
description: Boolean describing whether all namespaces are selected
in contrast to a list restricting them.
description: Boolean describing whether all namespaces are selected in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
@ -55,63 +48,45 @@ spec:
podMetricsEndpoints:
description: A list of endpoints allowed as part of this PodMonitor.
items:
description: PodMetricsEndpoint defines a scrapeable endpoint of
a Kubernetes Pod serving Prometheus metrics.
description: PodMetricsEndpoint defines a scrapeable endpoint of a Kubernetes Pod serving Prometheus metrics.
properties:
honorLabels:
description: HonorLabels chooses the metric's labels on collisions
with target labels.
description: HonorLabels chooses the metric's labels on collisions with target labels.
type: boolean
honorTimestamps:
description: HonorTimestamps controls whether Prometheus respects
the timestamps present in scraped data.
description: HonorTimestamps controls whether Prometheus respects the timestamps present in scraped data.
type: boolean
interval:
description: Interval at which metrics should be scraped
type: string
metricRelabelings:
description: MetricRelabelConfigs to apply to samples before
ingestion.
description: MetricRelabelConfigs to apply to samples before ingestion.
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
description: 'RelabelConfig allows dynamic rewriting of the label set, being applied to samples before ingestion. It defines `<metric_relabel_configs>`-section of Prometheus configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
description: Action to perform based on regex matching. Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
description: Modulus to take of the hash of the source label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
description: Regular expression against which the extracted value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
description: Replacement value against which a regex replace is performed if the regular expression matches. Regex capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
description: Separator placed between concatenated source label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
description: The source labels select values from existing labels. Their content is concatenated using the configured separator and matched against the configured regular expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
description: Label to which the resulting value is written in a replace action. It is mandatory for replace actions. Regex capture groups are available.
type: string
type: object
type: array
@ -126,56 +101,39 @@ spec:
description: HTTP path to scrape for metrics.
type: string
port:
description: Name of the pod port this endpoint refers to. Mutually
exclusive with targetPort.
description: Name of the pod port this endpoint refers to. Mutually exclusive with targetPort.
type: string
proxyUrl:
description: ProxyURL eg http://proxyserver:2195 Directs scrapes
to proxy through this endpoint.
description: ProxyURL eg http://proxyserver:2195 Directs scrapes to proxy through this endpoint.
type: string
relabelings:
description: 'RelabelConfigs to apply to samples before ingestion.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
description: 'RelabelConfigs to apply to samples before ingestion. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
description: 'RelabelConfig allows dynamic rewriting of the label set, being applied to samples before ingestion. It defines `<metric_relabel_configs>`-section of Prometheus configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
description: Action to perform based on regex matching. Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
description: Modulus to take of the hash of the source label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
description: Regular expression against which the extracted value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
description: Replacement value against which a regex replace is performed if the regular expression matches. Regex capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
description: Separator placed between concatenated source label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
description: The source labels select values from existing labels. Their content is concatenated using the configured separator and matched against the configured regular expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
description: Label to which the resulting value is written in a replace action. It is mandatory for replace actions. Regex capture groups are available.
type: string
type: object
type: array
@ -194,42 +152,30 @@ spec:
type: object
type: array
podTargetLabels:
description: PodTargetLabels transfers labels on the Kubernetes Pod
onto the target.
description: PodTargetLabels transfers labels on the Kubernetes Pod onto the target.
items:
type: string
type: array
sampleLimit:
description: SampleLimit defines per-scrape limit on number of scraped
samples that will be accepted.
description: SampleLimit defines per-scrape limit on number of scraped samples that will be accepted.
format: int64
type: integer
selector:
description: Selector to select Pod objects.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
description: matchExpressions is a list of label selector requirements. The requirements are ANDed.
items:
description: A label selector requirement is a selector that
contains values, a key, and an operator that relates the key
and values.
description: A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.
properties:
key:
description: key is the label key that the selector applies
to.
description: key is the label key that the selector applies to.
type: string
operator:
description: operator represents a key's relationship to
a set of values. Valid operators are In, NotIn, Exists
and DoesNotExist.
description: operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: values is an array of string values. If the
operator is In or NotIn, the values array must be non-empty.
If the operator is Exists or DoesNotExist, the values
array must be empty. This array is replaced during a strategic
merge patch.
description: values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.
items:
type: string
type: array
@ -241,11 +187,7 @@ spec:
matchLabels:
additionalProperties:
type: string
description: matchLabels is a map of {key,value} pairs. A single
{key,value} in the matchLabels map is equivalent to an element
of matchExpressions, whose key field is "key", the operator
is "In", and the values array contains only "value". The requirements
are ANDed.
description: matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
type: object
required:

View File

@ -1,131 +0,0 @@
{
"apiVersion": "apiextensions.k8s.io/v1",
"kind": "CustomResourceDefinition",
"metadata": {
"annotations": {
"controller-gen.kubebuilder.io/version": "v0.2.4"
},
"creationTimestamp": null,
"name": "prometheusrules.monitoring.coreos.com"
},
"spec": {
"group": "monitoring.coreos.com",
"names": {
"kind": "PrometheusRule",
"listKind": "PrometheusRuleList",
"plural": "prometheusrules",
"singular": "prometheusrule"
},
"scope": "Namespaced",
"versions": [
{
"name": "v1",
"schema": {
"openAPIV3Schema": {
"description": "PrometheusRule defines alerting rules for a Prometheus instance",
"properties": {
"apiVersion": {
"description": "APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources",
"type": "string"
},
"kind": {
"description": "Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds",
"type": "string"
},
"metadata": {
"type": "object"
},
"spec": {
"description": "Specification of desired alerting rule definitions for Prometheus.",
"properties": {
"groups": {
"description": "Content of Prometheus rule file",
"items": {
"description": "RuleGroup is a list of sequentially evaluated recording and alerting rules. Note: PartialResponseStrategy is only used by ThanosRuler and will be ignored by Prometheus instances. Valid values for this field are 'warn' or 'abort'. More info: https://github.com/thanos-io/thanos/blob/master/docs/components/rule.md#partial-response",
"properties": {
"interval": {
"type": "string"
},
"name": {
"type": "string"
},
"partial_response_strategy": {
"type": "string"
},
"rules": {
"items": {
"description": "Rule describes an alerting or recording rule.",
"properties": {
"alert": {
"type": "string"
},
"annotations": {
"additionalProperties": {
"type": "string"
},
"type": "object"
},
"expr": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
],
"x-kubernetes-int-or-string": true
},
"for": {
"type": "string"
},
"labels": {
"additionalProperties": {
"type": "string"
},
"type": "object"
},
"record": {
"type": "string"
}
},
"required": [
"expr"
],
"type": "object"
},
"type": "array"
}
},
"required": [
"name",
"rules"
],
"type": "object"
},
"type": "array"
}
},
"type": "object"
}
},
"required": [
"spec"
],
"type": "object"
}
},
"served": true,
"storage": true
}
]
},
"status": {
"acceptedNames": {
"kind": "",
"plural": ""
},
"conditions": [ ],
"storedVersions": [ ]
}
}

View File

@ -20,14 +20,10 @@ spec:
description: PrometheusRule defines alerting rules for a Prometheus instance
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
@ -37,10 +33,7 @@ spec:
groups:
description: Content of Prometheus rule file
items:
description: 'RuleGroup is a list of sequentially evaluated recording
and alerting rules. Note: PartialResponseStrategy is only used
by ThanosRuler and will be ignored by Prometheus instances. Valid
values for this field are ''warn'' or ''abort''. More info: https://github.com/thanos-io/thanos/blob/master/docs/components/rule.md#partial-response'
description: 'RuleGroup is a list of sequentially evaluated recording and alerting rules. Note: PartialResponseStrategy is only used by ThanosRuler and will be ignored by Prometheus instances. Valid values for this field are ''warn'' or ''abort''. More info: https://github.com/thanos-io/thanos/blob/master/docs/components/rule.md#partial-response'
properties:
interval:
type: string

View File

@ -1,514 +0,0 @@
{
"apiVersion": "apiextensions.k8s.io/v1",
"kind": "CustomResourceDefinition",
"metadata": {
"annotations": {
"controller-gen.kubebuilder.io/version": "v0.2.4"
},
"creationTimestamp": null,
"name": "servicemonitors.monitoring.coreos.com"
},
"spec": {
"group": "monitoring.coreos.com",
"names": {
"kind": "ServiceMonitor",
"listKind": "ServiceMonitorList",
"plural": "servicemonitors",
"singular": "servicemonitor"
},
"scope": "Namespaced",
"versions": [
{
"name": "v1",
"schema": {
"openAPIV3Schema": {
"description": "ServiceMonitor defines monitoring for a set of services.",
"properties": {
"apiVersion": {
"description": "APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources",
"type": "string"
},
"kind": {
"description": "Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds",
"type": "string"
},
"metadata": {
"type": "object"
},
"spec": {
"description": "Specification of desired Service selection for target discovery by Prometheus.",
"properties": {
"endpoints": {
"description": "A list of endpoints allowed as part of this ServiceMonitor.",
"items": {
"description": "Endpoint defines a scrapeable endpoint serving Prometheus metrics.",
"properties": {
"basicAuth": {
"description": "BasicAuth allow an endpoint to authenticate over basic authentication More info: https://prometheus.io/docs/operating/configuration/#endpoints",
"properties": {
"password": {
"description": "The secret in the service monitor namespace that contains the password for authentication.",
"properties": {
"key": {
"description": "The key of the secret to select from. Must be a valid secret key.",
"type": "string"
},
"name": {
"description": "Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?",
"type": "string"
},
"optional": {
"description": "Specify whether the Secret or its key must be defined",
"type": "boolean"
}
},
"required": [
"key"
],
"type": "object"
},
"username": {
"description": "The secret in the service monitor namespace that contains the username for authentication.",
"properties": {
"key": {
"description": "The key of the secret to select from. Must be a valid secret key.",
"type": "string"
},
"name": {
"description": "Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?",
"type": "string"
},
"optional": {
"description": "Specify whether the Secret or its key must be defined",
"type": "boolean"
}
},
"required": [
"key"
],
"type": "object"
}
},
"type": "object"
},
"bearerTokenFile": {
"description": "File to read bearer token for scraping targets.",
"type": "string"
},
"bearerTokenSecret": {
"description": "Secret to mount to read bearer token for scraping targets. The secret needs to be in the same namespace as the service monitor and accessible by the Prometheus Operator.",
"properties": {
"key": {
"description": "The key of the secret to select from. Must be a valid secret key.",
"type": "string"
},
"name": {
"description": "Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?",
"type": "string"
},
"optional": {
"description": "Specify whether the Secret or its key must be defined",
"type": "boolean"
}
},
"required": [
"key"
],
"type": "object"
},
"honorLabels": {
"description": "HonorLabels chooses the metric's labels on collisions with target labels.",
"type": "boolean"
},
"honorTimestamps": {
"description": "HonorTimestamps controls whether Prometheus respects the timestamps present in scraped data.",
"type": "boolean"
},
"interval": {
"description": "Interval at which metrics should be scraped",
"type": "string"
},
"metricRelabelings": {
"description": "MetricRelabelConfigs to apply to samples before ingestion.",
"items": {
"description": "RelabelConfig allows dynamic rewriting of the label set, being applied to samples before ingestion. It defines `<metric_relabel_configs>`-section of Prometheus configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs",
"properties": {
"action": {
"description": "Action to perform based on regex matching. Default is 'replace'",
"type": "string"
},
"modulus": {
"description": "Modulus to take of the hash of the source label values.",
"format": "int64",
"type": "integer"
},
"regex": {
"description": "Regular expression against which the extracted value is matched. Default is '(.*)'",
"type": "string"
},
"replacement": {
"description": "Replacement value against which a regex replace is performed if the regular expression matches. Regex capture groups are available. Default is '$1'",
"type": "string"
},
"separator": {
"description": "Separator placed between concatenated source label values. default is ';'.",
"type": "string"
},
"sourceLabels": {
"description": "The source labels select values from existing labels. Their content is concatenated using the configured separator and matched against the configured regular expression for the replace, keep, and drop actions.",
"items": {
"type": "string"
},
"type": "array"
},
"targetLabel": {
"description": "Label to which the resulting value is written in a replace action. It is mandatory for replace actions. Regex capture groups are available.",
"type": "string"
}
},
"type": "object"
},
"type": "array"
},
"params": {
"additionalProperties": {
"items": {
"type": "string"
},
"type": "array"
},
"description": "Optional HTTP URL parameters",
"type": "object"
},
"path": {
"description": "HTTP path to scrape for metrics.",
"type": "string"
},
"port": {
"description": "Name of the service port this endpoint refers to. Mutually exclusive with targetPort.",
"type": "string"
},
"proxyUrl": {
"description": "ProxyURL eg http://proxyserver:2195 Directs scrapes to proxy through this endpoint.",
"type": "string"
},
"relabelings": {
"description": "RelabelConfigs to apply to samples before scraping. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config",
"items": {
"description": "RelabelConfig allows dynamic rewriting of the label set, being applied to samples before ingestion. It defines `<metric_relabel_configs>`-section of Prometheus configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs",
"properties": {
"action": {
"description": "Action to perform based on regex matching. Default is 'replace'",
"type": "string"
},
"modulus": {
"description": "Modulus to take of the hash of the source label values.",
"format": "int64",
"type": "integer"
},
"regex": {
"description": "Regular expression against which the extracted value is matched. Default is '(.*)'",
"type": "string"
},
"replacement": {
"description": "Replacement value against which a regex replace is performed if the regular expression matches. Regex capture groups are available. Default is '$1'",
"type": "string"
},
"separator": {
"description": "Separator placed between concatenated source label values. default is ';'.",
"type": "string"
},
"sourceLabels": {
"description": "The source labels select values from existing labels. Their content is concatenated using the configured separator and matched against the configured regular expression for the replace, keep, and drop actions.",
"items": {
"type": "string"
},
"type": "array"
},
"targetLabel": {
"description": "Label to which the resulting value is written in a replace action. It is mandatory for replace actions. Regex capture groups are available.",
"type": "string"
}
},
"type": "object"
},
"type": "array"
},
"scheme": {
"description": "HTTP scheme to use for scraping.",
"type": "string"
},
"scrapeTimeout": {
"description": "Timeout after which the scrape is ended",
"type": "string"
},
"targetPort": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
],
"description": "Name or number of the pod port this endpoint refers to. Mutually exclusive with port.",
"x-kubernetes-int-or-string": true
},
"tlsConfig": {
"description": "TLS configuration to use when scraping the endpoint",
"properties": {
"ca": {
"description": "Stuct containing the CA cert to use for the targets.",
"properties": {
"configMap": {
"description": "ConfigMap containing data to use for the targets.",
"properties": {
"key": {
"description": "The key to select.",
"type": "string"
},
"name": {
"description": "Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?",
"type": "string"
},
"optional": {
"description": "Specify whether the ConfigMap or its key must be defined",
"type": "boolean"
}
},
"required": [
"key"
],
"type": "object"
},
"secret": {
"description": "Secret containing data to use for the targets.",
"properties": {
"key": {
"description": "The key of the secret to select from. Must be a valid secret key.",
"type": "string"
},
"name": {
"description": "Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?",
"type": "string"
},
"optional": {
"description": "Specify whether the Secret or its key must be defined",
"type": "boolean"
}
},
"required": [
"key"
],
"type": "object"
}
},
"type": "object"
},
"caFile": {
"description": "Path to the CA cert in the Prometheus container to use for the targets.",
"type": "string"
},
"cert": {
"description": "Struct containing the client cert file for the targets.",
"properties": {
"configMap": {
"description": "ConfigMap containing data to use for the targets.",
"properties": {
"key": {
"description": "The key to select.",
"type": "string"
},
"name": {
"description": "Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?",
"type": "string"
},
"optional": {
"description": "Specify whether the ConfigMap or its key must be defined",
"type": "boolean"
}
},
"required": [
"key"
],
"type": "object"
},
"secret": {
"description": "Secret containing data to use for the targets.",
"properties": {
"key": {
"description": "The key of the secret to select from. Must be a valid secret key.",
"type": "string"
},
"name": {
"description": "Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?",
"type": "string"
},
"optional": {
"description": "Specify whether the Secret or its key must be defined",
"type": "boolean"
}
},
"required": [
"key"
],
"type": "object"
}
},
"type": "object"
},
"certFile": {
"description": "Path to the client cert file in the Prometheus container for the targets.",
"type": "string"
},
"insecureSkipVerify": {
"description": "Disable target certificate validation.",
"type": "boolean"
},
"keyFile": {
"description": "Path to the client key file in the Prometheus container for the targets.",
"type": "string"
},
"keySecret": {
"description": "Secret containing the client key file for the targets.",
"properties": {
"key": {
"description": "The key of the secret to select from. Must be a valid secret key.",
"type": "string"
},
"name": {
"description": "Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?",
"type": "string"
},
"optional": {
"description": "Specify whether the Secret or its key must be defined",
"type": "boolean"
}
},
"required": [
"key"
],
"type": "object"
},
"serverName": {
"description": "Used to verify the hostname for the targets.",
"type": "string"
}
},
"type": "object"
}
},
"type": "object"
},
"type": "array"
},
"jobLabel": {
"description": "The label to use to retrieve the job name from.",
"type": "string"
},
"namespaceSelector": {
"description": "Selector to select which namespaces the Endpoints objects are discovered from.",
"properties": {
"any": {
"description": "Boolean describing whether all namespaces are selected in contrast to a list restricting them.",
"type": "boolean"
},
"matchNames": {
"description": "List of namespace names.",
"items": {
"type": "string"
},
"type": "array"
}
},
"type": "object"
},
"podTargetLabels": {
"description": "PodTargetLabels transfers labels on the Kubernetes Pod onto the target.",
"items": {
"type": "string"
},
"type": "array"
},
"sampleLimit": {
"description": "SampleLimit defines per-scrape limit on number of scraped samples that will be accepted.",
"format": "int64",
"type": "integer"
},
"selector": {
"description": "Selector to select Endpoints objects.",
"properties": {
"matchExpressions": {
"description": "matchExpressions is a list of label selector requirements. The requirements are ANDed.",
"items": {
"description": "A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.",
"properties": {
"key": {
"description": "key is the label key that the selector applies to.",
"type": "string"
},
"operator": {
"description": "operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.",
"type": "string"
},
"values": {
"description": "values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.",
"items": {
"type": "string"
},
"type": "array"
}
},
"required": [
"key",
"operator"
],
"type": "object"
},
"type": "array"
},
"matchLabels": {
"additionalProperties": {
"type": "string"
},
"description": "matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is \"key\", the operator is \"In\", and the values array contains only \"value\". The requirements are ANDed.",
"type": "object"
}
},
"type": "object"
},
"targetLabels": {
"description": "TargetLabels transfers labels on the Kubernetes Service onto the target.",
"items": {
"type": "string"
},
"type": "array"
}
},
"required": [
"endpoints",
"selector"
],
"type": "object"
}
},
"required": [
"spec"
],
"type": "object"
}
},
"served": true,
"storage": true
}
]
},
"status": {
"acceptedNames": {
"kind": "",
"plural": ""
},
"conditions": [ ],
"storedVersions": [ ]
}
}

View File

@ -20,65 +20,50 @@ spec:
description: ServiceMonitor defines monitoring for a set of services.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Specification of desired Service selection for target discovery
by Prometheus.
description: Specification of desired Service selection for target discovery by Prometheus.
properties:
endpoints:
description: A list of endpoints allowed as part of this ServiceMonitor.
items:
description: Endpoint defines a scrapeable endpoint serving Prometheus
metrics.
description: Endpoint defines a scrapeable endpoint serving Prometheus metrics.
properties:
basicAuth:
description: 'BasicAuth allow an endpoint to authenticate over
basic authentication More info: https://prometheus.io/docs/operating/configuration/#endpoints'
description: 'BasicAuth allow an endpoint to authenticate over basic authentication More info: https://prometheus.io/docs/operating/configuration/#endpoints'
properties:
password:
description: The secret in the service monitor namespace
that contains the password for authentication.
description: The secret in the service monitor namespace that contains the password for authentication.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
description: The key of the secret to select from. Must be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
description: Specify whether the Secret or its key must be defined
type: boolean
required:
- key
type: object
username:
description: The secret in the service monitor namespace
that contains the username for authentication.
description: The secret in the service monitor namespace that contains the username for authentication.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
description: The key of the secret to select from. Must be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
description: Specify whether the Secret or its key must be defined
type: boolean
required:
- key
@ -88,79 +73,57 @@ spec:
description: File to read bearer token for scraping targets.
type: string
bearerTokenSecret:
description: Secret to mount to read bearer token for scraping
targets. The secret needs to be in the same namespace as the
service monitor and accessible by the Prometheus Operator.
description: Secret to mount to read bearer token for scraping targets. The secret needs to be in the same namespace as the service monitor and accessible by the Prometheus Operator.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
description: The key of the secret to select from. Must be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
description: Specify whether the Secret or its key must be defined
type: boolean
required:
- key
type: object
honorLabels:
description: HonorLabels chooses the metric's labels on collisions
with target labels.
description: HonorLabels chooses the metric's labels on collisions with target labels.
type: boolean
honorTimestamps:
description: HonorTimestamps controls whether Prometheus respects
the timestamps present in scraped data.
description: HonorTimestamps controls whether Prometheus respects the timestamps present in scraped data.
type: boolean
interval:
description: Interval at which metrics should be scraped
type: string
metricRelabelings:
description: MetricRelabelConfigs to apply to samples before
ingestion.
description: MetricRelabelConfigs to apply to samples before ingestion.
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
description: 'RelabelConfig allows dynamic rewriting of the label set, being applied to samples before ingestion. It defines `<metric_relabel_configs>`-section of Prometheus configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
description: Action to perform based on regex matching. Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
description: Modulus to take of the hash of the source label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
description: Regular expression against which the extracted value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
description: Replacement value against which a regex replace is performed if the regular expression matches. Regex capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
description: Separator placed between concatenated source label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
description: The source labels select values from existing labels. Their content is concatenated using the configured separator and matched against the configured regular expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
description: Label to which the resulting value is written in a replace action. It is mandatory for replace actions. Regex capture groups are available.
type: string
type: object
type: array
@ -175,56 +138,39 @@ spec:
description: HTTP path to scrape for metrics.
type: string
port:
description: Name of the service port this endpoint refers to.
Mutually exclusive with targetPort.
description: Name of the service port this endpoint refers to. Mutually exclusive with targetPort.
type: string
proxyUrl:
description: ProxyURL eg http://proxyserver:2195 Directs scrapes
to proxy through this endpoint.
description: ProxyURL eg http://proxyserver:2195 Directs scrapes to proxy through this endpoint.
type: string
relabelings:
description: 'RelabelConfigs to apply to samples before scraping.
More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
description: 'RelabelConfigs to apply to samples before scraping. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config'
items:
description: 'RelabelConfig allows dynamic rewriting of the
label set, being applied to samples before ingestion. It
defines `<metric_relabel_configs>`-section of Prometheus
configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
description: 'RelabelConfig allows dynamic rewriting of the label set, being applied to samples before ingestion. It defines `<metric_relabel_configs>`-section of Prometheus configuration. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs'
properties:
action:
description: Action to perform based on regex matching.
Default is 'replace'
description: Action to perform based on regex matching. Default is 'replace'
type: string
modulus:
description: Modulus to take of the hash of the source
label values.
description: Modulus to take of the hash of the source label values.
format: int64
type: integer
regex:
description: Regular expression against which the extracted
value is matched. Default is '(.*)'
description: Regular expression against which the extracted value is matched. Default is '(.*)'
type: string
replacement:
description: Replacement value against which a regex replace
is performed if the regular expression matches. Regex
capture groups are available. Default is '$1'
description: Replacement value against which a regex replace is performed if the regular expression matches. Regex capture groups are available. Default is '$1'
type: string
separator:
description: Separator placed between concatenated source
label values. default is ';'.
description: Separator placed between concatenated source label values. default is ';'.
type: string
sourceLabels:
description: The source labels select values from existing
labels. Their content is concatenated using the configured
separator and matched against the configured regular
expression for the replace, keep, and drop actions.
description: The source labels select values from existing labels. Their content is concatenated using the configured separator and matched against the configured regular expression for the replace, keep, and drop actions.
items:
type: string
type: array
targetLabel:
description: Label to which the resulting value is written
in a replace action. It is mandatory for replace actions.
Regex capture groups are available.
description: Label to which the resulting value is written in a replace action. It is mandatory for replace actions. Regex capture groups are available.
type: string
type: object
type: array
@ -238,31 +184,25 @@ spec:
anyOf:
- type: integer
- type: string
description: Name or number of the pod port this endpoint refers
to. Mutually exclusive with port.
description: Name or number of the pod port this endpoint refers to. Mutually exclusive with port.
x-kubernetes-int-or-string: true
tlsConfig:
description: TLS configuration to use when scraping the endpoint
properties:
ca:
description: Stuct containing the CA cert to use for the
targets.
description: Stuct containing the CA cert to use for the targets.
properties:
configMap:
description: ConfigMap containing data to use for the
targets.
description: ConfigMap containing data to use for the targets.
properties:
key:
description: The key to select.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the ConfigMap or its
key must be defined
description: Specify whether the ConfigMap or its key must be defined
type: boolean
required:
- key
@ -271,45 +211,35 @@ spec:
description: Secret containing data to use for the targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
description: The key of the secret to select from. Must be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key
must be defined
description: Specify whether the Secret or its key must be defined
type: boolean
required:
- key
type: object
type: object
caFile:
description: Path to the CA cert in the Prometheus container
to use for the targets.
description: Path to the CA cert in the Prometheus container to use for the targets.
type: string
cert:
description: Struct containing the client cert file for
the targets.
description: Struct containing the client cert file for the targets.
properties:
configMap:
description: ConfigMap containing data to use for the
targets.
description: ConfigMap containing data to use for the targets.
properties:
key:
description: The key to select.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the ConfigMap or its
key must be defined
description: Specify whether the ConfigMap or its key must be defined
type: boolean
required:
- key
@ -318,48 +248,38 @@ spec:
description: Secret containing data to use for the targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
description: The key of the secret to select from. Must be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind,
uid?'
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key
must be defined
description: Specify whether the Secret or its key must be defined
type: boolean
required:
- key
type: object
type: object
certFile:
description: Path to the client cert file in the Prometheus
container for the targets.
description: Path to the client cert file in the Prometheus container for the targets.
type: string
insecureSkipVerify:
description: Disable target certificate validation.
type: boolean
keyFile:
description: Path to the client key file in the Prometheus
container for the targets.
description: Path to the client key file in the Prometheus container for the targets.
type: string
keySecret:
description: Secret containing the client key file for the
targets.
description: Secret containing the client key file for the targets.
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
description: The key of the secret to select from. Must be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
description: Specify whether the Secret or its key must be defined
type: boolean
required:
- key
@ -374,12 +294,10 @@ spec:
description: The label to use to retrieve the job name from.
type: string
namespaceSelector:
description: Selector to select which namespaces the Endpoints objects
are discovered from.
description: Selector to select which namespaces the Endpoints objects are discovered from.
properties:
any:
description: Boolean describing whether all namespaces are selected
in contrast to a list restricting them.
description: Boolean describing whether all namespaces are selected in contrast to a list restricting them.
type: boolean
matchNames:
description: List of namespace names.
@ -388,42 +306,30 @@ spec:
type: array
type: object
podTargetLabels:
description: PodTargetLabels transfers labels on the Kubernetes Pod
onto the target.
description: PodTargetLabels transfers labels on the Kubernetes Pod onto the target.
items:
type: string
type: array
sampleLimit:
description: SampleLimit defines per-scrape limit on number of scraped
samples that will be accepted.
description: SampleLimit defines per-scrape limit on number of scraped samples that will be accepted.
format: int64
type: integer
selector:
description: Selector to select Endpoints objects.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
description: matchExpressions is a list of label selector requirements. The requirements are ANDed.
items:
description: A label selector requirement is a selector that
contains values, a key, and an operator that relates the key
and values.
description: A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.
properties:
key:
description: key is the label key that the selector applies
to.
description: key is the label key that the selector applies to.
type: string
operator:
description: operator represents a key's relationship to
a set of values. Valid operators are In, NotIn, Exists
and DoesNotExist.
description: operator represents a key's relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: values is an array of string values. If the
operator is In or NotIn, the values array must be non-empty.
If the operator is Exists or DoesNotExist, the values
array must be empty. This array is replaced during a strategic
merge patch.
description: values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.
items:
type: string
type: array
@ -435,16 +341,11 @@ spec:
matchLabels:
additionalProperties:
type: string
description: matchLabels is a map of {key,value} pairs. A single
{key,value} in the matchLabels map is equivalent to an element
of matchExpressions, whose key field is "key", the operator
is "In", and the values array contains only "value". The requirements
are ANDed.
description: matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
type: object
targetLabels:
description: TargetLabels transfers labels on the Kubernetes Service
onto the target.
description: TargetLabels transfers labels on the Kubernetes Service onto the target.
items:
type: string
type: array

View File

@ -4,7 +4,7 @@ metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.39.0
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
rules:
- apiGroups:

View File

@ -4,7 +4,7 @@ metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.39.0
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
roleRef:
apiGroup: rbac.authorization.k8s.io

View File

@ -4,7 +4,7 @@ metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.39.0
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
namespace: monitoring
spec:
@ -18,15 +18,15 @@ spec:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.39.0
app.kubernetes.io/version: v0.40.0
spec:
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --logtostderr=true
- --config-reloader-image=carlosedp/configmap-reload:latest
- --prometheus-config-reloader=carlosedp/prometheus-config-reloader:v0.39.0
image: carlosedp/prometheus-operator:v0.39.0
- --prometheus-config-reloader=carlosedp/prometheus-config-reloader:v0.40.0
image: carlosedp/prometheus-operator:v0.40.0
name: prometheus-operator
ports:
- containerPort: 8080

View File

@ -4,7 +4,7 @@ metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.39.0
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
namespace: monitoring
spec:

View File

@ -4,6 +4,6 @@ metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/name: prometheus-operator
app.kubernetes.io/version: v0.39.0
app.kubernetes.io/version: v0.40.0
name: prometheus-operator
namespace: monitoring

View File

@ -49,6 +49,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
'/bin/rpi_exporter',
'--web.listen-address=127.0.0.1:9243',
]) +
container.mixin.securityContext.withPrivileged(true) +
container.mixin.resources.withRequests({ cpu: '50m', memory: '50Mi' }) +
container.mixin.resources.withLimits({ cpu: '100m', memory: '100Mi' });

View File

@ -1,6 +1,8 @@
local utils = import '../utils.libsonnet';
local vars = import '../vars.jsonnet';
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local service = k.core.v1.service;
local servicePort = k.core.v1.service.mixin.spec.portsType;
{
prometheus+:: {
@ -9,5 +11,17 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
kubeSchedulerPrometheusDiscoveryEndpoints:
utils.newEndpoint('kube-scheduler-prometheus-discovery', 'kube-system', vars.k3s.master_ip, 'http-metrics', 10251),
kubeControllerManagerPrometheusDiscoveryService:
service.new('kube-controller-manager-prometheus-discovery', {}, servicePort.newNamed('http-metrics', 10252, 10252)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-controller-manager' }) +
service.mixin.spec.withClusterIp('None'),
kubeSchedulerPrometheusDiscoveryService:
service.new('kube-scheduler-prometheus-discovery', {}, servicePort.newNamed('http-metrics', 10251, 10251)) +
service.mixin.metadata.withNamespace('kube-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'kube-scheduler' }) +
service.mixin.spec.withClusterIp('None'),
},
}

View File

@ -24,5 +24,20 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
service.mixin.metadata.withNamespace('metallb-system') +
service.mixin.metadata.withLabels({ 'k8s-app': 'metallb-controller' }) +
service.mixin.spec.withClusterIp('None'),
clusterRole:
utils.newClusterRole('metallb-exporter', [
{
apis: [''],
res: ['services', 'endpoints', 'pods'],
verbs: ['get', 'list', 'watch'],
},
], null),
clusterRoleBinding:
utils.newClusterRoleBinding('metallb-exporter', 'prometheus-k8s', $._config.namespace, 'metallb-exporter', null),
},
}

44
modules/nginx.jsonnet Normal file
View File

@ -0,0 +1,44 @@
local utils = import '../utils.libsonnet';
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
{
_config+:: {
namespace: 'monitoring',
grafanaDashboards+:: {
'nginx-dashboard.json': (import '../grafana-dashboards/nginx-dashboard.json'),
},
},
nginxExporter+:: {
serviceMonitor:
utils.newServiceMonitor('nginx', $._config.namespace, { 'app.kubernetes.io/name': 'ingress-nginx' }, 'ingress-nginx', 'prometheus', 'http'),
service:
local service = k.core.v1.service;
local servicePort = k.core.v1.service.mixin.spec.portsType;
local nginxPort = servicePort.newNamed('prometheus', 10254, 10254);
service.new('ingress-nginx-metrics', {'app.kubernetes.io/name': 'ingress-nginx'}, nginxPort) +
service.mixin.metadata.withNamespace('ingress-nginx') +
service.mixin.metadata.withLabels({'app.kubernetes.io/name': 'ingress-nginx'}) +
service.mixin.spec.withClusterIp('None'),
clusterRole:
utils.newClusterRole('nginx-exporter', [
{
apis: [''],
res: ['services', 'endpoints', 'pods'],
verbs: ['get', 'list', 'watch'],
},
], null),
serviceAccount:
utils.newServiceAccount('nginx-exporter', $._config.namespace, null),
clusterRoleBinding:
utils.newClusterRoleBinding('nginx-exporter', 'nginx-exporter', $._config.namespace, 'nginx-exporter', null),
},
}

View File

@ -5,15 +5,15 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
namespace: 'monitoring',
versions+:: {
smtpServer: 'v1.0.1',
smtpRelay: 'v1.0.1',
},
imageRepos+:: {
smtpServer: 'carlosedp/docker-smtp',
smtpRelay: 'carlosedp/docker-smtp',
},
},
smtpServer+:: {
smtpRelay+:: {
deployment:
local deployment = k.apps.v1.deployment;
local container = k.apps.v1.deployment.mixin.spec.template.spec.containersType;
@ -21,8 +21,8 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
local podLabels = { run: 'smtp-server' };
local smtpServer =
container.new('smtp-server', $._config.imageRepos.smtpServer + ':' + $._config.versions.smtpServer) +
local smtpRelay =
container.new('smtp-server', $._config.imageRepos.smtpRelay + ':' + $._config.versions.smtpRelay) +
container.withPorts(containerPort.newNamed(25, 'smtp')) +
container.withEnv([
{
@ -44,7 +44,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
{ name: 'RELAY_DOMAINS', value: ':192.168.0.0/24:10.0.0.0/16' },
]);
local c = [smtpServer];
local c = [smtpRelay];
deployment.new('smtp-server', 1, c, podLabels) +
deployment.mixin.metadata.withNamespace($._config.namespace) +
@ -54,9 +54,9 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
service:
local service = k.core.v1.service;
local servicePort = k.core.v1.service.mixin.spec.portsType;
local smtpServerPorts = servicePort.newNamed('smtp', 25, 'smtp');
local smtpRelayPorts = servicePort.newNamed('smtp', 25, 'smtp');
service.new('smtp-server', $.smtpServer.deployment.spec.selector.matchLabels, smtpServerPorts) +
service.new('smtp-server', $.smtpRelay.deployment.spec.selector.matchLabels, smtpRelayPorts) +
service.mixin.metadata.withNamespace($._config.namespace) +
service.mixin.metadata.withLabels({ run: 'smtp-server' }),
},

View File

@ -0,0 +1,51 @@
local utils = import '../utils.libsonnet';
local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
{
_config+:: {
namespace: 'monitoring',
replicas: 1,
imageRepos+:: {
speedtestExporter: 'ghcr.io/miguelndecarvalho/speedtest-exporter',
},
// Add custom dashboards
grafanaDashboards+:: {
'speedtest-exporter-dashboard.json': (import '../grafana-dashboards/speedtest-exporter-dashboard.json'),
},
},
speedtestExporter+:: {
deployment:
local deployment = k.apps.v1.deployment;
local container = k.apps.v1.deployment.mixin.spec.template.spec.containersType;
local containerPort = container.portsType;
local podLabels = { 'k8s-app': 'speedtest-exporter' };
local speedtestExporter =
container.new('speedtest-exporter',
$._config.imageRepos.speedtestExporter) +
container.withPorts(containerPort.newNamed(9798, 'metrics'));
local c = [speedtestExporter];
deployment.new('speedtest-exporter', $._config.replicas, c, podLabels) +
deployment.mixin.metadata.withNamespace($._config.namespace) +
deployment.mixin.metadata.withLabels(podLabels) +
deployment.mixin.spec.selector.withMatchLabels(podLabels) +
deployment.mixin.spec.template.spec.withRestartPolicy('Always'),
service:
local service = k.core.v1.service;
local servicePort = k.core.v1.service.mixin.spec.portsType;
local speedtestExporterPorts = servicePort.newNamed('metrics', 9798, 'metrics');
service.new('speedtest-exporter', $.speedtestExporter.deployment.spec.selector.matchLabels, speedtestExporterPorts) +
service.mixin.metadata.withNamespace($._config.namespace) +
service.mixin.metadata.withLabels({ 'k8s-app': 'speedtest-exporter' }),
serviceMonitor:
utils.newServiceMonitor('speedtest-exporter', $._config.namespace, { 'k8s-app': 'speedtest-exporter' }, $._config.namespace, 'metrics', 'http', 'metrics', '30m', '2m'),
},
}

View File

@ -0,0 +1,14 @@
apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana
labels:
type: local
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: "/data/k3s-storage/grafana/"

View File

@ -0,0 +1,14 @@
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus
labels:
type: local
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: "/data/k3s-storage/prometheus/"

View File

@ -20,4 +20,10 @@ rm -rf manifests
mkdir -p manifests/setup
# optional, but we would like to generate yaml, not json
$JSONNET_BIN -J vendor -m manifests "${1-example.jsonnet}" | xargs -I{} sh -c 'cat {} | $(go env GOPATH)/bin/gojsontoyaml > {}.yaml; rm -f {}' -- {}
$JSONNET_BIN -J vendor -m manifests "${1-example.jsonnet}" | while IFS= read -r file; do
"$(go env GOPATH)/bin/gojsontoyaml" <"$file" >"$file.yaml"
rm -f "$file"
done
# Clean-up json files from manifests dir
find manifests -type f ! -name '*.yaml' -delete

View File

@ -9,9 +9,9 @@ REPO=carlosedp
export AOR_VERSION=2.3
export KSM_VERSION=v1.9.6
export PROM_OP_VERSION=v0.39.0
export PROM_OP_VERSION=v0.40.0
export KUBE_RBAC_VERSION=v0.5.0
export PROM_CONFIG_RELOADER_VERSION=v0.39.0
export PROM_CONFIG_RELOADER_VERSION=v0.40.0
export CONFIGMAP_RELOAD_VERSION=latest
#-------------------------------------------------------------------------------
# Kubernetes addon-resizer

View File

@ -91,38 +91,53 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
),
// Creates ingress objects
newIngress(name, namespace, host, path, serviceName, servicePort):: (
local ingress = k.extensions.v1beta1.ingress;
local ingressTls = ingress.mixin.spec.tlsType;
local ingressRule = ingress.mixin.spec.rulesType;
local httpIngressPath = ingressRule.mixin.http.pathsType;
newIngress(name, namespace, hosts, path, serviceName, servicePort):: (
{
apiVersion: 'networking.k8s.io/v1',
kind: 'Ingress',
metadata: {
name: name,
namespace: namespace,
},
spec: {
rules: [$.newIngressHost(host, path, serviceName, servicePort) for host in hosts],
},
}
),
ingress.new()
+ ingress.mixin.metadata.withName(name)
+ ingress.mixin.metadata.withNamespace(namespace)
+ ingress.mixin.spec.withRules(
ingressRule.new()
+ ingressRule.withHost(host)
+ ingressRule.mixin.http.withPaths(
httpIngressPath.new()
+ httpIngressPath.withPath(path)
+ httpIngressPath.mixin.backend.withServiceName(serviceName)
+ httpIngressPath.mixin.backend.withServicePort(servicePort)
),
)
// Add host to Ingress resource
newIngressHost(host, path, serviceName, servicePort):: (
{
host: host,
http: {
paths: [
{
backend: {
service: {
name: serviceName,
port: {
name: servicePort,
},
},
},
path: path,
pathType: 'Prefix',
},
],
},
}
),
// Add TLS to Ingress resource with secret containing the certificates if exists
addIngressTLS(I, S=''):: (
local ingress = k.extensions.v1beta1.ingress;
addIngressTLS(I, hosts, secretName=''):: (
local ingress = k.networking.v1beta1.ingress;
local ingressTls = ingress.mixin.spec.tlsType;
local host = I.spec.rules[0].host;
local namespace = I.metadata.namespace;
I + ingress.mixin.spec.withTls(
ingressTls.new() +
ingressTls.withHosts(host) +
(if S != '' then { secretName: S } else {})
ingressTls.withHosts(hosts) +
(if secretName != '' then { secretName: secretName } else {})
)
),
@ -130,7 +145,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
newTLSSecret(name, namespace, crt, key):: (
local secret = k.core.v1.secret;
secret.new('ingress-secret') +
secret.new(name) +
secret.mixin.metadata.withNamespace(namespace) +
secret.withType('kubernetes.io/tls') +
secret.withData(
@ -175,7 +190,7 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
),
// Creates http ServiceMonitor objects
newServiceMonitor(name, namespace, matchLabel, matchNamespace, portName, portScheme, path='metrics'):: (
newServiceMonitor(name, namespace, matchLabel, matchNamespace, portName, portScheme, path='metrics', interval='30s', timeout='30s'):: (
{
apiVersion: 'monitoring.coreos.com/v1',
kind: 'ServiceMonitor',
@ -195,7 +210,8 @@ local k = import 'ksonnet/ksonnet.beta.4/k.libsonnet';
{
port: portName,
scheme: portScheme,
interval: '30s',
interval: interval,
scrapeTimeout: timeout,
relabelings: [
{
action: 'replace',

View File

@ -25,6 +25,11 @@
enabled: false,
file: import 'modules/metallb.jsonnet',
},
{
name: 'nginxExporter',
enabled: false,
file: import 'modules/nginx.jsonnet',
},
{
name: 'traefikExporter',
enabled: false,
@ -35,15 +40,23 @@
enabled: false,
file: import 'modules/elasticsearch_exporter.jsonnet',
},
{
name: 'speedtestExporter',
enabled: false,
file: import 'modules/speedtest_exporter.jsonnet',
},
],
k3s: {
enabled: false,
master_ip: ['192.168.15.15'],
master_ip: ['192.168.1.15'],
},
// Domain suffix for the ingresses
suffixDomain: '192.168.15.15.nip.io',
suffixDomain: '192.168.1.15.nip.io',
// Additional domain suffixes for the ingresses.
// For example suffixDomain could be an external one and this a local domain.
additionalDomains: [],
// If TLSingress is true, a self-signed HTTPS ingress with redirect will be created
TLSingress: true,
// If UseProvidedCerts is true, provided files will be used on created HTTPS ingresses.
@ -52,14 +65,34 @@
TLSCertificate: importstr 'server.crt',
TLSKey: importstr 'server.key',
// Setting these to false, defaults to emptyDirs
// Persistent volume configuration
enablePersistence: {
// Setting these to false, defaults to emptyDirs.
prometheus: false,
grafana: false,
// If using a pre-created PV, fill in the names below. If blank, they will use the default StorageClass
prometheusPV: '',
grafanaPV: '',
// If required to use a specific storageClass, keep the PV names above blank and fill the storageClass name below.
storageClass: '',
// Define the PV sizes below
prometheusSizePV: '2Gi',
grafanaSizePV: '20Gi',
},
// Grafana "from" email
// Configuration for Prometheus deployment
prometheus: {
retention: '15d',
scrapeInterval: '30s',
scrapeTimeout: '30s',
},
grafana: {
// Grafana "from" email
from_address: 'myemail@gmail.com',
// Plugins to be installed at runtime.
//Ex. plugins: ['grafana-piechart-panel', 'grafana-clock-panel'],
plugins: [],
//Ex. env: [ { name: 'http_proxy', value: 'host:8080' } ]
env: []
},
}