406 lines
16 KiB
Markdown
406 lines
16 KiB
Markdown
---
|
|
title: Set up prerequisites
|
|
description: >
|
|
InfluxDB Clustered requires prerequisites such as a Kubernetes cluster, object storage,
|
|
a PostgreSQL-compatitble database and more. Learn how to set up and configure the
|
|
necessary prerequisites.
|
|
menu:
|
|
influxdb3_clustered:
|
|
name: Prerequisites
|
|
parent: Set up your cluster
|
|
weight: 201
|
|
aliases:
|
|
- /influxdb3/clustered/install/prerequisites/
|
|
---
|
|
|
|
InfluxDB Clustered requires the following prerequisite external dependencies:
|
|
|
|
- **kubectl command line tool**
|
|
- **Kubernetes cluster**
|
|
- **kubecfg kubit operator**
|
|
- **Kubernetes ingress controller**
|
|
- **Object storage**: AWS S3 or S3-compatible storage (including Google Cloud Storage
|
|
or Azure Blob Storage) to store the InfluxDB Parquet files.
|
|
- **PostgreSQL-compatible database** _(AWS Aurora, hosted PostgreSQL, etc.)_:
|
|
Stores the [InfluxDB Catalog](/influxdb3/clustered/reference/internals/storage-engine/#catalog).
|
|
- **Local or attached storage**:
|
|
Stores the Write-Ahead Log (WAL) for
|
|
[InfluxDB Ingesters](/influxdb3/clustered/reference/internals/storage-engine/#ingester).
|
|
|
|
The following walks through preparing these prerequisites.
|
|
|
|
- [Install kubectl](#install-kubectl)
|
|
- [Set up your Kubernetes cluster](#set-up-your-kubernetes-cluster)
|
|
- [Install the kubecfg kubit operator](#install-the-kubecfg-kubit-operator)
|
|
- [Set up a Kubernetes ingress controller](#set-up-a-kubernetes-ingress-controller)
|
|
- [Set up your object store](#set-up-your-object-store)
|
|
- [Set up your PostgreSQL-compatible database](#set-up-your-postgresql-compatible-database)
|
|
- [Set up local or attached storage](#set-up-local-or-attached-storage)
|
|
|
|
## Install kubectl
|
|
|
|
Kubernetes provides the `kubectl` command line tool for communicating with a
|
|
Kubernetes cluster's control plane. `kubectl` is used to manage your InfluxDB
|
|
cluster.
|
|
|
|
Follow instructions to install `kubectl` on your local machine:
|
|
|
|
{{% note %}}
|
|
InfluxDB Clustered Kubernetes deployments require `kubectl` 1.27 or higher.
|
|
{{% /note %}}
|
|
|
|
- [Install kubectl on Linux](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/)
|
|
- [Install kubectl on macOS](https://kubernetes.io/docs/tasks/tools/install-kubectl-macos/)
|
|
- [Install kubectl on Windows](https://kubernetes.io/docs/tasks/tools/install-kubectl-windows/)
|
|
|
|
## Set up your Kubernetes cluster
|
|
|
|
1. Deploy a Kubernetes cluster. The deployment process depends on your
|
|
Kubernetes environment or Kubernetes cloud provider. Refer to the
|
|
[Kubernetes documentation](https://kubernetes.io/docs/home/) or your cloud
|
|
provider's documentation for information about deploying a Kubernetes cluster.
|
|
|
|
2. Ensure `kubectl` can connect to your Kubernetes cluster.
|
|
Your Manage [kubeconfig file](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/)
|
|
defines cluster connection credentials.
|
|
|
|
3. Create two namespaces--`influxdb` and `kubit`. Use
|
|
[`kubectl create namespace`](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_create/kubectl_create_namespace/) to create the
|
|
namespaces:
|
|
|
|
<!-- pytest.mark.skip -->
|
|
|
|
```bash
|
|
kubectl create namespace influxdb && \
|
|
kubectl create namespace kubit
|
|
```
|
|
|
|
4. Install an [ingress controller](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/)
|
|
in the cluster and a mechanism to obtain a valid TLS certificate
|
|
(for example: [cert-manager](https://cert-manager.io/) or provide the
|
|
certificate PEM manually out of band).
|
|
To use the InfluxDB-specific ingress controller, install [Ingress NGINX](https://github.com/kubernetes/ingress-nginx).
|
|
|
|
5. Ensure your Kubernetes cluster can access the InfluxDB container registry,
|
|
or, if running in an air-gapped environment, a local container registry to
|
|
which you can copy the InfluxDB images.
|
|
|
|
### Cluster sizing recommendation
|
|
|
|
As a starting point for a production workload, InfluxData recommends the
|
|
following sizing for {{% product-name %}} components:
|
|
|
|
{{< tabs-wrapper >}}
|
|
{{% tabs %}}
|
|
[AWS](#)
|
|
[Google Cloud Platform](#)
|
|
[Microsoft Azure](#)
|
|
[On-Prem](#)
|
|
{{% /tabs %}}
|
|
{{% tab-content %}}
|
|
<!--------------------------------- BEGIN AWS --------------------------------->
|
|
|
|
- **Catalog (PostgreSQL-compatible database) (x1):**
|
|
- _[See below](#postgresql-compatible-database-requirements)_
|
|
- **Ingesters and Routers (x3):**
|
|
- EC2 m6i.2xlarge (8 CPU, 32 GB RAM)
|
|
- Local storage: minimum of 2 GB (high-speed SSD)
|
|
- **Queriers (x3):**
|
|
- EC2 m6i.2xlarge (8 CPU, 32 GB RAM)
|
|
- **Compactors (x1):**
|
|
- EC2 m6i.2xlarge (8 CPU, 32 GB RAM)
|
|
- **Kubernetes Control Plane (x1):**
|
|
- EC2 t3.large (2 CPU, 8 GB RAM)
|
|
|
|
<!---------------------------------- END AWS ---------------------------------->
|
|
{{% /tab-content %}}
|
|
{{% tab-content %}}
|
|
<!--------------------------------- BEGIN GCP --------------------------------->
|
|
|
|
- **Catalog (PostgreSQL-compatible database) (x1):**
|
|
- _[See below](#postgresql-compatible-database-requirements)_
|
|
- **Ingesters and Routers (x3):**
|
|
- GCE c2-standard-8 (8 CPU, 32 GB RAM)
|
|
- Local storage: minimum of 2 GB (high-speed SSD)
|
|
- **Queriers (x3):**
|
|
- GCE c2-standard-8 (8 CPU, 32 GB RAM)
|
|
- **Compactors (x1):**
|
|
- GCE c2-standard-8 (8 CPU, 32 GB RAM)
|
|
- **Kubernetes Control Plane (x1):**
|
|
- GCE c2d-standard-2 (2 CPU, 8 GB RAM)
|
|
|
|
<!---------------------------------- END GCP ---------------------------------->
|
|
{{% /tab-content %}}
|
|
{{% tab-content %}}
|
|
<!-------------------------------- BEGIN Azure -------------------------------->
|
|
|
|
- **Catalog (PostgreSQL-compatible database) (x1):**
|
|
- _[See below](#postgresql-compatible-database-requirements)_
|
|
- **Ingesters and Routers (x3):**
|
|
- Standard_D8s_v3 (8 CPU, 32 GB RAM)
|
|
- Local storage: minimum of 2 GB (high-speed SSD)
|
|
- **Queriers (x3):**
|
|
- Standard_D8s_v3 (8 CPU, 32 GB RAM)
|
|
- **Compactors (x1):**
|
|
- Standard_D8s_v3 (8 CPU, 32 GB RAM)
|
|
- **Kubernetes Control Plane (x1):**
|
|
- Standard_B2ms (2 CPU, 8 GB RAM)
|
|
|
|
<!--------------------------------- END Azure --------------------------------->
|
|
{{% /tab-content %}}
|
|
{{% tab-content %}}
|
|
<!------------------------------- BEGIN ON-PREM ------------------------------->
|
|
|
|
- **Catalog (PostgreSQL-compatible database) (x1):**
|
|
- CPU: 4-8 cores
|
|
- RAM: 16-32 GB
|
|
- **Ingesters and Routers (x3):**
|
|
- CPU: 8 cores
|
|
- RAM: 32 GB
|
|
- Local storage: 2 GB (high-speed SSD)
|
|
- **Queriers (x3):**
|
|
- CPU: 8 cores
|
|
- RAM: 32 GB
|
|
- **Compactors (x1):**
|
|
- CPU: 8 cores
|
|
- RAM: 32 GB
|
|
- **Kubernetes Control Plane (x1):**
|
|
- CPU: 2 cores
|
|
- RAM: 8 GB
|
|
|
|
<!-------------------------------- END ON-PREM -------------------------------->
|
|
{{% /tab-content %}}
|
|
{{< /tabs-wrapper >}}
|
|
|
|
Your sizing may need to be different based on your environment, cloud provider,
|
|
and workload, but this is a reasonable starting size for your initial testing.
|
|
|
|
## Install the kubecfg kubit operator
|
|
|
|
The [`kubecfg kubit` operator](https://github.com/kubecfg/kubit) (maintained by InfluxData)
|
|
simplifies the installation and management of the InfluxDB Clustered package.
|
|
It manages the application of the jsonnet templates used to install, manage, and
|
|
update an InfluxDB cluster.
|
|
|
|
{{% note %}}
|
|
#### The InfluxDB Clustered Helm chart includes the kubit operator
|
|
|
|
If using the [InfluxDB Clustered Helm chart](https://github.com/influxdata/helm-charts/tree/master/charts/influxdb3-clustered)
|
|
to deploy your InfluxDB cluster, you do not need to install the kubit operator
|
|
separately. The Helm chart installs the kubit operator.
|
|
{{% /note %}}
|
|
|
|
Use `kubectl` to install the [kubecfg kubit](https://github.com/kubecfg/kubit)
|
|
operator **v0.0.18 or later**.
|
|
|
|
<!-- pytest.mark.skip -->
|
|
|
|
```bash
|
|
kubectl apply -k 'https://github.com/kubecfg/kubit//kustomize/global?ref=v0.0.19'
|
|
```
|
|
|
|
## Set up a Kubernetes ingress controller
|
|
|
|
[Kubernetes ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/)
|
|
routes HTTP/S requests to services within the cluster and requires deploying an
|
|
[ingress controller](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/).
|
|
You can provide your own ingress or you can install
|
|
[Nginx Ingress Controller](https://github.com/kubernetes/ingress-nginx) to use
|
|
the InfluxDB-defined ingress.
|
|
|
|
{{% note %}}
|
|
InfluxDB Clustered components use gRPC/HTTP2 protocols. If using an external
|
|
load balancer, you may need to explicitly enable these protocols on your load
|
|
balancers.
|
|
{{% /note %}}
|
|
|
|
## Set up your object store
|
|
|
|
InfluxDB Clustered supports AWS S3 or S3-compatible storage (including Google
|
|
Cloud Storage, Azure Blob Storage, and MinIO) for storing
|
|
[InfluxDB Parquet files](/influxdb3/clustered/reference/internals/storage-engine/#object-store).
|
|
Refer to your object storage provider's documentation for information about
|
|
setting up an object store:
|
|
|
|
- [Create an AWS S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html)
|
|
- [Create a Google Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets)
|
|
- [Create an Azure Blog Storage container](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-portal#create-a-container)
|
|
- [Create a MinIO bucket](https://min.io/docs/minio/linux/reference/minio-mc/mc-mb.html)
|
|
|
|
{{% caption %}}
|
|
\* This list does not represent all S3-compatible object stores
|
|
that work with InfluxDB Clustered. Other S3-compatible object stores should work
|
|
as well.
|
|
{{% /caption %}}
|
|
|
|
{{% note %}}
|
|
#### Object storage recommendations
|
|
|
|
We **strongly** recommend the following:
|
|
|
|
- ##### Enable object versioning
|
|
|
|
Enable object versioning in your object store.
|
|
Refer to your object storage provider's documentation for information about
|
|
enabling object versioning.
|
|
|
|
- ##### Run the object store in a separate namespace or outside of Kubernetes
|
|
|
|
Run the Object store in a separate namespace from InfluxDB or external to
|
|
Kubernetes entirely. Doing so makes management of the InfluxDB cluster easier
|
|
and helps to prevent accidental data loss. While deploying everything in the
|
|
same namespace is possible, we do not recommend it for production environments.
|
|
{{% /note %}}
|
|
|
|
### Configure object storage permissions
|
|
|
|
Ensure the identity you use to connect to your S3-compatible object store has
|
|
the correct permissions to allow InfluxDB to perform all the actions it needs to.
|
|
|
|
{{< expand-wrapper >}}
|
|
{{% expand "View example AWS S3 access policy" %}}
|
|
|
|
The IAM role that you use to access AWS S3 should have the following policy:
|
|
|
|
{{% code-placeholders "S3_BUCKET_NAME" %}}
|
|
```json
|
|
{
|
|
"Version": "2012-10-17",
|
|
"Statement": [
|
|
{
|
|
"Sid": "",
|
|
"Effect": "Allow",
|
|
"Action": [
|
|
"s3:PutObjectAcl",
|
|
"s3:PutObject",
|
|
"s3:ListMultipartUploadParts",
|
|
"s3:GetObjectAcl",
|
|
"s3:GetObject",
|
|
"s3:DeleteObject",
|
|
"s3:AbortMultipartUpload"
|
|
],
|
|
"Resource": "arn:aws:s3:::S3_BUCKET_NAME/*",
|
|
},
|
|
{
|
|
"Sid": "",
|
|
"Effect": "Allow",
|
|
"Action": [
|
|
"s3:ListBucket",
|
|
"s3:GetBucketLocation"
|
|
],
|
|
"Resource": "arn:aws:s3:::S3_BUCKET_NAME",
|
|
},
|
|
{
|
|
"Sid": "",
|
|
"Effect": "Allow",
|
|
"Action": "s3:ListAllMyBuckets",
|
|
"Resource": "*",
|
|
}
|
|
]
|
|
}
|
|
```
|
|
{{% /code-placeholders %}}
|
|
|
|
Replace the following:
|
|
|
|
- {{% code-placeholder-key %}}`S3_BUCKET_NAME`{{% /code-placeholder-key %}}: Name of your AWS S3 bucket
|
|
|
|
{{% /expand %}}
|
|
|
|
{{% expand "View the requirements for Google Cloud Storage" %}}
|
|
|
|
To use Google Cloud Storage (GCS) as your object store, your [IAM principal](https://cloud.google.com/iam/docs/overview) should be granted the `roles/storage.objectUser` role.
|
|
For example, if using [Google Service Accounts](https://cloud.google.com/iam/docs/service-account-overview):
|
|
|
|
{{% code-placeholders "GCP_SERVICE_ACCOUNT|GCP_BUCKET" %}}
|
|
```bash
|
|
gcloud storage buckets add-iam-policy-binding \
|
|
gs://GCP_BUCKET \
|
|
--member="serviceAccount:GCP_SERVICE_ACCOUNT" \
|
|
--role="roles/storage.objectUser"
|
|
```
|
|
{{% /code-placeholders %}}
|
|
|
|
Replace the following:
|
|
|
|
- {{% code-placeholder-key %}}`GCP_SERVICE_ACCOUNT`{{% /code-placeholder-key %}}: Google Service Account name
|
|
- {{% code-placeholder-key %}}`GCP_BUCKET`{{% /code-placeholder-key %}}: GCS bucket name
|
|
|
|
{{% /expand %}}
|
|
|
|
{{% expand "View the requirements for Azure Blob Storage" %}}
|
|
|
|
To use Azure Blob Storage as your object store, your [service principal](https://learn.microsoft.com/en-us/entra/architecture/service-accounts-principal)
|
|
should be granted the `Storage Blob Data Contributor` role.
|
|
This is a built-in role for Azure which encompasses common permissions.
|
|
You can assign it using the following command:
|
|
|
|
{{% code-placeholders "PRINCIPAL|AZURE_SUBSCRIPTION|AZURE_RESOURCE_GROUP|AZURE_STORAGE_ACCOUNT|AZURE_STORAGE_CONTAINER" %}}
|
|
```bash
|
|
az role assignment create \
|
|
--role "Storage Blob Data Contributor" \
|
|
--assignee PRINCIPAL \
|
|
--scope "/subscriptions/AZURE_SUBSCRIPTION/resourceGroups/AZURE_RESOURCE_GROUP/providers/Microsoft.Storage/storageAccounts/AZURE_STORAGE_ACCOUNT/blobServices/default/containers/AZURE_STORAGE_CONTAINER"
|
|
```
|
|
{{% /code-placeholders %}}
|
|
|
|
Replace the following:
|
|
|
|
- {{% code-placeholder-key %}}`PRINCIPAL`{{% /code-placeholder-key %}}: A user, group, or service principal that the role should be assigned to
|
|
- {{% code-placeholder-key %}}`AZURE_SUBSCRIPTION`{{% /code-placeholder-key %}}: Your Azure subscription
|
|
- {{% code-placeholder-key %}}`AZURE_RESOURCE_GROUP`{{% /code-placeholder-key %}}: The resource group that your Azure Blob storage account belongs to
|
|
- {{% code-placeholder-key %}}`AZURE_STORAGE_ACCOUNT`{{% /code-placeholder-key %}}: Azure Blob storage account name
|
|
- {{% code-placeholder-key %}}`AZURE_STORAGE_CONTAINER`{{% /code-placeholder-key %}}: Container name in your Azure Blob storage account
|
|
|
|
{{% /expand %}}
|
|
|
|
{{< /expand-wrapper >}}
|
|
|
|
{{% note %}}
|
|
To configure permissions with MinIO, use the
|
|
[example AWS access policy](#view-example-aws-s3-access-policy).
|
|
{{% /note %}}
|
|
|
|
## Set up your PostgreSQL-compatible database
|
|
|
|
The [InfluxDB Catalog](/influxdb3/clustered/reference/internals/storage-engine/#catalog)
|
|
that stores metadata related to your time series data requires a PostgreSQL or
|
|
PostgreSQL-compatible database _(AWS Aurora, hosted PostgreSQL, etc.)_.
|
|
The process for installing and setting up your PostgreSQL-compatible database
|
|
depends on the database and database provider you use.
|
|
Refer to your database's or provider's documentation for setting up your
|
|
PostgreSQL-compatible database.
|
|
|
|
### PostgreSQL-compatible database requirements
|
|
|
|
- PostgreSQL version **13 or 14**.
|
|
- **Minimum of 4 GB of memory** or equivalent provider-specific units.
|
|
- To avoid conflicts and prevent issues caused by shared usage with other
|
|
applications, ensure that your PostgreSQL-compatible instance is dedicated
|
|
exclusively to InfluxDB.
|
|
|
|
{{% note %}}
|
|
We **strongly** recommended running the PostgreSQL-compatible database
|
|
in a separate namespace from InfluxDB or external to Kubernetes entirely.
|
|
Doing so makes management of the InfluxDB cluster easier and helps to prevent
|
|
accidental data loss.
|
|
|
|
While deploying everything in the same namespace is possible, we do not
|
|
recommend it for production environments.
|
|
{{% /note %}}
|
|
|
|
## Set up local or attached storage
|
|
|
|
The [InfluxDB Ingester](/influxdb3/clustered/reference/internals/storage-engine/#ingester)
|
|
needs local or attached storage to store the Write-Ahead Log (WAL).
|
|
The read and write speed of the attached storage affects the write performance
|
|
of the Ingester, so the faster the storage device, the better your write
|
|
performance will be.
|
|
The recommended minimum size of the local storage is 2 gibibytes (`2Gi`).
|
|
|
|
Installation and setup of local or attached storage depends on your underlying
|
|
hardware or cloud provider. Refer to your provider's documentation for
|
|
information about installing and configuring local storage.
|
|
|
|
{{< page-nav prev="/influxdb3/clustered/install/set-up-cluster/" prevText="Back" next="/influxdb3/clustered/install/set-up-cluster/configure-cluster" nextText="Configure your cluster" >}}
|