Merge pull request #5777 from influxdata/fix/issue5772

fix(monolith): get-started and broken install link:
pull/5783/head
Jason Stirnaman 2025-01-15 15:16:04 -06:00 committed by GitHub
commit 19eaf3c759
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 631 additions and 257 deletions

View File

@ -37,7 +37,7 @@ Use the InfluxDB 3 quick install script to install {{< product-name >}} on
**Linux** and **macOS**.
> [!Important]
> If using Windows, [download the InfluxDB 3 Core Windows binary](?t=Windows#download-influxdb-3-core-binaries).
> If using Windows, [download the {{% product-name %}} Windows binary](?t=Windows#download-influxdb-3-core-binaries).
1. Use the following command to download and install the appropriate
{{< product-name >}} package on your local machine:
@ -46,7 +46,7 @@ Use the InfluxDB 3 quick install script to install {{< product-name >}} on
curl -O https://www.influxdata.com/d/install_influxdb3.sh && sh install_influxdb3.sh
```
2. Ensure installation completed successfully:
2. Verify that installation completed successfully:
```bash
influxdb3 --version
@ -56,7 +56,7 @@ Use the InfluxDB 3 quick install script to install {{< product-name >}} on
>
> #### influxdb3 not found
>
> If it your system can't locate your `influxdb3` binary, `source` your
> If your system can't locate your `influxdb3` binary, `source` your
> current shell configuration file (`.bashrc`, `.zshrc`, etc.).
>
> {{< code-tabs-wrapper >}}
@ -98,7 +98,7 @@ source ~/.zshrc
- [InfluxDB 3 Core • Linux (ARM) • GNU](https://download.influxdata.com/influxdb/snapshots/influxdb3-core_aarch64-unknown-linux-gnu.tar.gz)
[sha256](ps://dl.influxdata.com/influxdb/snapshots/influxdb3-core_aarch64-unknown-linux-gnu.tar.gz.sha256)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_aarch64-unknown-linux-gnu.tar.gz.sha256)
- [InfluxDB 3 Core • Linux (ARM) • MUSL](https://download.influxdata.com/influxdb/snapshots/influxdb3-core_aarch64-unknown-linux-musl.tar.gz)
@ -136,10 +136,10 @@ source ~/.zshrc
## Docker image
Use the {{< product-name >}} Docker image to deploy {{< product-name >}} in a
Use the `influxdb3-core` Docker image to deploy {{< product-name >}} in a
Docker container.
```
```bash
docker pull quay.io/influxdb/influxdb3-core:latest
```

View File

@ -37,7 +37,7 @@ Use the InfluxDB 3 quick install script to install {{< product-name >}} on
**Linux** and **macOS**.
> [!Important]
> If using Windows, [download the InfluxDB 3 Enterprise Windows binary](?t=Windows#download-influxdb-3-enterprise-binaries).
> If using Windows, [download the {{% product-name %}} Windows binary](?t=Windows#download-influxdb-3-enterprise-binaries).
1. Use the following command to download and install the appropriate
{{< product-name >}} package on your local machine:
@ -46,7 +46,7 @@ Use the InfluxDB 3 quick install script to install {{< product-name >}} on
curl -O https://www.influxdata.com/d/install_influxdb3.sh && sh install_influxdb3.sh enterprise
```
2. Ensure installation completed successfully:
2. Verify that installation completed successfully:
```bash
influxdb3 --version
@ -56,7 +56,7 @@ Use the InfluxDB 3 quick install script to install {{< product-name >}} on
>
> #### influxdb3 not found
>
> If it your system can't locate your `influxdb3` binary, `source` your
> If your system can't locate your `influxdb3` binary, `source` your
> current shell configuration file (`.bashrc`, `.zshrc`, etc.).
>
> {{< code-tabs-wrapper >}}
@ -98,7 +98,7 @@ source ~/.zshrc
- [InfluxDB 3 Enterprise • Linux (ARM) • GNU](https://download.influxdata.com/influxdb/snapshots/influxdb3-enterprise_aarch64-unknown-linux-gnu.tar.gz)
[sha256](ps://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_aarch64-unknown-linux-gnu.tar.gz.sha256)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_aarch64-unknown-linux-gnu.tar.gz.sha256)
- [InfluxDB 3 Enterprise • Linux (ARM) • MUSL](https://download.influxdata.com/influxdb/snapshots/influxdb3-enterprise_aarch64-unknown-linux-musl.tar.gz)
@ -136,10 +136,10 @@ source ~/.zshrc
## Docker image
Use the {{< product-name >}} Docker image to deploy {{< product-name >}} in a
Use the `influxdb3-enterprise` Docker image to deploy {{< product-name >}} in a
Docker container.
```
```bash
docker pull quay.io/influxdb/influxdb3-enterprise:latest
```

View File

@ -1,14 +1,14 @@
> [!Note]
> InfluxDB 3 Core is tailored for real-time data monitoring and recent data.
> InfluxDB 3 Core is purpose-built for real-time data monitoring and recent data.
> InfluxDB 3 Enterprise builds on top of Core with support for historical data
> querying, high availability, read replicas, and more. It will soon enable
> enhanced security, row-level deletions, an administration UI, and additional
> features. You can learn more about InfluxDB 3 Enterprise
> [here](/influxdb3/enterprise/get-started/).
> querying, high availability, read replicas, and more.
> Enterprise will soon unlock
> enhanced security, row-level deletions, an administration UI, and more.
> Learn more about [InfluxDB 3 Enterprise](/influxdb3/enterprise/).
## Getting Started with InfluxDB 3 Core
## Get started with {{% product-name %}}
InfluxDB is a database built to collect, process, transform, and store event and time series data. It is ideal for use cases that require real-time ingest and fast query response times to build user interfaces, monitoring, and automation solutions.
InfluxDB is a database built to collect, process, transform, and store event and time series data, and is ideal for use cases that require real-time ingest and fast query response times to build user interfaces, monitoring, and automation solutions.
Common use cases include:
@ -23,6 +23,7 @@ InfluxDB is optimized for scenarios where near real-time data monitoring is esse
{{% product-name %}} is the InfluxDB 3 open source release.
Core's feature highlights include:
* Diskless architecture with object storage support (or local disk with no dependencies)
* Fast query response times (under 10ms for last-value queries, or 30ms for distinct metadata)
* Embedded Python VM for plugins and triggers
@ -30,6 +31,7 @@ Core's feature highlights include:
* Compatibility with InfluxDB 1.x and 2.x write APIs
The Enterprise version adds onto Core's functionality with:
* Historical query capability and single series indexing
* High availability
* Read replicas
@ -37,7 +39,7 @@ The Enterprise version adds onto Core's functionality with:
* Row-level delete support (coming soon)
* Integrated admin UI (coming soon)
For more information, see the [Enterprise guide](/influxdb3/enterprise/get-started/).
For more information, see how to [get started with Enterprise](/influxdb3/enterprise/get-started/).
### What's in this guide
@ -48,45 +50,31 @@ This guide covers InfluxDB 3 Core (the open source release), including the follo
* [Write data to the database](#write-data)
* [Query the database](#query-the-database)
* [Last values cache](#last-values-cache)
* [Distinct Values Cache](#distinct-values-cache)
* [Distinct values cache](#distinct-values-cache)
* [Python plugins and the processing engine](#python-plugins-and-the-processing-engine)
* [Diskless architecture](#diskless-architecture)
### Installation and Startup
### Install and startup
{{% product-name %}} runs on **Linux**, **macOS**, and **Windows**.
[Run the install script](#run-the-install-script) to get started quickly,
regardless of your operating system.
Or, if you prefer, you can download and install {{% product-name %}} from [build artifacts and Docker images](#optional-download-build-artifacts-and-docker-images).
#### Run the install script
Enter the following command to use [curl](https://curl.se/download.html) to download the script and install {{% product-name %}} for MacOS and Linux operating systems:
{{% tabs-wrapper %}}
{{% tabs %}}
[Linux or macOS](#linux-or-macos)
[Windows](#windows)
[Docker (x86)](#docker-x86)
{{% /tabs %}}
{{% tab-content %}}
<!--------------- BEGIN LINUX AND MACOS -------------->
To get started quickly, download and run the install script--for example, using [curl](https://curl.se/download.html):
```bash
curl -O https://www.influxdata.com/d/install_influxdb3.sh && sh install_influxdb3.sh
curl -O https://www.influxdata.com/d/install_influxdb3.sh \
&& sh install_influxdb3.sh
```
To verify that the download and installation completed successfully, run the following command:
Or, download and install [build artifacts](/influxdb3/core/install/#download-influxdb-3-core-binaries):
```bash
influxdb3 --version
```
If your system doesn't locate `influxdb3`, then `source` the configuration file (for example, .bashrc, .zshrc) for your shell--for example:
```zsh
source ~/.zshrc
```
#### Optional: Download build artifacts and Docker images
Download the latest build artifacts — including Windows — and Docker images from the links below. These are updated with every merge into `main`.
##### {{% product-name %}} (latest):
- Docker: [quay.io/influxdb/influxdb3-core:latest](https://quay.io/influxdb/influxdb3-core:latest)
- [Linux | x86 | musl](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_x86_64-unknown-linux-musl.tar.gz)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_x86_64-unknown-linux-musl.tar.gz.sha256)
@ -101,41 +89,128 @@ Download the latest build artifacts — including Windows — and Docker images
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_aarch64-unknown-linux-gnu.tar.gz.sha256)
- [macOS | Darwin](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_aarch64-apple-darwin.tar.gz)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_aarch64-apple-darwin.tar.gz.sha256)
- [Windows | x86](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_x86_64-pc-windows-gnu.tar.gz)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_x86_64-pc-windows-gnu.tar.gz.sha256)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_aarch64-apple-darwin.tar.gz.sha256)
> [!Note]
> macOS Intel builds are coming soon.
<!--------------- END LINUX AND MACOS -------------->
{{% /tab-content %}}
{{% tab-content %}}
<!--------------- BEGIN WINDOWS -------------->
Download and install the {{% product-name %}} [Windows (x86) binary](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_x86_64-pc-windows-gnu.tar.gz)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-core_x86_64-pc-windows-gnu.tar.gz.sha256)
<!--------------- END WINDOWS -------------->
{{% /tab-content %}}
{{% tab-content %}}
<!--------------- BEGIN DOCKER -------------->
Pull the [`influxdb3-core` image](https://quay.io/repository/influxdb/influxdb3-core?tab=tags&tag=latest):
```bash
docker pull quay.io/influxdb/influxdb3-core:latest
```
<!--------------- END DOCKER -------------->
{{% /tab-content %}}
{{% /tabs-wrapper %}}
_Build artifacts and images update with every merge into the {{% product-name %}} `main` branch._
#### Verify the install
After you have installed {{% product-name %}}, enter the following command to verify that it completed successfully:
```bash
influxdb3 --version
```
If your system doesn't locate `influxdb3`, then `source` the configuration file (for example, .bashrc, .zshrc) for your shell--for example:
```zsh
source ~/.zshrc
```
#### Start InfluxDB
To start your InfluxDB instance, use the `influxdb3 serve` command
and provide an object store configuration and a unique `writer-id`.
and provide the following:
- `--object-store`: InfluxDB supports various storage options, including the local file system, memory, S3 (and compatible services like Ceph or Minio), Google Cloud Storage, and Azure Blob Storage.
- `--writer-id`: This string identifier determines the path under which all files written by this instance will be stored in the configured storage location.
- `--object-store`: Specifies the type of Object store to use. InfluxDB supports the following: local file system (`file`), `memory`, S3 (and compatible services like Ceph or Minio) (`s3`), Google Cloud Storage (`google`), and Azure Blob Storage (`azure`).
- `--writer-id`: A string identifier that determines the server's storage path within the configured storage location
The following examples show how to start InfluxDB with different object store configurations:
The following examples show how to start InfluxDB 3 with different object store configurations:
```bash
# MEMORY
# Stores data in RAM; doesn't persist data
influxdb3 serve --writer-id=local01 --object-store=memory
```
```bash
# FILESYSTEM
influxdb3 serve --writer-id=local01 --object-store=file --data-dir ~/.influxdb3
# Provide the filesystem directory
influxdb3 serve \
--writer-id=local01 \
--object-store=file \
--data-dir ~/.influxdb3
```
To run the [Docker image](/influxdb3/core/install/#docker-image) and persist data to the filesystem, mount a volume for the Object store-for example, pass the following options:
- `-v /path/on/host:/path/in/container`: Mounts a directory from your filesystem to the container
- `--object-store file --data-dir /path/in/container`: Uses the mount for server storage
```bash
# FILESYSTEM USING DOCKER
# Create a mount
# Provide the mount path
docker run -it \
-v /path/on/host:/path/in/container \
quay.io/influxdb/influxdb3-core:latest serve \
--writer-id my_host \
--object-store file \
--data-dir /path/in/container
```
```bash
# S3 (defaults to us-east-1 for region)
# Specify the Object store type and associated options
influxdb3 serve --writer-id=local01 --object-store=s3 --bucket=[BUCKET] --aws-access-key=[AWS ACCESS KEY] --aws-secret-access-key=[AWS SECRET ACCESS KEY]
```
```bash
# Minio/Open Source Object Store (Uses the AWS S3 API, with additional parameters)
# Specify the Object store type and associated options
influxdb3 serve --writer-id=local01 --object-store=s3 --bucket=[BUCKET] --aws-access-key=[AWS ACCESS KEY] --aws-secret-access-key=[AWS SECRET ACCESS KEY] --aws-endpoint=[ENDPOINT] --aws-allow-http
```
_For more information about server options, run `influxdb3 serve --help`._
> [!Important]
> #### Stopping the Docker container
>
> Currently, a bug prevents using `Ctrl-c` to stop an InfluxDB 3 container.
> Use the `docker kill` command to stop the container:
>
> 1. Enter the following command to find the container ID:
> ```bash
> docker ps -a
> ```
> 2. Enter the command to stop the container:
> ```bash
> docker kill <CONTAINER_ID>
> ```
#### Licensing
When starting {{% product-name %}} for the first time, it prompts you to enter an email address for verification. You will receive an email with a verification link.
Upon verification, the license creation, retrieval, and application are automated.
_During the alpha period, licenses are valid until May 7, 2025._
### Data Model
The database server contains logical databases, which have tables, which have columns. Compared to previous versions of InfluxDB you can think of a database as a `bucket` in v2 or as a `db/retention_policy` in v1. A `table` is equivalent to a `measurement`, which has columns that can be of type `tag` (a string dictionary), `int64`, `float64`, `uint64`, `bool`, or `string` and finally every table has a `time` column that is a nanosecond precision timestamp.
@ -151,24 +226,23 @@ InfluxDB is a schema-on-write database. You can start writing data and InfluxDB
After a schema is created, InfluxDB validates future write requests against it before accepting the data.
Subsequent requests can add new fields on-the-fly, but can't add new tags.
InfluxDB 3 Core is optimized for recent data only--it accepts writes for data with timestamps from the last 72 hours. It will persist that data in Parquet files for access by third-party systems for longer term historical analysis and queries. If you require longer historical queries with a compactor that optimizes data organization, consider using [InfluxDB 3 Enterprise](/influxdb3/enterprise/get-started/).
InfluxDB 3 Core is optimized for recent data only--it accepts writes for data with timestamps from the last 72 hours. It persists that data in Parquet files for access by third-party systems for longer term historical analysis and queries. If you require longer historical queries with a compactor that optimizes data organization, consider using [InfluxDB 3 Enterprise](/influxdb3/enterprise/get-started/).
**Note**: write requests to the database _don't_ return until a WAL file has been flushed to the configured object store, which by default happens once per second.
This means that individual write requests may not complete quickly, but you can make many concurrent requests to get higher total throughput. In the future, we will add an API parameter to make requests that do not wait for the WAL flush to return.
The database has three write API endpoints that respond to HTTP `POST` requests:
* `/write?db=mydb,precision=ns`
* `/api/v2/write?db=mydb,precision=ns`
* `/api/v3/write?db=mydb,precision=ns`
* `/write?db=mydb&precision=ns`
* `/api/v2/write?bucket=mydb&precision=ns`
* `/api/v3/write_lp?db=mydb&precision=ns&accept_partial=true`
{{% product-name %}} provides the `/write` and `/api/v2` endpoints for backward compatibility with clients that can write data to previous versions of InfluxDB.
{{% product-name %}} provides the `/write` and `/api/v2/write` endpoints for backward compatibility with clients that can write data to previous versions of InfluxDB.
However, these APIs differ from the APIs in the previous versions in the following ways:
- Tags in a table (measurement) are _immutable_
- A tag and a field can't have the same name within a table.
The `/api/v3/write` endpoint accepts the same line protocol syntax as previous versions, and brings new functionality that lets you accept or reject partial writes using the `accept_partial` parameter (`true` is default).
{{% product-name %}} adds the `/api/v3/write_lp` endpoint, which accepts the same line protocol syntax as previous versions, and supports an `?accept_partial=<BOOLEAN>` parameter, which
lets you accept or reject partial writes (default is `true`).
The following code block is an example of [line protocol](/influxdb3/core/reference/syntax/line-protocol/), which shows the table name followed by tags, which are an ordered, comma-separated list of key/value pairs where the values are strings, followed by a comma-separated list of key/value pairs that are the fields, and ending with an optional timestamp. The timestamp by default is a nanosecond epoch, but you can specify a different precision through the `precision` query parameter.
@ -187,8 +261,55 @@ If you save the preceding line protocol to a file (for example, `server_data`),
influxdb3 write --database=mydb --file=server_data
```
The written data goes into WAL files, created once per second, and into an in-memory queryable buffer. Later, InfluxDB snapshots the WAL and persists the data into object storage as Parquet files.
We'll cover the [diskless architecture](#diskless-architecture) later in this document.
The following examples show how to write data using `curl` and the `/api/3/write_lp` HTTP endpoint.
To show the difference between accepting and rejecting partial writes, line `2` in the example contains a `string` value for a `float` field (`temp=hi`).
##### Partial write of line protocol occurred
With `accept_partial=true`:
```
* upload completely sent off: 59 bytes
< HTTP/1.1 400 Bad Request
< transfer-encoding: chunked
< date: Wed, 15 Jan 2025 19:35:36 GMT
<
* Connection #0 to host localhost left intact
{"error":"partial write of line protocol occurred","data":[{"original_line":"dquote> home,room=Sunroom temp=hi","line_number":2,"error_message":"No fields were provided"}]}%
```
Line `1` is written and queryable.
The response is an HTTP error (`400`) status, and the response body contains `partial write of line protocol occurred` and details about the problem line.
##### Parsing failed for write_lp endpoint
With `accept_partial=false`:
```
> curl -v -XPOST "localhost:8181/api/v3/write_lp?db=sensors&precision=auto&accept_partial=false" \
--data-raw "home,room=Sunroom temp=96
dquote> home,room=Sunroom temp=hi"
< HTTP/1.1 400 Bad Request
< transfer-encoding: chunked
< date: Wed, 15 Jan 2025 19:28:27 GMT
<
* Connection #0 to host localhost left intact
{"error":"parsing failed for write_lp endpoint","data":{"original_line":"dquote> home,room=Sunroom temp=hi","line_number":2,"error_message":"No fields were provided"}}%
```
Neither line is written to the database.
The response is an HTTP error (`400`) status, and the response body contains `parsing failed for write_lp endpoint` and details about the problem line.
##### Data durability
Written data goes into WAL files, created once per second, and into an in-memory queryable buffer. Later, InfluxDB snapshots the WAL and persists the data into object storage as Parquet files.
We cover the [diskless architecture](#diskless-architecture) later in this guide.
> [!Note]
> ##### Write requests return after WAL flush
> Write requests to the database _don't_ return until a WAL file has been flushed to the configured object store, which by default happens once per second.
> Individual write requests might not complete quickly, but you can make many concurrent requests to get higher total throughput.
> In the future, we will add an API parameter that lets requests return without waiting for the WAL flush.
#### Create a Database or Table
@ -217,12 +338,12 @@ The `query` subcommand includes options to help ensure that the right database i
| Option | Description | Required |
|---------|-------------|--------------|
| `--host` | The host URL of the running {{% product-name %}} server [default: http://127.0.0.1:8181] | No |
| `--host` | The host URL of the server [default: `http://127.0.0.1:8181`] to query | No |
| `--database` | The name of the database to operate on | Yes |
| `--token` | The token for authentication with the {{% product-name %}} server | No |
| `--language` | The query language used to format the provided query string [default: sql] [possible values: sql, influxql] | No |
| `--format` | The format in which to output the query [default: pretty] [possible values: pretty, json, json_lines, csv, parquet] | No |
| `--output` | Put all query output into `output` | No |
| `--token` | The authentication token for the {{% product-name %}} server | No |
| `--language` | The query language of the provided query string [default: `sql`] [possible values: `sql`, `influxql`] | No |
| `--format` | The format in which to output the query [default: `pretty`] [possible values: `pretty`, `json`, `json_lines`, `csv`, `parquet`] | No |
| `--output` | The path to output data to | No |
#### Example: query `“SHOW TABLES”` on the `servers` database:
@ -293,7 +414,8 @@ curl -v "http://127.0.0.1:8181/api/v3/query_sql?db=servers&q=select+*+from+cpu+l
The following example sends an HTTP `POST` request with parameters in a JSON payload:
```bash
curl http://127.0.0.1:8181/api/v3/query_sql --data '{"db": "server", "q": "select * from cpu limit 5"}'
curl http://127.0.0.1:8181/api/v3/query_sql \
--data '{"db": "server", "q": "select * from cpu limit 5"}'
```
### Query using the Python client
@ -351,7 +473,7 @@ print(table.group_by('cpu').aggregate([('time_system', 'mean')]))
For more information about the Python client library, see the [`influxdb3-python` repository](https://github.com/InfluxCommunity/influxdb3-python) in GitHub.
## Last values cache
### Last values cache
{{% product-name %}} supports a **last-n values cache** which stores the last N values in a series or column hierarchy in memory. This gives the database the ability to answer these kinds of queries in under 10 milliseconds.
You can use the `influxdb3` CLI to create a last value cache.
@ -389,26 +511,26 @@ An example of creating this cache in use:
influxdb3 create last-cache --database=servers --table=cpu --cache-name=cpuCache --key-columns=host,application --value-columns=usage_percent,status --count=5
```
### Querying a Last values cache
#### Query a Last values cache
To leverage the LVC, you need to specifically call on it using the `last_cache()` function. An example of this type of query:
To leverage the LVC, call it using the `last_cache()` function in your query--for example:
```bash
influxdb3 query --database=servers "SELECT * FROM last_cache('cpu', 'cpuCache') WHERE host = 'Bravo;"
```
Usage: $ influxdb3 query --database=servers "SELECT * FROM last_cache('cpu', 'cpuCache') WHERE host = 'Bravo;"
```
{{% note %}}
#### Only works with SQL
The Last Value Cache only works with SQL, not InfluxQL; SQL is the default language.
The Last values cache only works with SQL, not InfluxQL; SQL is the default language.
{{% /note %}}
### Deleting a Last values cache
#### Deleting a Last values cache
Removing a Last values cache is also easy and straightforward, with the instructions below.
To remove a Last values cache, use the following command:
```
Usage: influxdb3 delete delete [OPTIONS] -d <DATABASE_NAME> -t <TABLE> --cache-name <CACHE_NAME>
```bash
influxdb3 delete last_cache [OPTIONS] -d <DATABASE_NAME> -t <TABLE> --cache-name <CACHE_NAME>
Options:
-h, --host <HOST_URL> Host URL of the running InfluxDB 3 server
@ -419,7 +541,7 @@ Options:
--help Print help information
```
## Distinct Values Cache
### Distinct values cache
Similar to the Last values cache, the database can cache in RAM the distinct values for a single column in a table or a heirarchy of columns. This is useful for fast metadata lookups, which can return in under 30 milliseconds. Many of the options are similar to the last value cache. See the CLI output for more information:
@ -427,52 +549,67 @@ Similar to the Last values cache, the database can cache in RAM the distinct val
influxdb3 create distinct_cache -h
```
### Python Plugins and the Processing Engine
### Python plugins and the Processing engine
{{% note %}}
#### Only supported in Docker
> [!Important]
> #### Processing engine only works with Docker
>
> The Processing engine is currently supported only in Docker x86 environments. Non-Docker support is coming soon. The engine, API, and developer experience are actively evolving and may change. Join our [Discord](https://discord.gg/9zaNCW2PRT) for updates and feedback.
As of this writing, the Processing Engine is only supported in Docker environments.
We expect it to launch in non-Docker environments soon. We're still in very active development creating the API and developer experience; things will break and change fast. Join our <a href="https://discord.gg/9zaNCW2PRT">Discord</a> to ask questions and give feedback.
{{% /note %}}
The InfluxDB 3 Processing engine is an embedded Python VM for running code inside the database to process and transform data.
InfluxDB3 has an embedded Python VM for running code inside the database. Currently, we only support plugins that get triggered on WAL file flushes, but more will be coming soon. Specifically, plugins will be able to be triggered by:
To use the Processing engine, you create [plugins](#plugin) and [triggers](#trigger).
* On WAL flush: sends a batch of write data to a plugin once a second (can be configured).
* On Snapshot (persist of Parquet files): sends the metadata to a plugin to do further processing against the Parquet data or send the information elsewhere (for example, adding it to an Iceberg Catalog).
* On Schedule: executes plugin on a schedule configured by the user, and is useful for data collection and deadman monitoring.
* On Request: binds a plugin to an HTTP endpoint at `/api/v3/plugins/<name>` where request headers and content are sent to the plugin, which can then parse, process, and send the data into the database or to third party services
#### Plugin
Plugins work in two parts: plugins and triggers. Plugins are the generic Python code that represent a plugin. Once you've loaded a plugin into the server, you can create many triggers of that plugin. A trigger has a plugin, a database and then a trigger-spec, which can be either all_tables or table:my_table_name where my_table_name is the name of your table you want to filter the plugin to.
A plugin is a Python function that has a signature compatible with one of the [trigger types](#trigger-types).
The [`influxdb3 create plugin`](/influxdb3/core/influxdb3-cli/create/plugin/) command loads a Python plugin file into the server.
You can also specify a list of key/value pairs as arguments supplied to a trigger. This makes it so that you could have many triggers of the same plugin, but with different arguments supplied to check for different things. These commands will give you useful information:
#### Trigger
```
influxdb3 create plugin -h
influxdb3 create trigger -h
```
After you load a plugin into an InfluxDB 3 server, you can create one or more
triggers associated with the plugin.
When you create a trigger, you specify a plugin, a database, optional runtime arguments,
and a trigger-spec, which specifies `all_tables` or `table:my_table_name` (for filtering data sent to the plugin).
When you _enable_ a trigger, the server executes the plugin code according to the
plugin signature.
##### Trigger types
InfluxDB 3 provides the following types of triggers:
- **On WAL flush**: Sends the batch of write data to a plugin once a second (configurable).
> [!Note]
> #### Plugins only work with x86 Docker
> For now, plugins only work with the x86 Docker image.
> Currently, only the **WAL flush** trigger is supported, but more are on the way:
>
> - **On Snapshot**: Sends metadata to a plugin for further processing against the Parquet data or to send the information elsewhere (for example, to an Iceberg Catalog). _Not yet available._
> - **On Schedule**: Executes a plugin on a user-configured schedule, useful for data collection and deadman monitoring. _Not yet available._
> - **On Request**: Binds a plugin to an HTTP endpoint at `/api/v3/plugins/<name>`. _Not yet available._
> The plugin receives the HTTP request headers and content, and can then parse, process, and send the data into the database or to third-party services.
Before we try to load up a plugin and create a trigger for it, we should write one and test it out. To test out and run plugins, you'll need to create a plugin directory. Start up your server with the --plugin-dir argument and point it at your plugin dir (note that you'll need to make this available in your Docker container).
### Test, create, and trigger plugin code
Have a look at this example Python plugin file:
> [!Important]
> #### Processing engine only works with Docker
>
> The Processing engine is currently supported only in Docker x86 environments. Non-Docker support is coming soon. The engine, API, and developer experience are actively evolving and may change. Join our [Discord](https://discord.gg/9zaNCW2PRT) for updates and feedback.
##### Example: Python plugin for WAL flush
```python
# This is the basic structure of the Python code that would be a plugin.
# After this Python exmaple there are instructions below for how to interact
# with the server to test it out, load it in, and set it to trigger on
# writes to either a specific DB or a specific table within a DB. When you
# define the trigger you can provide arguments to it. This will allow you to
# set things like monitoring thresholds, environment variables to look up,
# host names or other things that your generic plugin can use.
# This is the basic structure for Python plugin code that runs in the
# InfluxDB 3 Processing engine.
# you define a function with this exact signature. every time the wal gets
# flushed (once per second by default), you will get the writes either from
# the table you triggered the plugin to or every table in the database that
# you triggered it to
# When creating a trigger, you can provide runtime arguments to your plugin,
# allowing you to write generic code that uses variables such as monitoring
thresholds, environment variables, and host names.
#
# Use the following exact signature to define a function for the WAL flush
# trigger.
# When you create a trigger for a WAL flush plugin, you specify the database
# and tables that the plugin receives written data from on every WAL flush
# (default is once per second).
def process_writes(influxdb3_local, table_batches, args=None):
# here you can see logging. for now this won't do anything, but soon
# we'll capture this so you can query it from system tables
@ -533,44 +670,99 @@ def process_writes(influxdb3_local, table_batches, args=None):
influxdb3_local.info("done")
```
Then you'll want to drop a file into that plugin directory. You can use the example from above, but comment out the section where it queries (unless you write some data to that table, in which case leave it in!).
##### Test a plugin on the server
To use the server to test what a plugin will do, in advance of actually loading it into the server or creating a trigger that calls it, enter the following command:
Test your InfluxDB 3 plugin safely without affecting written data. During a plugin test:
`influxdb3 test wal_plugin -h`
- A query executed by the plugin queries against the server you send the request to.
- Writes aren't sent to the server but are returned to you.
The important arguments are `lp` or `file`, which read line protocol from that file and yield it as a test to your new plugin.
To test a plugin, do the following:
`--input-arguments` are key/value pairs separated by commas--for example:
1. Create a _plugin directory_--for example, `/path/to/.influxdb/plugins`
2. Make the plugin directory available to the Docker container (for example, using a bind mount)
3. Run the Docker command to [start the server](#start-influxdb) and include the `--plugin-dir` option with your plugin directory path.
4. Save the [preceding example code](#example-python-plugin) to a plugin file inside of the plugin directory. If you haven't yet written data to the table in the example, comment out the lines where it queries.
5. To run the test, enter the following command with the following options:
- `--lp` or `--file`: The line protocol to test
- Optional: `--input-arguments`: A comma-delimited list of `<KEY>=<VALUE>` arguments for your plugin code
```bash
influxdb3 test wal_plugin \
--lp <INPUT_LINE_PROTOCOL> \
--input-arguments "arg1=foo,arg2=bar"
--database <DATABASE_NAME> \
<PLUGIN_FILENAME>
```
The command runs the plugin code with the test data, yields the data to the plugin code, and then responds with the plugin result.
You can quickly see how the plugin behaves, what data it would have written to the database, and any errors.
You can then edit your Python code in the plugins directory, and rerun the test.
The server reloads the file for every request to the `test` API.
For more information, see [`influxdb3 test wal_plugin`](/influxdb3/core/reference/cli/influxdb3/test/wal_plugin/) or run `influxdb3 test wal_plugin -h`.
With the plugin code inside the server plugin directory, and a successful test,
you're ready to create a plugin and a trigger to run on the server.
##### Example: Test, create, and run a plugin
The following example shows how to test a plugin, and then create the plugin and
trigger:
```bash
--input-arguments "arg1=foo,arg2=bar"
# Test and create a plugin
# Requires:
# - A database named `mydb` with a table named `foo`
# - A Python plugin file named `test.py`
# Test a plugin
influxdb3 test wal_plugin \
--lp="my_measure,tag1=asdf f1=1.0 123" \
-d mydb \
--input-arguments="arg1=hello,arg2=world" \
test.py
```
If you execute a query within the plugin, it will query against the live server you're sending this request to. Any writes you do will not be sent into the server, but instead returned back to you.
This will let you see what a plugin would have written back without actually doing it. It will also let you quickly spot errors, change your python file in the plugins directory, and then run the test again. The server will reload the file on every request to the test API.
Once you've done that, you can create the plugin through the command shown above. Then you'll have to create trigger to have it be active and run with data as you write it into the server.
Here's an example of each of the three commands being run:
```
influxdb3 test wal_plugin --lp="my_measure,tag1=asdf f1=1.0 123" -d mydb --input-arguments="arg1=hello,arg2=world" test.py
# make sure you've created mydb first
influxdb3 create plugin -d mydb --code-filename="/Users/pauldix/.influxdb3/plugins/test.py" test_plugin
influxdb3 create trigger -d mydb --plugin=test_plugin --trigger-spec="table:foo" trigger1
```bash
# Create a plugin to run
influxdb3 create plugin \
-d mydb \
--code-filename="/path/to/.influxdb3/plugins/test.py" \
test_plugin
```
After you've tested it, you can create the plugin in the server(the file will need to be there in the plugin-dir) and then create a trigger to trigger it on WAL flushes.
```bash
# Create a trigger that runs the plugin
influxdb3 create trigger \
-d mydb \
--plugin=test_plugin \
--trigger-spec="table:foo" \
--trigger-arguments="arg1=hello,arg2=world" \
trigger1
```
After you have created a plugin and trigger, enter the following command to
enable the trigger and have it run the plugin as you write data:
```bash
influxdb3 enable trigger --database mydb trigger1
```
For more information, see the following:
- [`influxdb3 test wal_plugin`](/influxdb3/core/influxdb3-cli/test/wal_plugin/)
- [`influxdb3 create plugin`](/influxdb3/core/influxdb3-cli/create/plugin/)
- [`influxdb3 create trigger`](/influxdb3/core/influxdb3-cli/create/trigger/)
### Diskless architecture
InfluxDB 3 is able to operate using only object storage with no locally attached disk. While it can use only a disk with no dependencies, the ability to operate without one is a new capability with this release. The figure below illustrates the write path for data landing in the database.
InfluxDB 3 is able to operate using only object storage with no locally attached disk.
While it can use only a disk with no dependencies, the ability to operate without one is a new capability with this release. The figure below illustrates the write path for data landing in the database.
{{< img-hd src="/img/influxdb/influxdb-3-write-path.png" alt="Write Path for InfluxDB 3 Core & Enterprise" />}}
As write requests come in to the server, they are parsed and validated and put into an in-memory WAL buffer. This buffer is flushed every second by default (can be changed through configuration), which will create a WAL file. Once the data is flushed to disk, it is put into a queryable in-memory buffer and then a response is sent back to the client that the write was successful. That data will now show up in queries to the server.
As write requests come in to the server, they are parsed, validated, and put into an in-memory WAL buffer. This buffer is flushed every second by default (can be changed through configuration), which will create a WAL file. Once the data is flushed to disk, it is put into a queryable in-memory buffer and then a response is sent back to the client that the write was successful. That data will now show up in queries to the server.
InfluxDB periodically snapshots the WAL to persist the oldest data in the queryable buffer, allowing the server to remove old WAL files. By default, the server will keep up to 900 WAL files buffered up (15 minutes of data) and attempt to persist the oldest 10 minutes, keeping the most recent 5 minutes around.

View File

@ -1,6 +1,6 @@
## Get started with {{% product-name %}}
InfluxDB is a database built to collect, process, transform, and store event and time series data. It is ideal for use cases that require real-time ingest and fast query response times to build user interfaces, monitoring, and automation solutions.
InfluxDB is a database built to collect, process, transform, and store event and time series data, and is ideal for use cases that require real-time ingest and fast query response times to build user interfaces, monitoring, and automation solutions.
Common use cases include:
@ -27,7 +27,7 @@ The Enterprise version adds onto Core's functionality with:
* Historical query capability and single series indexing
* High availability
* Read replicas
* Enhanced security
* Enhanced security (coming soon)
* Row-level delete support (coming soon)
* Integrated admin UI (coming soon)
@ -40,7 +40,7 @@ This guide covers Enterprise as well as InfluxDB 3 Core, including the following
* [Write data to the database](#write-data)
* [Query the database](#query-the-database)
* [Last values cache](#last-values-cache)
* [Distinct Values Cache](#distinct-values-cache)
* [Distinct values cache](#distinct-values-cache)
* [Python plugins and the processing engine](#python-plugins-and-the-processing-engine)
* [Diskless architecture](#diskless-architecture)
* [Multi-server setups](#multi-server-setup)
@ -48,38 +48,24 @@ This guide covers Enterprise as well as InfluxDB 3 Core, including the following
### Install and startup
{{% product-name %}} runs on **Linux**, **macOS**, and **Windows**.
[Run the install script](#run-the-install-script) to get started quickly,
regardless of your operating system.
Or, if you prefer, you can download and install {{% product-name %}} from [build artifacts and Docker images](#optional-download-build-artifacts-and-docker-images).
#### Run the install script
Enter the following command to use [curl](https://curl.se/download.html) to download the script and install {{% product-name %}}, for MacOS and Linux operating systems:
{{% tabs-wrapper %}}
{{% tabs %}}
[Linux or macOS](#linux-or-macos)
[Windows](#windows)
[Docker (x86)](#docker-x86)
{{% /tabs %}}
{{% tab-content %}}
<!--------------- BEGIN LINUX AND MACOS -------------->
To get started quickly, download and run the install script--for example, using [curl](https://curl.se/download.html):
```bash
curl -O https://www.influxdata.com/d/install_influxdb3.sh && sh install_influxdb3.sh enterprise
curl -O https://www.influxdata.com/d/install_influxdb3.sh \
&& sh install_influxdb3.sh enterprise
```
To verify that the download and installation completed successfully, run the following command:
Or, download and install [build artifacts](/influxdb3/enterprise/install/#download-influxdb-3-enterprise-binaries):
```bash
influxdb3 --version
```
If your system doesn't locate `influxdb3`, then `source` the configuration file (for example, .bashrc, .zshrc) for your shell--for example:
```zsh
source ~/.zshrc
```
#### Optional: Download build artifacts and Docker images
Download the latest build artifacts — including Windows — and Docker images from the links below. These are updated with every merge into `main`.
##### {{% product-name %}} (latest):
- Docker: [quay.io/influxdb/influxdb3-enterprise:latest](quay.io/influxdb/influxdb3-enterprise:latest)
- [Linux | x86_64 | GNU](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_x86_64-unknown-linux-gnu.tar.gz)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_x86_64-unknown-linux-gnu.tar.gz.sha256)
@ -95,40 +81,120 @@ Download the latest build artifacts — including Windows — and Docker images
- [macOS | ARM64](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_aarch64-apple-darwin.tar.gz)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_aarch64-apple-darwin.tar.gz.sha256)
- [Windows | x86_64](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_x86_64-pc-windows-gnu.tar.gz)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_x86_64-pc-windows-gnu.tar.gz.sha256)
> [!Note]
> macOS Intel builds are coming soon.
<!--------------- END LINUX AND MACOS -------------->
{{% /tab-content %}}
{{% tab-content %}}
<!--------------- BEGIN WINDOWS -------------->
Download and install the {{% product-name %}} [Windows (x86) binary](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_x86_64-pc-windows-gnu.tar.gz)
[sha256](https://dl.influxdata.com/influxdb/snapshots/influxdb3-enterprise_x86_64-pc-windows-gnu.tar.gz.sha256)
<!--------------- END WINDOWS -------------->
{{% /tab-content %}}
{{% tab-content %}}
<!--------------- BEGIN DOCKER -------------->
Pull the [`influxdb3-enterprise` image](https://quay.io/repository/influxdb/influxdb3-enterprise?tab=tags&tag=latest):
```bash
docker pull quay.io/influxdb/influxdb3-enterprise:latest
```
<!--------------- END DOCKER -------------->
{{% /tab-content %}}
{{% /tabs-wrapper %}}
_Build artifacts and images update with every merge into the {{% product-name %}} `main` branch._
#### Verify the install
After you have installed {{% product-name %}}, enter the following command to verify that it completed successfully:
```bash
influxdb3 --version
```
If your system doesn't locate `influxdb3`, then `source` the configuration file (for example, .bashrc, .zshrc) for your shell--for example:
```zsh
source ~/.zshrc
```
#### Start InfluxDB
To start your InfluxDB instance, use the `influxdb3 serve` command
and provide an object store configuration and a unique `writer-id`.
and provide the following:
- `--object-store`: InfluxDB supports various storage options, including the local file system, memory, S3 (and compatible services like Ceph or Minio), Google Cloud Storage, and Azure Blob Storage.
- `--writer-id`: This string identifier determines the path under which all files written by this instance will be stored in the configured storage location.
- `--object-store`: Specifies the type of Object store to use. InfluxDB supports the following: local file system (`file`), `memory`, S3 (and compatible services like Ceph or Minio) (`s3`), Google Cloud Storage (`google`), and Azure Blob Storage (`azure`).
- `--writer-id`: A string identifier that determines the server's storage path within the configured storage location, and, in a multi-node setup, is used to reference the node
The following examples show how to start InfluxDB with different object store configurations:
The following examples show how to start InfluxDB 3 with different object store configurations:
```bash
# MEMORY
# Stores data in RAM; doesn't persist data
influxdb3 serve --writer-id=local01 --object-store=memory
```
```bash
# FILESYSTEM
influxdb3 serve --writer-id=local01 --object-store=file --data-dir ~/.influxdb3
# Provide the filesystem directory
influxdb3 serve \
--writer-id=local01 \
--object-store=file \
--data-dir ~/.influxdb3
```
To run the [Docker image](/influxdb3/enterprise/install/#docker-image) and persist data to the filesystem, mount a volume for the Object store-for example, pass the following options:
- `-v /path/on/host:/path/in/container`: Mounts a directory from your filesystem to the container
- `--object-store file --data-dir /path/in/container`: Uses the mount for server storage
```bash
# FILESYSTEM USING DOCKER
# Create a mount
# Provide the mount path
docker run -it \
-v /path/on/host:/path/in/container \
quay.io/influxdb/influxdb3-enterprise:latest serve \
--writer-id my_host \
--object-store file \
--data-dir /path/in/container
```
```bash
# S3 (defaults to us-east-1 for region)
# Specify the Object store type and associated options
influxdb3 serve --writer-id=local01 --object-store=s3 --bucket=[BUCKET] --aws-access-key=[AWS ACCESS KEY] --aws-secret-access-key=[AWS SECRET ACCESS KEY]
```
```bash
# Minio/Open Source Object Store (Uses the AWS S3 API, with additional parameters)
# Specify the Object store type and associated options
influxdb3 serve --writer-id=local01 --object-store=s3 --bucket=[BUCKET] --aws-access-key=[AWS ACCESS KEY] --aws-secret-access-key=[AWS SECRET ACCESS KEY] --aws-endpoint=[ENDPOINT] --aws-allow-http
```
_For more information about server options, run `influxdb3 serve --help`._
> [!Important]
> #### Stopping the Docker container
>
> Currently, a bug prevents using `Ctrl-c` to stop an InfluxDB 3 container.
> Use the `docker kill` command to stop the container:
>
> 1. Enter the following command to find the container ID:
> ```bash
> docker ps -a
> ```
> 2. Enter the command to stop the container:
> ```bash
> docker kill <CONTAINER_ID>
> ```
#### Licensing
When starting {{% product-name %}} for the first time, it prompts you to enter an email address for verification. You will receive an email with a verification link.
@ -151,24 +217,22 @@ InfluxDB is a schema-on-write database. You can start writing data and InfluxDB
After a schema is created, InfluxDB validates future write requests against it before accepting the data.
Subsequent requests can add new fields on-the-fly, but can't add new tags.
**Note**: write requests to the database _don't_ return until a WAL file has been flushed to the configured object store, which by default happens once per second.
This means that individual write requests may not complete quickly, but you can make many concurrent requests to get higher total throughput. In the future, we will add an API parameter to make requests that do not wait for the WAL flush to return.
The database has three write API endpoints that respond to HTTP `POST` requests:
* `/write?db=mydb,precision=ns`
* `/api/v2/write?db=mydb,precision=ns`
* `/api/v3/write?db=mydb,precision=ns`
* `/write?db=mydb&precision=ns`
* `/api/v2/write?bucket=mydb&precision=ns`
* `/api/v3/write_lp?db=mydb&precision=ns&accept_partial=true`
{{% product-name %}} provides the `/write` and `/api/v2` endpoints for backward compatibility with clients that can write data to previous versions of InfluxDB.
{{% product-name %}} provides the `/write` and `/api/v2/write` endpoints for backward compatibility with clients that can write data to previous versions of InfluxDB.
However, these APIs differ from the APIs in the previous versions in the following ways:
- Tags in a table (measurement) are _immutable_
- A tag and a field can't have the same name within a table.
The `/api/v3/write` endpoint accepts the same line protocol syntax as previous versions, and brings new functionality that lets you accept or reject partial writes using the `accept_partial` parameter (`true` is default).
{{% product-name %}} adds the `/api/v3/write_lp` endpoint, which accepts the same line protocol syntax as previous versions, and supports an `?accept_partial=<BOOLEAN>` parameter, which
lets you accept or reject partial writes (default is `true`).
The following code block is an example of [line protocol](/influxdb3/enterprise/reference/syntax/line-protocol/), which shows the table name followed by tags, which are an ordered, comma-separated list of key/value pairs where the values are strings, followed by a comma-separated list of key/value pairs that are the fields, and ending with an optional timestamp. The timestamp by default is a nanosecond epoch, but you can specify a different precision through the `precision` query parameter.
The following code block is an example of [line protocol](/influxdb3/core/reference/syntax/line-protocol/), which shows the table name followed by tags, which are an ordered, comma-separated list of key/value pairs where the values are strings, followed by a comma-separated list of key/value pairs that are the fields, and ending with an optional timestamp. The timestamp by default is a nanosecond epoch, but you can specify a different precision through the `precision` query parameter.
```
cpu,host=Alpha,region=us-west,application=webserver val=1i,usage_percent=20.5,status="OK"
@ -185,8 +249,55 @@ If you save the preceding line protocol to a file (for example, `server_data`),
influxdb3 write --database=mydb --file=server_data
```
The written data goes into WAL files, created once per second, and into an in-memory queryable buffer. Later, InfluxDB snapshots the WAL and persists the data into object storage as Parquet files.
We'll cover the [diskless architecture](#diskless-architecture) later in this document.
The following examples show how to write data using `curl` and the `/api/3/write_lp` HTTP endpoint.
To show the difference between accepting and rejecting partial writes, line `2` in the example contains a `string` value for a `float` field (`temp=hi`).
##### Partial write of line protocol occurred
With `accept_partial=true`:
```
* upload completely sent off: 59 bytes
< HTTP/1.1 400 Bad Request
< transfer-encoding: chunked
< date: Wed, 15 Jan 2025 19:35:36 GMT
<
* Connection #0 to host localhost left intact
{"error":"partial write of line protocol occurred","data":[{"original_line":"dquote> home,room=Sunroom temp=hi","line_number":2,"error_message":"No fields were provided"}]}%
```
Line `1` is written and queryable.
The response is an HTTP error (`400`) status, and the response body contains `partial write of line protocol occurred` and details about the problem line.
##### Parsing failed for write_lp endpoint
With `accept_partial=false`:
```
> curl -v -XPOST "localhost:8181/api/v3/write_lp?db=sensors&precision=auto&accept_partial=false" \
--data-raw "home,room=Sunroom temp=96
dquote> home,room=Sunroom temp=hi"
< HTTP/1.1 400 Bad Request
< transfer-encoding: chunked
< date: Wed, 15 Jan 2025 19:28:27 GMT
<
* Connection #0 to host localhost left intact
{"error":"parsing failed for write_lp endpoint","data":{"original_line":"dquote> home,room=Sunroom temp=hi","line_number":2,"error_message":"No fields were provided"}}%
```
Neither line is written to the database.
The response is an HTTP error (`400`) status, and the response body contains `parsing failed for write_lp endpoint` and details about the problem line.
##### Data durability
Written data goes into WAL files, created once per second, and into an in-memory queryable buffer. Later, InfluxDB snapshots the WAL and persists the data into object storage as Parquet files.
We cover the [diskless architecture](#diskless-architecture) later in this guide.
> [!Note]
> ##### Write requests return after WAL flush
> Write requests to the database _don't_ return until a WAL file has been flushed to the configured object store, which by default happens once per second.
> Individual write requests might not complete quickly, but you can make many concurrent requests to get higher total throughput.
> In the future, we will add an API parameter that lets requests return without waiting for the WAL flush.
#### Create a Database or Table
@ -215,12 +326,12 @@ The `query` subcommand includes options to help ensure that the right database i
| Option | Description | Required |
|---------|-------------|--------------|
| `--host` | The host URL of the running {{% product-name %}} server [default: http://127.0.0.1:8181] | No |
| `--host` | The host URL of the server [default: `http://127.0.0.1:8181`] to query | No |
| `--database` | The name of the database to operate on | Yes |
| `--token` | The token for authentication with the {{% product-name %}} server | No |
| `--language` | The query language used to format the provided query string [default: sql] [possible values: sql, influxql] | No |
| `--format` | The format in which to output the query [default: pretty] [possible values: pretty, json, json_lines, csv, parquet] | No |
| `--output` | Put all query output into `output` | No |
| `--token` | The authentication token for the {{% product-name %}} server | No |
| `--language` | The query language of the provided query string [default: `sql`] [possible values: `sql`, `influxql`] | No |
| `--format` | The format in which to output the query [default: `pretty`] [possible values: `pretty`, `json`, `json_lines`, `csv`, `parquet`] | No |
| `--output` | The path to output data to | No |
#### Example: query `“SHOW TABLES”` on the `servers` database:
@ -291,7 +402,8 @@ curl -v "http://127.0.0.1:8181/api/v3/query_sql?db=servers&q=select+*+from+cpu+l
The following example sends an HTTP `POST` request with parameters in a JSON payload:
```bash
curl http://127.0.0.1:8181/api/v3/query_sql --data '{"db": "server", "q": "select * from cpu limit 5"}'
curl http://127.0.0.1:8181/api/v3/query_sql \
--data '{"db": "server", "q": "select * from cpu limit 5"}'
```
### Query using the Python client
@ -349,7 +461,7 @@ print(table.group_by('cpu').aggregate([('time_system', 'mean')]))
For more information about the Python client library, see the [`influxdb3-python` repository](https://github.com/InfluxCommunity/influxdb3-python) in GitHub.
## Last values cache
### Last values cache
{{% product-name %}} supports a **last-n values cache** which stores the last N values in a series or column hierarchy in memory. This gives the database the ability to answer these kinds of queries in under 10 milliseconds.
You can use the `influxdb3` CLI to create a last value cache.
@ -387,26 +499,26 @@ An example of creating this cache in use:
influxdb3 create last-cache --database=servers --table=cpu --cache-name=cpuCache --key-columns=host,application --value-columns=usage_percent,status --count=5
```
### Querying a Last values cache
#### Query a Last values cache
To leverage the LVC, you need to specifically call on it using the `last_cache()` function. An example of this type of query:
To leverage the LVC, call it using the `last_cache()` function in your query--for example:
```bash
influxdb3 query --database=servers "SELECT * FROM last_cache('cpu', 'cpuCache') WHERE host = 'Bravo;"
```
Usage: $ influxdb3 query --database=servers "SELECT * FROM last_cache('cpu', 'cpuCache') WHERE host = 'Bravo;"
```
{{% note %}}
#### Only works with SQL
The Last Value Cache only works with SQL, not InfluxQL; SQL is the default language.
The Last values cache only works with SQL, not InfluxQL; SQL is the default language.
{{% /note %}}
### Deleting a Last values cache
#### Deleting a Last values cache
Removing a Last values cache is also easy and straightforward, with the instructions below.
To remove a Last values cache, use the following command:
```
Usage: influxdb3 delete delete [OPTIONS] -d <DATABASE_NAME> -t <TABLE> --cache-name <CACHE_NAME>
```bash
influxdb3 delete last_cache [OPTIONS] -d <DATABASE_NAME> -t <TABLE> --cache-name <CACHE_NAME>
Options:
-h, --host <HOST_URL> Host URL of the running InfluxDB 3 server
@ -417,7 +529,7 @@ Options:
--help Print help information
```
## Distinct Values Cache
### Distinct values cache
Similar to the Last values cache, the database can cache in RAM the distinct values for a single column in a table or a heirarchy of columns. This is useful for fast metadata lookups, which can return in under 30 milliseconds. Many of the options are similar to the last value cache. See the CLI output for more information:
@ -425,52 +537,67 @@ Similar to the Last values cache, the database can cache in RAM the distinct val
influxdb3 create distinct_cache -h
```
### Python Plugins and the Processing Engine
### Python plugins and the Processing engine
{{% note %}}
#### Only supported in Docker
> [!Important]
> #### Processing engine only works with Docker
>
> The Processing engine is currently supported only in Docker x86 environments. Non-Docker support is coming soon. The engine, API, and developer experience are actively evolving and may change. Join our [Discord](https://discord.gg/9zaNCW2PRT) for updates and feedback.
As of this writing, the Processing Engine is only supported in Docker environments.
We expect it to launch in non-Docker environments soon. We're still in very active development creating the API and developer experience; things will break and change fast. Join our <a href="https://discord.gg/9zaNCW2PRT">Discord</a> to ask questions and give feedback.
{{% /note %}}
The InfluxDB 3 Processing engine is an embedded Python VM for running code inside the database to process and transform data.
InfluxDB3 has an embedded Python VM for running code inside the database. Currently, we only support plugins that get triggered on WAL file flushes, but more will be coming soon. Specifically, plugins will be able to be triggered by:
To use the Processing engine, you create [plugins](#plugin) and [triggers](#trigger).
* On WAL flush: sends a batch of write data to a plugin once a second (can be configured).
* On Snapshot (persist of Parquet files): sends the metadata to a plugin to do further processing against the Parquet data or send the information elsewhere (for example, adding it to an Iceberg Catalog).
* On Schedule: executes plugin on a schedule configured by the user, and is useful for data collection and deadman monitoring.
* On Request: binds a plugin to an HTTP endpoint at `/api/v3/plugins/<name>` where request headers and content are sent to the plugin, which can then parse, process, and send the data into the database or to third party services
#### Plugin
Plugins work in two parts: plugins and triggers. Plugins are the generic Python code that represent a plugin. Once you've loaded a plugin into the server, you can create many triggers of that plugin. A trigger has a plugin, a database and then a trigger-spec, which can be either all_tables or table:my_table_name where my_table_name is the name of your table you want to filter the plugin to.
A plugin is a Python function that has a signature compatible with one of the [trigger types](#trigger-types).
The [`influxdb3 create plugin`](/influxdb3/enterprise/influxdb3-cli/create/plugin/) command loads a Python plugin file into the server.
You can also specify a list of key/value pairs as arguments supplied to a trigger. This makes it so that you could have many triggers of the same plugin, but with different arguments supplied to check for different things. These commands will give you useful information:
#### Trigger
```
influxdb3 create plugin -h
influxdb3 create trigger -h
```
After you load a plugin into an InfluxDB 3 server, you can create one or more
triggers associated with the plugin.
When you create a trigger, you specify a plugin, a database, optional runtime arguments,
and a trigger-spec, which specifies `all_tables` or `table:my_table_name` (for filtering data sent to the plugin).
When you _enable_ a trigger, the server executes the plugin code according to the
plugin signature.
##### Trigger types
InfluxDB 3 provides the following types of triggers:
- **On WAL flush**: Sends the batch of write data to a plugin once a second (configurable).
> [!Note]
> #### Plugins only work with x86 Docker
> For now, plugins only work with the x86 Docker image.
> Currently, only the **WAL flush** trigger is supported, but more are on the way:
>
> - **On Snapshot**: Sends metadata to a plugin for further processing against the Parquet data or to send the information elsewhere (for example, to an Iceberg Catalog). _Not yet available._
> - **On Schedule**: Executes a plugin on a user-configured schedule, useful for data collection and deadman monitoring. _Not yet available._
> - **On Request**: Binds a plugin to an HTTP endpoint at `/api/v3/plugins/<name>`. _Not yet available._
> The plugin receives the HTTP request headers and content, and can then parse, process, and send the data into the database or to third-party services.
Before we try to load up a plugin and create a trigger for it, we should write one and test it out. To test out and run plugins, you'll need to create a plugin directory. Start up your server with the --plugin-dir argument and point it at your plugin dir (note that you'll need to make this available in your Docker container).
### Test, create, and trigger plugin code
Have a look at this example Python plugin file:
> [!Important]
> #### Processing engine only works with Docker
>
> The Processing engine is currently supported only in Docker x86 environments. Non-Docker support is coming soon. The engine, API, and developer experience are actively evolving and may change. Join our [Discord](https://discord.gg/9zaNCW2PRT) for updates and feedback.
##### Example: Python plugin for WAL flush
```python
# This is the basic structure of the Python code that would be a plugin.
# After this Python exmaple there are instructions below for how to interact
# with the server to test it out, load it in, and set it to trigger on
# writes to either a specific DB or a specific table within a DB. When you
# define the trigger you can provide arguments to it. This will allow you to
# set things like monitoring thresholds, environment variables to look up,
# host names or other things that your generic plugin can use.
# This is the basic structure for Python plugin code that runs in the
# InfluxDB 3 Processing engine.
# you define a function with this exact signature. every time the wal gets
# flushed (once per second by default), you will get the writes either from
# the table you triggered the plugin to or every table in the database that
# you triggered it to
# When creating a trigger, you can provide runtime arguments to your plugin,
# allowing you to write generic code that uses variables such as monitoring
thresholds, environment variables, and host names.
#
# Use the following exact signature to define a function for the WAL flush
# trigger.
# When you create a trigger for a WAL flush plugin, you specify the database
# and tables that the plugin receives written data from on every WAL flush
# (default is once per second).
def process_writes(influxdb3_local, table_batches, args=None):
# here you can see logging. for now this won't do anything, but soon
# we'll capture this so you can query it from system tables
@ -531,36 +658,91 @@ def process_writes(influxdb3_local, table_batches, args=None):
influxdb3_local.info("done")
```
Then you'll want to drop a file into that plugin directory. You can use the example from above, but comment out the section where it queries (unless you write some data to that table, in which case leave it in!).
##### Test a plugin on the server
To use the server to test what a plugin will do, in advance of actually loading it into the server or creating a trigger that calls it, enter the following command:
Use InfluxDB 3 to safely test a plugin before you load it, without touching written data.
During a plugin test:
`influxdb3 test wal_plugin -h`
- A query executed by the plugin queries against the server you send the request to.
- Writes aren't sent to the server but are returned to you.
The important arguments are `lp` or `file`, which read line protocol from that file and yield it as a test to your new plugin.
To test a plugin, do the following:
`--input-arguments` are key/value pairs separated by commas--for example:
1. Create a _plugin directory_--for example, `/path/to/.influxdb/plugins`
2. Make the plugin directory available to the Docker container (for example, using a bind mount)
3. Run the Docker command to [start the server](#start-influxdb) and include the `--plugin-dir` option with your plugin directory path.
4. Save the [preceding example code](#example-python-plugin) to a plugin file inside of the plugin directory. If you haven't yet written data to the table in the example, comment out the lines where it queries.
5. To run the test, enter the following command with the following options:
- `--lp` or `--file`: The line protocol to test
- Optional: `--input-arguments`: A comma-delimited list of `<KEY>=<VALUE>` arguments for your plugin code
```bash
influxdb3 test wal_plugin \
--lp <INPUT_LINE_PROTOCOL> \
--input-arguments "arg1=foo,arg2=bar"
--database <DATABASE_NAME> \
<PLUGIN_FILENAME>
```
The command runs the plugin code with the test data, yields the data to the plugin code, and then responds with the plugin result.
You can quickly see how the plugin behaves, what data it would have written to the database, and any errors.
You can then edit your Python code in the plugins directory, and rerun the test.
The server reloads the file for every request to the `test` API.
For more information, see [`influxdb3 test wal_plugin`](/influxdb3/enterprise/reference/cli/influxdb3/test/wal_plugin/) or run `influxdb3 test wal_plugin -h`.
With the plugin code inside the server plugin directory, and a successful test,
you're ready to create a plugin and a trigger to run on the server.
##### Example: Test, create, and run a plugin
The following example shows how to test a plugin, and then create the plugin and
trigger:
```bash
--input-arguments "arg1=foo,arg2=bar"
# Test and create a plugin
# Requires:
# - A database named `mydb` with a table named `foo`
# - A Python plugin file named `test.py`
# Test a plugin
influxdb3 test wal_plugin \
--lp="my_measure,tag1=asdf f1=1.0 123" \
-d mydb \
--input-arguments="arg1=hello,arg2=world" \
test.py
```
If you execute a query within the plugin, it will query against the live server you're sending this request to. Any writes you do will not be sent into the server, but instead returned back to you.
This will let you see what a plugin would have written back without actually doing it. It will also let you quickly spot errors, change your python file in the plugins directory, and then run the test again. The server will reload the file on every request to the test API.
Once you've done that, you can create the plugin through the command shown above. Then you'll have to create trigger to have it be active and run with data as you write it into the server.
Here's an example of each of the three commands being run:
```
influxdb3 test wal_plugin --lp="my_measure,tag1=asdf f1=1.0 123" -d mydb --input-arguments="arg1=hello,arg2=world" test.py
# make sure you've created mydb first
influxdb3 create plugin -d mydb --code-filename="/Users/pauldix/.influxdb3/plugins/test.py" test_plugin
influxdb3 create trigger -d mydb --plugin=test_plugin --trigger-spec="table:foo" trigger1
```bash
# Create a plugin to run
influxdb3 create plugin \
-d mydb \
--code-filename="/path/to/.influxdb3/plugins/test.py" \
test_plugin
```
After you've tested it, you can create the plugin in the server(the file will need to be there in the plugin-dir) and then create a trigger to trigger it on WAL flushes.
```bash
# Create a trigger that runs the plugin
influxdb3 create trigger \
-d mydb \
--plugin=test_plugin \
--trigger-spec="table:foo" \
--trigger-arguments="arg1=hello,arg2=world" \
trigger1
```
After you have created a plugin and trigger, enter the following command to
enable the trigger and have it run the plugin as you write data:
```bash
influxdb3 enable trigger --database mydb trigger1
```
For more information, see the following:
- [`influxdb3 test wal_plugin`](/influxdb3/enterprise/influxdb3-cli/test/wal_plugin/)
- [`influxdb3 create plugin`](/influxdb3/enterprise/influxdb3-cli/create/plugin/)
- [`influxdb3 create trigger`](/influxdb3/enterprise/influxdb3-cli/create/trigger/)
### Diskless architecture
@ -568,7 +750,7 @@ InfluxDB 3 is able to operate using only object storage with no locally attached
{{< img-hd src="/img/influxdb/influxdb-3-write-path.png" alt="Write Path for InfluxDB 3 Core & Enterprise" />}}
As write requests come in to the server, they are parsed and validated and put into an in-memory WAL buffer. This buffer is flushed every second by default (can be changed through configuration), which will create a WAL file. Once the data is flushed to disk, it is put into a queryable in-memory buffer and then a response is sent back to the client that the write was successful. That data will now show up in queries to the server.
As write requests come in to the server, they are parsed, validated, and put into an in-memory WAL buffer. This buffer is flushed every second by default (can be changed through configuration), which will create a WAL file. Once the data is flushed to disk, it is put into a queryable in-memory buffer and then a response is sent back to the client that the write was successful. That data will now show up in queries to the server.
InfluxDB periodically snapshots the WAL to persist the oldest data in the queryable buffer, allowing the server to remove old WAL files. By default, the server will keep up to 900 WAL files buffered up (15 minutes of data) and attempt to persist the oldest 10 minutes, keeping the most recent 5 minutes around.