Port "Optimize writes" guide to Serverless and Dedicated (#4973)

* port optimize writes guide to serverless and dedicated

* updated dedicated lp guide
pull/4985/head^2
Scott Anderson 2023-06-21 09:55:11 -06:00 committed by GitHub
parent eef54e021c
commit 7246eb647b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 400 additions and 138 deletions

View File

@ -0,0 +1,122 @@
---
title: Optimize writes to InfluxDB
description: >
Simple tips to optimize performance and system overhead when writing data to
InfluxDB Cloud Dedicated.
weight: 203
menu:
influxdb_cloud_dedicated:
name: Optimize writes
parent: write-best-practices
influxdb/cloud/tags: [best practices, write]
related:
- /resources/videos/ingest-data/, How to Ingest Data in InfluxDB (Video)
---
Use these tips to optimize performance and system overhead when writing data to InfluxDB.
- [Batch writes](#batch-writes)
- [Sort tags by key](#sort-tags-by-key)
- [Use the coarsest time precision possible](#use-the-coarsest-time-precision-possible)
- [Use gzip compression](#use-gzip-compression)
- [Synchronize hosts with NTP](#synchronize-hosts-with-ntp)
- [Write multiple data points in one request](#write-multiple-data-points-in-one-request)
{{% note %}}
The following tools write to InfluxDB and employ _most_ write optimizations by default:
- [Telegraf](/influxdb/cloud-dedicated/write-data/use-telegraf/)
- [InfluxDB client libraries](/influxdb/cloud-dedicated/reference/client-libraries/)
{{% /note %}}
## Batch writes
Write data in batches to minimize network overhead when writing data to InfluxDB.
{{% note %}}
The optimal batch size is 10,000 lines of line protocol or 10 MBs,
whichever threshold is met first.
{{% /note %}}
## Sort tags by key
Before writing data points to InfluxDB, sort tags by key in lexicographic order.
_Verify sort results match results from the [Go `bytes.Compare` function](http://golang.org/pkg/bytes/#Compare)._
```sh
# Line protocol example with unsorted tags
measurement,tagC=therefore,tagE=am,tagA=i,tagD=i,tagB=think fieldKey=fieldValue 1562020262
# Optimized line protocol example with tags sorted by key
measurement,tagA=i,tagB=think,tagC=therefore,tagD=i,tagE=am fieldKey=fieldValue 1562020262
```
## Use the coarsest time precision possible
By default, InfluxDB writes data in nanosecond precision.
However if your data isn't collected in nanoseconds, there is no need to write at that precision.
For better performance, use the coarsest precision possible for timestamps.
_Specify timestamp precision when [writing to InfluxDB](/influxdb/cloud-dedicated/write-data/)._
## Use gzip compression
Use gzip compression to speed up writes to InfluxDB.
Benchmarks have shown up to a 5x speed improvement when data is compressed.
{{< tabs-wrapper >}}
{{% tabs %}}
[Telegraf](#)
[Client libraries](#)
[InfluxDB API](#)
{{% /tabs %}}
{{% tab-content %}}
### Enable gzip compression in Telegraf
In the `influxdb_v2` output plugin configuration in your `telegraf.conf`, set the
`content_encoding` option to `gzip`:
```toml
[[outputs.influxdb_v2]]
urls = ["https://cluster-id.influxdb.io"]
# ...
content_encoding = "gzip"
```
{{% /tab-content %}}
{{% tab-content %}}
### Enable gzip compression in InfluxDB client libraries
Each [InfluxDB client library](/influxdb/cloud-dedicated/reference/client-libraries/) provides
options for compressing write requests or enforces compression by default.
The method for enabling compression is different for each library.
For specific instructions, see the
[InfluxDB client libraries documentation](/influxdb/cloud-dedicated/reference/client-libraries/).
{{% /tab-content %}}
{{% tab-content %}}
### Use gzip compression with the InfluxDB API
When using the InfluxDB API `/api/v2/write` endpoint to write data,
compress the data with `gzip` and set the `Content-Encoding` header to `gzip`.
{{% code-callout "Content-Encoding: gzip" "orange" %}}
```sh
{{% get-shared-text "/api/cloud-dedicated/write-compressed.sh" %}}
```
{{% /code-callout %}}
{{% /tab-content %}}
{{< /tabs-wrapper >}}
## Synchronize hosts with NTP
Use the Network Time Protocol (NTP) to synchronize time between hosts.
If a timestamp isn't included in line protocol, InfluxDB uses its host's local
time (in UTC) to assign timestamps to each point.
If a host's clocks isn't synchronized with NTP, timestamps may be inaccurate.
## Write multiple data points in one request
To write multiple lines in one request, each line of line protocol must be delimited by a new line (`\n`).

View File

@ -122,13 +122,13 @@ After setting up InfluxDB and your project, you should have the following:
The following example shows how to construct `Point` objects that follow the [example `home` schema](#example-home-schema), and then write the points as line protocol to an The following example shows how to construct `Point` objects that follow the [example `home` schema](#example-home-schema), and then write the points as line protocol to an
InfluxDB Cloud Dedicated database. InfluxDB Cloud Dedicated database.
{{< code-tabs-wrapper >}} {{< tabs-wrapper >}}
{{% code-tabs %}} {{% tabs %}}
[Go](#) [Go](#)
[Node.js](#) [Node.js](#)
[Python](#) [Python](#)
{{% /code-tabs %}} {{% /tabs %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN GO PROJECT SETUP --> <!-- BEGIN GO PROJECT SETUP -->
1. Install [Go 1.13 or later](https://golang.org/doc/install). 1. Install [Go 1.13 or later](https://golang.org/doc/install).
@ -140,8 +140,8 @@ InfluxDB Cloud Dedicated database.
``` ```
<!-- END GO SETUP PROJECT --> <!-- END GO SETUP PROJECT -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN NODE.JS PROJECT SETUP --> <!-- BEGIN NODE.JS PROJECT SETUP -->
Inside of your project directory, install the `@influxdata/influxdb-client` InfluxDB v2 JavaScript client library. Inside of your project directory, install the `@influxdata/influxdb-client` InfluxDB v2 JavaScript client library.
@ -151,8 +151,8 @@ npm install --save @influxdata/influxdb-client
``` ```
<!-- END NODE.JS SETUP PROJECT --> <!-- END NODE.JS SETUP PROJECT -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN PYTHON SETUP PROJECT --> <!-- BEGIN PYTHON SETUP PROJECT -->
1. **Optional, but recommended**: Use `venv` or `conda` to activate a virtual environment for installing and executing code--for example: 1. **Optional, but recommended**: Use `venv` or `conda` to activate a virtual environment for installing and executing code--for example:
@ -170,35 +170,35 @@ npm install --save @influxdata/influxdb-client
``` ```
<!-- END PYTHON SETUP PROJECT --> <!-- END PYTHON SETUP PROJECT -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{< /code-tabs-wrapper >}} {{< /tabs-wrapper >}}
### Construct points and write line protocol ### Construct points and write line protocol
{{< code-tabs-wrapper >}} {{< tabs-wrapper >}}
{{% code-tabs %}} {{% tabs %}}
[Go](#) [Go](#)
[Node.js](#) [Node.js](#)
[Python](#) [Python](#)
{{% /code-tabs %}} {{% /tabs %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN GO SETUP SAMPLE --> <!-- BEGIN GO SETUP SAMPLE -->
1. Create a file for your module--for example: `write-point.go`. 1. Create a file for your module--for example: `write-point.go`.
2. In `write-point.go`, enter the following sample code: 2. In `write-point.go`, enter the following sample code:
```go ```go
package main package main
import ( import (
"os" "os"
"time" "time"
"fmt" "fmt"
"github.com/influxdata/influxdb-client-go/v2" "github.com/influxdata/influxdb-client-go/v2"
) )
func main() { func main() {
// Set a log level constant // Set a log level constant
const debugLevel uint = 4 const debugLevel uint = 4
@ -251,11 +251,11 @@ func main() {
// Ensure background processes finish and release resources. // Ensure background processes finish and release resources.
client.Close() client.Close()
} }
``` ```
<!-- END GO SETUP SAMPLE --> <!-- END GO SETUP SAMPLE -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN NODE.JS SETUP SAMPLE --> <!-- BEGIN NODE.JS SETUP SAMPLE -->
1. Create a file for your module--for example: `write-point.js`. 1. Create a file for your module--for example: `write-point.js`.
@ -311,9 +311,9 @@ func main() {
}) })
``` ```
<!-- END NODE.JS SETUP SAMPLE --> <!-- END NODE.JS SETUP SAMPLE -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN PYTHON SETUP SAMPLE --> <!-- BEGIN PYTHON SETUP SAMPLE -->
1. Create a file for your module--for example: `write-point.py`. 1. Create a file for your module--for example: `write-point.py`.
@ -356,8 +356,8 @@ func main() {
write_api.close() write_api.close()
``` ```
<!-- END PYTHON SETUP PROJECT --> <!-- END PYTHON SETUP PROJECT -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{< /code-tabs-wrapper >}} {{< /tabs-wrapper >}}
The sample code does the following: The sample code does the following:

View File

@ -0,0 +1,119 @@
---
title: Optimize writes to InfluxDB
description: >
Simple tips to optimize performance and system overhead when writing data to
InfluxDB Cloud Serverless.
weight: 203
menu:
influxdb_cloud_serverless:
name: Optimize writes
parent: write-best-practices
influxdb/cloud/tags: [best practices, write]
related:
- /resources/videos/ingest-data/, How to Ingest Data in InfluxDB (Video)
---
Use these tips to optimize performance and system overhead when writing data to InfluxDB.
- [Batch writes](#batch-writes)
- [Sort tags by key](#sort-tags-by-key)
- [Use the coarsest time precision possible](#use-the-coarsest-time-precision-possible)
- [Use gzip compression](#use-gzip-compression)
- [Synchronize hosts with NTP](#synchronize-hosts-with-ntp)
- [Write multiple data points in one request](#write-multiple-data-points-in-one-request)
{{% note %}}
The following tools write to InfluxDB and employ _most_ write optimizations by default:
- [Telegraf](/influxdb/cloud-serverless/write-data/use-telegraf/)
- [InfluxDB client libraries](/influxdb/cloud-serverless/reference/client-libraries/)
{{% /note %}}
## Batch writes
Write data in batches to minimize network overhead when writing data to InfluxDB.
{{% note %}}
The optimal batch size is 10,000 lines of line protocol or 10 MBs,
whichever threshold is met first.
{{% /note %}}
## Sort tags by key
Before writing data points to InfluxDB, sort tags by key in lexicographic order.
_Verify sort results match results from the [Go `bytes.Compare` function](http://golang.org/pkg/bytes/#Compare)._
```sh
# Line protocol example with unsorted tags
measurement,tagC=therefore,tagE=am,tagA=i,tagD=i,tagB=think fieldKey=fieldValue 1562020262
# Optimized line protocol example with tags sorted by key
measurement,tagA=i,tagB=think,tagC=therefore,tagD=i,tagE=am fieldKey=fieldValue 1562020262
```
## Use the coarsest time precision possible
By default, InfluxDB writes data in nanosecond precision.
However if your data isn't collected in nanoseconds, there is no need to write at that precision.
For better performance, use the coarsest precision possible for timestamps.
_Specify timestamp precision when [writing to InfluxDB](/influxdb/cloud-serverless/write-data/)._
## Use gzip compression
Use gzip compression to speed up writes to InfluxDB.
Benchmarks have shown up to a 5x speed improvement when data is compressed.
{{< tabs-wrapper >}}
{{% tabs %}}
[Telegraf](#)
[Client libraries](#)
[InfluxDB API](#)
{{% /tabs %}}
{{% tab-content %}}
### Enable gzip compression in Telegraf
In the `influxdb_v2` output plugin configuration in your `telegraf.conf`, set the
`content_encoding` option to `gzip`:
```toml
[[outputs.influxdb_v2]]
urls = ["https://cloud2.influxdata.com"]
# ...
content_encoding = "gzip"
```
{{% /tab-content %}}
{{% tab-content %}}
### Enable gzip compression in InfluxDB client libraries
Each [InfluxDB client library](/influxdb/cloud-serverless/reference/client-libraries/) provides
options for compressing write requests or enforces compression by default.
The method for enabling compression is different for each library.
For specific instructions, see the
[InfluxDB client libraries documentation](/influxdb/cloud-serverless/reference/client-libraries/).
{{% /tab-content %}}
{{% tab-content %}}
### Use gzip compression with the InfluxDB API
When using the InfluxDB API `/api/v2/write` endpoint to write data,
compress the data with `gzip` and set the `Content-Encoding` header to `gzip`.
```sh
{{% get-shared-text "/api/cloud-serverless/write-compressed.sh" %}}
```
{{% /tab-content %}}
{{< /tabs-wrapper >}}
## Synchronize hosts with NTP
Use the Network Time Protocol (NTP) to synchronize time between hosts.
If a timestamp isn't included in line protocol, InfluxDB uses its host's local
time (in UTC) to assign timestamps to each point.
If a host's clocks isn't synchronized with NTP, timestamps may be inaccurate.
## Write multiple data points in one request
To write multiple lines in one request, each line of line protocol must be delimited by a new line (`\n`).

View File

@ -128,30 +128,30 @@ InfluxDB Cloud Serverless bucket.
### Construct points and write line protocol ### Construct points and write line protocol
{{< code-tabs-wrapper >}} {{< tabs-wrapper >}}
{{% code-tabs %}} {{% tabs %}}
[Go](#) [Go](#)
[Node.js](#) [Node.js](#)
[Python](#) [Python](#)
{{% /code-tabs %}} {{% /tabs %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN GO SETUP SAMPLE --> <!-- BEGIN GO SETUP SAMPLE -->
1. Create a file for your module--for example: `write-point.go`. 1. Create a file for your module--for example: `write-point.go`.
2. In `write-point.go`, enter the following sample code: 2. In `write-point.go`, enter the following sample code:
```go ```go
package main package main
import ( import (
"os" "os"
"time" "time"
"fmt" "fmt"
"github.com/influxdata/influxdb-client-go/v2" "github.com/influxdata/influxdb-client-go/v2"
) )
func main() { func main() {
// Set a log level constant // Set a log level constant
const debugLevel uint = 4 const debugLevel uint = 4
@ -203,11 +203,11 @@ func main() {
// Ensure background processes finish and release resources. // Ensure background processes finish and release resources.
client.Close() client.Close()
} }
``` ```
<!-- END GO SETUP SAMPLE --> <!-- END GO SETUP SAMPLE -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN NODE.JS SETUP SAMPLE --> <!-- BEGIN NODE.JS SETUP SAMPLE -->
1. Create a file for your module--for example: `write-point.js`. 1. Create a file for your module--for example: `write-point.js`.
@ -263,9 +263,9 @@ func main() {
}) })
``` ```
<!-- END NODE.JS SETUP SAMPLE --> <!-- END NODE.JS SETUP SAMPLE -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{% code-tab-content %}} {{% tab-content %}}
<!-- BEGIN PYTHON SETUP SAMPLE --> <!-- BEGIN PYTHON SETUP SAMPLE -->
1. Create a file for your module--for example: `write-point.py`. 1. Create a file for your module--for example: `write-point.py`.
@ -308,8 +308,8 @@ func main() {
write_api.close() write_api.close()
``` ```
<!-- END PYTHON SETUP PROJECT --> <!-- END PYTHON SETUP PROJECT -->
{{% /code-tab-content %}} {{% /tab-content %}}
{{< /code-tabs-wrapper >}} {{< /tabs-wrapper >}}
The sample code does the following: The sample code does the following:

View File

@ -0,0 +1,12 @@
curl --request POST "https://cluster-id.influxdb.io/api/v2/write" \
--header "Authorization: Token DATABASE_TOKEN" \
--header "Content-Encoding: gzip" \
--data-urlencode "org=ignored" \
--data-urlencode "bucket=DATABASE_NAME" \
--data-urlencode "precision=s" \
--data-raw "
mem,host=host1 used_percent=23.43234543 1556896326
mem,host=host2 used_percent=26.81522361 1556896326
mem,host=host1 used_percent=22.52984738 1556896336
mem,host=host2 used_percent=27.18294630 1556896336
"

View File

@ -0,0 +1,9 @@
curl --request POST "http://cloud2.influxdata.com/api/v2/write?org=YOUR_ORG&bucket=YOUR_BUCKET&precision=s" \
--header "Authorization: Token YOURAUTHTOKEN" \
--header "Content-Encoding: gzip" \
--data-raw "
mem,host=host1 used_percent=23.43234543 1556896326
mem,host=host2 used_percent=26.81522361 1556896326
mem,host=host1 used_percent=22.52984738 1556896336
mem,host=host2 used_percent=27.18294630 1556896336
"