diff --git a/content/influxdb/cloud-dedicated/write-data/best-practices/optimize-writes.md b/content/influxdb/cloud-dedicated/write-data/best-practices/optimize-writes.md new file mode 100644 index 000000000..61c70379f --- /dev/null +++ b/content/influxdb/cloud-dedicated/write-data/best-practices/optimize-writes.md @@ -0,0 +1,122 @@ +--- +title: Optimize writes to InfluxDB +description: > + Simple tips to optimize performance and system overhead when writing data to + InfluxDB Cloud Dedicated. +weight: 203 +menu: + influxdb_cloud_dedicated: + name: Optimize writes + parent: write-best-practices +influxdb/cloud/tags: [best practices, write] +related: + - /resources/videos/ingest-data/, How to Ingest Data in InfluxDB (Video) +--- + +Use these tips to optimize performance and system overhead when writing data to InfluxDB. + +- [Batch writes](#batch-writes) +- [Sort tags by key](#sort-tags-by-key) +- [Use the coarsest time precision possible](#use-the-coarsest-time-precision-possible) +- [Use gzip compression](#use-gzip-compression) +- [Synchronize hosts with NTP](#synchronize-hosts-with-ntp) +- [Write multiple data points in one request](#write-multiple-data-points-in-one-request) + +{{% note %}} +The following tools write to InfluxDB and employ _most_ write optimizations by default: + +- [Telegraf](/influxdb/cloud-dedicated/write-data/use-telegraf/) +- [InfluxDB client libraries](/influxdb/cloud-dedicated/reference/client-libraries/) +{{% /note %}} + +## Batch writes + +Write data in batches to minimize network overhead when writing data to InfluxDB. + +{{% note %}} +The optimal batch size is 10,000 lines of line protocol or 10 MBs, +whichever threshold is met first. +{{% /note %}} + +## Sort tags by key + +Before writing data points to InfluxDB, sort tags by key in lexicographic order. +_Verify sort results match results from the [Go `bytes.Compare` function](http://golang.org/pkg/bytes/#Compare)._ + +```sh +# Line protocol example with unsorted tags +measurement,tagC=therefore,tagE=am,tagA=i,tagD=i,tagB=think fieldKey=fieldValue 1562020262 + +# Optimized line protocol example with tags sorted by key +measurement,tagA=i,tagB=think,tagC=therefore,tagD=i,tagE=am fieldKey=fieldValue 1562020262 +``` + +## Use the coarsest time precision possible + +By default, InfluxDB writes data in nanosecond precision. +However if your data isn't collected in nanoseconds, there is no need to write at that precision. +For better performance, use the coarsest precision possible for timestamps. + +_Specify timestamp precision when [writing to InfluxDB](/influxdb/cloud-dedicated/write-data/)._ + +## Use gzip compression + +Use gzip compression to speed up writes to InfluxDB. +Benchmarks have shown up to a 5x speed improvement when data is compressed. + +{{< tabs-wrapper >}} +{{% tabs %}} +[Telegraf](#) +[Client libraries](#) +[InfluxDB API](#) +{{% /tabs %}} +{{% tab-content %}} + +### Enable gzip compression in Telegraf + +In the `influxdb_v2` output plugin configuration in your `telegraf.conf`, set the +`content_encoding` option to `gzip`: + +```toml +[[outputs.influxdb_v2]] + urls = ["https://cluster-id.influxdb.io"] + # ... + content_encoding = "gzip" +``` + +{{% /tab-content %}} +{{% tab-content %}} + +### Enable gzip compression in InfluxDB client libraries + +Each [InfluxDB client library](/influxdb/cloud-dedicated/reference/client-libraries/) provides +options for compressing write requests or enforces compression by default. +The method for enabling compression is different for each library. +For specific instructions, see the +[InfluxDB client libraries documentation](/influxdb/cloud-dedicated/reference/client-libraries/). +{{% /tab-content %}} +{{% tab-content %}} + +### Use gzip compression with the InfluxDB API + +When using the InfluxDB API `/api/v2/write` endpoint to write data, +compress the data with `gzip` and set the `Content-Encoding` header to `gzip`. + +{{% code-callout "Content-Encoding: gzip" "orange" %}} +```sh +{{% get-shared-text "/api/cloud-dedicated/write-compressed.sh" %}} +``` +{{% /code-callout %}} +{{% /tab-content %}} +{{< /tabs-wrapper >}} + +## Synchronize hosts with NTP + +Use the Network Time Protocol (NTP) to synchronize time between hosts. +If a timestamp isn't included in line protocol, InfluxDB uses its host's local +time (in UTC) to assign timestamps to each point. +If a host's clocks isn't synchronized with NTP, timestamps may be inaccurate. + +## Write multiple data points in one request + +To write multiple lines in one request, each line of line protocol must be delimited by a new line (`\n`). diff --git a/content/influxdb/cloud-dedicated/write-data/line protocol/_index.md b/content/influxdb/cloud-dedicated/write-data/line protocol/_index.md index 70bcde495..4cd8795ee 100644 --- a/content/influxdb/cloud-dedicated/write-data/line protocol/_index.md +++ b/content/influxdb/cloud-dedicated/write-data/line protocol/_index.md @@ -122,13 +122,13 @@ After setting up InfluxDB and your project, you should have the following: The following example shows how to construct `Point` objects that follow the [example `home` schema](#example-home-schema), and then write the points as line protocol to an InfluxDB Cloud Dedicated database. -{{< code-tabs-wrapper >}} -{{% code-tabs %}} +{{< tabs-wrapper >}} +{{% tabs %}} [Go](#) [Node.js](#) [Python](#) -{{% /code-tabs %}} -{{% code-tab-content %}} +{{% /tabs %}} +{{% tab-content %}} 1. Install [Go 1.13 or later](https://golang.org/doc/install). @@ -140,8 +140,8 @@ InfluxDB Cloud Dedicated database. ``` -{{% /code-tab-content %}} -{{% code-tab-content %}} +{{% /tab-content %}} +{{% tab-content %}} Inside of your project directory, install the `@influxdata/influxdb-client` InfluxDB v2 JavaScript client library. @@ -151,8 +151,8 @@ npm install --save @influxdata/influxdb-client ``` -{{% /code-tab-content %}} -{{% code-tab-content %}} +{{% /tab-content %}} +{{% tab-content %}} 1. **Optional, but recommended**: Use `venv` or `conda` to activate a virtual environment for installing and executing code--for example: @@ -170,92 +170,92 @@ npm install --save @influxdata/influxdb-client ``` -{{% /code-tab-content %}} -{{< /code-tabs-wrapper >}} +{{% /tab-content %}} +{{< /tabs-wrapper >}} ### Construct points and write line protocol -{{< code-tabs-wrapper >}} -{{% code-tabs %}} +{{< tabs-wrapper >}} +{{% tabs %}} [Go](#) [Node.js](#) [Python](#) -{{% /code-tabs %}} -{{% code-tab-content %}} +{{% /tabs %}} +{{% tab-content %}} 1. Create a file for your module--for example: `write-point.go`. 2. In `write-point.go`, enter the following sample code: -```go -package main + ```go + package main -import ( - "os" - "time" - "fmt" - "github.com/influxdata/influxdb-client-go/v2" -) + import ( + "os" + "time" + "fmt" + "github.com/influxdata/influxdb-client-go/v2" + ) -func main() { - // Set a log level constant - const debugLevel uint = 4 + func main() { + // Set a log level constant + const debugLevel uint = 4 - /** - * Define options for the client. - * Instantiate the client with the following arguments: - * - An object containing InfluxDB URL and token credentials. - * - Write options for batch size and timestamp precision. - **/ - clientOptions := influxdb2.DefaultOptions(). - SetBatchSize(20). - SetLogLevel(debugLevel). - SetPrecision(time.Second) + /** + * Define options for the client. + * Instantiate the client with the following arguments: + * - An object containing InfluxDB URL and token credentials. + * - Write options for batch size and timestamp precision. + **/ + clientOptions := influxdb2.DefaultOptions(). + SetBatchSize(20). + SetLogLevel(debugLevel). + SetPrecision(time.Second) - client := influxdb2.NewClientWithOptions(os.Getenv("INFLUX_URL"), - os.Getenv("INFLUX_TOKEN"), - clientOptions) + client := influxdb2.NewClientWithOptions(os.Getenv("INFLUX_URL"), + os.Getenv("INFLUX_TOKEN"), + clientOptions) - /** - * Create an asynchronous, non-blocking write client. - * Provide your InfluxDB org and database as arguments - **/ - writeAPI := client.WriteAPI(os.Getenv("INFLUX_ORG"), "get-started") + /** + * Create an asynchronous, non-blocking write client. + * Provide your InfluxDB org and database as arguments + **/ + writeAPI := client.WriteAPI(os.Getenv("INFLUX_ORG"), "get-started") - // Get the errors channel for the asynchronous write client. - errorsCh := writeAPI.Errors() + // Get the errors channel for the asynchronous write client. + errorsCh := writeAPI.Errors() - /** Create a point. - * Provide measurement, tags, and fields as arguments. - **/ - p := influxdb2.NewPointWithMeasurement("home"). - AddTag("room", "Kitchen"). - AddField("temp", 72.0). - AddField("hum", 20.2). - AddField("co", 9). - SetTime(time.Now()) - - // Define a proc for handling errors. - go func() { - for err := range errorsCh { - fmt.Printf("write error: %s\n", err.Error()) - } - }() + /** Create a point. + * Provide measurement, tags, and fields as arguments. + **/ + p := influxdb2.NewPointWithMeasurement("home"). + AddTag("room", "Kitchen"). + AddField("temp", 72.0). + AddField("hum", 20.2). + AddField("co", 9). + SetTime(time.Now()) + + // Define a proc for handling errors. + go func() { + for err := range errorsCh { + fmt.Printf("write error: %s\n", err.Error()) + } + }() - // Write the point asynchronously - writeAPI.WritePoint(p) + // Write the point asynchronously + writeAPI.WritePoint(p) - // Send pending writes from the buffer to the database. - writeAPI.Flush() + // Send pending writes from the buffer to the database. + writeAPI.Flush() - // Ensure background processes finish and release resources. - client.Close() -} -``` + // Ensure background processes finish and release resources. + client.Close() + } + ``` -{{% /code-tab-content %}} -{{% code-tab-content %}} +{{% /tab-content %}} +{{% tab-content %}} 1. Create a file for your module--for example: `write-point.js`. @@ -311,9 +311,9 @@ func main() { }) ``` -{{% /code-tab-content %}} +{{% /tab-content %}} -{{% code-tab-content %}} +{{% tab-content %}} 1. Create a file for your module--for example: `write-point.py`. @@ -356,8 +356,8 @@ func main() { write_api.close() ``` -{{% /code-tab-content %}} -{{< /code-tabs-wrapper >}} +{{% /tab-content %}} +{{< /tabs-wrapper >}} The sample code does the following: diff --git a/content/influxdb/cloud-serverless/write-data/best-practices/optimize-writes.md b/content/influxdb/cloud-serverless/write-data/best-practices/optimize-writes.md new file mode 100644 index 000000000..4c4a4edf7 --- /dev/null +++ b/content/influxdb/cloud-serverless/write-data/best-practices/optimize-writes.md @@ -0,0 +1,119 @@ +--- +title: Optimize writes to InfluxDB +description: > + Simple tips to optimize performance and system overhead when writing data to + InfluxDB Cloud Serverless. +weight: 203 +menu: + influxdb_cloud_serverless: + name: Optimize writes + parent: write-best-practices +influxdb/cloud/tags: [best practices, write] +related: + - /resources/videos/ingest-data/, How to Ingest Data in InfluxDB (Video) +--- + +Use these tips to optimize performance and system overhead when writing data to InfluxDB. + +- [Batch writes](#batch-writes) +- [Sort tags by key](#sort-tags-by-key) +- [Use the coarsest time precision possible](#use-the-coarsest-time-precision-possible) +- [Use gzip compression](#use-gzip-compression) +- [Synchronize hosts with NTP](#synchronize-hosts-with-ntp) +- [Write multiple data points in one request](#write-multiple-data-points-in-one-request) + +{{% note %}} +The following tools write to InfluxDB and employ _most_ write optimizations by default: + +- [Telegraf](/influxdb/cloud-serverless/write-data/use-telegraf/) +- [InfluxDB client libraries](/influxdb/cloud-serverless/reference/client-libraries/) +{{% /note %}} + +## Batch writes + +Write data in batches to minimize network overhead when writing data to InfluxDB. + +{{% note %}} +The optimal batch size is 10,000 lines of line protocol or 10 MBs, +whichever threshold is met first. +{{% /note %}} + +## Sort tags by key + +Before writing data points to InfluxDB, sort tags by key in lexicographic order. +_Verify sort results match results from the [Go `bytes.Compare` function](http://golang.org/pkg/bytes/#Compare)._ + +```sh +# Line protocol example with unsorted tags +measurement,tagC=therefore,tagE=am,tagA=i,tagD=i,tagB=think fieldKey=fieldValue 1562020262 + +# Optimized line protocol example with tags sorted by key +measurement,tagA=i,tagB=think,tagC=therefore,tagD=i,tagE=am fieldKey=fieldValue 1562020262 +``` + +## Use the coarsest time precision possible + +By default, InfluxDB writes data in nanosecond precision. +However if your data isn't collected in nanoseconds, there is no need to write at that precision. +For better performance, use the coarsest precision possible for timestamps. + +_Specify timestamp precision when [writing to InfluxDB](/influxdb/cloud-serverless/write-data/)._ + +## Use gzip compression + +Use gzip compression to speed up writes to InfluxDB. +Benchmarks have shown up to a 5x speed improvement when data is compressed. + +{{< tabs-wrapper >}} +{{% tabs %}} +[Telegraf](#) +[Client libraries](#) +[InfluxDB API](#) +{{% /tabs %}} +{{% tab-content %}} + +### Enable gzip compression in Telegraf + +In the `influxdb_v2` output plugin configuration in your `telegraf.conf`, set the +`content_encoding` option to `gzip`: + +```toml +[[outputs.influxdb_v2]] + urls = ["https://cloud2.influxdata.com"] + # ... + content_encoding = "gzip" +``` +{{% /tab-content %}} +{{% tab-content %}} + +### Enable gzip compression in InfluxDB client libraries + +Each [InfluxDB client library](/influxdb/cloud-serverless/reference/client-libraries/) provides +options for compressing write requests or enforces compression by default. +The method for enabling compression is different for each library. +For specific instructions, see the +[InfluxDB client libraries documentation](/influxdb/cloud-serverless/reference/client-libraries/). +{{% /tab-content %}} +{{% tab-content %}} + +### Use gzip compression with the InfluxDB API + +When using the InfluxDB API `/api/v2/write` endpoint to write data, +compress the data with `gzip` and set the `Content-Encoding` header to `gzip`. + +```sh +{{% get-shared-text "/api/cloud-serverless/write-compressed.sh" %}} +``` +{{% /tab-content %}} +{{< /tabs-wrapper >}} + +## Synchronize hosts with NTP + +Use the Network Time Protocol (NTP) to synchronize time between hosts. +If a timestamp isn't included in line protocol, InfluxDB uses its host's local +time (in UTC) to assign timestamps to each point. +If a host's clocks isn't synchronized with NTP, timestamps may be inaccurate. + +## Write multiple data points in one request + +To write multiple lines in one request, each line of line protocol must be delimited by a new line (`\n`). diff --git a/content/influxdb/cloud-serverless/write-data/line protocol/_index.md b/content/influxdb/cloud-serverless/write-data/line protocol/_index.md index 154d8256a..ce1a0b272 100644 --- a/content/influxdb/cloud-serverless/write-data/line protocol/_index.md +++ b/content/influxdb/cloud-serverless/write-data/line protocol/_index.md @@ -128,86 +128,86 @@ InfluxDB Cloud Serverless bucket. ### Construct points and write line protocol -{{< code-tabs-wrapper >}} -{{% code-tabs %}} +{{< tabs-wrapper >}} +{{% tabs %}} [Go](#) [Node.js](#) [Python](#) -{{% /code-tabs %}} -{{% code-tab-content %}} +{{% /tabs %}} +{{% tab-content %}} 1. Create a file for your module--for example: `write-point.go`. 2. In `write-point.go`, enter the following sample code: -```go -package main + ```go + package main -import ( - "os" - "time" - "fmt" - "github.com/influxdata/influxdb-client-go/v2" -) + import ( + "os" + "time" + "fmt" + "github.com/influxdata/influxdb-client-go/v2" + ) -func main() { - // Set a log level constant - const debugLevel uint = 4 + func main() { + // Set a log level constant + const debugLevel uint = 4 - /** - * Instantiate a client with a configuration object - * that contains your InfluxDB URL and token. - **/ + /** + * Instantiate a client with a configuration object + * that contains your InfluxDB URL and token. + **/ - clientOptions := influxdb2.DefaultOptions(). - SetBatchSize(20). - SetLogLevel(debugLevel). - SetPrecision(time.Second) + clientOptions := influxdb2.DefaultOptions(). + SetBatchSize(20). + SetLogLevel(debugLevel). + SetPrecision(time.Second) - client := influxdb2.NewClientWithOptions(os.Getenv("INFLUX_URL"), - os.Getenv("INFLUX_TOKEN"), - clientOptions) + client := influxdb2.NewClientWithOptions(os.Getenv("INFLUX_URL"), + os.Getenv("INFLUX_TOKEN"), + clientOptions) - /** - * Create an asynchronous, non-blocking write client. - * Provide your InfluxDB org and bucket as arguments - **/ - writeAPI := client.WriteAPI(os.Getenv("INFLUX_ORG"), "get-started") + /** + * Create an asynchronous, non-blocking write client. + * Provide your InfluxDB org and bucket as arguments + **/ + writeAPI := client.WriteAPI(os.Getenv("INFLUX_ORG"), "get-started") - // Get the errors channel for the asynchronous write client. - errorsCh := writeAPI.Errors() + // Get the errors channel for the asynchronous write client. + errorsCh := writeAPI.Errors() - /** Create a point. - * Provide measurement, tags, and fields as arguments. - **/ - p := influxdb2.NewPointWithMeasurement("home"). - AddTag("room", "Kitchen"). - AddField("temp", 72.0). - AddField("hum", 20.2). - AddField("co", 9). - SetTime(time.Now()) - - // Define a proc for handling errors. - go func() { - for err := range errorsCh { - fmt.Printf("write error: %s\n", err.Error()) - } - }() + /** Create a point. + * Provide measurement, tags, and fields as arguments. + **/ + p := influxdb2.NewPointWithMeasurement("home"). + AddTag("room", "Kitchen"). + AddField("temp", 72.0). + AddField("hum", 20.2). + AddField("co", 9). + SetTime(time.Now()) + + // Define a proc for handling errors. + go func() { + for err := range errorsCh { + fmt.Printf("write error: %s\n", err.Error()) + } + }() - // Write the point asynchronously - writeAPI.WritePoint(p) + // Write the point asynchronously + writeAPI.WritePoint(p) - // Send pending writes from the buffer to the bucket. - writeAPI.Flush() + // Send pending writes from the buffer to the bucket. + writeAPI.Flush() - // Ensure background processes finish and release resources. - client.Close() -} -``` + // Ensure background processes finish and release resources. + client.Close() + } + ``` -{{% /code-tab-content %}} -{{% code-tab-content %}} +{{% /tab-content %}} +{{% tab-content %}} 1. Create a file for your module--for example: `write-point.js`. @@ -263,9 +263,9 @@ func main() { }) ``` -{{% /code-tab-content %}} +{{% /tab-content %}} -{{% code-tab-content %}} +{{% tab-content %}} 1. Create a file for your module--for example: `write-point.py`. @@ -308,8 +308,8 @@ func main() { write_api.close() ``` -{{% /code-tab-content %}} -{{< /code-tabs-wrapper >}} +{{% /tab-content %}} +{{< /tabs-wrapper >}} The sample code does the following: diff --git a/shared/text/api/cloud-dedicated/write-compressed.sh b/shared/text/api/cloud-dedicated/write-compressed.sh new file mode 100644 index 000000000..97db7139e --- /dev/null +++ b/shared/text/api/cloud-dedicated/write-compressed.sh @@ -0,0 +1,12 @@ +curl --request POST "https://cluster-id.influxdb.io/api/v2/write" \ + --header "Authorization: Token DATABASE_TOKEN" \ + --header "Content-Encoding: gzip" \ + --data-urlencode "org=ignored" \ + --data-urlencode "bucket=DATABASE_NAME" \ + --data-urlencode "precision=s" \ + --data-raw " +mem,host=host1 used_percent=23.43234543 1556896326 +mem,host=host2 used_percent=26.81522361 1556896326 +mem,host=host1 used_percent=22.52984738 1556896336 +mem,host=host2 used_percent=27.18294630 1556896336 +" diff --git a/shared/text/api/cloud-serverless/write-compressed.sh b/shared/text/api/cloud-serverless/write-compressed.sh new file mode 100644 index 000000000..9a642f19f --- /dev/null +++ b/shared/text/api/cloud-serverless/write-compressed.sh @@ -0,0 +1,9 @@ +curl --request POST "http://cloud2.influxdata.com/api/v2/write?org=YOUR_ORG&bucket=YOUR_BUCKET&precision=s" \ + --header "Authorization: Token YOURAUTHTOKEN" \ + --header "Content-Encoding: gzip" \ + --data-raw " +mem,host=host1 used_percent=23.43234543 1556896326 +mem,host=host2 used_percent=26.81522361 1556896326 +mem,host=host1 used_percent=22.52984738 1556896336 +mem,host=host2 used_percent=27.18294630 1556896336 +"