Guides for migrating data (#3865)

* add guide for migrating data from cloud to oss, closes influxdata/DAR#270

* Apply suggestions from code review

Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com>

* update cloud to oss migration guide to address PR feedback

* Apply suggestions from code review

Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com>

* fixed typo

* Data migration (#3906)

* WIP migration section

* WIP migrate data

* add migrate data section with associated docs

* Apply suggestions from code review

Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>

* updates to address PR feedback

Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>

Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com>
Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>
pull/3910/head
Scott Anderson 2022-03-29 14:44:01 -06:00 committed by GitHub
parent b0c84c0884
commit 939d59d505
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
14 changed files with 920 additions and 12 deletions

View File

@ -29,6 +29,15 @@ function scrollToAnchor(target) {
}, 400, 'swing', function () {
window.location.hash = target;
});
// Unique accordion functionality
// If the target is an accordion element, open the accordion after scrolling
if ($target.hasClass('expand')) {
if ($(target + ' .expand-label .expand-toggle').hasClass('open')) {}
else {
$(target + '> .expand-label').trigger('click');
};
};
}
}
@ -140,13 +149,36 @@ $(".truncate-toggle").click(function(e) {
$(this).closest('.truncate').toggleClass('closed');
})
////////////////////////////// Expand Accordians ///////////////////////////////
////////////////////////////// Expand Accordions ///////////////////////////////
$('.expand-label').click(function() {
$(this).children('.expand-toggle').toggleClass('open')
$(this).next('.expand-content').slideToggle(200)
})
// Expand accordions on load based on URL anchor
function openAccordionByHash() {
var anchor = window.location.hash;
function expandElement() {
if ($(anchor).parents('.expand').length > 0) {
return $(anchor).closest('.expand').children('.expand-label');
} else if ($(anchor).hasClass('expand')){
return $(anchor).children('.expand-label');
}
};
if (expandElement() != null) {
if (expandElement().children('.expand-toggle').hasClass('open')) {}
else {
expandElement().children('.expand-toggle').trigger('click');
};
};
};
// Open accordions by hash on page load.
openAccordionByHash()
////////////////////////// Inject tooltips on load //////////////////////////////
$('.tooltip').each( function(){

View File

@ -0,0 +1,14 @@
---
title: Migrate data to InfluxDB
description: >
Migrate data from InfluxDB OSS (open source) to InfluxDB Cloud or other InfluxDB OSS instances--or from InfluxDB Cloud to InfluxDB OSS.
menu:
influxdb_cloud:
name: Migrate data
weight: 9
---
Migrate data to InfluxDB from other InfluxDB instances including by InfluxDB OSS
and InfluxDB Cloud.
{{< children >}}

View File

@ -0,0 +1,380 @@
---
title: Migrate data between InfluxDB Cloud organizations
description: >
To migrate data from one InfluxDB Cloud organization to another, query the
data from time-based batches and write the queried data to a bucket in another
InfluxDB Cloud organization.
menu:
influxdb_cloud:
name: Migrate from Cloud to Cloud
parent: Migrate data
weight: 102
---
To migrate data from one InfluxDB Cloud organization to another, query the
data from time-based batches and write the queried data to a bucket in another
InfluxDB Cloud organization.
Because full data migrations will likely exceed your organizations' limits and
adjustable quotas, migrate your data in batches.
The following guide provides instructions for setting up an InfluxDB task
that queries data from an InfluxDB Cloud bucket in time-based batches and writes
each batch to another InfluxDB Cloud bucket in another organization.
{{% cloud %}}
All query and write requests are subject to your InfluxDB Cloud organization's
[rate limits and adjustable quotas](/influxdb/cloud/account-management/limits/).
{{% /cloud %}}
- [Set up the migration](#set-up-the-migration)
- [Migration task](#migration-task)
- [Configure the migration](#configure-the-migration)
- [Migration Flux script](#migration-flux-script)
- [Configuration help](#configuration-help)
- [Monitor the migration progress](#monitor-the-migration-progress)
- [Troubleshoot migration task failures](#troubleshoot-migration-task-failures)
## Set up the migration
{{% note %}}
The migration process requires two buckets in your destination InfluxDB
organization—one bucket to store the migrated data and another bucket to store migration metadata.
If the destination organization uses the [InfluxDB Cloud Free Plan](/influxdb/cloud/account-management/limits/#free-plan),
any buckets in addition to these two will exceed the your plan's bucket limit.
{{% /note %}}
1. **In the InfluxDB Cloud organization you're migrating data _from_**,
[create an API token](/influxdb/cloud/security/tokens/create-token/)
with **read access** to the bucket you want to migrate.
2. **In the InfluxDB Cloud organization you're migrating data _to_**:
1. Add the **InfluxDB Cloud API token from the source organization** as a
secret using the key, `INFLUXDB_CLOUD_TOKEN`.
_See [Add secrets](/influxdb/cloud/security/secrets/add/) for more information._
2. [Create a bucket](/influxdb/cloud/organizations/buckets/create-bucket/)
**to migrate data to**.
3. [Create a bucket](/influxdb/cloud/organizations/buckets/create-bucket/)
**to store temporary migration metadata**.
4. [Create a new task](/influxdb/cloud/process-data/manage-tasks/create-task/)
using the provided [migration task](#migration-task).
Update the necessary [migration configuration options](#configure-the-migration).
5. _(Optional)_ Set up [migration monitoring](#monitor-the-migration-progress).
6. Save the task.
{{% note %}}
Newly-created tasks are enabled by default, so the data migration begins when you save the task.
{{% /note %}}
**After the migration is complete**, each subsequent migration task execution
will fail with the following error:
```
error exhausting result iterator: error calling function "die" @41:9-41:86:
Batch range is beyond the migration range. Migration is complete.
```
## Migration task
### Configure the migration
1. Specify how often you want the task to run using the `task.every` option.
_See [Determine your task interval](#determine-your-task-interval)._
2. Define the following properties in the `migration`
[record](/{{< latest "flux" >}}/data-types/composite/record/):
##### migration
- **start**: Earliest time to include in the migration.
_See [Determine your migration start time](#determine-your-migration-start-time)._
- **stop**: Latest time to include in the migration.
- **batchInterval**: Duration of each time-based batch.
_See [Determine your batch interval](#determine-your-batch-interval)._
- **batchBucket**: InfluxDB bucket to store migration batch metadata in.
- **sourceHost**: [InfluxDB Cloud region URL](/influxdb/cloud/reference/regions)
to migrate data from.
- **sourceOrg**: InfluxDB Cloud organization to migrate data from.
- **sourceToken**: InfluxDB Cloud API token. To keep the API token secure, store
it as a secret in InfluxDB OSS.
- **sourceBucket**: InfluxDB Cloud bucket to migrate data from.
- **destinationBucket**: InfluxDB OSS bucket to migrate data to.
### Migration Flux script
```js
import "array"
import "experimental"
import "influxdata/influxdb/secrets"
// Configure the task
option task = {every: 5m, name: "Migrate data from InfluxDB Cloud"}
// Configure the migration
migration = {
start: 2022-01-01T00:00:00Z,
stop: 2022-02-01T00:00:00Z,
batchInterval: 1h,
batchBucket: "migration",
sourceHost: "https://cloud2.influxdata.com",
sourceOrg: "example-cloud-org",
sourceToken: secrets.get(key: "INFLUXDB_CLOUD_TOKEN"),
sourceBucket: "example-cloud-bucket",
destinationBucket: "example-oss-bucket",
}
// batchRange dynamically returns a record with start and stop properties for
// the current batch. It queries migration metadata stored in the
// `migration.batchBucket` to determine the stop time of the previous batch.
// It uses the previous stop time as the new start time for the current batch
// and adds the `migration.batchInterval` to determine the current batch stop time.
batchRange = () => {
_lastBatchStop =
(from(bucket: migration.batchBucket)
|> range(start: migration.start)
|> filter(fn: (r) => r._field == "batch_stop")
|> filter(fn: (r) => r.srcOrg == migration.sourceOrg)
|> filter(fn: (r) => r.srcBucket == migration.sourceBucket)
|> last()
|> findRecord(fn: (key) => true, idx: 0))._value
_batchStart =
if exists _lastBatchStop then
time(v: _lastBatchStop)
else
migration.start
return {start: _batchStart, stop: experimental.addDuration(d: migration.batchInterval, to: _batchStart)}
}
// Define a static record with batch start and stop time properties
batch = {start: batchRange().start, stop: batchRange().stop}
// Check to see if the current batch start time is beyond the migration.stop
// time and exit with an error if it is.
finished =
if batch.start >= migration.stop then
die(msg: "Batch range is beyond the migration range. Migration is complete.")
else
"Migration in progress"
// Query all data from the specified source bucket within the batch-defined time
// range. To limit migrated data by measurement, tag, or field, add a `filter()`
// function after `range()` with the appropriate predicate fn.
data = () =>
from(host: migration.sourceHost, org: migration.sourceOrg, token: migration.sourceToken, bucket: migration.sourceBucket)
|> range(start: batch.start, stop: batch.stop)
// rowCount is a stream of tables that contains the number of rows returned in
// the batch and is used to generate batch metadata.
rowCount =
data()
|> group(columns: ["_start", "_stop"])
|> count()
// emptyRange is a stream of tables that acts as filler data if the batch is
// empty. This is used to generate batch metadata for empty batches and is
// necessary to correctly increment the time range for the next batch.
emptyRange = array.from(rows: [{_start: batch.start, _stop: batch.stop, _value: 0}])
// metadata returns a stream of tables representing batch metadata.
metadata = () => {
_input =
if exists (rowCount |> findRecord(fn: (key) => true, idx: 0))._value then
rowCount
else
emptyRange
return
_input
|> map(
fn: (r) =>
({
_time: now(),
_measurement: "batches",
srcOrg: migration.sourceOrg,
srcBucket: migration.sourceBucket,
dstBucket: migration.destinationBucket,
batch_start: string(v: batch.start),
batch_stop: string(v: batch.stop),
rows: r._value,
percent_complete:
float(v: int(v: r._stop) - int(v: migration.start)) / float(
v: int(v: migration.stop) - int(v: migration.start),
) * 100.0,
}),
)
|> group(columns: ["_measurement", "srcOrg", "srcBucket", "dstBucket"])
}
// Write the queried data to the specified InfluxDB OSS bucket.
data()
|> to(bucket: migration.destinationBucket)
// Generate and store batch metadata in the migration.batchBucket.
metadata()
|> experimental.to(bucket: migration.batchBucket)
```
### Configuration help
{{< expand-wrapper >}}
<!----------------------- BEGIN Determine task interval ----------------------->
{{% expand "Determine your task interval" %}}
The task interval determines how often the migration task runs and is defined by
the [`task.every` option](/influxdb/cloud/process-data/task-options/#every).
InfluxDB Cloud rate limits and quotas reset every five minutes, so
**we recommend a `5m` task interval**.
You can do shorter task intervals and execute the migration task more often,
but you need to balance the task interval with your [batch interval](#determine-your-batch-interval)
and the amount of data returned in each batch.
If the total amount of data queried in each five-minute interval exceeds your
InfluxDB Cloud organization's [rate limits and quotas](/influxdb/cloud/account-management/limits/),
the batch will fail until rate limits and quotas reset.
{{% /expand %}}
<!------------------------ END Determine task interval ------------------------>
<!---------------------- BEGIN Determine migration start ---------------------->
{{% expand "Determine your migration start time" %}}
The `migration.start` time should be at or near the same time as the earliest
data point you want to migrate.
All migration batches are determined using the `migration.start` time and
`migration.batchInterval` settings.
To find time of the earliest point in your bucket, run the following query:
```js
from(bucket: "example-cloud-bucket")
|> range(start: 0)
|> group()
|> first()
|> keep(columns: ["_time"])
```
{{% /expand %}}
<!----------------------- END Determine migration start ----------------------->
<!----------------------- BEGIN Determine batch interval ---------------------->
{{% expand "Determine your batch interval" %}}
The `migration.batchInterval` setting controls the time range queried by each batch.
The "density" of the data in your InfluxDB Cloud bucket and your InfluxDB Cloud
organization's [rate limits and quotas](/influxdb/cloud/account-management/limits/)
determine what your batch interval should be.
For example, if you're migrating data collected from hundreds of sensors with
points recorded every second, your batch interval will need to be shorter.
If you're migrating data collected from five sensors with points recorded every
minute, your batch interval can be longer.
It all depends on how much data gets returned in a single batch.
If points occur at regular intervals, you can get a fairly accurate estimate of
how much data will be returned in a given time range by using the `/api/v2/query`
endpoint to execute a query for the time range duration and then measuring the
size of the response body.
The following `curl` command queries an InfluxDB Cloud bucket for the last day
and returns the size of the response body in bytes.
You can customize the range duration to match your specific use case and
data density.
```sh
INFLUXDB_CLOUD_ORG=<your_influxdb_cloud_org>
INFLUXDB_CLOUD_TOKEN=<your_influxdb_cloud_token>
INFLUXDB_CLOUD_BUCKET=<your_influxdb_cloud_bucket>
curl -so /dev/null --request POST \
https://cloud2.influxdata.com/api/v2/query?org=$INFLUXDB_CLOUD_ORG \
--header "Authorization: Token $INFLUXDB_CLOUD_TOKEN" \
--header "Accept: application/csv" \
--header "Content-type: application/vnd.flux" \
--data "from(bucket:\"$INFLUXDB_CLOUD_BUCKET\") |> range(start: -1d, stop: now())" \
--write-out '%{size_download}'
```
{{% note %}}
You can also use other HTTP API tools like [Postman](https://www.postman.com/)
that provide the size of the response body.
{{% /note %}}
Divide the output of this command by 1000000 to convert it to megabytes (MB).
```
batchInterval = (write-rate-limit-mb / response-body-size-mb) * range-duration
```
For example, if the response body of your query that returns data from one day
is 1 MB and you're using the InfluxDB Cloud Free Plan with a write limit of
5 MB per five minutes:
```js
batchInterval = (5 / 1) * 1d
// batchInterval = 5d
```
You _could_ query 5 days of data before hitting your write limit, but this is just an estimate.
We recommend setting the `batchInterval` slightly lower than the calculated interval
to allow for variation between batches.
So in this example, **it would be best to set your `batchInterval` to `4d`**.
##### Important things to note
- This assumes no other queries are running in your source InfluxDB Cloud organization.
- This assumes no other writes are happening in your destination InfluxDB Cloud organization.
{{% /expand %}}
<!------------------------ END Determine batch interval ----------------------->
{{< /expand-wrapper >}}
## Monitor the migration progress
The [InfluxDB Cloud Migration Community template](https://github.com/influxdata/community-templates/tree/master/influxdb-cloud-oss-migration/)
installs the migration task outlined in this guide as well as a dashboard
for monitoring running data migrations.
{{< img-hd src="/img/influxdb/2-1-migration-dashboard.png" alt="InfluxDB Cloud migration dashboard" />}}
<a class="btn" href="https://github.com/influxdata/community-templates/tree/master/influxdb-cloud-oss-migration/#quick-install">Install the InfluxDB Cloud Migration template</a>
## Troubleshoot migration task failures
If the migration task fails, [view your task logs](/influxdb/cloud/process-data/manage-tasks/task-run-history/)
to identify the specific error. Below are common causes of migration task failures.
- [Exceeded rate limits](#exceeded-rate-limits)
- [Invalid API token](#invalid-api-token)
- [Query timeout](#query-timeout)
### Exceeded rate limits
If your data migration causes you to exceed your InfluxDB Cloud organization's
limits and quotas, the task will return an error similar to:
```
too many requests
```
**Possible solutions**:
- Update the `migration.batchInterval` setting in your migration task to use
a smaller interval. Each batch will then query less data.
### Invalid API token
If the API token you add as the `INFLUXDB_CLOUD_SECRET` doesn't have read access to
your InfluxDB Cloud bucket, the task will return an error similar to:
```
unauthorized access
```
**Possible solutions**:
- Ensure the API token has read access to your InfluxDB Cloud bucket.
- Generate a new InfluxDB Cloud API token with read access to the bucket you
want to migrate. Then, update the `INFLUXDB_CLOUD_TOKEN` secret in your
InfluxDB OSS instance with the new token.
### Query timeout
The InfluxDB Cloud query timeout is 90 seconds. If it takes longer than this to
return the data from the batch interval, the query will time out and the
task will fail.
**Possible solutions**:
- Update the `migration.batchInterval` setting in your migration task to use
a smaller interval. Each batch will then query less data and take less time
to return results.

View File

@ -0,0 +1,13 @@
---
title: Migrate data from InfluxDB Cloud to InfluxDB OSS
description: >
To migrate data from InfluxDB Cloud to InfluxDB OSS, query the data from
InfluxDB Cloud in time-based batches and write the data to InfluxDB OSS.
menu:
influxdb_cloud:
name: Migrate from Cloud to OSS
parent: Migrate data
weight: 103
---
{{< duplicate-oss >}}

View File

@ -0,0 +1,13 @@
---
title: Migrate data from InfluxDB OSS to InfluxDB Cloud
description: >
To migrate data from an InfluxDB OSS bucket to an InfluxDB Cloud bucket, export
your data as line protocol and then write it to your InfluxDB Cloud bucket.
menu:
influxdb_cloud:
name: Migrate data from OSS
parent: Migrate data
weight: 101
---
{{< duplicate-oss >}}

View File

@ -4,7 +4,7 @@ description: >
InfluxDB templates are prepackaged InfluxDB configurations that contain everything
from dashboards and Telegraf configurations to notifications and alerts.
menu: influxdb_2_1
weight: 9
weight: 10
influxdb/v2.1/tags: [templates]
---

View File

@ -0,0 +1,15 @@
---
title: Migrate data to InfluxDB
description: >
Migrate data to InfluxDB from other InfluxDB instances including by InfluxDB OSS
and InfluxDB Cloud.
menu:
influxdb_2_1:
name: Migrate data
weight: 9
---
Migrate data to InfluxDB from other InfluxDB instances including by InfluxDB OSS
and InfluxDB Cloud.
{{< children >}}

View File

@ -0,0 +1,372 @@
---
title: Migrate data from InfluxDB Cloud to InfluxDB OSS
description: >
To migrate data from InfluxDB Cloud to InfluxDB OSS, query the data from
InfluxDB Cloud in time-based batches and write the data to InfluxDB OSS.
menu:
influxdb_2_1:
name: Migrate from Cloud to OSS
parent: Migrate data
weight: 102
---
To migrate data from InfluxDB Cloud to InfluxDB OSS, query the data
from InfluxDB Cloud and write the data to InfluxDB OSS.
Because full data migrations will likely exceed your organization's limits and
adjustable quotas, migrate your data in batches.
The following guide provides instructions for setting up an InfluxDB OSS task
that queries data from an InfluxDB Cloud bucket in time-based batches and writes
each batch to an InfluxDB OSS bucket.
{{% cloud %}}
All queries against data in InfluxDB Cloud are subject to your organization's
[rate limits and adjustable quotas](/influxdb/cloud/account-management/limits/).
{{% /cloud %}}
- [Set up the migration](#set-up-the-migration)
- [Migration task](#migration-task)
- [Configure the migration](#configure-the-migration)
- [Migration Flux script](#migration-flux-script)
- [Configuration help](#configuration-help)
- [Monitor the migration progress](#monitor-the-migration-progress)
- [Troubleshoot migration task failures](#troubleshoot-migration-task-failures)
## Set up the migration
1. [Install and set up InfluxDB OSS](/influxdb/{{< current-version-link >}}/install/).
2. **In InfluxDB Cloud**, [create an API token](/influxdb/cloud/security/tokens/create-token/)
with **read access** to the bucket you want to migrate.
3. **In InfluxDB OSS**:
1. Add your **InfluxDB Cloud API token** as a secret using the key,
`INFLUXDB_CLOUD_TOKEN`.
_See [Add secrets](/influxdb/{{< current-version-link >}}/security/secrets/add/) for more information._
2. [Create a bucket](/influxdb/{{< current-version-link >}}/organizations/buckets/create-bucket/)
**to migrate data to**.
3. [Create a bucket](/influxdb/{{< current-version-link >}}/organizations/buckets/create-bucket/)
**to store temporary migration metadata**.
4. [Create a new task](/influxdb/{{< current-version-link >}}/process-data/manage-tasks/create-task/)
using the provided [migration task](#migration-task).
Update the necessary [migration configuration options](#configure-the-migration).
5. _(Optional)_ Set up [migration monitoring](#monitor-the-migration-progress).
6. Save the task.
{{% note %}}
Newly-created tasks are enabled by default, so the data migration begins when you save the task.
{{% /note %}}
**After the migration is complete**, each subsequent migration task execution
will fail with the following error:
```
error exhausting result iterator: error calling function "die" @41:9-41:86:
Batch range is beyond the migration range. Migration is complete.
```
## Migration task
### Configure the migration
1. Specify how often you want the task to run using the `task.every` option.
_See [Determine your task interval](#determine-your-task-interval)._
2. Define the following properties in the `migration`
[record](/{{< latest "flux" >}}/data-types/composite/record/):
##### migration
- **start**: Earliest time to include in the migration.
_See [Determine your migration start time](#determine-your-migration-start-time)._
- **stop**: Latest time to include in the migration.
- **batchInterval**: Duration of each time-based batch.
_See [Determine your batch interval](#determine-your-batch-interval)._
- **batchBucket**: InfluxDB OSS bucket to store migration batch metadata in.
- **sourceHost**: [InfluxDB Cloud region URL](/influxdb/cloud/reference/regions)
to migrate data from.
- **sourceOrg**: InfluxDB Cloud organization to migrate data from.
- **sourceToken**: InfluxDB Cloud API token. To keep the API token secure, store
it as a secret in InfluxDB OSS.
- **sourceBucket**: InfluxDB Cloud bucket to migrate data from.
- **destinationBucket**: InfluxDB OSS bucket to migrate data to.
### Migration Flux script
```js
import "array"
import "experimental"
import "influxdata/influxdb/secrets"
// Configure the task
option task = {every: 5m, name: "Migrate data from InfluxDB Cloud"}
// Configure the migration
migration = {
start: 2022-01-01T00:00:00Z,
stop: 2022-02-01T00:00:00Z,
batchInterval: 1h,
batchBucket: "migration",
sourceHost: "https://cloud2.influxdata.com",
sourceOrg: "example-cloud-org",
sourceToken: secrets.get(key: "INFLUXDB_CLOUD_TOKEN"),
sourceBucket: "example-cloud-bucket",
destinationBucket: "example-oss-bucket",
}
// batchRange dynamically returns a record with start and stop properties for
// the current batch. It queries migration metadata stored in the
// `migration.batchBucket` to determine the stop time of the previous batch.
// It uses the previous stop time as the new start time for the current batch
// and adds the `migration.batchInterval` to determine the current batch stop time.
batchRange = () => {
_lastBatchStop =
(from(bucket: migration.batchBucket)
|> range(start: migration.start)
|> filter(fn: (r) => r._field == "batch_stop")
|> filter(fn: (r) => r.srcOrg == migration.sourceOrg)
|> filter(fn: (r) => r.srcBucket == migration.sourceBucket)
|> last()
|> findRecord(fn: (key) => true, idx: 0))._value
_batchStart =
if exists _lastBatchStop then
time(v: _lastBatchStop)
else
migration.start
return {start: _batchStart, stop: experimental.addDuration(d: migration.batchInterval, to: _batchStart)}
}
// Define a static record with batch start and stop time properties
batch = {start: batchRange().start, stop: batchRange().stop}
// Check to see if the current batch start time is beyond the migration.stop
// time and exit with an error if it is.
finished =
if batch.start >= migration.stop then
die(msg: "Batch range is beyond the migration range. Migration is complete.")
else
"Migration in progress"
// Query all data from the specified source bucket within the batch-defined time
// range. To limit migrated data by measurement, tag, or field, add a `filter()`
// function after `range()` with the appropriate predicate fn.
data = () =>
from(host: migration.sourceHost, org: migration.sourceOrg, token: migration.sourceToken, bucket: migration.sourceBucket)
|> range(start: batch.start, stop: batch.stop)
// rowCount is a stream of tables that contains the number of rows returned in
// the batch and is used to generate batch metadata.
rowCount =
data()
|> group(columns: ["_start", "_stop"])
|> count()
// emptyRange is a stream of tables that acts as filler data if the batch is
// empty. This is used to generate batch metadata for empty batches and is
// necessary to correctly increment the time range for the next batch.
emptyRange = array.from(rows: [{_start: batch.start, _stop: batch.stop, _value: 0}])
// metadata returns a stream of tables representing batch metadata.
metadata = () => {
_input =
if exists (rowCount |> findRecord(fn: (key) => true, idx: 0))._value then
rowCount
else
emptyRange
return
_input
|> map(
fn: (r) =>
({
_time: now(),
_measurement: "batches",
srcOrg: migration.sourceOrg,
srcBucket: migration.sourceBucket,
dstBucket: migration.destinationBucket,
batch_start: string(v: batch.start),
batch_stop: string(v: batch.stop),
rows: r._value,
percent_complete:
float(v: int(v: r._stop) - int(v: migration.start)) / float(
v: int(v: migration.stop) - int(v: migration.start),
) * 100.0,
}),
)
|> group(columns: ["_measurement", "srcOrg", "srcBucket", "dstBucket"])
}
// Write the queried data to the specified InfluxDB OSS bucket.
data()
|> to(bucket: migration.destinationBucket)
// Generate and store batch metadata in the migration.batchBucket.
metadata()
|> experimental.to(bucket: migration.batchBucket)
```
### Configuration help
{{< expand-wrapper >}}
<!----------------------- BEGIN Determine task interval ----------------------->
{{% expand "Determine your task interval" %}}
The task interval determines how often the migration task runs and is defined by
the [`task.every` option](/influxdb/v2.1/process-data/task-options/#every).
InfluxDB Cloud rate limits and quotas reset every five minutes, so
**we recommend a `5m` task interval**.
You can do shorter task intervals and execute the migration task more often,
but you need to balance the task interval with your [batch interval](#determine-your-batch-interval)
and the amount of data returned in each batch.
If the total amount of data queried in each five-minute interval exceeds your
InfluxDB Cloud organization's [rate limits and quotas](/influxdb/cloud/account-management/limits/),
the batch will fail until rate limits and quotas reset.
{{% /expand %}}
<!------------------------ END Determine task interval ------------------------>
<!---------------------- BEGIN Determine migration start ---------------------->
{{% expand "Determine your migration start time" %}}
The `migration.start` time should be at or near the same time as the earliest
data point you want to migrate.
All migration batches are determined using the `migration.start` time and
`migration.batchInterval` settings.
To find time of the earliest point in your bucket, run the following query:
```js
from(bucket: "example-cloud-bucket")
|> range(start: 0)
|> group()
|> first()
|> keep(columns: ["_time"])
```
{{% /expand %}}
<!----------------------- END Determine migration start ----------------------->
<!----------------------- BEGIN Determine batch interval ---------------------->
{{% expand "Determine your batch interval" %}}
The `migration.batchInterval` setting controls the time range queried by each batch.
The "density" of the data in your InfluxDB Cloud bucket and your InfluxDB Cloud
organization's [rate limits and quotas](/influxdb/cloud/account-management/limits/)
determine what your batch interval should be.
For example, if you're migrating data collected from hundreds of sensors with
points recorded every second, your batch interval will need to be shorter.
If you're migrating data collected from five sensors with points recorded every
minute, your batch interval can be longer.
It all depends on how much data gets returned in a single batch.
If points occur at regular intervals, you can get a fairly accurate estimate of
how much data will be returned in a given time range by using the `/api/v2/query`
endpoint to execute a query for the time range duration and then measuring the
size of the response body.
The following `curl` command queries an InfluxDB Cloud bucket for the last day
and returns the size of the response body in bytes.
You can customize the range duration to match your specific use case and
data density.
```sh
INFLUXDB_CLOUD_ORG=<your_influxdb_cloud_org>
INFLUXDB_CLOUD_TOKEN=<your_influxdb_cloud_token>
INFLUXDB_CLOUD_BUCKET=<your_influxdb_cloud_bucket>
curl -so /dev/null --request POST \
https://cloud2.influxdata.com/api/v2/query?org=$INFLUXDB_CLOUD_ORG \
--header "Authorization: Token $INFLUXDB_CLOUD_TOKEN" \
--header "Accept: application/csv" \
--header "Content-type: application/vnd.flux" \
--data "from(bucket:\"$INFLUXDB_CLOUD_BUCKET\") |> range(start: -1d, stop: now())" \
--write-out '%{size_download}'
```
{{% note %}}
You can also use other HTTP API tools like [Postman](https://www.postman.com/)
that provide the size of the response body.
{{% /note %}}
Divide the output of this command by 1000000 to convert it to megabytes (MB).
```
batchInterval = (read-rate-limit-mb / response-body-size-mb) * range-duration
```
For example, if the response body of your query that returns data from one day
is 8 MB and you're using the InfluxDB Cloud Free Plan with a read limit of
300 MB per five minutes:
```js
batchInterval = (300 / 8) * 1d
// batchInterval = 37d
```
You could query 37 days of data before hitting your read limit, but this is just an estimate.
We recommend setting the `batchInterval` slightly lower than the calculated interval
to allow for variation between batches.
So in this example, **it would be best to set your `batchInterval` to `35d`**.
##### Important things to note
- This assumes no other queries are running in your InfluxDB Cloud organization.
- You should also consider your network speeds and whether a batch can be fully
downloaded within the [task interval](#determine-your-task-interval).
{{% /expand %}}
<!------------------------ END Determine batch interval ----------------------->
{{< /expand-wrapper >}}
## Monitor the migration progress
The [InfluxDB Cloud Migration Community template](https://github.com/influxdata/community-templates/tree/master/influxdb-cloud-oss-migration/)
installs the migration task outlined in this guide as well as a dashboard
for monitoring running data migrations.
{{< img-hd src="/img/influxdb/2-1-migration-dashboard.png" alt="InfluxDB Cloud migration dashboard" />}}
<a class="btn" href="https://github.com/influxdata/community-templates/tree/master/influxdb-cloud-oss-migration/#quick-install">Install the InfluxDB Cloud Migration template</a>
## Troubleshoot migration task failures
If the migration task fails, [view your task logs](/influxdb/v2.1/process-data/manage-tasks/task-run-history/)
to identify the specific error. Below are common causes of migration task failures.
- [Exceeded rate limits](#exceeded-rate-limits)
- [Invalid API token](#invalid-api-token)
- [Query timeout](#query-timeout)
### Exceeded rate limits
If your data migration causes you to exceed your InfluxDB Cloud organization's
limits and quotas, the task will return an error similar to:
```
too many requests
```
**Possible solutions**:
- Update the `migration.batchInterval` setting in your migration task to use
a smaller interval. Each batch will then query less data.
### Invalid API token
If the API token you add as the `INFLUXDB_CLOUD_SECRET` doesn't have read access to
your InfluxDB Cloud bucket, the task will return an error similar to:
```
unauthorized access
```
**Possible solutions**:
- Ensure the API token has read access to your InfluxDB Cloud bucket.
- Generate a new InfluxDB Cloud API token with read access to the bucket you
want to migrate. Then, update the `INFLUXDB_CLOUD_TOKEN` secret in your
InfluxDB OSS instance with the new token.
### Query timeout
The InfluxDB Cloud query timeout is 90 seconds. If it takes longer than this to
return the data from the batch interval, the query will time out and the
task will fail.
**Possible solutions**:
- Update the `migration.batchInterval` setting in your migration task to use
a smaller interval. Each batch will then query less data and take less time
to return results.

View File

@ -0,0 +1,64 @@
---
title: Migrate data from InfluxDB OSS to other InfluxDB instances
description: >
To migrate data from an InfluxDB OSS bucket to another InfluxDB OSS or InfluxDB
Cloud bucket, export your data as line protocol and write it to your other
InfluxDB bucket.
menu:
influxdb_2_1:
name: Migrate data from OSS
parent: Migrate data
weight: 101
---
To migrate data from an InfluxDB OSS bucket to another InfluxDB OSS or InfluxDB
Cloud bucket, export your data as line protocol and write it to your other
InfluxDB bucket.
{{% cloud %}}
#### InfluxDB Cloud write limits
If migrating data from InfluxDB OSS to InfluxDB Cloud, you are subject to your
[InfluxDB Cloud organization's rate limits and adjustable quotas](/influxdb/cloud/account-management/limits/).
Consider exporting your data in time-based batches to limit the file size
of exported line protocol to match your InfluxDB Cloud organization's limits.
{{% /cloud %}}
1. [Find the InfluxDB OSS bucket ID](/influxdb/v2.1/organizations/buckets/view-buckets/)
that contains data you want to migrate.
2. Use the `influxd inspect export-lp` command to export data in your bucket as
[line protocol](/influxdb/v2.1/reference/syntax/line-protocol/).
Provide the following:
- **bucket ID**: ({{< req >}}) ID of the bucket to migrate.
- **engine path**: ({{< req >}}) Path to the TSM storage files on disk.
The default engine path [depends on your operating system](/influxdb/v2.1/reference/internals/file-system-layout/#file-system-layout),
If using a [custom engine-path](/influxdb/v2.1/reference/config-options/#engine-path)
provide your custom path.
- **output path**: ({{< req >}}) File path to output line protocol to.
- **start time**: Earliest time to export.
- **end time**: Latest time to export.
- **measurement**: Export a specific measurement. By default, the command
exports all measurements.
- **compression**: ({{< req text="Recommended" color="magenta" >}})
Use Gzip compression to compress the output line protocol file.
```sh
influxd inspect export-lp \
--bucket-id 12ab34cd56ef \
--engine-path ~/.influxdbv2/engine \
--output-path path/to/export.lp
--start 2022-01-01T00:00:00Z \
--end 2022-01-31T23:59:59Z \
--compress
```
3. Write the exported line protocol to your InfluxDB OSS or InfluxDB Cloud instance.
Do any of the following:
- Write line protocol in the **InfluxDB UI**:
- [InfluxDB Cloud UI](/influxdb/cloud/write-data/no-code/load-data/#load-csv-or-line-protocol-in-ui)
- [InfluxDB OSS {{< current-version >}} UI](/influxdb/v2.1/write-data/no-code/load-data/#load-csv-or-line-protocol-in-ui)
- [Write line protocol using the `influx write` command](/influxdb/v2.1/reference/cli/influx/write/)
- [Write line protocol using the InfluxDB API](/influxdb/v2.1/write-data/developer-tools/api/)
- [Bulk ingest data (InfluxDB Cloud)](/influxdb/cloud/write-data/bulk-ingest-cloud/)

View File

@ -17,6 +17,7 @@ related:
- /influxdb/v2.1/reference/syntax/line-protocol
- /influxdb/v2.1/reference/syntax/annotated-csv
- /influxdb/v2.1/reference/cli/influx/write
- /influxdb/v2.1/migrate-data/
- /resources/videos/ingest-data/, How to Ingest Data in InfluxDB (Video)
---
@ -28,6 +29,7 @@ related:
- [Query and explore data](/influxdb/v2.1/query-data/)
- [Process data](/influxdb/v2.1/process-data/)
- [Visualize data](/influxdb/v2.1/visualize-data/)
- [Migrate data](/influxdb/v2.1/migrate-data/)
- [Monitor and alert](/influxdb/v2.1/monitor-alert/)
The following video discusses different ways to write data to InfluxDB:

View File

@ -36,7 +36,7 @@ The URL in the examples depends on the version and location of your InfluxDB {{<
{{% /code-tab-content %}}
{{% code-tab-content %}}
```js
{{< get-shared-text "api/v2.0/write/write.sh" >}}
{{< get-shared-text "api/v2.0/write/write.mjs" >}}
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}

View File

@ -14,7 +14,6 @@ menu:
Load data from the following sources in the InfluxDB user interface (UI):
- [CSV or line protocol file](#load-csv-or-line-protocol-in-ui)
- [Line protocol](#load-data-using-line-protocol)
- [Client libraries](#load-data-from-a-client-library-in-the-ui)
- [Telegraf plugins](#load-data-from-a-telegraf-plugin-in-the-ui)
@ -22,17 +21,17 @@ Load data from the following sources in the InfluxDB user interface (UI):
Load CSV or line protocol data by uploading a file or pasting the data manually into the UI.
1. In the navigation menu on the left, click **Data (Load Data)** > **Sources**.
1. In the navigation menu on the left, click **Load Data** > **Sources**.
{{< nav-icon "data" >}}
2. Under **File Upload**, select the type of data to upload:
- **Annotated CSV**. Verify your CSV file follows the supported [annotated CSV](/influxdb/cloud/reference/syntax/annotated-csv/) syntax.
- **Line Protocol**. Verify your line protocol file adheres to the following conventions:
- Each line represents a data point.
- Each data point requires a:
- [*measurement*](/influxdb/cloud/reference/syntax/line-protocol/#measurement)
- [*field set*](/influxdb/cloud/reference/syntax/line-protocol/#field-set)
- (Optional) [*tag set*](/influxdb/cloud/reference/syntax/line-protocol/#tag-set)
- [*timestamp*](/influxdb/cloud/reference/syntax/line-protocol/#timestamp)
- Each data point requires a:
- [*measurement*](/influxdb/cloud/reference/syntax/line-protocol/#measurement)
- [*field set*](/influxdb/cloud/reference/syntax/line-protocol/#field-set)
- (Optional) [*tag set*](/influxdb/cloud/reference/syntax/line-protocol/#tag-set)
- [*timestamp*](/influxdb/cloud/reference/syntax/line-protocol/#timestamp)
For more information, see supported [line protocol](/influxdb/cloud/reference/syntax/line-protocol/) syntax.
@ -44,7 +43,7 @@ Load CSV or line protocol data by uploading a file or pasting the data manually
### Load data from a client library in the UI
1. In the navigation menu on the left, click **Data (Load Data)** > **Sources**.
1. In the navigation menu on the left, click **Load Data** > **Sources**.
{{< nav-icon "data" >}}
2. Do one of the following:
- Enter a specific client library to search for in the **Search data writing methods** field.
@ -59,7 +58,7 @@ Load CSV or line protocol data by uploading a file or pasting the data manually
### Load data from a Telegraf plugin in the UI
1. In the navigation menu on the left, click **Data (Load Data)** > **Sources**.
1. In the navigation menu on the left, click **Load Data** > **Sources**.
{{< nav-icon "data" >}}
2. Do one of the following:
- Enter a specific Telegraf plugin to search for in the **Search data writing methods** field.

View File

@ -0,0 +1,4 @@
{{- $productPathData := findRE "[^/]+.*?" .Page.RelPermalink -}}
{{- $currentVersion := index $productPathData 1 -}}
{{- $currentVersionLink := replaceRE `\.` "%2e" $currentVersion -}}
{{ $currentVersionLink }}

Binary file not shown.

After

Width:  |  Height:  |  Size: 164 KiB