Update sample data page (#4629)

* Update sample data page

* port updated sample data to 2.5 and cloud
pull/4636/head^2
Scott Anderson 2022-11-14 16:56:33 -07:00 committed by GitHub
parent f62acc68b0
commit 645b294103
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 366 additions and 333 deletions

View File

@ -10,184 +10,4 @@ menu: influxdb_cloud_ref
weight: 7 weight: 7
--- ---
Use **sample data** to familiarize yourself with time series data and InfluxDB Cloud. {{< duplicate-oss >}}
**Sample datasets** in InfluxDB Cloud let you access time series data without having to write data to InfluxDB. Sample datasets are available for download and can be written to InfluxDB or loaded at query time.
The sample data below contains both static and live datasets. A static sample dataset is not updated regularly and has fixed timestamps. A "live" sample dataset is updated regularly.
{{% note %}}
If writing a static sample dataset to a bucket with a limited retention period, use [sample.alignToNow()](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/aligntonow/) to shift timestamps to align the last point in the set to now. This will prevent writing points with timestamps beyond the bucket's retention period.
{{% /note %}}
## Sample datasets
Do one of the following:
- Run the sample data query in the **Script Editor** found in **Explore** from the left-hand navigation bar:
- [Air sensor sample data](#air-sensor-sample-data)
- [Bird migration sample data](#bird-migration-sample-data)
- [NOAA sample data](#noaa-sample-data)
- [NOAA NDBC data](#noaa-ndbc-data)
- [NOAA water sample data](#noaa-water-sample-data)
- [USGS Earthquake data](#usgs-earthquake-data)
- [Write sample data with InfluxDB task](#write-sample-data-with-influxdb-task)
### Air sensor sample data
{{% caption %}}
**Size**: ~600 KB • **Updated**: every 15m
{{% /caption %}}
Air sensor sample data represents an "Internet of Things" (IoT) use case by simulating
temperature, humidity, and carbon monoxide levels for multiple rooms in a building.
To download and output the air sensor sample dataset, use the
[`sample.data()` function](/influxdb/cloud/reference/flux/stdlib/influxdb-sample/data/).
```js
import "influxdata/influxdb/sample"
sample.data(set: "airSensor")
```
#### Companion SQL sensor data
The air sensor sample dataset is paired with a relational SQL dataset with meta
information about sensors in each room.
These two sample datasets are used to demonstrate
[how to join time series data and relational data with Flux](/influxdb/cloud/query-data/flux/sql/#join-sql-data-with-data-in-influxdb)
in the [Query SQL data sources](/influxdb/cloud/query-data/flux/sql/) guide.
<a class="btn download" href="https://influx-testdata.s3.amazonaws.com/sample-sensor-info.csv" download>Download SQL air sensor data</a>
### Bird migration sample data
{{% caption %}}
**Size**: ~1.2 MB • **Updated**: N/A
{{% /caption %}}
Bird migration sample data is adapted from the
[Movebank: Animal Tracking data set](https://www.kaggle.com/pulkit8595/movebank-animal-tracking)
and represents animal migratory movements throughout 2019.
To download and output the bird migration sample dataset, use the
[`sample.data()` function](/influxdb/cloud/reference/flux/stdlib/influxdb-sample/data/).
```js
import "influxdata/influxdb/sample"
sample.data(set: "birdMigration")
```
The bird migration sample dataset is used in the [Work with geo-temporal data](/influxdb/cloud/query-data/flux/geo/)
guide to demonstrate how to query and analyze geo-temporal data.
### NOAA sample data
The following two National Oceanic and Atmospheric Administration (NOAA) datasets are
available to use with InfluxDB.
- [NOAA NDBC data](#noaa-ndbc-data)
- [NOAA water sample data](#noaa-water-sample-data)
#### NOAA NDBC data
{{% caption %}}
**Size**: ~1.3 MB • **Updated**: every 15m
{{% /caption %}}
The **NOAA National Data Buoy Center (NDBC)** dataset provides observations (updated every 15 minutes) from the NOAA NDBC network of buoys throughout the world.
To download and output the most recent NOAA NDBC observations, use the
[`sample.data()` function](/influxdb/cloud/reference/flux/stdlib/influxdb-sample/data/).
```js
import "influxdata/influxdb/sample"
sample.data(set: "noaa")
```
{{% note %}}
##### Store historical NOAA NDBC data
The **NOAA NDBC sample dataset** only returns the most recent observations;
not historical observations.
To regularly query and store NOAA NDBC observations, add the following as an
[InfluxDB task](/inflxudb/v2.0/process-data/manage-tasks/).
Replace `example-org` and `example-bucket` with your organization name and the
name of the bucket to store data in.
{{% get-shared-text "flux/noaa-ndbc-sample-task.md" %}}
{{% /note %}}
#### NOAA water sample data
{{% caption %}}
**Size**: ~10 MB • **Updated**: N/A
{{% /caption %}}
The **NOAA water sample dataset** is static dataset extracted from
[NOAA Center for Operational Oceanographic Products and Services](http://tidesandcurrents.noaa.gov/stations.html) data.
The sample dataset includes 15,258 observations of water levels (ft) collected every six minutes at two stations
(Santa Monica, CA (ID 9410840) and Coyote Creek, CA (ID 9414575)) over the period
from **August 18, 2015** through **September 18, 2015**.
{{% note %}}
##### Store NOAA water sample data to avoid bandwidth usage
To avoid having to re-download this 10MB dataset every time you run a query,
we recommend that you [create a new bucket](/influxdb/cloud/organizations/buckets/create-bucket/)
(`noaa`) and write the NOAA data to it.
We also recommend updating the timestamps of the data to be relative to `now()`.
To do so, run the following:
```js
import "experimental/csv"
relativeToNow = (tables=<-) => tables
|> elapsed()
|> sort(columns: ["_time"], desc: true)
|> cumulativeSum(columns: ["elapsed"])
|> map(fn: (r) => ({r with _time: time(v: int(v: now()) - r.elapsed * 1000000000)}))
csv.from(url: "https://influx-testdata.s3.amazonaws.com/noaa.csv")
|> relativeToNow()
|> to(bucket: "noaa", org: "example-org")
```
{{% /note %}}
The NOAA water sample dataset is used to demonstrate Flux queries in the
[Common queries](/influxdb/cloud/query-data/common-queries/) and
[Common tasks](/influxdb/cloud/process-data/common-tasks/) guides.
### USGS Earthquake data
{{% caption %}}
**Size**: ~6 MB • **Updated**: every 15m
{{% /caption %}}
The United States Geological Survey (USGS) earthquake dataset contains observations
collected from USGS seismic sensors around the world over the last week.
Data is updated approximately every 15m.
To download and output the last week of USGS seismic data, use the
[`sample.data()` function](/influxdb/cloud/reference/flux/stdlib/influxdb-sample/data/).
```js
import "influxdata/influxdb/sample"
sample.data(set: "usgs")
```
### Write sample data with an InfluxDB task
Use the [Flux InfluxDB sample package](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/) to download and write sample data to InfluxDB.
Add the following as an [InfluxDB task](/influxdb/cloud/process-data/manage-tasks/create-task/).
```js
import "influxdata/influxdb/sample"
option task = {name: "Collect NOAA NDBC data", every: 15m}
sample.data(set: "noaa")
|> to(bucket: "noaa")
```

View File

@ -15,14 +15,31 @@ InfluxData provides many sample time series datasets to use with InfluxDB.
You can also use the [Flux InfluxDB sample package](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/) You can also use the [Flux InfluxDB sample package](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/)
to view, download, and output sample datasets. to view, download, and output sample datasets.
- [Air sensor sample data](#air-sensor-sample-data) - [Live datasets](#live-datasets)
- [Bird migration sample data](#bird-migration-sample-data) - [Air sensor sample data](#air-sensor-sample-data)
- [NOAA sample data](#noaa-sample-data) - [Bitcoin sample data](#bitcoin-sample-data)
- [NOAA NDBC data](#noaa-ndbc-data) - [NOAA NDBC data](#noaa-ndbc-data)
- [USGS Earthquake data](#usgs-earthquake-data)
- [Static datasets](#static-datasets)
- [Bird migration sample data](#bird-migration-sample-data)
- [Machine production sample data](#machine-production-sample-data)
- [NOAA water sample data](#noaa-water-sample-data) - [NOAA water sample data](#noaa-water-sample-data)
## Live datasets
Live sample datasets are continually updated with new data.
These sets can be loaded once and treated as a "static" dataset, or you can create
an [InfluxDB task](/influxdb/v2.4/process-data/get-started/) to periodically
collect and write new data.
- [Air sensor sample data](#air-sensor-sample-data)
- [Bitcoin sample data](#bitcoin-sample-data)
- [NOAA NDBC data](#noaa-ndbc-data)
- [USGS Earthquake data](#usgs-earthquake-data) - [USGS Earthquake data](#usgs-earthquake-data)
## Air sensor sample data ---
### Air sensor sample data
{{% caption %}} {{% caption %}}
**Size**: ~600 KB • **Updated**: every 15m **Size**: ~600 KB • **Updated**: every 15m
@ -31,16 +48,26 @@ to view, download, and output sample datasets.
Air sensor sample data represents an "Internet of Things" (IoT) use case by simulating Air sensor sample data represents an "Internet of Things" (IoT) use case by simulating
temperature, humidity, and carbon monoxide levels for multiple rooms in a building. temperature, humidity, and carbon monoxide levels for multiple rooms in a building.
To download and output the air sensor sample dataset, use the To continually download and write updated air sensor sample data to a bucket,
[`sample.data()` function](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/data/). [create an InfluxDB task](/influxdb/v2.4/process-data/manage-tasks/create-task/)
with the following Flux query.
_Replace `example-bucket` with your target bucket_.
```js ```js
import "influxdata/influxdb/sample" import "influxdata/influxdb/sample"
option task = {
name: "Collect air sensor sample data",
every: 15m,
}
sample.data(set: "airSensor") sample.data(set: "airSensor")
|> to(bucket: "example-bucket")
``` ```
#### Companion SQL sensor data {{< expand-wrapper >}}
{{% expand "Companion relational sensor data" %}}
The air sensor sample dataset is paired with a relational SQL dataset with meta The air sensor sample dataset is paired with a relational SQL dataset with meta
information about sensors in each room. information about sensors in each room.
These two sample datasets are used to demonstrate These two sample datasets are used to demonstrate
@ -49,35 +76,36 @@ in the [Query SQL data sources](/influxdb/v2.4/query-data/flux/sql/) guide.
<a class="btn download" href="https://influx-testdata.s3.amazonaws.com/sample-sensor-info.csv" download>Download SQL air sensor data</a> <a class="btn download" href="https://influx-testdata.s3.amazonaws.com/sample-sensor-info.csv" download>Download SQL air sensor data</a>
## Bird migration sample data {{% /expand %}}
{{< /expand-wrapper >}}
### Bitcoin sample data
{{% caption %}} {{% caption %}}
**Size**: ~1.2 MB • **Updated**: N/A **Size**: ~700 KB • **Updated**: every 15m
{{% /caption %}} {{% /caption %}}
Bird migration sample data is adapted from the The Bitcoin sample dataset provides Bitcoin prices from the last 30
[Movebank: Animal Tracking data set](https://www.kaggle.com/pulkit8595/movebank-animal-tracking) days—[Powered by CoinDesk](https://www.coindesk.com/price/bitcoin).
and represents animal migratory movements throughout 2019.
To download and output the bird migration sample dataset, use the To continually download and write updated Bitcoin sample data to a bucket,
[`sample.data()` function](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/data/). [create an InfluxDB task](/influxdb/v2.4/process-data/manage-tasks/create-task/)
with the following Flux query.
_Replace `example-bucket` with your target bucket_.
```js ```js
import "influxdata/influxdb/sample" import "influxdata/influxdb/sample"
sample.data(set: "birdMigration") option task = {
name: "Collect Bitcoin sample data",
every: 15m,
}
sample.data(set: "bitcoin")
|> to(bucket: "example-bucket")
``` ```
The bird migration sample dataset is used in the [Work with geo-temporal data](/influxdb/v2.4/query-data/flux/geo/) ---
guide to demonstrate how to query and analyze geo-temporal data.
## NOAA sample data
There are two National Oceanic and Atmospheric Administration (NOAA) datasets
available to use with InfluxDB.
- [NOAA NDBC data](#noaa-ndbc-data)
- [NOAA water sample data](#noaa-water-sample-data)
### NOAA NDBC data ### NOAA NDBC data
@ -85,63 +113,29 @@ available to use with InfluxDB.
**Size**: ~1.3 MB • **Updated**: every 15m **Size**: ~1.3 MB • **Updated**: every 15m
{{% /caption %}} {{% /caption %}}
The **NOAA National Data Buoy Center (NDBC)** dataset provides the latest The **National Oceanic and Atmospheric Administration (NOAA) National Data Buoy Center (NDBC)**
observations from the NOAA NDBC network of buoys throughout the world. dataset provides the latest observations from the NOAA NDBC network of buoys throughout the world.
Observations are updated approximately every 15 minutes.
To download and output the most recent NOAA NDBC observations, use the To continually download and write updated NOAA NDBC sample data to a bucket,
[`sample.data()` function](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/data/). [create an InfluxDB task](/influxdb/v2.4/process-data/manage-tasks/create-task/)
with the following Flux query.
_Replace `example-bucket` with your target bucket_.
```js ```js
import "influxdata/influxdb/sample" import "influxdata/influxdb/sample"
option task = {
name: "Collect NOAA NDBC sample data",
every: 15m,
}
sample.data(set: "noaa") sample.data(set: "noaa")
|> to(bucket: "example-bucket")
``` ```
{{% note %}} ---
#### Store historical NOAA NDBC data
The **NOAA NDBC sample dataset** only returns the most recent observations; ### USGS Earthquake data
not historical observations.
To regularly query and store NOAA NDBC observations, add the following as an
[InfluxDB task](/influxdb/v2.4/process-data/manage-tasks/).
Replace `example-org` and `example-bucket` with your organization name and the
name of the bucket to store data in.
{{% get-shared-text "flux/noaa-ndbc-sample-task.md" %}}
{{% /note %}}
### NOAA water sample data
{{% caption %}}
**Size**: ~10 MB • **Updated**: N/A
{{% /caption %}}
The **NOAA water sample dataset** is static dataset extracted from
[NOAA Center for Operational Oceanographic Products and Services](http://tidesandcurrents.noaa.gov/stations.html) data.
The sample dataset includes 15,258 observations of water levels (ft) collected every six minutes at two stations
(Santa Monica, CA (ID 9410840) and Coyote Creek, CA (ID 9414575)) over the period
from **August 18, 2015** through **September 18, 2015**.
{{% note %}}
#### Store NOAA water sample data to avoid bandwidth usage
To avoid having to re-download this 10MB dataset every time you run a query,
we recommend that you [create a new bucket](/influxdb/v2.4/organizations/buckets/create-bucket/)
(`noaa`) and write the NOAA sample water data to it.
```js
import "experimental/csv"
csv.from(url: "https://influx-testdata.s3.amazonaws.com/noaa.csv")
|> to(bucket: "noaa", org: "example-org")
```
{{% /note %}}
The NOAA water sample dataset is used to demonstrate Flux queries in the
[Common queries](/influxdb/v2.4/query-data/common-queries/) and
[Common tasks](/influxdb/v2.4/process-data/common-tasks/) guides.
## USGS Earthquake data
{{% caption %}} {{% caption %}}
**Size**: ~6 MB • **Updated**: every 15m **Size**: ~6 MB • **Updated**: every 15m
@ -149,13 +143,114 @@ The NOAA water sample dataset is used to demonstrate Flux queries in the
The United States Geological Survey (USGS) earthquake dataset contains observations The United States Geological Survey (USGS) earthquake dataset contains observations
collected from USGS seismic sensors around the world over the last week. collected from USGS seismic sensors around the world over the last week.
Data is updated approximately every 15m.
To download and output the last week of USGS seismic data, use the To continually download and write updated USGS earthquake sample data to a bucket,
[`sample.data()` function](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/data/). [create an InfluxDB task](/influxdb/v2.4/process-data/manage-tasks/create-task/)
with the following Flux query.
_Replace `example-bucket` with your target bucket_.
```js ```js
import "influxdata/influxdb/sample" import "influxdata/influxdb/sample"
option task = {
name: "Collect USGS sample data",
every: 15m,
}
sample.data(set: "usgs") sample.data(set: "usgs")
|> to(bucket: "example-bucket")
``` ```
---
## Static datasets
Static datasets are fixed datasets from a specific past time range.
- [Bird migration sample data](#bird-migration-sample-data)
- [Machine production sample data](#machine-production-sample-data)
- [NOAA water sample data](#noaa-water-sample-data)
---
### Bird migration sample data
{{% caption %}}
**Size**: ~1.2 MB
**Time range**: 2019-04-01T13:00:00Z to 2019-04-12T20:00:00Z
{{% /caption %}}
Bird migration sample data is adapted from the
[Movebank: Animal Tracking dataset](https://www.kaggle.com/pulkit8595/movebank-animal-tracking)
and represents animal migratory movements throughout 2019.
To download and write the bird migration sample data to a bucket, run the
following Flux query.
_Replace `example-bucket` with your target bucket_.
```js
import "influxdata/influxdb/sample"
sample.data(set: "birdMigration")
|> to("example-bucket")
```
The bird migration sample dataset is used in the [Work with geo-temporal data](/influxdb/v2.4/query-data/flux/geo/)
guide to demonstrate how to query and analyze geo-temporal data.
---
### Machine production sample data
{{% caption %}}
**Size**: ~11.9 MB
**Time range**: 2021-08-01T00:00:00Z to 2021-08-02T00:00:00Z
{{% /caption %}}
The machine production sample dataset includes states and metrics reported from
four automated grinding wheel stations on a production line.
To download and write the machine production sample data to a bucket, run the
following Flux query.
_Replace `example-bucket` with your target bucket_.
```js
import "influxdata/influxdb/sample"
sample.data(set: "machineProduction")
|> to(bucket: "example-bucket")
```
The machine production data is used in the
[IoT sensor common query](/influxdb/v2.4/query-data/common-queries/iot-common-queries/) guide.
---
### NOAA water sample data
{{% caption %}}
**Size**: ~10 MB
**Time range**: 2019-08-17T00:00:00Z to 2019-09-17T22:00:00Z
{{% /caption %}}
The **National Oceanic and Atmospheric Administration (NOAA) water sample dataset**
is static dataset extracted from
[NOAA Center for Operational Oceanographic Products and Services](http://tidesandcurrents.noaa.gov/stations.html) data.
The sample dataset includes 15,258 observations of water levels (ft) collected every six minutes at two stations
(Santa Monica, CA (ID 9410840) and Coyote Creek, CA (ID 9414575)) over the period
from **August 17, 2019** through **September 17, 2019**.
To download and write the NOAA water sample data to a bucket, run the
following Flux query.
_Replace `example-bucket` with your target bucket_.
```js
import "influxdata/influxdb/sample"
sample.data(set: "noaaWater")
|> to(bucket: "example-bucket"
```
The NOAA water sample dataset is used to demonstrate Flux queries in the
[Common queries](/influxdb/v2.4/query-data/common-queries/) and
[Common tasks](/influxdb/v2.4/process-data/common-tasks/) guides.

View File

@ -15,14 +15,31 @@ InfluxData provides many sample time series datasets to use with InfluxDB.
You can also use the [Flux InfluxDB sample package](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/) You can also use the [Flux InfluxDB sample package](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/)
to view, download, and output sample datasets. to view, download, and output sample datasets.
- [Air sensor sample data](#air-sensor-sample-data) - [Live datasets](#live-datasets)
- [Bird migration sample data](#bird-migration-sample-data) - [Air sensor sample data](#air-sensor-sample-data)
- [NOAA sample data](#noaa-sample-data) - [Bitcoin sample data](#bitcoin-sample-data)
- [NOAA NDBC data](#noaa-ndbc-data) - [NOAA NDBC data](#noaa-ndbc-data)
- [USGS Earthquake data](#usgs-earthquake-data)
- [Static datasets](#static-datasets)
- [Bird migration sample data](#bird-migration-sample-data)
- [Machine production sample data](#machine-production-sample-data)
- [NOAA water sample data](#noaa-water-sample-data) - [NOAA water sample data](#noaa-water-sample-data)
## Live datasets
Live sample datasets are continually updated with new data.
These sets can be loaded once and treated as a "static" dataset, or you can create
an [InfluxDB task](/influxdb/v2.5/process-data/get-started/) to periodically
collect and write new data.
- [Air sensor sample data](#air-sensor-sample-data)
- [Bitcoin sample data](#bitcoin-sample-data)
- [NOAA NDBC data](#noaa-ndbc-data)
- [USGS Earthquake data](#usgs-earthquake-data) - [USGS Earthquake data](#usgs-earthquake-data)
## Air sensor sample data ---
### Air sensor sample data
{{% caption %}} {{% caption %}}
**Size**: ~600 KB • **Updated**: every 15m **Size**: ~600 KB • **Updated**: every 15m
@ -31,16 +48,26 @@ to view, download, and output sample datasets.
Air sensor sample data represents an "Internet of Things" (IoT) use case by simulating Air sensor sample data represents an "Internet of Things" (IoT) use case by simulating
temperature, humidity, and carbon monoxide levels for multiple rooms in a building. temperature, humidity, and carbon monoxide levels for multiple rooms in a building.
To download and output the air sensor sample dataset, use the To continually download and write updated air sensor sample data to a bucket,
[`sample.data()` function](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/data/). [create an InfluxDB task](/influxdb/v2.5/process-data/manage-tasks/create-task/)
with the following Flux query.
_Replace `example-bucket` with your target bucket_.
```js ```js
import "influxdata/influxdb/sample" import "influxdata/influxdb/sample"
option task = {
name: "Collect air sensor sample data",
every: 15m,
}
sample.data(set: "airSensor") sample.data(set: "airSensor")
|> to(bucket: "example-bucket")
``` ```
#### Companion SQL sensor data {{< expand-wrapper >}}
{{% expand "Companion relational sensor data" %}}
The air sensor sample dataset is paired with a relational SQL dataset with meta The air sensor sample dataset is paired with a relational SQL dataset with meta
information about sensors in each room. information about sensors in each room.
These two sample datasets are used to demonstrate These two sample datasets are used to demonstrate
@ -49,35 +76,36 @@ in the [Query SQL data sources](/influxdb/v2.5/query-data/flux/sql/) guide.
<a class="btn download" href="https://influx-testdata.s3.amazonaws.com/sample-sensor-info.csv" download>Download SQL air sensor data</a> <a class="btn download" href="https://influx-testdata.s3.amazonaws.com/sample-sensor-info.csv" download>Download SQL air sensor data</a>
## Bird migration sample data {{% /expand %}}
{{< /expand-wrapper >}}
### Bitcoin sample data
{{% caption %}} {{% caption %}}
**Size**: ~1.2 MB • **Updated**: N/A **Size**: ~700 KB • **Updated**: every 15m
{{% /caption %}} {{% /caption %}}
Bird migration sample data is adapted from the The Bitcoin sample dataset provides Bitcoin prices from the last 30
[Movebank: Animal Tracking data set](https://www.kaggle.com/pulkit8595/movebank-animal-tracking) days—[Powered by CoinDesk](https://www.coindesk.com/price/bitcoin).
and represents animal migratory movements throughout 2019.
To download and output the bird migration sample dataset, use the To continually download and write updated Bitcoin sample data to a bucket,
[`sample.data()` function](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/data/). [create an InfluxDB task](/influxdb/v2.5/process-data/manage-tasks/create-task/)
with the following Flux query.
_Replace `example-bucket` with your target bucket_.
```js ```js
import "influxdata/influxdb/sample" import "influxdata/influxdb/sample"
sample.data(set: "birdMigration") option task = {
name: "Collect Bitcoin sample data",
every: 15m,
}
sample.data(set: "bitcoin")
|> to(bucket: "example-bucket")
``` ```
The bird migration sample dataset is used in the [Work with geo-temporal data](/influxdb/v2.5/query-data/flux/geo/) ---
guide to demonstrate how to query and analyze geo-temporal data.
## NOAA sample data
There are two National Oceanic and Atmospheric Administration (NOAA) datasets
available to use with InfluxDB.
- [NOAA NDBC data](#noaa-ndbc-data)
- [NOAA water sample data](#noaa-water-sample-data)
### NOAA NDBC data ### NOAA NDBC data
@ -85,63 +113,29 @@ available to use with InfluxDB.
**Size**: ~1.3 MB • **Updated**: every 15m **Size**: ~1.3 MB • **Updated**: every 15m
{{% /caption %}} {{% /caption %}}
The **NOAA National Data Buoy Center (NDBC)** dataset provides the latest The **National Oceanic and Atmospheric Administration (NOAA) National Data Buoy Center (NDBC)**
observations from the NOAA NDBC network of buoys throughout the world. dataset provides the latest observations from the NOAA NDBC network of buoys throughout the world.
Observations are updated approximately every 15 minutes.
To download and output the most recent NOAA NDBC observations, use the To continually download and write updated NOAA NDBC sample data to a bucket,
[`sample.data()` function](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/data/). [create an InfluxDB task](/influxdb/v2.5/process-data/manage-tasks/create-task/)
with the following Flux query.
_Replace `example-bucket` with your target bucket_.
```js ```js
import "influxdata/influxdb/sample" import "influxdata/influxdb/sample"
option task = {
name: "Collect NOAA NDBC sample data",
every: 15m,
}
sample.data(set: "noaa") sample.data(set: "noaa")
|> to(bucket: "example-bucket")
``` ```
{{% note %}} ---
#### Store historical NOAA NDBC data
The **NOAA NDBC sample dataset** only returns the most recent observations; ### USGS Earthquake data
not historical observations.
To regularly query and store NOAA NDBC observations, add the following as an
[InfluxDB task](/influxdb/v2.5/process-data/manage-tasks/).
Replace `example-org` and `example-bucket` with your organization name and the
name of the bucket to store data in.
{{% get-shared-text "flux/noaa-ndbc-sample-task.md" %}}
{{% /note %}}
### NOAA water sample data
{{% caption %}}
**Size**: ~10 MB • **Updated**: N/A
{{% /caption %}}
The **NOAA water sample dataset** is static dataset extracted from
[NOAA Center for Operational Oceanographic Products and Services](http://tidesandcurrents.noaa.gov/stations.html) data.
The sample dataset includes 15,258 observations of water levels (ft) collected every six minutes at two stations
(Santa Monica, CA (ID 9410840) and Coyote Creek, CA (ID 9414575)) over the period
from **August 18, 2015** through **September 18, 2015**.
{{% note %}}
#### Store NOAA water sample data to avoid bandwidth usage
To avoid having to re-download this 10MB dataset every time you run a query,
we recommend that you [create a new bucket](/influxdb/v2.5/organizations/buckets/create-bucket/)
(`noaa`) and write the NOAA sample water data to it.
```js
import "experimental/csv"
csv.from(url: "https://influx-testdata.s3.amazonaws.com/noaa.csv")
|> to(bucket: "noaa", org: "example-org")
```
{{% /note %}}
The NOAA water sample dataset is used to demonstrate Flux queries in the
[Common queries](/influxdb/v2.5/query-data/common-queries/) and
[Common tasks](/influxdb/v2.5/process-data/common-tasks/) guides.
## USGS Earthquake data
{{% caption %}} {{% caption %}}
**Size**: ~6 MB • **Updated**: every 15m **Size**: ~6 MB • **Updated**: every 15m
@ -149,13 +143,137 @@ The NOAA water sample dataset is used to demonstrate Flux queries in the
The United States Geological Survey (USGS) earthquake dataset contains observations The United States Geological Survey (USGS) earthquake dataset contains observations
collected from USGS seismic sensors around the world over the last week. collected from USGS seismic sensors around the world over the last week.
Data is updated approximately every 15m.
To download and output the last week of USGS seismic data, use the To continually download and write updated USGS earthquake sample data to a bucket,
[`sample.data()` function](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/data/). [create an InfluxDB task](/influxdb/v2.5/process-data/manage-tasks/create-task/)
with the following Flux query.
_Replace `example-bucket` with your target bucket_.
```js ```js
import "influxdata/influxdb/sample" import "influxdata/influxdb/sample"
option task = {
name: "Collect USGS sample data",
every: 15m,
}
sample.data(set: "usgs") sample.data(set: "usgs")
|> to(bucket: "example-bucket")
``` ```
---
## Static datasets
Static datasets are fixed datasets from a specific past time range.
- [Bird migration sample data](#bird-migration-sample-data)
- [Machine production sample data](#machine-production-sample-data)
- [NOAA water sample data](#noaa-water-sample-data)
{{% cloud-only %}}
{{% note %}}
#### Static sample data and bucket retention periods
If writing a static sample dataset to a bucket with a limited retention period,
use [sample.alignToNow()](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/sample/aligntonow/)
to shift timestamps to align the last point in the set to now.
This will prevent writing points with timestamps beyond the bucket's retention period.
For example:
```js
import "influxdata/influxdb/sample"
sample.data(set: "birdMigration")
|> sample.alignToNow()
|> to("example-bucket")
```
{{% /note %}}
{{% /cloud-only %}}
---
### Bird migration sample data
{{% caption %}}
**Size**: ~1.2 MB
**Time range**: 2019-04-01T13:00:00Z to 2019-04-12T20:00:00Z
{{% /caption %}}
Bird migration sample data is adapted from the
[Movebank: Animal Tracking dataset](https://www.kaggle.com/pulkit8595/movebank-animal-tracking)
and represents animal migratory movements throughout 2019.
To download and write the bird migration sample data to a bucket, run the
following Flux query.
_Replace `example-bucket` with your target bucket_.
```js
import "influxdata/influxdb/sample"
sample.data(set: "birdMigration")
|> to("example-bucket")
```
The bird migration sample dataset is used in the [Work with geo-temporal data](/influxdb/v2.5/query-data/flux/geo/)
guide to demonstrate how to query and analyze geo-temporal data.
---
### Machine production sample data
{{% caption %}}
**Size**: ~11.9 MB
**Time range**: 2021-08-01T00:00:00Z to 2021-08-02T00:00:00Z
{{% /caption %}}
The machine production sample dataset includes states and metrics reported from
four automated grinding wheel stations on a production line.
To download and write the machine production sample data to a bucket, run the
following Flux query.
_Replace `example-bucket` with your target bucket_.
```js
import "influxdata/influxdb/sample"
sample.data(set: "machineProduction")
|> to(bucket: "example-bucket")
```
The machine production data is used in the
[IoT sensor common query](/influxdb/v2.5/query-data/common-queries/iot-common-queries/) guide.
---
### NOAA water sample data
{{% caption %}}
**Size**: ~10 MB
**Time range**: 2019-08-17T00:00:00Z to 2019-09-17T22:00:00Z
{{% /caption %}}
The **National Oceanic and Atmospheric Administration (NOAA) water sample dataset**
is static dataset extracted from
[NOAA Center for Operational Oceanographic Products and Services](http://tidesandcurrents.noaa.gov/stations.html) data.
The sample dataset includes 15,258 observations of water levels (ft) collected every six minutes at two stations
(Santa Monica, CA (ID 9410840) and Coyote Creek, CA (ID 9414575)) over the period
from **August 17, 2019** through **September 17, 2019**.
To download and write the NOAA water sample data to a bucket, run the
following Flux query.
_Replace `example-bucket` with your target bucket_.
```js
import "influxdata/influxdb/sample"
sample.data(set: "noaaWater")
|> to(bucket: "example-bucket"
```
The NOAA water sample dataset is used to demonstrate Flux queries in the
[Common queries](/influxdb/v2.5/query-data/common-queries/) and
[Common tasks](/influxdb/v2.5/process-data/common-tasks/) guides.