Merge pull request #851 from influxdata/geo-guide

Geo guides
pull/860/head
Scott Anderson 2020-03-23 14:22:13 -06:00 committed by GitHub
commit 3b2d827e37
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 425 additions and 12 deletions

View File

@ -75,6 +75,7 @@ related: # Creates links to specific internal and external content at the bottom
external_url: # Used in children shortcode type="list" for page links that are external
list_image: # Image included with article descriptions in children type="articles" shortcode
list_note: # Used in children shortcode type="list" to add a small note next to listed links
list_code_example: # Code example included with article descriptions in children type="articles" shortcode
```
#### Title usage
@ -406,17 +407,29 @@ The following list types are available:
- **list:** lists children article links in an unordered list.
- **functions:** a special use-case designed for listing Flux functions.
#### Include a code example with a child summary
Use the `list_code_example` frontmatter to provide a code example with an article
in an articles list.
~~~yaml
list_code_example: |
```sh
This is a code example
```
~~~
#### Children frontmatter
Each children list `type` uses [frontmatter properties](#page-frontmatter) when generating the list of articles.
The following table shows which children types use which frontmatter properties:
| Frontmatter | articles | list | functions |
|:----------- |:--------:|:----:|:---------:|
| `list_title` | ✓ | ✓ | ✓ |
| `description` | ✓ | | |
| `external_url` | ✓ | ✓ | |
| `list_image` | ✓ | | |
| `list_note` | | ✓ | |
| Frontmatter | articles | list | functions |
|:----------- |:--------:|:----:|:---------:|
| `list_title` | ✓ | ✓ | ✓ |
| `description` | ✓ | | |
| `external_url` | ✓ | ✓ | |
| `list_image` | ✓ | | |
| `list_note` | | ✓ | |
| `list_code_example` | ✓ | | |
### Inline icons
The `icon` shortcode allows you to inject icons in paragraph text.

View File

@ -33,4 +33,4 @@ data = from(bucket: "example-bucket")
---
{{< children >}}
{{< children pages="all" >}}

View File

@ -0,0 +1,80 @@
---
title: Work with geo-temporal data
description: >
Use the Flux Geo package to filter geo-temporal data and group by geographic location or track.
menu:
v2_0:
name: Geo-temporal data
parent: Query with Flux
weight: 220
---
Use the [Flux Geo package](/v2.0/reference/flux/stdlib/experimental/geo) to
filter geo-temporal data and group by geographic location or track.
{{% warn %}}
The Geo package is experimental and subject to change at any time.
By using it, you agree to the [risks of experimental functions](/v2.0/reference/flux/stdlib/experimental/#use-experimental-functions-at-your-own-risk).
{{% /warn %}}
**To work with geo-temporal data:**
1. Import the `experimental/geo` package.
```js
import "experimental/geo"
```
2. Load geo-temporal data. _See below for [sample geo-temporal data](#sample-data)._
3. Do one or more of the following:
- [Shape data to work with the Geo package](#shape-data-to-work-with-the-geo-package)
- [Filter data by region](#filter-geo-temporal-data-by-region) (using strict or non-strict filters)
- [Group data by area or by track](#group-geo-temporal-data)
{{< children >}}
---
## Sample data
Many of the examples in this section use a `sampleGeoData` variable that represents
a sample set of geo-temporal data.
The [Bird Migration Sample Data](https://github.com/influxdata/influxdb2-sample-data/tree/master/bird-migration-data)
available on GitHub provides sample geo-temporal data that meets the
[requirements of the Flux Geo package](/v2.0/reference/flux/stdlib/experimental/geo/#geo-schema-requirements).
### Load annotated CSV sample data
Use the [experimental `csv.from()` function](/v2.0/reference/flux/stdlib/experimental/csv/from/)
to load the sample bird migration annotated CSV data from GitHub:
```js
import `experimental/csv`
sampleGeoData = csv.from(
url: "https://github.com/influxdata/influxdb2-sample-data/blob/master/bird-migration-data/bird-migration.csv"
)
```
{{% note %}}
`csv.from(url: ...)` downloads sample data each time you execute the query **(~1.3 MB)**.
If bandwidth is a concern, use the [`to()` function](/v2.0/reference/flux/stdlib/built-in/outputs/to/)
to write the data to a bucket, and then query the bucket with [`from()`](/v2.0/reference/flux/stdlib/built-in/inputs/from/).
{{% /note %}}
### Write sample data to InfluxDB with line protocol
Use `curl` and the `influx write` command to write bird migration line protocol to InfluxDB.
Replace `example-bucket` with your destination bucket:
```sh
curl https://raw.githubusercontent.com/influxdata/influxdb2-sample-data/master/bird-migration-data/bird-migration.line --output ./tmp-data
influx write -b example-bucket @./tmp-data
rm -f ./tmp-data
```
Use Flux to query the bird migration data and assign it to the `sampleGeoData` variable:
```js
sampleGeoData = from(bucket: "example-bucket")
|> range(start: 2019-01-01T00:00:00Z, stop: 2019-12-31T23:59:59Z)
|> filter(fn: (r) => r._measurement == "migration")
```

View File

@ -0,0 +1,127 @@
---
title: Filter geo-temporal data by region
description: >
Use the `geo.filterRows` function to filter geo-temporal data by box-shaped, circular, or polygonal geographic regions.
menu:
v2_0:
name: Filter by region
parent: Geo-temporal data
weight: 302
list_code_example: |
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(
region: {lat: 30.04, lon: 31.23, radius: 200.0},
strict: true
)
```
---
Use the [`geo.filterRows` function](/v2.0/reference/flux/stdlib/geo/filterrows/)
to filter geo-temporal data by geographic region:
1. [Define a geographic region](#define-geographic-regions)
2. [Use strict or non-strict filtering](#strict-and-non-strict-filtering)
The following example uses the [sample bird migration data](/v2.0/query-data/flux/geo/#sample-data)
and queries data points **within 200km of Cairo, Egypt**:
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(
region: {lat: 30.04, lon: 31.23, radius: 200.0},
strict: true
)
```
## Define a geographic region
Many functions in the Geo package filter data based on geographic region.
Define a geographic region using one of the the following shapes:
- [box](#box)
- [circle](#circle)
- [polygon](#polygon)
### box
Define a box-shaped region by specifying an object containing the following properties:
- **minLat:** minimum latitude in decimal degrees (WGS 84) _(Float)_
- **maxLat:** maximum latitude in decimal degrees (WGS 84) _(Float)_
- **minLon:** minimum longitude in decimal degrees (WGS 84) _(Float)_
- **maxLon:** maximum longitude in decimal degrees (WGS 84) _(Float)_
##### Example box-shaped region
```js
{
minLat: 40.51757813,
maxLat: 40.86914063,
minLon: -73.65234375,
maxLon: -72.94921875
}
```
### circle
Define a circular region by specifying an object containing the following properties:
- **lat**: latitude of the circle center in decimal degrees (WGS 84) _(Float)_
- **lon**: longitude of the circle center in decimal degrees (WGS 84) _(Float)_
- **radius**: radius of the circle in kilometers (km) _(Float)_
##### Example circular region
```js
{
lat: 40.69335938,
lon: -73.30078125,
radius: 20.0
}
```
### polygon
Define a polygonal region with an object containing the latitude and longitude for
each point in the polygon:
- **points**: points that define the custom polygon _(Array of objects)_
Define each point with an object containing the following properties:
- **lat**: latitude in decimal degrees (WGS 84) _(Float)_
- **lon**: longitude in decimal degrees (WGS 84) _(Float)_
##### Example polygonal region
```js
{
points: [
{lat: 40.671659, lon: -73.936631},
{lat: 40.706543, lon: -73.749177},
{lat: 40.791333, lon: -73.880327}
]
}
```
## Strict and non-strict filtering
In most cases, the specified geographic region does not perfectly align with S2 grid cells.
- **Non-strict filtering** returns points that may be outside of the specified region but
inside S2 grid cells partially covered by the region.
- **Strict filtering** returns only points inside the specified region.
_Strict filtering is less performant, but more accurate than non-strict filtering._
<span class="key-geo-cell"></span> S2 grid cell
<span class="key-geo-region"></span> Filter region
<span class="key-geo-point"></span> Returned point
{{< flex >}}
{{% flex-content %}}
**Strict filtering**
{{< svg "/static/svgs/geo-strict.svg" >}}
{{% /flex-content %}}
{{% flex-content %}}
**Non-strict filtering**
{{< svg "/static/svgs/geo-non-strict.svg" >}}
{{% /flex-content %}}
{{< /flex >}}

View File

@ -0,0 +1,70 @@
---
title: Group geo-temporal data
description: >
Use the `geo.groupByArea()` to group geo-temporal data by area and `geo.asTracks()`
to group data into tracks or routes.
menu:
v2_0:
parent: Geo-temporal data
weight: 302
list_code_example: |
```js
import "experimental/geo"
sampleGeoData
|> geo.groupByArea(newColumn: "geoArea", level: 5)
|> geo.asTracks(groupBy: ["id"],sortBy: ["_time"])
```
---
Use the `geo.groupByArea()` to group geo-temporal data by area and `geo.asTracks()`
to group data into tracks or routes.
- [Group data by area](#group-data-by-area)
- [Group data into tracks or routes](#group-data-into-tracks-or-routes)
### Group data by area
Use the [`geo.groupByArea()` function](/v2.0/reference/flux/stdlib/experimental/geo/groupbyarea/)
to group geo-temporal data points by geographic area.
Areas are determined by [S2 grid cells](https://s2geometry.io/devguide/s2cell_hierarchy.html#s2cellid-numbering)
- Specify a new column to store the unique area identifier for each point with the `newColumn` parameter.
- Specify the [S2 cell level](https://s2geometry.io/resources/s2cell_statistics)
to use when calculating geographic areas with the `level` parameter.
The following example uses the [sample bird migration data](/v2.0/query-data/flux/geo/#sample-data)
to query data points within 200km of Cairo, Egypt and group them by geographic area:
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|> geo.groupByArea(
newColumn: "geoArea",
level: 5
)
```
### Group data by track or route
Use [`geo.asTracks()` function](/v2.0/reference/flux/stdlib/experimental/geo/astracks/)
to group data points into tracks or routes and order them by time or other columns.
Data must contain a unique identifier for each track. For example: `id` or `tid`.
- Specify columns that uniquely identify each track or route with the `groupBy` parameter.
- Specify which columns to sort by with the `sortBy` parameter. Default is `["_time"]`.
The following example uses the [sample bird migration data](/v2.0/query-data/flux/geo/#sample-data)
to query data points within 200km of Cairo, Egypt and group them into routes unique
to each bird:
```js
import "experimental/geo"
sampleGeoData
|> geo.filterRows(region: {lat: 30.04, lon: 31.23, radius: 200.0})
|> geo.asTracks(
groupBy: ["id"],
sortBy: ["_time"]
)
```

View File

@ -0,0 +1,121 @@
---
title: Shape data to work with the Geo package
description: >
Functions in the Flux Geo package require **lat** and **lon** fields and an **s2_cell_id** tag.
Rename latitude and longitude fields and generate S2 cell ID tokens.
menu:
v2_0:
name: Shape geo-temporal data
parent: Geo-temporal data
weight: 301
list_code_example: |
```js
import "experimental/geo"
sampleGeoData
|> map(fn: (r) => ({ r with
_field:
if r._field == "latitude" then "lat"
else if r._field == "longitude" then "lon"
else r._field
}))
|> map(fn: (r) => ({ r with
s2_cell_id: geo.s2CellIDToken(point: {lon: r.lon, lat: r.lat}, level: 10)
}))
```
---
Functions in the Geo package require the following data schema:
- an **s2_cell_id** tag containing the [S2 Cell ID](https://s2geometry.io/devguide/s2cell_hierarchy.html#s2cellid-numbering)
**as a token**
- a **`lat` field** field containing the **latitude in decimal degrees** (WGS 84)
- a **`lon` field** field containing the **longitude in decimal degrees** (WGS 84)
<!-- -->
- [Rename latitude and longitude fields](#rename-latitude-and-longitude-fields)
- [Generate S2 cell ID tokens](#generate-s2-cell-id-tokens)
## Rename latitude and longitude fields
Use [`map()`](/v2.0/reference/flux/stdlib/built-in/transformations/map/) to rename
existing latitude and longitude fields using other names.
```js
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> map(fn: (r) => ({ r with
_field:
if r._field == "existingLatitudeField" then "lat"
else if r._field == "existingLongitudeField" then "lon"
else r._field
}))
```
## Generate S2 cell ID tokens
The Geo package uses the [S2 Geometry Library](https://s2geometry.io/) to represent
geographic coordinates on a three-dimensional sphere.
The sphere is divided into [cells](https://s2geometry.io/devguide/s2cell_hierarchy),
each with a unique 64-bit identifier (S2 cell ID).
Grid and S2 cell ID accuracy are defined by a [level](https://s2geometry.io/resources/s2cell_statistics).
{{% note %}}
To filter more quickly, use higher S2 Cell ID levels,
but know that that higher levels increase [series cardinality](/v2.0/reference/glossary/#series-cardinality).
{{% /note %}}
The Geo package requires S2 cell IDs as tokens.
To generate add S2 cell IDs tokens to your data, use one of the following options:
- [Generate S2 cell ID tokens with Telegraf](#generate-s2-cell-id-tokens-with-telegraf)
- [Generate S2 cell ID tokens language-specific libraries](#generate-s2-cell-id-tokens-language-specific-libraries)
- [Generate S2 cell ID tokens with Flux](#generate-s2-cell-id-tokens-with-flux)
### Generate S2 cell ID tokens with Telegraf
Enable the [Telegraf S2 Geo (`s2geo`) processor](https://github.com/influxdata/telegraf/tree/master/plugins/processors/s2geo)
to generate S2 cell ID tokens at a specified `cell_level` using `lat` and `lon` field values.
Add the `processors.s2geo` configuration to your Telegraf configuration file (`telegraf.conf`):
```toml
[[processors.s2geo]]
## The name of the lat and lon fields containing WGS-84 latitude and
## longitude in decimal degrees.
lat_field = "lat"
lon_field = "lon"
## New tag to create
tag_key = "s2_cell_id"
## Cell level (see https://s2geometry.io/resources/s2cell_statistics.html)
cell_level = 9
```
Telegraf stores the S2 cell ID token in the `s2_cell_id` tag.
### Generate S2 cell ID tokens language-specific libraries
Many programming languages offer S2 Libraries with methods for generating S2 cell ID tokens.
Use latitude and longitude with the `s2.CellID.ToToken` endpoint of the S2 Geometry
Library to generate `s2_cell_id` tags. For example:
- **Go:** [s2.CellID.ToToken()](https://godoc.org/github.com/golang/geo/s2#CellID.ToToken)
- **Python:** [s2sphere.CellId.to_token()](https://s2sphere.readthedocs.io/en/latest/api.html#s2sphere.CellId)
- **JavaScript:** [s2.cellid.toToken()](https://github.com/mapbox/node-s2/blob/master/API.md#cellidtotoken---string)
### Generate S2 cell ID tokens with Flux
Use the [`geo.s2CellIDToken()` function](/v2.0/reference/flux/stdlib/experimental/geo/s2cellidtoken/)
with existing longitude (`lon`) and latitude (`lat`) field values to generate and add the S2 cell ID token.
First, use the [`geo.toRows()` function](/v2.0/reference/flux/stdlib/experimental/geo/torows/)
to pivot **lat** and **lon** fields into row-wise sets:
```js
import "experimental/geo"
from(bucket: "example-bucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "example-measurement")
|> geo.toRows()
|> map(fn: (r) => ({ r with
s2_cell_id: geo.s2CellIDToken(point: {lon: r.lon, lat: r.lat}, level: 10)
}))
```

View File

@ -48,9 +48,8 @@ Geometry Library to generate `s2_cell_id` tags.
Specify your [S2 Cell ID level](https://s2geometry.io/resources/s2cell_statistics.html).
{{% note %}}
For faster filtering, use higher S2 Cell ID levels.
But know that that higher levels increase
[series cardinality](/v2.0/reference/glossary/#series-cardinality).
To filter more quickly, use higher S2 Cell ID levels,
but know that that higher levels increase [series cardinality](/v2.0/reference/glossary/#series-cardinality).
{{% /note %}}
Language-specific implementations of the S2 Geometry Library provide methods for

View File

@ -18,7 +18,7 @@
{{ $title := cond ( isset .Params "list_title" ) .Params.list_title .Title }}
{{ $url := cond ( isset .Params "external_url" ) .Params.external_url .RelPermalink }}
{{ $target := cond ( isset .Params "external_url" ) "_blank" "" }}
<h3><a href="{{ $url }}" target="{{ $target }}">{{ $title }}</a></h3>
<h3 id="{{ anchorize $title }}"><a href="{{ $url }}" target="{{ $target }}">{{ $title }}</a></h3>
<p>
{{- if .Description }}{{- .Description | markdownify -}}
{{ else }}{{- .Summary | markdownify -}}
@ -35,6 +35,9 @@
<img src='{{ $img }}'/>
{{ end }}
{{ end }}
{{ if .Params.list_code_example }}
{{ .Params.list_code_example | markdownify }}
{{ end }}
{{ end }}
{{ else if (eq $type "functions") }}