finalized procedural docs for writing with csv

2020-05-26 14:52:08 -06:00 · 2020-05-26 14:52:08 -06:00 · 4e82a5f551
parent 1be8f826ca
commit 4e82a5f551
2 changed files with 508 additions and 22 deletions
--- a/content/v2.0/reference/syntax/annotated-csv/extended.md
+++ b/content/v2.0/reference/syntax/annotated-csv/extended.md
@ -75,8 +75,11 @@ The **column value** is the **tag value**.
 #### dateTime
 Indicates the column is the **timestamp**.
 `time` is as an alias for `dateTime`.
+If the [timestamp format](#supported-timestamp-formats) includes a time zone,
+the parsed timestamp respects the time zone.
 By default, all timestamps are UTC.
-Use the [`#timezone` annotation](#timezone) to adjust timestamps to a specific timezone.
+You can also use the [`#timezone` annotation](#timezone) to adjust timestamps to
+a specific time zone.

 {{% note %}}
 There can only be **one** `dateTime` column.
@ -99,7 +102,6 @@ Append the timestamp format to the `dateTime` datatype with (`:`).
 | **RFC3339**      | RFC3339 timestamp | `2020-01-01T00:00:00Z`           |
 | **RFC3339Nano**  | RFC3339 timestamp | `2020-01-01T00:00:00.000000000Z` |
 | **number**       | Unix timestamp    | `1577836800000000000`            |
-| **2006-01-02**   | YYYY-MM-DD date   | `2020-01-01`                     |

 {{% note %}}
 If using the `number` timestamp format and timestamps are **not nanosecond Unix timestamps**,
@ -107,6 +109,11 @@ use the [`--precision` flag](/v2.0/reference/cli/influx/write/#flags) with the
 `influx write` command to specify the timestamp precision.
 {{% /note %}}

+##### Custom timestamp formats
+To specify a custom timestamp format, use timestamp formats as described in the
+[Go time package](https://golang.org/pkg/time).
+For example: `2020-01-01`.
+
 #### field
 Indicates the column is a **field** and auto-detects the field type.
 The **column label** is the **field key**.
@ -120,6 +127,12 @@ The column is a **field** of a specified type.
 The **column label** is the **field key**.
 The **column value** is the **field value**.

+- [string](#string)
+- [double](#double)
+- [long](#long)
+- [unsignedLong](#unsignedlong)
+- [boolean](#boolean)
+
 ##### string
 Column is a **[string](/v2.0/reference/glossary/#string) field**.

@ -156,15 +169,81 @@ For example:

 {{% note %}}
 If your **float separators** include a comma (`,`), wrap the column annotation in double
-quotes (`""`) to prevent the comma from being parsed as column separator or delimitter.
+quotes (`""`) to prevent the comma from being parsed as column separator or delimiter.
 You can also [define a custom column separator](#define-custom-column-separator).
 {{% /note %}}

 ##### long
 Column is an **[integer](/v2.0/reference/glossary/#integer) field**.
+If column values contain separators such as periods (`.`) or commas (`,`), specify
+the following **integer separators**:
+
+- **fraction separator**: Separates the fraction from the whole number.
+  _**Integer values are truncated at the fraction separator when converted to line protocol.**_
+- **ignored separator**: Visually separates the whole number into groups but should
+  be ignored when parsing the integer value.
+
+Use the following syntax to specify **integer separators**:
+
+```sh
+# Syntax
+<fraction-separator><ignored-separator>
+
+# Example
+.,
+
+# With the integer separators above
+# 1,200,000.00 => 1200000i
+```
+
+Append **integer separators** to the `long` datatype annotation with a colon (`:`).
+For example:
+
+```
+#datatype "fieldName|long:.,"
+```
+
+{{% note %}}
+If your **integer separators** include a comma (`,`), wrap the column annotation in double
+quotes (`""`) to prevent the comma from being parsed as column separator or delimiter.
+You can also [define a custom column separator](#define-custom-column-separator).
+{{% /note %}}

 ##### unsignedLong
-Column is an **[unsigned integer](/v2.0/reference/glossary/#unsigned-integer) field**.
+Column is an **[unsigned integer (uinteger)](/v2.0/reference/glossary/#unsigned-integer) field**.
+If column values contain separators such as periods (`.`) or commas (`,`), specify
+the following **uinteger separators**:
+
+- **fraction separator**: Separates the fraction from the whole number.
+  _**Uinteger values are truncated at the fraction separator when converted to line protocol.**_
+- **ignored separator**: Visually separates the whole number into groups but should
+  be ignored when parsing the uinteger value.
+
+Use the following syntax to specify **uinteger separators**:
+
+```sh
+# Syntax
+<fraction-separator><ignored-separator>
+
+# Example
+.,
+
+# With the uinteger separators above
+# 1,200,000.00 => 1200000u
+```
+
+Append **uinteger separators** to the `long` datatype annotation with a colon (`:`).
+For example:
+
+```
+#datatype "fieldName|usignedLong:.,"
+```
+
+{{% note %}}
+If your **uinteger separators** include a comma (`,`), wrap the column annotation in double
+quotes (`""`) to prevent the comma from being parsed as column separator or delimiter.
+You can also [define a custom column separator](#define-custom-column-separator).
+{{% /note %}}

 ##### boolean
 Column is a **[boolean](/v2.0/reference/glossary/#boolean) field**.
@ -176,10 +255,11 @@ specify the **boolean format** with the following syntax:
 <true-values>:<false-values>

 # Example
-y,Y:n,N
+y,Y,1:n,N,0

 # With the boolean format above
-# y => true, Y => true, n => false, N => false
+# y => true, Y => true, 1 => true
+# n => false, N => false, 0 => false
 ```

 Append the **boolean format** to the `boolean` datatype annotation with a colon (`:`).
--- a/content/v2.0/write-data/csv.md
+++ b/content/v2.0/write-data/csv.md
@ -16,24 +16,430 @@ related:
 ---

 Use the [`influx write` command](/v2.0/reference/cli/influx/write/) to write CSV data
-to InfluxDB. Include annotations with the CSV data to specify how the data translates
-into [line protocol](/v2.0/reference/syntax/line-protocol/).
+to InfluxDB. Include [Extended annotated CSV](/v2.0/reference/syntax/annotated-csv/extended/)
+annotations to specify how the data translates into [line protocol](/v2.0/reference/syntax/line-protocol/).
+Include annotations in the CSV file or inject them using the `--header` flag of
+the `influx write` command.

-InfluxDB requires the following for each point written:
+##### On this page
+- [CSV Annotations](#csv-annotations)
+- [Inject annotation headers](#inject-annotation-headers)
+- [Skip annotation headers](#skip-annotation-headers)
+- [Process input as CSV](#process-input-as-csv)
+- [Specify CSV character encoding](#specify-csv-character-encoding)
+- [Skip rows with errors](#skip-rows-with-errors)
+- [Advanced examples](#advanced-examples)

- measurement
- field set
- timestamp
- (optional) tag set
+##### Example write command
+```sh
+influx write -b example-bucket -f path/to/example.csv
+```

- Extended annotated CSV
+##### example.csv
+```
+#datatype measurement,tag,float,dateTime:RFC3339
+m,host,used_percent,time
+mem,host1,64.23,2020-01-01T00:00:00Z
+mem,host2,72.01,2020-01-01T00:00:00Z
+mem,host1,62.61,2020-01-01T00:00:10Z
+mem,host2,72.98,2020-01-01T00:00:10Z
+mem,host1,63.40,2020-01-01T00:00:20Z
+mem,host2,73.77,2020-01-01T00:00:20Z
+```

- Write command
-  - inject annotation headers
-  - skip headers
+##### Resulting line protocol
+```
+mem,host=host1 used_percent=64.23 1577836800000000000
+mem,host=host2 used_percent=72.01 1577836800000000000
+mem,host=host1 used_percent=62.61 1577836810000000000
+mem,host=host2 used_percent=72.98 1577836810000000000
+mem,host=host1 used_percent=63.40 1577836820000000000
+mem,host=host2 used_percent=73.77 1577836820000000000
+```

- Example commands
-  - Write the raw results of a Flux query
-  - Simple annotated CSV with #datatype annotation
-  - Annotated CSV with #datatype and CSV annotations
-    - Include defaults with the #datatype annotation
+{{% note %}}
+To test the CSV to line protocol conversion process, include the `--dryrun` flag
+with the `influx write` command to print the resulting line protocol to stdout
+rather than write to InfluxDB.
+{{% /note %}}
+
+## CSV Annotations
+Use **CSV annotations** to specify which element of line protocol each CSV column
+represents and how to format the data. CSV annotations are rows at the beginning
+of a CSV file that describe column properties.
+
+The `influx write` command supports [Extended annotated CSV](/v2.0/reference/syntax/annotated-csv/extended)
+which provides options for specifying how CSV data should be converted into line
+protocol and how data is formatted.
+
+To write data to InfluxDB, data must include the following:
+
+- [measurement](/v2.0/reference/syntax/line-protocol/#measurement)
+- [field set](/v2.0/reference/syntax/line-protocol/#field-set)
+- [timestamp](/v2.0/reference/syntax/line-protocol/#timestamp) _(Optional but recommended)_
+- [tag set](/v2.0/reference/syntax/line-protocol/#tag-set) _(Optional)_
+
+Use CSV annotations to specify which of these elements each column represents.
+
+## Write raw query results back to InfluxDB
+Flux returns query results in [Annotated CSV](/v2.0/reference/syntax/annotated-csv/).
+These results include all annotations necessary to write the data back to InfluxDB.
+
+## Inject annotation headers
+If the CSV data you want to write to InfluxDB does not contain the annotations
+required to properly convert the data to line protocol, use the `--header` flag
+to inject annotation rows into the CSV data.
+
+```sh
+influx write -b example-bucket \
+  -f path/to/example.csv \
+  --header "#constant measurement,birds" \
+  --header "#datatype dataTime:2006-01-02,long,tag"
+```
+
+{{< flex >}}
+{{% flex-content %}}
+##### example.csv
+```
+date,sighted,loc
+2020-01-01,12,Boise
+2020-06-01,78,Boise
+2020-01-01,54,Seattle
+2020-06-01,112,Seattle
+2020-01-01,9,Detroit
+2020-06-01,135,Detroit
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+birds,loc=Boise sighted=12 1577836800000000000
+birds,loc=Boise sighted=78 1590969600000000000
+birds,loc=Seattle sighted=54 1577836800000000000
+birds,loc=Seattle sighted=112 1590969600000000000
+birds,loc=Detroit sighted=9 1577836800000000000
+birds,loc=Detroit sighted=135 1590969600000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}
+
+## Skip annotation headers
+Some CSV data may include annotations that conflict with annotations necessary to
+write CSV data to InfluxDB.
+Use the `--skipHeader` flag to specify the **number of rows to skip** at the
+beginning of the CSV data.
+
+```sh
+influx write -b example-bucket \
+  -f path/to/example.csv \
+  --skipHeader=2
+```
+
+## Process input as CSV
+The `influx write` command automatically processes files with the `.csv` extension as CSV files.
+If your CSV file uses a different extension, use the `--format` flat to explicitly
+declare the format of the input file.
+
+```sh
+influx write -b example-bucket \
+  -f path/to/example.txt \
+  --format csv
+```
+
+{{% note %}}
+The `influx write` command assumes all input files are line protocol unless they
+include the `.csv` extension or you declare the `csv`.
+{{% /note %}}
+
+## Specify CSV character encoding
+The `influx write` command assumes CSV files contain UTF-8 encoded characters.
+If your CSV data uses different character encoding, specify the encoding
+with the `--encoding`.
+
+```sh
+influx write -b example-bucket \
+  -f path/to/example.csv \
+  --encoding "UTF-16"
+```
+
+## Skip rows with errors
+If a row in your CSV data is missing an
+[element required to write to InfluxDB](/v2.0/reference/syntax/line-protocol/#elements-of-line-protocol)
+or data is incorrectly formatted, when processing the row, the `influx write` command
+returns an error and cancels the write request.
+To skip rows with errors, use the `--skipRowOnError` flag.
+
+```sh
+influx write -b example-bucket \
+  -f path/to/example.csv \
+  --skipRowOnError
+```
+
+{{% warn %}}
+Skipped rows are ignored and are not written to InfluxDB.
+{{% /warn %}}
+
+## Advanced examples
+
+- [Define constants](#define-constants)
+- [Annotation shorthand](#annotation-shorthand)
+- [Use alternate numeric formats](#use-alternate-numeric-formats)
+- [Use alternate boolean format](#use-alternate-boolean-format)
+- [Use different timestamp formats](#use-different-timestamp-formats)
+
+---
+
+### Define constants
+Use the Extended annotated CSV [`#constant` annotation](/v2.0/reference/syntax/annotated-csv/extended/#constant)
+to add a column and value to each row in the CSV data.
+
+{{< flex >}}
+{{% flex-content %}}
+##### CSV with constants
+```
+#constant measurement,example
+#constant tag,source,csv
+#datatype long,dateTime:RFC3339
+count,time
+1,2020-01-01T00:00:00Z
+4,2020-01-02T00:00:00Z
+9,2020-01-03T00:00:00Z
+18,2020-01-04T00:00:00Z
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+example,source=csv count=1 1577836800000000000
+example,source=csv count=4 1577923200000000000
+example,source=csv count=9 1578009600000000000
+example,source=csv count=18 1578096000000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}
+
+---
+
+### Annotation shorthand
+Extended annotated CSV supports [annotation shorthand](/v2.0/reference/syntax/annotated-csv/extended/#annotation-shorthand),
+which lets you define the **column label**, **datatype**, and **default value** in the column header.
+
+{{< flex >}}
+{{% flex-content %}}
+##### CSV with annotation shorthand
+```
+m|measurement,count|long|0,time|dateTime:RFC3339
+example,1,2020-01-01T00:00:00Z
+example,4,2020-01-02T00:00:00Z
+example,,2020-01-03T00:00:00Z
+example,18,2020-01-04T00:00:00Z
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+example count=1 1577836800000000000
+example count=4 1577923200000000000
+example count=0 1578009600000000000
+example count=18 1578096000000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}
+
+#### Replace column header with annotation shorthand
+It's possible to replace the column header row in a CSV file with annotation
+shorthand without modifying the CSV file.
+This lets you define column data types and default values while writing to InfluxDB.
+
+To replace an existing column header row with annotation shorthand:
+
+1. Use the `--skipHeader` flag to ignore the existing column header row.
+2. Use the `--header` flag to inject a new column header row that uses annotation shorthand.
+
+<!-- -->
+```sh
+influx write -b example-bucket \
+  -f example.csv \
+  --skipHeader=1
+  --header="m|measurement,count|long|0,time|dateTime:RFC3339"
+```
+
+{{< flex >}}
+{{% flex-content %}}
+##### Unmodified example.csv
+```
+m,count,time
+example,1,2020-01-01T00:00:00Z
+example,4,2020-01-02T00:00:00Z
+example,,2020-01-03T00:00:00Z
+example,18,2020-01-04T00:00:00Z
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+example count=1 1577836800000000000
+example count=4 1577923200000000000
+example count=0 1578009600000000000
+example count=18 1578096000000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}
+
+---
+
+### Use alternate numeric formats
+If your CSV data contains numeric values that use a non-default fraction separator (`.`)
+or contain group separators, [define your numeric format](/v2.0/reference/syntax/annotated-csv/extended/#double)
+in the `double`, `long`, and `unsignedLong` datatype annotations.
+
+{{% note %}}
+If your **numeric format separators** include a comma (`,`), wrap the column annotation in double
+quotes (`""`) to prevent the comma from being parsed as column separator or delimiter.
+You can also [define a custom column separator](##################).
+{{% /note %}}
+
+{{< tabs-wrapper >}}
+{{% tabs %}}
+[Floats](#)
+[Integers](#)
+[Uintegers](#)
+{{% /tabs %}}
+{{% tab-content %}}
+{{< flex >}}
+{{% flex-content %}}
+##### CSV with non-default float values
+```
+#datatype measurement,"double:.,",dateTime:RFC3339
+m,lbs,time
+example,"1,280.7",2020-01-01T00:00:00Z
+example,"1,352.5",2020-01-02T00:00:00Z
+example,"1,862.8",2020-01-03T00:00:00Z
+example,"2,014.9",2020-01-04T00:00:00Z
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+example lbs=1280.7 1577836800000000000
+example lbs=1352.5 1577923200000000000
+example lbs=1862.8 1578009600000000000
+example lbs=2014.9 1578096000000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}
+{{% /tab-content %}}
+
+{{% tab-content %}}
+{{< flex >}}
+{{% flex-content %}}
+##### CSV with non-default integer values
+```
+#datatype measurement,"long:.,",dateTime:RFC3339
+m,lbs,time
+example,"1,280.0",2020-01-01T00:00:00Z
+example,"1,352.0",2020-01-02T00:00:00Z
+example,"1,862.0",2020-01-03T00:00:00Z
+example,"2,014.9",2020-01-04T00:00:00Z
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+example lbs=1280i 1577836800000000000
+example lbs=1352i 1577923200000000000
+example lbs=1862i 1578009600000000000
+example lbs=2014i 1578096000000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}
+{{% /tab-content %}}
+
+{{% tab-content %}}
+{{< flex >}}
+{{% flex-content %}}
+##### CSV with non-default uinteger values
+```
+#datatype measurement,"unsignedLong:.,",dateTime:RFC3339
+m,lbs,time
+example,"1,280.0",2020-01-01T00:00:00Z
+example,"1,352.0",2020-01-02T00:00:00Z
+example,"1,862.0",2020-01-03T00:00:00Z
+example,"2,014.9",2020-01-04T00:00:00Z
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+example lbs=1280u 1577836800000000000
+example lbs=1352u 1577923200000000000
+example lbs=1862u 1578009600000000000
+example lbs=2014u 1578096000000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}
+{{% /tab-content %}}
+{{< /tabs-wrapper >}}
+
+---
+
+### Use alternate boolean format
+Line protocol supports only [specific boolean values](/v2.0/reference/syntax/line-protocol/#boolean).
+If your CSV data contains boolean values that line protocol does not support,
+[define your boolean format](/v2.0/reference/syntax/annotated-csv/extended/#boolean)
+in the `boolean` datatype annotation.
+
+{{< flex >}}
+{{% flex-content %}}
+##### CSV with non-default boolean values
+```
+sep=;
+#datatype measurement,"boolean:y,Y,1:n,N,0",dateTime:RFC3339
+m,lbs,time
+example,"1,280.7",2020-01-01T00:00:00Z
+example,"1,352.5",2020-01-02T00:00:00Z
+example,"1,862.8",2020-01-03T00:00:00Z
+example,"2,014.9",2020-01-04T00:00:00Z
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+example lbs=1280.7 1577836800000000000
+example lbs=1352.5 1577923200000000000
+example lbs=1862.8 1578009600000000000
+example lbs=2014.9 1578096000000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}
+
+---
+
+### Use different timestamp formats
+The `influx write` command automatically detects **RFC3339** and **number** formatted
+timestamps when converting CSV to line protocol.
+If using a different timestamp format, [define your timestamp format](/v2.0/reference/syntax/annotated-csv/extended/#datetime)
+in the `dateTime` datatype annotation.
+
+{{< flex >}}
+{{% flex-content %}}
+##### CSV with non-default timestamps
+```
+#datatype measurement,dateTime:2006-01-02,field
+m,time,lbs
+example,2020-01-01,1280.7
+example,2020-01-02,1352.5
+example,2020-01-03,1862.8
+example,2020-01-04,2014.9
+```
+{{% /flex-content %}}
+{{% flex-content %}}
+##### Resulting line protocol
+```
+example lbs=1280.7 1577836800000000000
+example lbs=1352.5 1577923200000000000
+example lbs=1862.8 1578009600000000000
+example lbs=2014.9 1578096000000000000
+```
+{{% /flex-content %}}
+{{< /flex >}}