12 KiB
| title | description | menu | weight | influxdb/v2.0/tags | related | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Extended annotated CSV | Extended annotated CSV provides additional annotations and options that specify how CSV data should be converted to [line protocol](/v2.0/reference/syntax/line-protocol/) and written to InfluxDB. |
|
201 |
|
|
Extended annotated CSV provides additional annotations and options that specify
how CSV data should be converted to line protocol
and written to InfluxDB.
InfluxDB uses the csv2lp library
to convert CSV into line protocol.
Extended annotated CSV supports all Annotated CSV
annotations.
{{% warn %}}
The Flux csv.from function only supports
annotated CSV, not extended annotated CSV.
{{% /warn %}}
To write data to InfluxDB, line protocol must include the following:
- measurement
- field set
- timestamp (Optional but recommended)
- tag set (Optional)
Extended CSV annotations identify the element of line protocol a column represents.
CSV Annotations
Extended annotated CSV extends and adds the following annotations:
datatype
Use the #datatype annotation to specify the line protocol element
a column represents.
To explicitly define a column as a field of a specific data type, use the field
type in the annotation (for example: string, double, long, etc.).
| Data type | Resulting line protocol |
|---|---|
| measurement | Column is the measurement |
| tag | Column is a tag |
| dateTime | Column is the timestamp |
| field | Column is a field |
| ignored | Column is ignored |
| string | Column is a string field |
| double | Column is a float field |
| long | Column is an integer field |
| unsignedLong | Column is an unsigned integer field |
| boolean | Column is a boolean field |
measurement
Indicates the column is the measurement.
tag
Indicates the column is a tag. The column label is the tag key. The column value is the tag value.
dateTime
Indicates the column is the timestamp.
time is an alias for dateTime.
If the timestamp format includes a time zone,
the parsed timestamp includes the time zone offset.
By default, all timestamps are UTC.
You can also use the #timezone annotation to adjust timestamps to
a specific time zone.
{{% note %}}
There can only be one dateTime column.
{{% /note %}}
The influx write command converts timestamps to Unix timestamps.
Append the timestamp format to the dateTime datatype with (:).
#datatype dateTime:RFC3339
#datatype dateTime:RFC3339Nano
#datatype dateTime:number
#datatype dateTime:2006-01-02
Supported timestamp formats
| Timestamp format | Description | Example |
|---|---|---|
| RFC3339 | RFC3339 timestamp | 2020-01-01T00:00:00Z |
| RFC3339Nano | RFC3339 timestamp | 2020-01-01T00:00:00.000000000Z |
| number | Unix timestamp | 1577836800000000000 |
{{% note %}}
If using the number timestamp format and timestamps are not in nanoseconds,
use the influx write --precision flag
to specify the timestamp precision.
{{% /note %}}
Custom timestamp formats
To specify a custom timestamp format, use timestamp formats as described in the
Go time package.
For example: 2020-01-02.
field
Indicates the column is a field. The column label is the field key. The column value is the field value.
{{% note %}}
With the field datatype, field values are copies as-is to line protocol.
For information about line protocol values and how they are written to InfluxDB,
see Line protocol data types and formats.
We generally recommend specifying the field type in annotations.
{{% /note %}}
ignored
The column is ignored and not written to InfluxDB.
Field types
The column is a field of a specified type. The column label is the field key. The column value is the field value.
string
Column is a string field.
double
Column is a float field.
By default, InfluxDB expects float values that use a period (.) to separate the
fraction from the whole number.
If column values include or use other separators, such as commas (,) to visually
separate large numbers into groups, specify the following float separators:
- fraction separator: Separates the fraction from the whole number.
- ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the float value.
Use the following syntax to specify float separators:
# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the float separators above
# 1,200,000.15 => 1200000.15
Append float separators to the double datatype annotation with a colon (:).
For example:
#datatype "double:.,"
{{% note %}}
If your float separators include a comma (,), wrap the column annotation in double
quotes ("") to prevent the comma from being parsed as a column separator or delimiter.
You can also define a custom column separator.
{{% /note %}}
long
Column is an integer field.
If column values contain separators such as periods (.) or commas (,), specify
the following integer separators:
- fraction separator: Separates the fraction from the whole number. Integer values are truncated at the fraction separator when converted to line protocol.
- ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the integer value.
Use the following syntax to specify integer separators:
# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the integer separators above
# 1,200,000.00 => 1200000i
Append integer separators to the long datatype annotation with a colon (:).
For example:
#datatype "long:.,"
{{% note %}}
If your integer separators include a comma (,), wrap the column annotation in double
quotes ("") to prevent the comma from being parsed as a column separator or delimiter.
You can also define a custom column separator.
{{% /note %}}
unsignedLong
Column is an unsigned integer (uinteger) field.
If column values contain separators such as periods (.) or commas (,), specify
the following uinteger separators:
- fraction separator: Separates the fraction from the whole number. Uinteger values are truncated at the fraction separator when converted to line protocol.
- ignored separator: Visually separates the whole number into groups but ignores the separator when parsing the uinteger value.
Use the following syntax to specify uinteger separators:
# Syntax
<fraction-separator><ignored-separator>
# Example
.,
# With the uinteger separators above
# 1,200,000.00 => 1200000u
Append uinteger separators to the long datatype annotation with a colon (:).
For example:
#datatype "usignedLong:.,"
{{% note %}}
If your uinteger separators include a comma (,), wrap the column annotation in double
quotes ("") to prevent the comma from being parsed as a column separator or delimiter.
You can also define a custom column separator.
{{% /note %}}
boolean
Column is a boolean field. If column values are not supported boolean values, specify the boolean format with the following syntax:
# Syntax
<true-values>:<false-values>
# Example
y,Y,1:n,N,0
# With the boolean format above
# y => true, Y => true, 1 => true
# n => false, N => false, 0 => false
Append the boolean format to the boolean datatype annotation with a colon (:).
For example:
#datatype "boolean:y,Y:n,N"
{{% note %}}
If your boolean format contains commas (,), wrap the column annotation in double
quotes ("") to prevent the comma from being parsed as a column separator or delimiter.
You can also define a custom column separator.
{{% /note %}}
constant
Use the #constant annotation to define a constant column label and value for each row.
The #constant annotation provides a way to supply
line protocol elements
that don't exist in the CSV data.
Use the following syntax to define constants:
#constant <datatype>,<column-label>,<column-value>
To provide multiple constants, include each #constant annotations on a separate line.
#constant measurement,m
#constant tag,dataSource,csv
{{% note %}}
For constants with measurement and dateTime datatypes, the second value in
the constant definition is the column-value.
{{% /note %}}
timezone
Use the #timezone annotation to update timestamps to a specific timezone.
By default, timestamps are parsed as UTC.
Use the ±HHmm format to specify the timezone offset relative to UTC.
Timezone examples
| Timezone | Offset |
|---|---|
| US Mountain Daylight Time | -0600 |
| Central European Summer Time | +0200 |
| Australia Eastern Standard Time | +1000 |
| Apia Daylight Time | +1400 |
Timezone annotation example
#timezone -0600
Define custom column separator
If columns are delimited using a character other than a comma, use the sep
keyword to define a custom separator in the first line of your CSV file.
sep=;
Annotation shorthand
Extended annotated CSV supports annotation shorthand. Include the column label, datatype, and (optional) default value in each column header row using the following syntax:
<column-label>|<column-datatype>|<column-default-value>
Example annotation shorthand
m|measurement,location|tag|Hong Kong,temp|double,pm|long|0,time|dateTime:RFC3339
weather,San Francisco,51.9,38,2020-01-01T00:00:00Z
weather,New York,18.2,,2020-01-01T00:00:00Z
weather,,53.6,171,2020-01-01T00:00:00Z
The shorthand explained
- The
mcolumn represents the measurement and has no default value. - The
locationcolumn is a tag with the default value,Hong Kong. - The
tempcolumn is a field with float (double) values and no default value. - The
pmcolumn is a field with integer (long) values and a default of0. - The
timecolumn represents the timestamp, uses the RFC3339 timestamp format, and has no default value.
Resulting line protocol
weather,location=San\ Francisco temp=51.9,pm=38i 1577836800000000000
weather,location=New\ York temp=18.2,pm=0i 1577836800000000000
weather,location=Hong\ Kong temp=53.6,pm=171i 1577836800000000000