chore(telegraf): Prepare for data formats documentation from upstream.

- Remove input and output files, to be replaced by generated files.
jts-docs-restructure-telegraf-serializers
Jason Stirnaman 2025-12-09 11:36:05 -06:00
parent 7041a2efb3
commit 14552b9da3
30 changed files with 0 additions and 5033 deletions

View File

@ -1,45 +0,0 @@
---
title: Telegraf input data formats
list_title: Input data formats
description: Telegraf supports parsing input data formats into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Input data formats
weight: 1
parent: Data formats
---
Telegraf [input plugins](/telegraf/v1/plugins/inputs/) consume data in one or more data formats and
parse the data into Telegraf [metrics][/telegraf/v1/metrics/].
Many input plugins use configurable parsers for parsing data formats into metrics.
This allows input plugins such as [`kafka_consumer` input plugin](/telegraf/v1/plugins/#input-kafka_consumer)
to consume and process different data formats, such as InfluxDB line
protocol or JSON.
Telegraf supports the following input **data formats**:
{{< children >}}
Any input plugin containing the `data_format` option can use it to select the
desired parser:
```toml
[[inputs.exec]]
## Commands array
commands = ["/tmp/test.sh", "/usr/bin/mycollector --foo=bar"]
## measurement name suffix (for separating different commands)
name_suffix = "_mycollector"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "json_v2"
```
## Input parser plugins
When you specify a `data_format` in an [input plugin](/telegraf/v1/plugins/inputs/) configuration that supports it, the input plugin uses the associated [parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers) to convert data from its source format into Telegraf metrics.
Many parser plugins provide additional configuration options for specifying details about your data schema and how it should map to fields in Telegraf metrics.
[metrics]: /telegraf/v1/metrics/

View File

@ -1,105 +0,0 @@
---
title: Avro input data format
list_title: Avro
description: Use the `avro` input data format to parse Avro binary or JSON data into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Avro
weight: 10
parent: Input data formats
metadata: [Avro Parser Plugin]
---
Use the `avro` input data format to parse binary or JSON [Avro](https://avro.apache.org/) message data into Telegraf metrics.
## Wire format
Avro messages should conform to [Wire Format](https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html#wire-format) using the following byte-mapping:
| Bytes | Area | Description |
| ----- | ---------- | ------------------------------------------------ |
| 0 | Magic Byte | Confluent serialization format version number; currently always `0`. |
| 1-4 | Schema ID | 4-byte schema ID as returned by Schema Registry. |
| 5- | Data | Serialized data. |
{{% caption %}}
Source: [Confluent Documentation](https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html#wire-format)
{{% /caption %}}
For more information about Avro schema and encodings, see the [specification](https://avro.apache.org/docs/current/specification/) in the Apache Avro documentation.
## Configuration
```toml
[[inputs.kafka_consumer]]
## Kafka brokers.
brokers = ["localhost:9092"]
## Topics to consume.
topics = ["telegraf"]
## Maximum length of a message to consume, in bytes (default 0/unlimited);
## larger messages are dropped
max_message_len = 1000000
## Avro data format settings
data_format = "avro"
## Avro message format
## Supported values are "binary" (default) and "json"
# avro_format = "binary"
## Url of the schema registry; exactly one of schema registry and
## schema must be set
avro_schema_registry = "http://localhost:8081"
## Schema string; exactly one of schema registry and schema must be set
#avro_schema = '''
# {
# "type":"record",
# "name":"Value",
# "namespace":"com.example",
# "fields":[
# {
# "name":"tag",
# "type":"string"
# },
# {
# "name":"field",
# "type":"long"
# },
# {
# "name":"timestamp",
# "type":"long"
# }
# ]
# }
#'''
## Measurement string; if not set, determine measurement name from
## schema (as "<namespace>.<name>")
# avro_measurement = "ratings"
## Avro fields to be used as tags; optional.
# avro_tags = ["CHANNEL", "CLUB_STATUS"]
## Avro fields to be used as fields; if empty, any Avro fields
## detected from the schema, not used as tags, will be used as
## measurement fields.
# avro_fields = ["STARS"]
## Avro fields to be used as timestamp; if empty, current time will
## be used for the measurement timestamp.
# avro_timestamp = ""
## If avro_timestamp is specified, avro_timestamp_format must be set
## to one of 'unix', 'unix_ms', 'unix_us', or 'unix_ns'
# avro_timestamp_format = "unix"
## Used to separate parts of array structures. As above, the default
## is the empty string, so a=["a", "b"] becomes a0="a", a1="b".
## If this were set to "_", then it would be a_0="a", a_1="b".
# avro_field_separator = "_"
## Default values for given tags: optional
# tags = { "application": "hermes", "region": "central" }
```

View File

@ -1,355 +0,0 @@
---
title: Binary input data format
list_title: Binary
description:
Use the `binary` input data format with user-specified configurations to parse binary protocols into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Binary
weight: 10
parent: Input data formats
metadata: [Binary Parser Plugin]
---
Use the `binary` input data format with user-specified configurations to parse binary protocols into Telegraf metrics.
## Configuration
```toml
[[inputs.file]]
files = ["example.bin"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "binary"
## Do not error-out if none of the filter expressions below matches.
# allow_no_match = false
## Specify the endianness of the data.
## Available values are "be" (big-endian), "le" (little-endian) and "host",
## where "host" means the same endianness as the machine running Telegraf.
# endianess = "host"
## Interpret input as string containing hex-encoded data.
# hex_encoding = false
## Multiple parsing sections are allowed
[[inputs.file.binary]]
## Optional: Metric (measurement) name to use if not extracted from the data.
# metric_name = "my_name"
## Definition of the message format and the extracted data.
## Please note that you need to define all elements of the data in the
## correct order with the correct length as the data is parsed in the order
## given.
## An entry can have the following properties:
## name -- Name of the element (e.g. field or tag). Can be omitted
## for special assignments (i.e. time & measurement) or if
## entry is omitted.
## type -- Data-type of the entry. Can be "int8/16/32/64", "uint8/16/32/64",
## "float32/64", "bool" and "string".
## In case of time, this can be any of "unix" (default), "unix_ms", "unix_us",
## "unix_ns" or a valid Golang time format.
## bits -- Length in bits for this entry. If omitted, the length derived from
## the "type" property will be used. For "time" 64-bit will be used
## as default.
## assignment -- Assignment of the gathered data. Can be "measurement", "time",
## "field" or "tag". If omitted "field" is assumed.
## omit -- Omit the given data. If true, the data is skipped and not added
## to the metric. Omitted entries only need a length definition
## via "bits" or "type".
## terminator -- Terminator for dynamic-length strings. Only used for "string" type.
## Valid values are "fixed" (fixed length string given by "bits"),
## "null" (null-terminated string) or a character sequence specified
## as HEX values (e.g. "0x0D0A"). Defaults to "fixed" for strings.
## timezone -- Timezone of "time" entries. Only applies to "time" assignments.
## Can be "utc", "local" or any valid Golang timezone (e.g. "Europe/Berlin")
entries = [
{ type = "string", assignment = "measurement", terminator = "null" },
{ name = "address", type = "uint16", assignment = "tag" },
{ name = "value", type = "float64" },
{ type = "unix", assignment = "time" },
]
## Optional: Filter evaluated before applying the configuration.
## This option can be used to mange multiple configuration specific for
## a certain message type. If no filter is given, the configuration is applied.
# [inputs.file.binary.filter]
# ## Filter message by the exact length in bytes (default: N/A).
# # length = 0
# ## Filter the message by a minimum length in bytes.
# ## Messages longer of of equal length will pass.
# # length_min = 0
# ## List of data parts to match.
# ## Only if all selected parts match, the configuration will be
# ## applied. The "offset" is the start of the data to match in bits,
# ## "bits" is the length in bits and "match" is the value to match
# ## against. Non-byte boundaries are supported, data is always right-aligned.
# selection = [
# { offset = 0, bits = 8, match = "0x1F" },
# ]
#
#
```
In this configuration mode, you explicitly specify the field and tags
to parse from your data.
A configuration can contain multiple `binary` subsections.
For example, the `file` plugin can process binary data multiple times.
This can be useful (together with _filters_) to handle different message types.
**Note**: The `filter` section needs to be placed _after_ the `entries`
definitions, otherwise the entries will be assigned to the filter section.
### General options and remarks
#### `allow_no_match` (optional)
By specifying `allow_no_match` you allow the parser to silently ignore data
that does not match _any_ given configuration filter. This can be useful if
you only want to collect a subset of the available messages.
#### `endianness` (optional)
This specifies the endianness of the data. If not specified, the parser will
fall back to the "host" endianness, assuming that the message and Telegraf
machine share the same endianness.
Alternatively, you can explicitly specify big-endian format (`"be"`) or
little-endian format (`"le"`).
#### `hex_encoding` (optional)
If `true`, the input data is interpreted as a string containing hex-encoded
data like `C0 C7 21 A9`. The value is _case insensitive_ and can handle spaces,
however prefixes like `0x` or `x` are _not_ allowed.
### Non-byte aligned value extraction
In both, `filter` and `entries` definitions, values can be extracted at non-byte
boundaries. You can for example extract 3-bit starting at bit-offset 8. In those
cases, the result will be masked and shifted such that the resulting byte-value
is _right_ aligned. In case your 3-bit are `101` the resulting byte value is
`0x05`.
This is especially important when specifying the `match` value in the filter
section.
### Entries definitions
The `entries` array specifies how to parse the message into the measurement name,
timestamp, tags, and fields.
#### `measurement` specification
When setting the `assignment` to `"measurement"`, the extracted value
is used as the metric name, overriding other specifications.
The `type` setting is assumed to be `"string"` and can be omitted similar
to the `name` option. See [`string` type handling](#string-type-handling)
for details and further options.
#### `time` specification
When setting the `assignment` to `"time"`, the extracted value
is used as the timestamp of the metric. The default is the _current
time_ for all created metrics.
The `type` setting specifies the time-format of included timestamps.
Use one of the following:
- `unix` _(default)_
- `unix_ms`
- `unix_us`
- `unix_ns`
- [Go "reference time"][time const]. Consult the Go [time][time parse]
package for details and additional examples on how to set the time format.
For the `unix` format and derivatives, the underlying value is assumed
to be a 64-bit integer. The `bits` setting can be used to specify other
length settings. All other time-formats assume a fixed-length `string`
value to be extracted. The length of the string is automatically
determined using the format setting in `type`.
The `timezone` setting converts the extracted time to the
given value timezone. By default, the time will be interpreted as `utc`.
Other valid values are `local` (the local timezone configured for
the machine), or valid timezone-specification (for example,`Europe/Berlin`).
### `tag` specification
When setting the `assignment` to `"tag"`, the extracted value
is used as a tag. The `name` setting is the name of the tag
and the `type` defaults to `string`. When specifying other types,
the extracted value is first interpreted as the given type and
then converted to `string`.
The `bits` setting can be used to specify the length of the data to
extract and is required for fixed-length `string` types.
### `field` specification
When setting the `assignment` to `"field"` or omitting the `assignment`
setting, the extracted value is used as a field. The `name` setting
is used as the name of the field and the `type` as the type of the field value.
The `bits` setting can be used to specify the length of the data to
extract. By default the length corresponding to `type` is used.
Please see the [string](#string-type-handling) and [bool](#bool-type-handling)
specific sections when using those types.
### `string` type handling
Strings are assumed to be fixed-length strings by default. In this case, the
`bits` setting is mandatory to specify the length of the string in _bit_.
To handle dynamic strings, the `terminator` setting can be used to specify
characters to terminate the string. The two named options, `fixed` and `null`
specify fixed-length and null-terminated strings, respectively.
Any other setting is interpreted as a hexadecimal sequence of bytes
matching the end of the string. The termination-sequence is removed from
the result.
### `bool` type handling
By default, `bool` types are assumed to be _one_ bit in length. You can
specify any other length by using the `bits` setting.
When interpreting values as booleans, any zero value is `false` and
any non-zero value is `true`.
### omitting data
Parts of the data can be omitted by setting `omit = true`. In this case,
you only need to specify the length of the chunk to omit by either using
the `type` or `bits` setting. All other options can be skipped.
### Filter definitions
Filters can be used to match the length or the content of the data against
a specified reference. See the [examples section](#examples) for details.
You can also check multiple parts of the message by specifying multiple
`section` entries for a filter. Each `section` is then matched separately.
All have to match to apply the configuration.
#### `length` and `length_min` options
Using the `length` option, the filter checks if the parsed data has
exactly the given number of _bytes_. Otherwise, the configuration is not applied.
Similarly, for `length_min` the data has to have _at least_ the given number
of _bytes_ to generate a match.
#### `selection` list
Selections can be used with or without length constraints to match the content
of the data. Here, the `offset` and `bits` properties specify the start
and length of the data to check. Both values are in _bit_ allowing for non-byte
aligned value extraction. The extracted data is checked against the
given `match` value specified in HEX.
If multiple `selection` entries are specified _all_ of the selections must
match for the configuration to get applied.
## Examples
In the following example, we use a binary protocol with three different messages
in little-endian format
### Message A definition
```text
+--------+------+------+--------+--------+------------+--------------------+--------------------+
| ID | type | len | addr | count | failure | value | timestamp |
+--------+------+------+--------+--------+------------+--------------------+--------------------+
| 0x0201 | 0x0A | 0x18 | 0x7F01 | 0x2A00 | 0x00000000 | 0x6F1283C0CA210940 | 0x10D4DF6200000000 |
+--------+------+------+--------+--------+------------+--------------------+--------------------+
```
### Message B definition
```text
+--------+------+------+------------+
| ID | type | len | value |
+--------+------+------+------------+
| 0x0201 | 0x0B | 0x04 | 0xDEADC0DE |
+--------+------+------+------------+
```
### Message C definition
```text
+--------+------+------+------------+------------+--------------------+
| ID | type | len | value x | value y | timestamp |
+--------+------+------+------------+------------+--------------------+
| 0x0201 | 0x0C | 0x10 | 0x4DF82D40 | 0x5F305C08 | 0x10D4DF6200000000 |
+--------+------+------+------------+------------+--------------------+
```
All messages consists of a 4-byte header containing the _message type_
in the 3rd byte and a message specific body. To parse those messages
you can use the following configuration:
```toml
[[inputs.file]]
files = ["messageA.bin", "messageB.bin", "messageC.bin"]
data_format = "binary"
endianess = "le"
[[inputs.file.binary]]
metric_name = "messageA"
entries = [
{ bits = 32, omit = true },
{ name = "address", type = "uint16", assignment = "tag" },
{ name = "count", type = "int16" },
{ name = "failure", type = "bool", bits = 32, assignment = "tag" },
{ name = "value", type = "float64" },
{ type = "unix", assignment = "time" },
]
[inputs.file.binary.filter]
selection = [{ offset = 16, bits = 8, match = "0x0A" }]
[[inputs.file.binary]]
metric_name = "messageB"
entries = [
{ bits = 32, omit = true },
{ name = "value", type = "uint32" },
]
[inputs.file.binary.filter]
selection = [{ offset = 16, bits = 8, match = "0x0B" }]
[[inputs.file.binary]]
metric_name = "messageC"
entries = [
{ bits = 32, omit = true },
{ name = "x", type = "float32" },
{ name = "y", type = "float32" },
{ type = "unix", assignment = "time" },
]
[inputs.file.binary.filter]
selection = [{ offset = 16, bits = 8, match = "0x0C" }]
```
The above configuration has one `[[inputs.file.binary]]` section per
message type and uses a filter in each of those sections to apply
the correct configuration by comparing the 3rd byte (containing
the message type). This results in the following output:
```text
metricA,address=383,failure=false count=42i,value=3.1415 1658835984000000000
metricB value=3737169374i 1658847037000000000
metricC x=2.718280076980591,y=0.0000000000000000000000000000000006626070178575745 1658835984000000000
```
`metricB` uses the parsing time as timestamp due to missing
information in the data. The other two metrics use the timestamp
derived from the data.
[time const]: https://golang.org/pkg/time/#pkg-constants
[time parse]: https://golang.org/pkg/time/#Parse

View File

@ -1,69 +0,0 @@
---
title: Collectd input data format
list_title: Collectd
description: Use the `collectd` input data format to parse collectd network binary protocol to create tags for host, instance, type, and type instance.
menu:
telegraf_v1_ref:
name: collectd
weight: 10
parent: Input data formats
metadata: [Collectd Parser Plugin]
---
Use the `collectd` input data format to parse [collectd binary network protocol](https://collectd.org/wiki/index.php/Binary_protocol) data into Telegraf metrics.
Tags are
created for host, instance, type, and type instance.
All collectd values are added as float64 fields.
For more information about the binary network protocol see
[here](https://collectd.org/wiki/index.php/Binary_protocol).
You can control the cryptographic settings with parser options. Create an
authentication file and set `collectd_auth_file` to the path of the file, then
set the desired security level in `collectd_security_level`.
Additional information including client setup can be found [here][1].
You can also change the path to the typesdb or add additional typesdb using
`collectd_typesdb`.
[1]: https://collectd.org/wiki/index.php/Networking_introduction#Cryptographic_setup
## Configuration
```toml
[[inputs.socket_listener]]
service_address = "udp://:25826"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "collectd"
## Authentication file for cryptographic security levels
collectd_auth_file = "/etc/collectd/auth_file"
## One of none (default), sign, or encrypt
collectd_security_level = "encrypt"
## Path of to TypesDB specifications
collectd_typesdb = ["/usr/share/collectd/types.db"]
## Multi-value plugins can be handled two ways.
## "split" will parse and store the multi-value plugin data into separate measurements
## "join" will parse and store the multi-value plugin as a single multi-value measurement.
## "split" is the default behavior for backward compatibility with previous versions of InfluxDB.
collectd_parse_multivalue = "split"
```
## Example Output
```text
memory,type=memory,type_instance=buffered value=2520051712 1560455990829955922
memory,type=memory,type_instance=used value=3710791680 1560455990829955922
memory,type=memory,type_instance=buffered value=2520047616 1560455980830417318
memory,type=memory,type_instance=cached value=9472626688 1560455980830417318
memory,type=memory,type_instance=slab_recl value=2088894464 1560455980830417318
memory,type=memory,type_instance=slab_unrecl value=146984960 1560455980830417318
memory,type=memory,type_instance=free value=2978258944 1560455980830417318
memory,type=memory,type_instance=used value=3707047936 1560455980830417318
```

View File

@ -1,599 +0,0 @@
---
title: CSV input data format
list_title: CSV
description: Use the `csv` input data format to parse comma-separated values into Telegraf metrics.
menu:
telegraf_v1_ref:
name: CSV
weight: 10
parent: Input data formats
metadata: [CSV parser plugin]
---
Use the `csv` input data format to parse comma-separated values into Telegraf metrics.
## Configuration
```toml
[[inputs.file]]
files = ["example"]
## The data format to consume.
## Type: string
## Each data format has its own unique set of configuration options.
## For more information about input data formats and options,
## see https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "csv"
## Specifies the number of rows to treat as the header.
## Type: integer
## Default: 0
## The value can be 0 or greater.
## If `0`, doesn't use a header; the parser treats all rows as data and uses the names specified in `csv_column_names`.
## If `1`, uses the first row as the header.
## If greater than `1`, concatenates that number of values for each column.
## Values specified in `csv_column_names` override column names in the header.
csv_header_row_count = 0
## Specifies custom names for columns.
## Type: []string
## Default: []
## Specify names in order by column; unnamed columns are ignored by the parser.
## Required if `csv_header_row_count` is set to `0`.
csv_column_names = []
## Specifies data types for columns.
## Type: []string{"int", "float", "bool", "string"}
## Default: Tries to convert each column to one of the possible types, in the following order: "int", "float", "bool", "string".
## Possible values: "int", "float", "bool", "string".
## Specify types in order by column (for example, `["string", "int", "float"]`).
csv_column_types = []
## Specifies the number of rows to skip before looking for metadata and header information.
## Default: 0
csv_skip_rows = 0
## Specifies the number of rows to parse as metadata (before looking for header information).
## Type: integer
## Default: 0; no metadata rows to parse.
## If set, parses the rows using the characters specified in `csv_metadata_separators`, and then adds the
## parsed key-value pairs as tags in the data.
## To convert the tags to fields, use the converter processor.
csv_metadata_rows = 0
## Specifies metadata separators, in order of precedence, for parsing metadata rows.
## Type: []string
## At least one separator is required if `csv_metadata_rows` is set.
## The specified values set the order of precedence for separators used to parse `csv_metadata_rows` into key-value pairs.
## Separators are case-sensitive.
csv_metadata_separators = [":", "="]
## Specifies a set of characters to trim from metadata rows.
## Type: string
## Default: empty; the parser doesn't trim metadata rows.
## Trim characters are case sensitive.
csv_metadata_trim_set = ""
## Specifies the number of columns to skip in header and data rows.
## Type: integer
## Default: 0; no columns are skipped
csv_skip_columns = 0
## Specifies the separator for columns in the CSV.
## Type: string
## Default: a comma (`,`)
## If you specify an invalid delimiter (for example, `"\u0000"`),
## the parser converts commas to `"\ufffd"` and converts invalid delimiters
## to commas, parses the data, and then reverts invalid characters and commas
## to their original values.
csv_delimiter = ","
## Specifies the character used to indicate a comment row.
## Type: string
## Default: empty; no rows are treated as comments
## The parser skips rows that begin with the specified character.
csv_comment = ""
## Specifies whether to remove leading whitespace from fields.
## Type: boolean
## Default: false
csv_trim_space = false
## Specifies columns (by name) to use as tags.
## Type: []string
## Default: empty
## Columns not specified as tags or measurement name are considered fields.
csv_tag_columns = []
## Specifies whether column tags overwrite metadata and default tags.
## Type: boolean
## Default: false
## If true, the column tag value takes precedence over metadata
## or default tags that have the same name.
csv_tag_overwrite = false
## Specifies the CSV column to use for the measurement name.
## Type: string
## Default: empty; uses the input plugin name for the measurement name.
## If set, the measurement name is extracted from values in the specified
## column and the column isn't included as a field.
csv_measurement_column = ""
## Specifies the CSV column to use for the timestamp.
## Type: string
## Default: empty; uses the current system time as the timestamp in metrics
## If set, the parser extracts time values from the specified column
## to use as timestamps in metrics, and the column isn't included
## as a field in metrics.
## If set, you must also specify a value for `csv_timestamp_format`.
## For more information, see [timestamps](/telegraf/v1/data_formats/input/csv/#timestamps).
csv_timestamp_column = ""
## Specifies the timestamp format for values extracted from `csv_timestamp_column`.
## Type: string
## Possible values: "unix", "unix_ms", "unix_us", "unix_ns", the Go reference time in one of the predefined layouts
## Default: empty
## Required if `csv_timestamp_column` is specified.
## For more information, see [timestamps](/telegraf/v1/data_formats/input/csv/#timestamps).
csv_timestamp_format = ""
## Specifies the time zone to use and outputs location-specific timestamps in metrics.
## Only used if `csv_timestamp_format` is the Go reference time in one of the
## predefined layouts; unix formats are in UTC.
## Type: string
## Default: empty
## Possible values: a time zone name in TZ syntax. For a list of names, see https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List.
csv_timezone = ""
## For more information, see [timestamps](/telegraf/v1/data_formats/input/csv/#timestamps).
## Specifies values to skip--for example, an empty string (`""`).
## Type: []string
## Default: empty
## The parser skips field values that match any of the specified values.
csv_skip_values = []
## Specifies whether to skip CSV lines that can't be parsed.
## Type: boolean
## Default: false
csv_skip_errors = false
## Specifies whether to reset the parser after each call.
## Type: string
## Default: "none"
## Possible values:
## - "none": Do not reset the parser.
## - "always": Reset the parser's state after reading each file in the gather
## cycle. If parsing by line, the setting is ignored.
## Resetting the parser state after parsing each file is helpful when reading
## full CSV structures that include headers or metadata.
csv_reset_mode = "none"
```
## Metrics
With the default configuration, the CSV data format parser creates one metric
for each CSV row, and adds CSV columns as fields in the metric.
A field's data type is automatically determined from its value (unless explicitly defined with `csv_column_types`).
Data format configuration options let you customize how the parser handles
specific CSV rows, columns, and data types.
[Metric filtering](/telegraf/v1/configuration/#metric-filtering) and [aggregator and processor plugins](/telegraf/v1/configure_plugins/aggregator_processor/) provide additional data transformation options--for example:
- Use metric filtering to skip columns and rows.
- Use the [converter processor](https://github.com/influxdata/telegraf/tree/master/plugins/processors/converter/) to convert parsed metadata from tags to fields.
## Timestamps
Every metric has a timestamp--a date and time associated with the fields.
The default timestamp for created metrics is the _current time_ in UTC.
To use extracted values from the CSV as timestamps for metrics, specify
the `csv_timestamp_column` and `csv_timestamp_format` options.
### csv_timestamp_column
The `csv_timestamp_column` option specifies the key (column name) in the CSV data
that contains the time value to extract and use as the timestamp in metrics.
A unix time value may be one of the following data types:
- int64
- float64
- string
If you specify a [Go format](https://go.dev/src/time/format.go) for `csv_timestamp_format`,
values in your timestamp column must be strings.
When using the [`"unix"` format](#csv_timestamp_format), an optional fractional component is allowed.
Other unix time formats, such as `"unix_ms"`, cannot have a fractional component.
### csv_timestamp_format
If specifying `csv_timestamp_column`, you must also specify the format of timestamps in the column.
To specify the format, set `csv_timestamp_format` to one of the following values:
- `"unix"`
- `"unix_ms"`
- `"unix_us"`
- `"unix_ns"`
- a predefined layout from Go [`time` constants](https://pkg.go.dev/time#pkg-constants) using the
Go _reference time_--for example, `"Mon Jan 2 15:04:05 MST 2006"` (the `UnixDate` format string).
For more information about time formats, see the following:
- Unix time documentation
- Go [time][time parse] package documentation
### Time zone
Telegraf outputs timestamps in UTC.
To parse location-aware timestamps in your data,
specify a [`csv_timestamp_format`](#csv_timestamp_format)
that contains time zone information.
If timestamps in the `csv_timestamp_column` contain a time zone offset, the parser uses the offset to calculate the timestamp in UTC.
If `csv_timestamp_format` and your timestamp data contain a time zone abbreviation, then the parser tries to resolve the abbreviation to a location in the [IANA Time Zone Database](https://www.iana.org/time-zones) and return a UTC offset for that location.
To set the location that the parser should use when resolving time zone abbreviations, specify a value for `csv_timezone`, following the TZ syntax in the [Internet Assigned Numbers Authority time zone database](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List).
{{% warn %}}
Prior to Telegraf v1.27, the Telegraf parser ignored abbreviated time zones (for example, "EST") in parsed time values, and used UTC for the timestamp location.
{{% /warn %}}
## Examples
### Extract timestamps from a time column using RFC3339 format
Configuration:
```toml
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_header_row_count = 1
csv_measurement_column = "measurement"
csv_timestamp_column = "time"
csv_timestamp_format = "2006-01-02T15:04:05Z07:00"
[[outputs.file]]
files = ["metrics.out"]
influx_sort_fields = true
```
Input:
```csv
measurement,cpu,time_user,time_system,time_idle,time
cpu,cpu0,42,42,42,2018-09-13T13:03:28Z
```
<!--
```bash
cat <<EOF > telegraf.conf
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_header_row_count = 1
csv_measurement_column = "measurement"
csv_timestamp_column = "time"
csv_timestamp_format = "2006-01-02T15:04:05Z07:00"
[[outputs.file]]
files = ["metrics.out"]
influx_sort_fields = true
EOF
cat <<EOF > example
measurement,cpu,time_user,time_system,time_idle,time
cpu,cpu0,42,42,42,2018-09-13T13:03:28Z
EOF
telegraf --once --config telegraf.conf && cat metrics.out && rm metrics.out
```
-->
Output:
<!--pytest-codeblocks:expected-output-->
```
cpu cpu="cpu0",time_idle=42i,time_system=42i,time_user=42i 1536843808000000000
```
### Parse timestamp abbreviations
The following example specifies `csv_timezone` for resolving an associated time zone (`EST`) in the input data:
Configuration:
```toml
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_header_row_count = 1
csv_measurement_column = "measurement"
csv_timestamp_column = "time"
csv_timestamp_format = "Mon, 02 Jan 2006 15:04:05 MST"
csv_timezone = "America/New_York"
[[outputs.file]]
files = ["metrics.out"]
influx_sort_fields = true
```
Input:
```csv
measurement,cpu,time_user,time_system,time_idle,time
cpu,cpu1,42,42,42,"Mon, 02 Jan 2006 15:04:05 EST"
cpu,cpu1,42,42,42,"Mon, 02 Jan 2006 15:04:05 GMT"
```
<!--
```bash
cat <<EOF > telegraf.conf
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_header_row_count = 1
csv_measurement_column = "measurement"
csv_timestamp_column = "time"
csv_timestamp_format = "Mon, 02 Jan 2006 15:04:05 MST"
csv_timezone = "America/New_York"
[[outputs.file]]
files = ["metrics.out"]
influx_sort_fields = true
EOF
cat <<EOF > example
measurement,cpu,time_user,time_system,time_idle,time
cpu,cpu1,42,42,42,"Mon, 02 Jan 2006 15:04:05 EST"
cpu,cpu1,42,42,42,"Mon, 02 Jan 2006 15:04:05 GMT"
EOF
telegraf --once --config telegraf.conf && cat metrics.out && rm metrics.out
```
-->
The parser resolves the `GMT` and `EST` abbreviations and outputs the following:
<!--pytest-codeblocks:expected-output-->
```
cpu cpu="cpu1",time_idle=42i,time_system=42i,time_user=42i 1136232245000000000
cpu cpu="cpu1",time_idle=42i,time_system=42i,time_user=42i 1136214245000000000
```
The timestamps represent the following dates, respectively:
```text
2006-01-02 20:04:05
2006-01-02 15:04:05
```
### Parse metadata into tags
Configuration:
```toml
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_measurement_column = "measurement"
csv_metadata_rows = 2
csv_metadata_separators = [":", "="]
csv_metadata_trim_set = "# "
csv_header_row_count = 1
csv_tag_columns = ["Version","cpu"]
csv_timestamp_column = "time"
csv_timestamp_format = "2006-01-02T15:04:05Z07:00"
[[outputs.file]]
files = ["metrics.out"]
influx_sort_fields = true
```
Input:
```csv
# Version=1.1
# File Created: 2021-11-17T07:02:45+10:00
Version,measurement,cpu,time_user,time_system,time_idle,time
1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z
```
<!--
```bash
cat <<EOF > telegraf.conf
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_measurement_column = "measurement"
csv_metadata_rows = 2
csv_metadata_separators = [":", "="]
csv_metadata_trim_set = "# "
csv_header_row_count = 1
csv_tag_columns = ["Version","cpu"]
csv_timestamp_column = "time"
csv_timestamp_format = "2006-01-02T15:04:05Z07:00"
[[outputs.file]]
files = ["metrics.out"]
influx_sort_fields = true
EOF
cat <<EOF > example
# Version=1.1
# File Created: 2021-11-17T07:02:45+10:00
Version,measurement,cpu,time_user,time_system,time_idle,time
1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z
EOF
telegraf --once --config telegraf.conf && cat metrics.out && rm metrics.out
```
-->
Output:
<!--pytest-codeblocks:expected-output-->
```
cpu,File\ Created=2021-11-17T07:02:45+10:00,Version=1.1,cpu=cpu0 time_idle=42i,time_system=42i,time_user=42i 1536843808000000000
```
### Allow tag column values to overwrite parsed metadata
Configuration:
```toml
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_measurement_column = "measurement"
csv_metadata_rows = 2
csv_metadata_separators = [":", "="]
csv_metadata_trim_set = " #"
csv_header_row_count = 1
csv_tag_columns = ["Version","cpu"]
csv_tag_overwrite = true
csv_timestamp_column = "time"
csv_timestamp_format = "2006-01-02T15:04:05Z07:00"
[[outputs.file]]
files = ["metrics.out"]
influx_sort_fields = true
```
Input:
```csv
# Version=1.1
# File Created: 2021-11-17T07:02:45+10:00
Version,measurement,cpu,time_user,time_system,time_idle,time
1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z
```
<!--
```bash
cat <<EOF > telegraf.conf
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_measurement_column = "measurement"
csv_metadata_rows = 2
csv_metadata_separators = [":", "="]
csv_metadata_trim_set = " #"
csv_header_row_count = 1
csv_tag_columns = ["Version","cpu"]
csv_tag_overwrite = true
csv_timestamp_column = "time"
csv_timestamp_format = "2006-01-02T15:04:05Z07:00"
[[outputs.file]]
files = ["metrics.out"]
influx_sort_fields = true
EOF
cat <<EOF > example
# Version=1.1
# File Created: 2021-11-17T07:02:45+10:00
Version,measurement,cpu,time_user,time_system,time_idle,time
1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z
EOF
telegraf --once --config telegraf.conf && cat metrics.out && rm metrics.out
```
-->
Output:
<!--pytest-codeblocks:expected-output-->
```
cpu,File\ Created=2021-11-17T07:02:45+10:00,Version=1.2,cpu=cpu0 time_idle=42i,time_system=42i,time_user=42i 1536843808000000000
```
### Combine multiple header rows
Configuration:
```toml
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_comment = "#"
csv_header_row_count = 2
csv_measurement_column = "measurement"
csv_timestamp_column = "time"
csv_timestamp_format = "2006-01-02T15:04:05Z07:00"
[[outputs.file]]
## Files to write to.
files = ["metrics.out"]
## Use determinate ordering.
influx_sort_fields = true
```
Input:
```csv
# Version=1.1
# File Created: 2021-11-17T07:02:45+10:00
Version,measurement,cpu,time,time,time,time
_system,,,_user,_system,_idle,
1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z
```
<!--
```bash
cat <<EOF > telegraf.conf
[agent]
omit_hostname = true
[[inputs.file]]
files = ["example"]
data_format = "csv"
csv_comment = "#"
csv_header_row_count = 2
csv_measurement_column = "measurement"
csv_timestamp_column = "time"
csv_timestamp_format = "2006-01-02T15:04:05Z07:00"
[[outputs.file]]
## Files to write to.
files = ["metrics.out"]
## Use determinate ordering.
influx_sort_fields = true
EOF
cat <<EOF > example
# Version=1.1
# File Created: 2021-11-17T07:02:45+10:00
Version,measurement,cpu,time,time,time,time
_system,,,_user,_system,_idle,
1.2,cpu,cpu0,42,42,42,2018-09-13T13:03:28Z
EOF
telegraf --once --config telegraf.conf && cat metrics.out && rm metrics.out
```
-->
Output:
<!--pytest-codeblocks:expected-output-->
```
cpu Version_system=1.2,cpu="cpu0",time_idle=42i,time_system=42i,time_user=42i 1536843808000000000
```
[time parse]: https://pkg.go.dev/time#Parse
[metric filtering]: /telegraf/v1/configuration/#metric-filtering

View File

@ -1,187 +0,0 @@
---
title: Dropwizard input data format
list_title: Dropwizard
description: Use the `dropwizard` input data format to parse Dropwizard JSON representations into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Dropwizard
weight: 10
parent: Input data formats
metadata: [Dropwizard parser plugin]
---
Use the `dropwizard` input data format to parse the [JSON Dropwizard][dropwizard]
representation of a single dropwizard metric registry into Telegraf metrics. By default, tags are
parsed from metric names as if they were actual InfluxDB line protocol keys
(`measurement<,tag_set>`) which can be overridden by defining a custom [template
pattern][templates]. All field value types are supported, `string`, `number` and
`boolean`.
[templates]: https://github.com/influxdata/telegraf/blob/master/docs/TEMPLATE_PATTERN.md
[dropwizard]: https://metrics.dropwizard.io/3.1.0/manual/json/
## Configuration
```toml
[[inputs.file]]
files = ["example"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "dropwizard"
## Used by the templating engine to join matched values when cardinality is > 1
separator = "_"
## Each template line requires a template pattern. It can have an optional
## filter before the template and separated by spaces. It can also have optional extra
## tags following the template. Multiple tags should be separated by commas and no spaces
## similar to the line protocol format. There can be only one default template.
## Templates support below format:
## 1. filter + template
## 2. filter + template + extra tag(s)
## 3. filter + template with field key
## 4. default template
## By providing an empty template array, templating is disabled and measurements are parsed as InfluxDB line protocol keys (measurement<,tag_set>)
templates = []
## You may use an appropriate [gjson path](https://github.com/tidwall/gjson#path-syntax)
## to locate the metric registry within the JSON document
# dropwizard_metric_registry_path = "metrics"
## You may use an appropriate [gjson path](https://github.com/tidwall/gjson#path-syntax)
## to locate the default time of the measurements within the JSON document
# dropwizard_time_path = "time"
# dropwizard_time_format = "2006-01-02T15:04:05Z07:00"
## You may use an appropriate [gjson path](https://github.com/tidwall/gjson#path-syntax)
## to locate the tags map within the JSON document
# dropwizard_tags_path = "tags"
## You may even use tag paths per tag
# [inputs.exec.dropwizard_tag_paths]
# tag1 = "tags.tag1"
# tag2 = "tags.tag2"
```
## Examples
A typical JSON of a dropwizard metric registry:
```json
{
"version": "3.0.0",
"counters" : {
"measurement,tag1=green" : {
"count" : 1
}
},
"meters" : {
"measurement" : {
"count" : 1,
"m15_rate" : 1.0,
"m1_rate" : 1.0,
"m5_rate" : 1.0,
"mean_rate" : 1.0,
"units" : "events/second"
}
},
"gauges" : {
"measurement" : {
"value" : 1
}
},
"histograms" : {
"measurement" : {
"count" : 1,
"max" : 1.0,
"mean" : 1.0,
"min" : 1.0,
"p50" : 1.0,
"p75" : 1.0,
"p95" : 1.0,
"p98" : 1.0,
"p99" : 1.0,
"p999" : 1.0,
"stddev" : 1.0
}
},
"timers" : {
"measurement" : {
"count" : 1,
"max" : 1.0,
"mean" : 1.0,
"min" : 1.0,
"p50" : 1.0,
"p75" : 1.0,
"p95" : 1.0,
"p98" : 1.0,
"p99" : 1.0,
"p999" : 1.0,
"stddev" : 1.0,
"m15_rate" : 1.0,
"m1_rate" : 1.0,
"m5_rate" : 1.0,
"mean_rate" : 1.0,
"duration_units" : "seconds",
"rate_units" : "calls/second"
}
}
}
```
Would get translated into 4 different measurements:
```text
measurement,metric_type=counter,tag1=green count=1
measurement,metric_type=meter count=1,m15_rate=1.0,m1_rate=1.0,m5_rate=1.0,mean_rate=1.0
measurement,metric_type=gauge value=1
measurement,metric_type=histogram count=1,max=1.0,mean=1.0,min=1.0,p50=1.0,p75=1.0,p95=1.0,p98=1.0,p99=1.0,p999=1.0
measurement,metric_type=timer count=1,max=1.0,mean=1.0,min=1.0,p50=1.0,p75=1.0,p95=1.0,p98=1.0,p99=1.0,p999=1.0,stddev=1.0,m15_rate=1.0,m1_rate=1.0,m5_rate=1.0,mean_rate=1.0
```
You may also parse a dropwizard registry from any JSON document which contains a
dropwizard registry in some inner field. Eg. to parse the following JSON
document:
```json
{
"time" : "2017-02-22T14:33:03.662+02:00",
"tags" : {
"tag1" : "green",
"tag2" : "yellow"
},
"metrics" : {
"counters" : {
"measurement" : {
"count" : 1
}
},
"meters" : {},
"gauges" : {},
"histograms" : {},
"timers" : {}
}
}
```
and translate it into:
```text
measurement,metric_type=counter,tag1=green,tag2=yellow count=1 1487766783662000000
```
you simply need to use the following additional configuration properties:
```toml
dropwizard_metric_registry_path = "metrics"
dropwizard_time_path = "time"
dropwizard_time_format = "2006-01-02T15:04:05Z07:00"
dropwizard_tags_path = "tags"
## tag paths per tag are supported too, eg.
#[inputs.yourinput.dropwizard_tag_paths]
# tag1 = "tags.tag1"
# tag2 = "tags.tag2"
```

View File

@ -1,71 +0,0 @@
---
title: Form URL-encoded input data format
list_title: Form URL-encoded
description:
Use the `form-urlencoded` data format to parse `application/x-www-form-urlencoded`
data, such as HTTP query strings.
menu:
telegraf_v1_ref:
name: Form URL-encoded
weight: 10
parent: Input data formats
metadata: [Form URLencoded parser plugin]
---
Use the `form-urlencoded` data format to parse `application/x-www-form-urlencoded`
data, such as HTTP query strings.
A common use case is to pair it with the [http_listener_v2](/telegraf/v1/plugins/#input-http_listener_v2) input plugin to parse
the HTTP request body or query parameters.
## Configuration
```toml
[[inputs.http_listener_v2]]
## Address and port to host HTTP listener on
service_address = ":8080"
## Part of the request to consume. Available options are "body" and
## "query".
data_source = "body"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "form_urlencoded"
## Array of key names which should be collected as tags.
## By default, keys with string value are ignored if not marked as tags.
form_urlencoded_tag_keys = ["tag1"]
```
## Examples
### Basic parsing
Config:
```toml
[[inputs.http_listener_v2]]
name_override = "mymetric"
service_address = ":8080"
data_source = "query"
data_format = "form_urlencoded"
form_urlencoded_tag_keys = ["tag1"]
```
Request:
```bash
curl -i -XGET 'http://localhost:8080/telegraf?tag1=foo&field1=0.42&field2=42'
```
Output:
```text
mymetric,tag1=foo field1=0.42,field2=42
```
[query string]: https://en.wikipedia.org/wiki/Query_string
[http_listener_v2]: /plugins/inputs/http_listener_v2

View File

@ -1,59 +0,0 @@
---
title: Graphite input data format
list_title: Graphite
description: Use the `graphite` input data format to parse Graphite dot buckets into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Graphite
weight: 10
parent: Input data formats
---
Use the `graphite` input data format to parse graphite _dot_ buckets directly into
Telegraf metrics with a measurement name, a single field, and optional tags.
By default, the separator is left as `.`, but this can be changed using the
`separator` argument. For more advanced options, Telegraf supports specifying
[templates](#templates) to translate graphite buckets into Telegraf metrics.
## Configuration
```toml
[[inputs.exec]]
## Commands array
commands = ["/tmp/test.sh", "/usr/bin/mycollector --foo=bar"]
## measurement name suffix (for separating different commands)
name_suffix = "_mycollector"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "graphite"
## This string will be used to join the matched values.
separator = "_"
## Each template line requires a template pattern. It can have an optional
## filter before the template and separated by spaces. It can also have optional extra
## tags following the template. Multiple tags should be separated by commas and no spaces
## similar to the line protocol format. There can be only one default template.
## Templates support below format:
## 1. filter + template
## 2. filter + template + extra tag(s)
## 3. filter + template with field key
## 4. default template
templates = [
"*.app env.service.resource.measurement",
"stats.* .host.measurement* region=eu-east,agent=sensu",
"stats2.* .host.measurement.field",
"measurement*"
]
```
## Templates
[Template patterns](/telegraf/v1/configure_plugins/template-patterns/) specify how a dot-delimited
string should be mapped to and from [metrics](/telegraf/v1/metrics/).

View File

@ -1,282 +0,0 @@
---
title: Grok input data format
list_title: Grok
description: Use the `grok` data format to parse line-delimited data using a regular expression-like language.
menu:
telegraf_v1_ref:
name: Grok
weight: 10
parent: Input data formats
---
Use the `grok` data format to parse line-delimited data using a regular expression-like
language.
For an introduction to grok patterns, see [Grok Basics](https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#_grok_basics)
in the Logstash documentation. The grok parser uses a slightly modified version of logstash **grok**
patterns, using the format:
```text
%{<capture_syntax>[:<semantic_name>][:<modifier>]}
```
The `capture_syntax` defines the grok pattern used to parse the input
line and the `semantic_name` is used to name the field or tag. The extension
`modifier` controls the data type that the parsed item is converted to or
other special handling.
By default all named captures are converted into string fields.
If a pattern does not have a semantic name it will not be captured.
Timestamp modifiers can be used to convert captures to the timestamp of the
parsed metric. If no timestamp is parsed the metric will be created using the
current time.
You must capture at least one field per line.
- Available modifiers:
- string (default if nothing is specified)
- int
- float
- duration (ie, 5.23ms gets converted to int nanoseconds)
- tag (converts the field into a tag)
- drop (drops the field completely)
- measurement (use the matched text as the measurement name)
- Timestamp modifiers:
- ts (This will auto-learn the timestamp format)
- ts-ansic ("Mon Jan _2 15:04:05 2006")
- ts-unix ("Mon Jan _2 15:04:05 MST 2006")
- ts-ruby ("Mon Jan 02 15:04:05 -0700 2006")
- ts-rfc822 ("02 Jan 06 15:04 MST")
- ts-rfc822z ("02 Jan 06 15:04 -0700")
- ts-rfc850 ("Monday, 02-Jan-06 15:04:05 MST")
- ts-rfc1123 ("Mon, 02 Jan 2006 15:04:05 MST")
- ts-rfc1123z ("Mon, 02 Jan 2006 15:04:05 -0700")
- ts-rfc3339 ("2006-01-02T15:04:05Z07:00")
- ts-rfc3339nano ("2006-01-02T15:04:05.999999999Z07:00")
- ts-httpd ("02/Jan/2006:15:04:05 -0700")
- ts-epoch (seconds since unix epoch, may contain decimal)
- ts-epochnano (nanoseconds since unix epoch)
- ts-epochmilli (milliseconds since unix epoch)
- ts-syslog ("Jan 02 15:04:05", parsed time is set to the current year)
- ts-"CUSTOM"
CUSTOM time layouts must be within quotes and be the representation of the
"reference time", which is `Mon Jan 2 15:04:05 -0700 MST 2006`.
To match a comma decimal point you can use a period. For example `%{TIMESTAMP:timestamp:ts-"2006-01-02 15:04:05.000"}` can be used to match `"2018-01-02 15:04:05,000"`
To match a comma decimal point you can use a period in the pattern string.
See https://golang.org/pkg/time/#Parse for more details.
Telegraf has many of its own [built-in patterns](https://github.com/influxdata/telegraf/blob/master/plugins/parsers/grok/influx_patterns.go),
as well as support for most of
[Logstash's core patterns](https://github.com/logstash-plugins/logstash-patterns-core/blob/main/patterns/ecs-v1/grok-patterns).
_Golang regular expressions do not support _lookahead_ or _lookbehind_.
Logstash patterns that depend on these aren't supported._
For help building and testing patterns, see [tips for creating patterns](#tips-for-creating-patterns).
<!-- TOC -->
- [Configuration](#configuration)
- [Timestamp Examples](#timestamp-examples)
- [TOML Escaping](#toml-escaping)
- [Tips for creating patterns](#tips-for-creating-patterns)
- [Performance](#performance)
## Configuration
```toml
[[inputs.file]]
## Files to parse each interval.
## These accept standard unix glob matching rules, but with the addition of
## ** as a "super asterisk". ie:
## /var/log/**.log -> recursively find all .log files in /var/log
## /var/log/*/*.log -> find all .log files with a parent dir in /var/log
## /var/log/apache.log -> only tail the apache log file
files = ["/var/log/apache/access.log"]
## The dataformat to be read from files
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "grok"
## This is a list of patterns to check the given log file(s) for.
## Note that adding patterns here increases processing time. The most
## efficient configuration is to have one pattern.
## Other common built-in patterns are:
## %{COMMON_LOG_FORMAT} (plain apache & nginx access logs)
## %{COMBINED_LOG_FORMAT} (access logs + referrer & agent)
grok_patterns = ["%{COMBINED_LOG_FORMAT}"]
## Full path(s) to custom pattern files.
grok_custom_pattern_files = []
## Custom patterns can also be defined here. Put one pattern per line.
grok_custom_patterns = '''
'''
## Timezone allows you to provide an override for timestamps that
## don't already include an offset
## e.g. 04/06/2016 12:41:45 data one two 5.43µs
##
## Default: "" which renders UTC
## Options are as follows:
## 1. Local -- interpret based on machine localtime
## 2. "Canada/Eastern" -- Unix TZ values like those found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
## 3. UTC -- or blank/unspecified, will return timestamp in UTC
grok_timezone = "Canada/Eastern"
## When set to "disable" timestamp will not incremented if there is a
## duplicate.
# grok_unique_timestamp = "auto"
## Enable multiline messages to be processed.
# grok_multiline = false
```
### Timestamp Examples
This example input and config parses a file using a custom timestamp conversion:
```text
2017-02-21 13:10:34 value=42
```
```toml
[[inputs.file]]
grok_patterns = ['%{TIMESTAMP_ISO8601:timestamp:ts-"2006-01-02 15:04:05"} value=%{NUMBER:value:int}']
```
This example input and config parses a file using a timestamp in unix time:
```text
1466004605 value=42
1466004605.123456789 value=42
```
```toml
[[inputs.file]]
grok_patterns = ['%{NUMBER:timestamp:ts-epoch} value=%{NUMBER:value:int}']
```
This example parses a file using a built-in conversion and a custom pattern:
```text
Wed Apr 12 13:10:34 PST 2017 value=42
```
```toml
[[inputs.file]]
grok_patterns = ["%{TS_UNIX:timestamp:ts-unix} value=%{NUMBER:value:int}"]
grok_custom_patterns = '''
TS_UNIX %{DAY} %{MONTH} %{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND} %{TZ} %{YEAR}
'''
```
This example input and config parses a file using a custom timestamp conversion
that doesn't match any specific standard:
```text
21/02/2017 13:10:34 value=42
```
```toml
[[inputs.file]]
grok_patterns = ['%{MY_TIMESTAMP:timestamp:ts-"02/01/2006 15:04:05"} value=%{NUMBER:value:int}']
grok_custom_patterns = '''
MY_TIMESTAMP (?:\d{2}.\d{2}.\d{4} \d{2}:\d{2}:\d{2})
'''
```
For cases where the timestamp itself is without offset, the `timezone` config
var is available to denote an offset. By default (with `timezone` either omit,
blank or set to `"UTC"`), the times are processed as if in the UTC timezone. If
specified as `timezone = "Local"`, the timestamp will be processed based on the
current machine timezone configuration. Lastly, if using a timezone from the
list of Unix
[timezones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones), grok
will offset the timestamp accordingly.
#### TOML Escaping
When saving patterns to the configuration file, keep in mind the different TOML
[string](https://github.com/toml-lang/toml#string) types and the escaping
rules for each. These escaping rules must be applied in addition to the
escaping required by the grok syntax. Using the Multi-line line literal
syntax with `'''` may be useful.
The following config examples will parse this input file:
```text
|42|\uD83D\uDC2F|'telegraf'|
```
Since `|` is a special character in the grok language, we must escape it to
get a literal `|`. With a basic TOML string, special characters such as
backslash must be escaped, requiring us to escape the backslash a second time.
```toml
[[inputs.file]]
grok_patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"]
grok_custom_patterns = "UNICODE_ESCAPE (?:\\\\u[0-9A-F]{4})+"
```
We cannot use a literal TOML string for the pattern, because we cannot match a
`'` within it. However, it works well for the custom pattern.
```toml
[[inputs.file]]
grok_patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"]
grok_custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+'
```
A multi-line literal string allows us to encode the pattern:
```toml
[[inputs.file]]
grok_patterns = ['''
\|%{NUMBER:value:int}\|%{UNICODE_ESCAPE:escape}\|'%{WORD:name}'\|
''']
grok_custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+'
```
### Tips for creating patterns
Complex patterns can be difficult to read and write.
For help building and debugging grok patterns, see the following tools:
- [Grok Constructor](https://grokconstructor.appspot.com/)
- [Grok Debugger](https://grokdebugger.com/)
We recommend the following steps for building and testing a new pattern with Telegraf and your data:
1. In your Telegraf configuration, do the following to help you isolate and view the captured metrics:
- Configure a file output that writes to stdout:
```toml
[[outputs.file]]
files = ["stdout"]
```
- Disable other outputs while testing.
*Keep in mind that the file output will only print once per `flush_interval`.*
2. For the input, start with a sample file that contains a single line of your data,
and then remove all but the first token or piece of the line.
3. In your Telegraf configuration, add the section of your pattern that matches the piece of data from the previous step.
4. Run Telegraf and verify that the metric is parsed successfully.
5. If successful, add the next token to the data file, update the pattern configuration in Telegraf, and then retest.
6. Continue one token at a time until the entire line is successfully parsed.
#### Performance
Performance depends heavily on the regular expressions that you use, but there
are a few techniques that can help:
- Avoid using patterns such as `%{DATA}` that will always match.
- If possible, add `^` and `$` anchors to your pattern:
```toml
[[inputs.file]]
grok_patterns = ["^%{COMBINED_LOG_FORMAT}$"]
```

View File

@ -1,30 +0,0 @@
---
title: InfluxDB line protocol input data format
list_title: InfluxDB line protocol
description: Use the `influx` line protocol input data format to parse InfluxDB metrics directly into Telegraf metrics.
menu:
telegraf_v1_ref:
name: InfluxDB line protocol
weight: 10
parent: Input data formats
---
Use the `influx` line protocol input data format to parse InfluxDB [line protocol](/influxdb3/cloud-serverless/reference/syntax/line-protocol/) data into Telegraf [metrics](/telegraf/v1/metrics/).
## Configuration
```toml
[[inputs.file]]
files = ["example"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"
## Influx line protocol parser
## 'internal' is the default. 'upstream' is a newer parser that is faster
## and more memory efficient.
## influx_parser_type = "internal"
```

View File

@ -1,278 +0,0 @@
---
title: JSON input data format
list_title: JSON
description: |
The `json` input data format parses JSON objects, or an array of objects, into Telegraf metrics.
For most cases, use the JSON v2 input data format instead.
menu:
telegraf_v1_ref:
name: JSON
weight: 10
parent: Input data formats
---
{{% note %}}
The following information applies to the legacy JSON input data format.
For most cases, use the [JSON v2 input data format](/telegraf/v1/data_formats/input/json_v2/) instead.
{{% /note %}}
The `json` data format parses a [JSON][json] object or an array of objects into
metric fields.
**NOTE:** All JSON numbers are converted to float fields. JSON strings and
booleans are ignored unless specified in the `tag_key` or `json_string_fields`
options.
## Configuration
```toml
[[inputs.file]]
files = ["example"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "json"
## When strict is true and a JSON array is being parsed, all objects within the
## array must be valid
json_strict = true
## Query is a GJSON path that specifies a specific chunk of JSON to be
## parsed, if not specified the whole document will be parsed.
##
## GJSON query paths are described here:
## https://github.com/tidwall/gjson/tree/v1.3.0#path-syntax
json_query = ""
## Tag keys is an array of keys that should be added as tags. Matching keys
## are no longer saved as fields. Supports wildcard glob matching.
tag_keys = [
"my_tag_1",
"my_tag_2",
"tags_*",
"tag*"
]
## Array of glob pattern strings or booleans keys that should be added as string fields.
json_string_fields = []
## Name key is the key to use as the measurement name.
json_name_key = ""
## Time key is the key containing the time that should be used to create the
## metric.
json_time_key = ""
## Time format is the time layout that should be used to interpret the json_time_key.
## The time must be `unix`, `unix_ms`, `unix_us`, `unix_ns`, or a time in the
## "reference time". To define a different format, arrange the values from
## the "reference time" in the example to match the format you will be
## using. For more information on the "reference time", visit
## https://golang.org/pkg/time/#Time.Format
## ex: json_time_format = "Mon Jan 2 15:04:05 -0700 MST 2006"
## json_time_format = "2006-01-02T15:04:05Z07:00"
## json_time_format = "01/02/2006 15:04:05"
## json_time_format = "unix"
## json_time_format = "unix_ms"
json_time_format = ""
## Timezone allows you to provide an override for timestamps that
## don't already include an offset
## e.g. 04/06/2016 12:41:45
##
## Default: "" which renders UTC
## Options are as follows:
## 1. Local -- interpret based on machine localtime
## 2. "America/New_York" -- Unix TZ values like those found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
## 3. UTC -- or blank/unspecified, will return timestamp in UTC
json_timezone = ""
```
### json_query
The `json_query` is a [GJSON][gjson] path that can be used to transform the
JSON document before being parsed. The query is performed before any other
options are applied and the new document produced will be parsed instead of the
original document, as such, the result of the query should be a JSON object or
an array of objects.
Consult the GJSON [path syntax][gjson syntax] for details and examples, and
consider using the [GJSON playground][gjson playground] for developing and
debugging your query.
### json_time_key, json_time_format, json_timezone
By default the current time will be used for all created metrics, to set the
time using the JSON document you can use the `json_time_key` and
`json_time_format` options together to set the time to a value in the parsed
document.
The `json_time_key` option specifies the key containing the time value and
`json_time_format` must be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or
the Go "reference time" which is defined to be the specific time:
`Mon Jan 2 15:04:05 MST 2006`.
Consult the Go [time][time parse] package for details and additional examples
on how to set the time format.
When parsing times that don't include a timezone specifier, times are assumed to
be UTC. To default to another timezone, or to local time, specify the
`json_timezone` option. This option should be set to a [Unix TZ
value](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones), such as
`America/New_York`, to `Local` to utilize the system timezone, or to `UTC`.
## Examples
### Basic Parsing
Config:
```toml
[[inputs.file]]
files = ["example"]
name_override = "myjsonmetric"
data_format = "json"
```
Input:
```json
{
"a": 5,
"b": {
"c": 6
},
"ignored": "I'm a string"
}
```
Output:
```text
myjsonmetric a=5,b_c=6
```
### Name, Tags, and String Fields
Config:
```toml
[[inputs.file]]
files = ["example"]
json_name_key = "name"
tag_keys = ["my_tag_1"]
json_string_fields = ["b_my_field"]
data_format = "json"
```
Input:
```json
{
"a": 5,
"b": {
"c": 6,
"my_field": "description"
},
"my_tag_1": "foo",
"name": "my_json"
}
```
Output:
```text
my_json,my_tag_1=foo a=5,b_c=6,b_my_field="description"
```
### Arrays
If the JSON data is an array, then each object within the array is parsed with
the configured settings.
Config:
```toml
[[inputs.file]]
files = ["example"]
data_format = "json"
json_time_key = "b_time"
json_time_format = "02 Jan 06 15:04 MST"
```
Input:
```json
[
{
"a": 5,
"b": {
"c": 6,
"time":"04 Jan 06 15:04 MST"
}
},
{
"a": 7,
"b": {
"c": 8,
"time":"11 Jan 07 15:04 MST"
}
}
]
```
Output:
```text
file a=5,b_c=6 1136387040000000000
file a=7,b_c=8 1168527840000000000
```
### Query
The `json_query` option can be used to parse a subset of the document.
Config:
```toml
[[inputs.file]]
files = ["example"]
data_format = "json"
tag_keys = ["first"]
json_string_fields = ["last"]
json_query = "obj.friends"
```
Input:
```json
{
"obj": {
"name": {"first": "Tom", "last": "Anderson"},
"age":37,
"children": ["Sara","Alex","Jack"],
"fav.movie": "Deer Hunter",
"friends": [
{"first": "Dale", "last": "Murphy", "age": 44},
{"first": "Roger", "last": "Craig", "age": 68},
{"first": "Jane", "last": "Murphy", "age": 47}
]
}
}
```
Output:
```text
file,first=Dale last="Murphy",age=44
file,first=Roger last="Craig",age=68
file,first=Jane last="Murphy",age=47
```
[gjson]: https://github.com/tidwall/gjson
[gjson syntax]: https://github.com/tidwall/gjson#path-syntax
[gjson playground]: https://gjson.dev/
[json]: https://www.json.org/
[time parse]: https://golang.org/pkg/time/#Parse

View File

@ -1,179 +0,0 @@
---
title: JSON v2 input data format
list_title: JSON v2
description: Use the `json_v2` input data format to parse [JSON][json] objects, or an array of objects, into Telegraf metrics.
menu:
telegraf_v1_ref:
name: JSON v2
weight: 10
parent: Input data formats
---
Use the `json_v2` input data format to parse a [JSON][json] object or an array of objects into Telegraf metrics.
The parser supports [GJSON Path Syntax](https://github.com/tidwall/gjson/blob/v1.7.5/SYNTAX.md) for querying JSON.
To test your GJSON path, use [GJSON Playground](https://gjson.dev/).
You can find multiple examples [here](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/json_v2/testdata) in the Telegraf repository.
<!--
is this still true?
{{% note %}}
All JSON numbers are converted to float fields. JSON String are
ignored unless specified in the `tag_key` or `json_string_fields` options.
{{% /note %}}
-->
## Configuration
Configure this parser by describing the metric you want by defining the fields and tags from the input.
The configuration is divided into config sub-tables called `field`, `tag`, and `object`.
In the example below you can see all the possible configuration keys you can define for each config table.
In the sections that follow these configuration keys are defined in more detail.
```toml
[[inputs.file]]
urls = []
data_format = "json_v2"
[[inputs.file.json_v2]]
measurement_name = "" # A string that will become the new measurement name
measurement_name_path = "" # A string with valid GJSON path syntax, will override measurement_name
timestamp_path = "" # A string with valid GJSON path syntax to a valid timestamp (single value)
timestamp_format = "" # A string with a valid timestamp format (see below for possible values)
timestamp_timezone = "" # A string with with a valid timezone (see below for possible values)
[[inputs.file.json_v2.field]]
path = "" # A string with valid GJSON path syntax
rename = "new name" # A string with a new name for the tag key
type = "int" # A string specifying the type (int,uint,float,string,bool)
optional = false # true: suppress errors if configured path does not exist
[[inputs.file.json_v2.tag]]
path = "" # A string with valid GJSON path syntax
rename = "new name" # A string with a new name for the tag key
type = "float" # A string specifying the type (int,uint,float,string,bool)
optional = false # true: suppress errors if configured path does not exist
[[inputs.file.json_v2.object]]
path = "" # A string with valid GJSON path syntax
timestamp_key = "" # A JSON key (for a nested key, prepend the parent keys with underscores) to a valid timestamp
timestamp_format = "" # A string with a valid timestamp format (see below for possible values)
timestamp_timezone = "" # A string with with a valid timezone (see below for possible values)
disable_prepend_keys = false (or true, just not both)
included_keys = [] # List of JSON keys (for a nested key, prepend the parent keys with underscores) that should be only included in result
excluded_keys = [] # List of JSON keys (for a nested key, prepend the parent keys with underscores) that shouldn't be included in result
tags = [] # List of JSON keys (for a nested key, prepend the parent keys with underscores) to be a tag instead of a field
optional = false # true: suppress errors if configured path does not exist
[inputs.file.json_v2.object.renames] # A map of JSON keys (for a nested key, prepend the parent keys with underscores) with a new name for the tag key
key = "new name"
[inputs.file.json_v2.object.fields] # A map of JSON keys (for a nested key, prepend the parent keys with underscores) with a type (int,uint,float,string,bool)
key = "int"
```
### Root configuration options
* **measurement_name (OPTIONAL)**: Will set the measurement name to the provided string.
* **measurement_name_path (OPTIONAL)**: You can define a query with [GJSON Path Syntax](https://github.com/tidwall/gjson/blob/v1.7.5/SYNTAX.md) to set a measurement name from the JSON input.
The query must return a single data value or it will use the default measurement name.
This takes precedence over `measurement_name`.
* **timestamp_path (OPTIONAL)**: You can define a query with [GJSON Path Syntax](https://github.com/tidwall/gjson/blob/v1.7.5/SYNTAX.md) to set a timestamp from the JSON input.
The query must return a single data value or it will default to the current time.
* **timestamp_format (OPTIONAL, but REQUIRED when timestamp_path is defined**: Must be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or
the Go "reference time" which is defined to be the specific time:
`Mon Jan 2 15:04:05 MST 2006`
* **timestamp_timezone (OPTIONAL, but REQUIRES timestamp_path**: This option should be set to a
[Unix TZ value](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones),
such as `America/New_York`, to `Local` to utilize the system timezone, or to `UTC`.
Defaults to `UTC`
## Arrays and Objects
The following describes the high-level approach when parsing arrays and objects:
- **Array**: Every element in an array is treated as a *separate* metric
- **Object**: Every key-value in a object is treated as a *single* metric
When handling nested arrays and objects, the rules above continue to apply as the parser creates metrics.
When an object has multiple arrays as values,
the arrays will become separate metrics containing only non-array values from the object.
Below you can see an example of this behavior,
with an input JSON containing an array of book objects that has a nested array of characters.
**Example JSON:**
```json
{
"book": {
"title": "The Lord Of The Rings",
"chapters": [
"A Long-expected Party",
"The Shadow of the Past"
],
"author": "Tolkien",
"characters": [
{
"name": "Bilbo",
"species": "hobbit"
},
{
"name": "Frodo",
"species": "hobbit"
}
],
"random": [
1,
2
]
}
}
```
**Example configuration:**
```toml
[[inputs.file]]
files = ["./testdata/multiple_arrays_in_object/input.json"]
data_format = "json_v2"
[[inputs.file.json_v2]]
[[inputs.file.json_v2.object]]
path = "book"
tags = ["title"]
disable_prepend_keys = true
```
**Expected metrics:**
```
file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",chapters="A Long-expected Party"
file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",chapters="The Shadow of the Past"
file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",name="Bilbo",species="hobbit"
file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",name="Frodo",species="hobbit"
file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",random=1
file,title=The\ Lord\ Of\ The\ Rings author="Tolkien",random=2
```
You can find more complicated examples under the folder [`testdata`][] in the telegraf repo.
## Types
For each field you have the option to define the types for each metric.
The following rules are in place for this configuration:
* If a type is explicitly defined, the parser will enforce this type and convert the data to the defined type if possible.
If the type can't be converted then the parser will fail.
* If a type isn't defined, the parser will use the default type defined in the JSON (int, float, string).
The type values you can set:
* `int`, bool, floats or strings (with valid numbers) can be converted to a int.
* `uint`, bool, floats or strings (with valid numbers) can be converted to a uint.
* `string`, any data can be formatted as a string.
* `float`, string values (with valid numbers) or integers can be converted to a float.
* `bool`, the string values "true" or "false" (regardless of capitalization) or the integer values `0` or `1` can be turned to a bool.
[json]: https://www.json.org/
[testdata]: https://github.com/influxdata/telegraf/tree/master/plugins/parsers/json_v2/testdata

View File

@ -1,42 +0,0 @@
---
title: Logfmt input data format
list_title: Logfmt
description: Use the `logfmt` input data format to parse logfmt data into Telegraf metrics.
menu:
telegraf_v1_ref:
name: logfmt
weight: 10
parent: Input data formats
---
Use the `logfmt` data format to parse [logfmt] data into Telegraf metrics.
[logfmt]: https://brandur.org/logfmt
## Configuration
```toml
[[inputs.file]]
files = ["example"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "logfmt"
## Array of key names which should be collected as tags. Globs accepted.
logfmt_tag_keys = ["method","host"]
```
## Metrics
Each key/value pair in the line is added to a new metric as a field. The type
of the field is automatically determined based on the contents of the value.
## Examples
```text
- method=GET host=example.org ts=2018-07-24T19:43:40.275Z connect=4ms service=8ms status=200 bytes=1653
+ logfmt,host=example.org,method=GET ts="2018-07-24T19:43:40.275Z",connect="4ms",service="8ms",status=200i,bytes=1653i
```

View File

@ -1,28 +0,0 @@
---
title: Nagios input data format
list_title: Nagios
description: Use the `nagios` input data format to parse the output of Nagios plugins into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Nagios
weight: 10
parent: Input data formats
---
Use the `nagios` input data format to parse the output of
[Nagios plugins](https://www.nagios.org/downloads/nagios-plugins/) into
Telegraf metrics.
## Configuration
```toml
[[inputs.exec]]
## Commands array
commands = ["/usr/lib/nagios/plugins/check_load -w 5,6,7 -c 7,8,9"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "nagios"
```

View File

@ -1,39 +0,0 @@
---
title: OpenTSDB Telnet "PUT" API input data format
list_title: OpenTSDB Telnet PUT API
description:
Use the `opentsdb` data format to parse OpenTSDB Telnet `PUT` API data into Telegraf metrics.
menu:
telegraf_v1_ref:
name: OpenTSDB
weight: 10
parent: Input data formats
metadata: []
---
Use the `opentsdb` data format to parse [OpenTSDB Telnet `PUT` API](http://opentsdb.net/docs/build/html/api_telnet/put.html) data into
Telegraf metrics. There are no additional configuration options for OpenTSDB.
For more detail on the format, see:
- [OpenTSDB Telnet "PUT" API guide](http://opentsdb.net/docs/build/html/api_telnet/put.html)
- [OpenTSDB data specification](http://opentsdb.net/docs/build/html/user_guide/writing/index.html#data-specification)
## Configuration
```toml
[[inputs.file]]
files = ["example"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "opentsdb"
```
## Example
```opentsdb
put sys.cpu.user 1356998400 42.5 host=webserver01 cpu=0
```

View File

@ -1,69 +0,0 @@
---
title: Prometheus Remote Write input data format
list_title: Prometheus Remote Write
description:
Use the `prometheusremotewrite` input data format to parse Prometheus Remote Write samples into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Prometheus Remote Write
weight: 10
parent: Input data formats
---
Use the `prometheusremotewrite` input data format to parse [Prometheus Remote Write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write) samples into Telegraf metrics.
{{% note %}}
If you are using InfluxDB 1.x and the [Prometheus Remote Write endpoint](https://github.com/influxdata/telegraf/blob/master/plugins/parsers/prometheusremotewrite/README.md
to write in metrics, you can migrate to InfluxDB 2.0 and use this parser.
For the metrics to completely align with the 1.x endpoint, add a Starlark processor as described [here](https://github.com/influxdata/telegraf/blob/master/plugins/processors/starlark/README.md).
{{% /note %}}
Converts prometheus remote write samples directly into Telegraf metrics. It can
be used with [http_listener_v2](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/http_listener_v2). There are no
additional configuration options for Prometheus Remote Write Samples.
## Configuration
```toml
[[inputs.http_listener_v2]]
## Address and port to host HTTP listener on
service_address = ":1234"
## Paths to listen to.
paths = ["/receive"]
## Data format to consume.
data_format = "prometheusremotewrite"
```
## Example Input
```json
prompb.WriteRequest{
Timeseries: []*prompb.TimeSeries{
{
Labels: []*prompb.Label{
{Name: "__name__", Value: "go_gc_duration_seconds"},
{Name: "instance", Value: "localhost:9090"},
{Name: "job", Value: "prometheus"},
{Name: "quantile", Value: "0.99"},
},
Samples: []prompb.Sample{
{Value: 4.63, Timestamp: time.Date(2020, 4, 1, 0, 0, 0, 0, time.UTC).UnixNano()},
},
},
},
}
```
## Example Output
```text
prometheus_remote_write,instance=localhost:9090,job=prometheus,quantile=0.99 go_gc_duration_seconds=4.63 1614889298859000000
```
## For alignment with the [InfluxDB v1.x Prometheus Remote Write Spec](/influxdb/v1/supported_protocols/prometheus/#how-prometheus-metrics-are-parsed-in-influxdb)
- Use the [Starlark processor rename prometheus remote write script](https://github.com/influxdata/telegraf/blob/master/plugins/processors/starlark/testdata/rename_prometheus_remote_write.star) to rename the measurement name to the fieldname and rename the fieldname to value.

View File

@ -1,45 +0,0 @@
---
title: Value input data format
list_title: Value
description: Use the `value` input data format to parse single values into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Value
weight: 10
parent: Input data formats
---
Use the `value` input data format to parse single values into Telegraf metrics.
## Configuration
Specify the measurement name and a field to use as the parsed metric.
> To specify the measurement name for your metric, set `name_override`; otherwise, the input plugin name (for example, "exec") is used as the measurement name.
You **must** tell Telegraf what type of metric to collect by using the
`data_type` configuration option. Available data type options are:
1. integer
2. float or long
3. string
4. boolean
```toml
[[inputs.exec]]
## Commands array
commands = ["cat /proc/sys/kernel/random/entropy_avail"]
## override the default metric name of "exec"
name_override = "entropy_available"
## override the field name of "value"
# value_field_name = "value"
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "value"
data_type = "integer" # required
```

View File

@ -1,29 +0,0 @@
---
title: Wavefront input data format
list_title: Wavefront
description: Use the `wavefront` input data format to parse Wavefront data into Telegraf metrics.
menu:
telegraf_v1_ref:
name: Wavefront
weight: 10
parent: Input data formats
---
Use the `wavefront` input data format to parse Wavefront data into Telegraf metrics.
For more information on the Wavefront native data format, see
[Wavefront Data Format](https://docs.wavefront.com/wavefront_data_format.html) in the Wavefront documentation.
## Configuration
There are no additional configuration options for Wavefront Data Format line-protocol.
```toml
[[inputs.file]]
files = ["example"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "wavefront"
```

View File

@ -1,59 +0,0 @@
---
title: XML input data format
list_title: XML
description: Use the `xml` input data format to parse XML data into Telegraf metrics.
menu:
telegraf_v1_ref:
name: XML
weight: 10
parent: Input data formats
metadata: [XPath parser plugin]
---
Use the `xml` input data format, provided by the [XPath parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/xpath), with XPath expressions to parse XML data into Telegraf metrics.
## Configuration
```toml
[[inputs.file]]
files = ["example.xml"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "xml"
## Multiple parsing sections are allowed
[[inputs.file.xml]]
## Optional: XPath-query to select a subset of nodes from the XML document.
#metric_selection = "/Bus/child::Sensor"
## Optional: XPath-query to set the metric (measurement) name.
#metric_name = "string('example')"
## Optional: Query to extract metric timestamp.
## If not specified the time of execution is used.
#timestamp = "/Gateway/Timestamp"
## Optional: Format of the timestamp determined by the query above.
## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
## time format. If not specified, a "unix" timestamp (in seconds) is expected.
#timestamp_format = "2006-01-02T15:04:05Z"
## Tag definitions using the given XPath queries.
[inputs.file.xml.tags]
name = "substring-after(Sensor/@name, ' ')"
device = "string('the ultimate sensor')"
## Integer field definitions using XPath queries.
[inputs.file.xml.fields_int]
consumers = "Variable/@consumers"
## Non-integer field definitions using XPath queries.
## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string.
[inputs.file.xml.fields]
temperature = "number(Variable/@temperature)"
power = "number(Variable/@power)"
frequency = "number(Variable/@frequency)"
ok = "Mode != 'ok'"
```

View File

@ -1,629 +0,0 @@
---
title: XPath JSON input data format
list_title: XPath JSON
description:
Use the `xpath_json` input data format and XPath expressions to parse JSON into Telegraf metrics.
menu:
telegraf_v1_ref:
name: XPath JSON
weight: 10
parent: Input data formats
metadata: [XPath parser plugin]
---
Use the `xpath_json` input data format, provided by the [XPath parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/xpath), with [XPath][xpath] expressions to parse JSON data into Telegraf metrics.
For information about supported XPath functions, see [the underlying XPath library][xpath lib].
**NOTE:** The type of fields are specified using [XPath functions][xpath
lib]. The only exceptions are _integer_ fields that need to be specified in a
`fields_int` section.
## Supported data formats
| name | `data_format` setting | comment |
| --------------------------------------- | --------------------- | ------- |
| [Extensible Markup Language (XML)][xml] | `"xml"` | |
| [JSON][json] | `"xpath_json"` | |
| [MessagePack][msgpack] | `"xpath_msgpack"` | |
| [Protocol-buffers][protobuf] | `"xpath_protobuf"` | [see additional parameters](#protocol-buffers-additional-settings)|
### Protocol-buffers additional settings
For using the protocol-buffer format you need to specify additional
(_mandatory_) properties for the parser. Those options are described here.
#### `xpath_protobuf_file` (mandatory)
Use this option to specify the name of the protocol-buffer definition file
(`.proto`).
#### `xpath_protobuf_type` (mandatory)
This option contains the top-level message file to use for deserializing the
data to be parsed. Usually, this is constructed from the `package` name in the
protocol-buffer definition file and the `message` name as `<package
name>.<message name>`.
#### `xpath_protobuf_import_paths` (optional)
In case you import other protocol-buffer definitions within your `.proto` file
(i.e. you use the `import` statement) you can use this option to specify paths
to search for the imported definition file(s). By default the imports are only
searched in `.` which is the current-working-directory, i.e. usually the
directory you are in when starting telegraf.
Imagine you do have multiple protocol-buffer definitions (e.g. `A.proto`,
`B.proto` and `C.proto`) in a directory (e.g. `/data/my_proto_files`) where your
top-level file (e.g. `A.proto`) imports at least one other definition
```protobuf
syntax = "proto3";
package foo;
import "B.proto";
message Measurement {
...
}
```
You should use the following setting
```toml
[[inputs.file]]
files = ["example.dat"]
data_format = "xpath_protobuf"
xpath_protobuf_file = "A.proto"
xpath_protobuf_type = "foo.Measurement"
xpath_protobuf_import_paths = [".", "/data/my_proto_files"]
...
```
#### `xpath_protobuf_skip_bytes` (optional)
This option allows to skip a number of bytes before trying to parse
the protocol-buffer message. This is useful in cases where the raw data
has a header e.g. for the message length or in case of GRPC messages.
This is a list of known headers and the corresponding values for
`xpath_protobuf_skip_bytes`
| name | setting | comment |
| --------------------------------------- | ------- | ------- |
| [GRPC protocol][GRPC] | 5 | GRPC adds a 5-byte header for _Length-Prefixed-Messages_ |
| [PowerDNS logging][PDNS] | 2 | Sent messages contain a 2-byte header containing the message length |
[GRPC]: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
[PDNS]: https://docs.powerdns.com/recursor/lua-config/protobuf.html
## Configuration
```toml
[[inputs.file]]
files = ["example.xml"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "xml"
## PROTOCOL-BUFFER definitions
## Protocol-buffer definition file
# xpath_protobuf_file = "sparkplug_b.proto"
## Name of the protocol-buffer message type to use in a fully qualified form.
# xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
## List of paths to use when looking up imported protocol-buffer definition files.
# xpath_protobuf_import_paths = ["."]
## Number of (header) bytes to ignore before parsing the message.
# xpath_protobuf_skip_bytes = 0
## Print the internal XML document when in debug logging mode.
## This is especially useful when using the parser with non-XML formats like protocol-buffers
## to get an idea on the expression necessary to derive fields etc.
# xpath_print_document = false
## Allow the results of one of the parsing sections to be empty.
## Useful when not all selected files have the exact same structure.
# xpath_allow_empty_selection = false
## Get native data-types for all data-format that contain type information.
## Currently, protobuf, msgpack and JSON support native data-types
# xpath_native_types = false
## Multiple parsing sections are allowed
[[inputs.file.xpath]]
## Optional: XPath-query to select a subset of nodes from the XML document.
# metric_selection = "/Bus/child::Sensor"
## Optional: XPath-query to set the metric (measurement) name.
# metric_name = "string('example')"
## Optional: Query to extract metric timestamp.
## If not specified the time of execution is used.
# timestamp = "/Gateway/Timestamp"
## Optional: Format of the timestamp determined by the query above.
## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
## time format. If not specified, a "unix" timestamp (in seconds) is expected.
# timestamp_format = "2006-01-02T15:04:05Z"
## Optional: Timezone of the parsed time
## This will locate the parsed time to the given timezone. Please note that
## for times with timezone-offsets (e.g. RFC3339) the timestamp is unchanged.
## This is ignored for all (unix) timestamp formats.
# timezone = "UTC"
## Optional: List of fields to convert to hex-strings if they are
## containing byte-arrays. This might be the case for e.g. protocol-buffer
## messages encoding data as byte-arrays. Wildcard patterns are allowed.
## By default, all byte-array-fields are converted to string.
# fields_bytes_as_hex = []
## Tag definitions using the given XPath queries.
[inputs.file.xpath.tags]
name = "substring-after(Sensor/@name, ' ')"
device = "string('the ultimate sensor')"
## Integer field definitions using XPath queries.
[inputs.file.xpath.fields_int]
consumers = "Variable/@consumers"
## Non-integer field definitions using XPath queries.
## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string.
[inputs.file.xpath.fields]
temperature = "number(Variable/@temperature)"
power = "number(Variable/@power)"
frequency = "number(Variable/@frequency)"
ok = "Mode != 'ok'"
```
In this configuration mode, you explicitly specify the field and tags you want
to scrape from your data.
A configuration can contain multiple _xpath_ subsections (for example, the file plugin
to process the xml-string multiple times). Consult the [XPath syntax][xpath] and
the [underlying library's functions][xpath lib] for details and help regarding
XPath queries. Consider using an XPath tester such as [xpather.com][xpather] or
[Code Beautify's XPath Tester][xpath tester] for help developing and debugging
your query.
## Configuration (batch)
Alternatively to the configuration above, fields can also be specified in a
batch way. So contrary to specify the fields in a section, you can define a
`name` and a `value` selector used to determine the name and value of the fields
in the metric.
```toml
[[inputs.file]]
files = ["example.xml"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "xml"
## PROTOCOL-BUFFER definitions
## Protocol-buffer definition file
# xpath_protobuf_file = "sparkplug_b.proto"
## Name of the protocol-buffer message type to use in a fully qualified form.
# xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
## List of paths to use when looking up imported protocol-buffer definition files.
# xpath_protobuf_import_paths = ["."]
## Print the internal XML document when in debug logging mode.
## This is especially useful when using the parser with non-XML formats like protocol-buffers
## to get an idea on the expression necessary to derive fields etc.
# xpath_print_document = false
## Allow the results of one of the parsing sections to be empty.
## Useful when not all selected files have the exact same structure.
# xpath_allow_empty_selection = false
## Get native data-types for all data-format that contain type information.
## Currently, protobuf, msgpack and JSON support native data-types
# xpath_native_types = false
## Multiple parsing sections are allowed
[[inputs.file.xpath]]
## Optional: XPath-query to select a subset of nodes from the XML document.
metric_selection = "/Bus/child::Sensor"
## Optional: XPath-query to set the metric (measurement) name.
# metric_name = "string('example')"
## Optional: Query to extract metric timestamp.
## If not specified the time of execution is used.
# timestamp = "/Gateway/Timestamp"
## Optional: Format of the timestamp determined by the query above.
## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
## time format. If not specified, a "unix" timestamp (in seconds) is expected.
# timestamp_format = "2006-01-02T15:04:05Z"
## Field specifications using a selector.
field_selection = "child::*"
## Optional: Queries to specify field name and value.
## These options are only to be used in combination with 'field_selection'!
## By default the node name and node content is used if a field-selection
## is specified.
# field_name = "name()"
# field_value = "."
## Optional: Expand field names relative to the selected node
## This allows to flatten out nodes with non-unique names in the subtree
# field_name_expansion = false
## Tag specifications using a selector.
## tag_selection = "child::*"
## Optional: Queries to specify tag name and value.
## These options are only to be used in combination with 'tag_selection'!
## By default the node name and node content is used if a tag-selection
## is specified.
# tag_name = "name()"
# tag_value = "."
## Optional: Expand tag names relative to the selected node
## This allows to flatten out nodes with non-unique names in the subtree
# tag_name_expansion = false
## Tag definitions using the given XPath queries.
[inputs.file.xpath.tags]
name = "substring-after(Sensor/@name, ' ')"
device = "string('the ultimate sensor')"
```
**Please note**: The resulting fields are _always_ of type string.
It is also possible to specify a mixture of the two alternative ways of
specifying fields. In this case, _explicitly_ defined tags and fields take
_precedence_ over the batch instances if both use the same tag or field name.
### metric_selection (optional)
You can specify a [XPath][xpath] query to select a subset of nodes from the XML
document, each used to generate a new metrics with the specified fields, tags
etc.
Relative queries in subsequent queries are relative to the
`metric_selection`. To specify absolute paths, start the query with a
slash (`/`).
Specifying `metric_selection` is optional. If not specified, all relative queries
are relative to the root node of the XML document.
### metric_name (optional)
By specifying `metric_name` you can override the metric/measurement name with
the result of the given [XPath][xpath] query. If not specified, the default
metric name is used.
### timestamp, timestamp_format, timezone (optional)
By default, the current time is used for all created metrics. To set the
time from values in the XML document you can specify a [XPath][xpath] query in
`timestamp` and set the format in `timestamp_format`.
The `timestamp_format` can be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or
an accepted [Go "reference time"][time const]. Consult the Go [time][time parse]
package for details and additional examples on how to set the time format. If
`timestamp_format` is omitted `unix` format is assumed as result of the
`timestamp` query.
The `timezone` setting is used to locate the parsed time in the given
timezone. This is helpful for cases where the time does not contain timezone
information, e.g. `2023-03-09 14:04:40` and is not located in _UTC_, which is
the default setting. It is also possible to set the `timezone` to `Local` which
used the configured host timezone.
For time formats with timezone information, e.g. RFC3339, the resulting
timestamp is unchanged. The `timezone` setting is ignored for all `unix`
timestamp formats.
### tags sub-section
[XPath][xpath] queries in the `tag name = query` format to add tags to the
metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
__NOTE:__ Results of tag-queries will always be converted to strings.
### fields_int sub-section
[XPath][xpath] queries in the `field name = query` format to add integer typed
fields to the metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
__NOTE:__ Results of field_int-queries will always be converted to
__int64__. The conversion fails in case the query result is not convertible.
### fields sub-section
[XPath][xpath] queries in the `field name = query` format to add non-integer
fields to the metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
The type of the field is specified in the [XPath][xpath] query using the type
conversion functions of XPath such as `number()`, `boolean()` or `string()` If
no conversion is performed in the query the field will be of type string.
__NOTE: Path conversion functions always succeed even if you convert a text
to float.__
### field_selection, field_name, field_value (optional)
You can specify a [XPath][xpath] query to select a set of nodes forming the
fields of the metric. The specified path can be absolute (starting with `/`) or
relative to the currently selected node. Each node selected by `field_selection`
forms a new field within the metric.
The _name_ and the _value_ of each field can be specified using the optional
`field_name` and `field_value` queries. The queries are relative to the selected
field if not starting with `/`. If not specified the field's _name_ defaults to
the node name and the field's _value_ defaults to the content of the selected
field node.
__NOTE__: `field_name` and `field_value` queries are only evaluated if a
`field_selection` is specified.
Specifying `field_selection` is optional. This is an alternative way to specify
fields especially for documents where the node names are not known a priori or
if there is a large number of fields to be specified. These options can also be
combined with the field specifications above.
__NOTE: Path conversion functions always succeed even if you convert a text
to float.__
### field_name_expansion (optional)
When _true_, field names selected with `field_selection` are expanded to a
_path_ relative to the _selected node_. This is necessary if we select all
leaf nodes as fields and those leaf nodes do not have unique names. That is in
case you have duplicate names in the fields you select you should set this to
`true`.
### tag_selection, tag_name, tag_value (optional)
You can specify a [XPath][xpath] query to select a set of nodes forming the tags
of the metric. The specified path can be absolute (starting with `/`) or
relative to the currently selected node. Each node selected by `tag_selection`
forms a new tag within the metric.
The _name_ and the _value_ of each tag can be specified using the optional
`tag_name` and `tag_value` queries. The queries are relative to the selected tag
if not starting with `/`. If not specified the tag's _name_ defaults to the node
name and the tag's _value_ defaults to the content of the selected tag node.
__NOTE__: `tag_name` and `tag_value` queries are only evaluated if a
`tag_selection` is specified.
Specifying `tag_selection` is optional. This is an alternative way to specify
tags especially for documents where the node names are not known a priori or if
there is a large number of tags to be specified. These options can also be
combined with the tag specifications above.
### tag_name_expansion (optional)
When _true_, tag names selected with `tag_selection` are expanded to a _path_
relative to the _selected node_. This is necessary if we e.g. select all leaf
nodes as tags and those leaf nodes do not have unique names. That is in case you
have duplicate names in the tags you select you should set this to `true`.
## Examples
This `example.xml` file is used in the configuration examples below:
```xml
<?xml version="1.0"?>
<Gateway>
<Name>Main Gateway</Name>
<Timestamp>2020-08-01T15:04:03Z</Timestamp>
<Sequence>12</Sequence>
<Status>ok</Status>
</Gateway>
<Bus>
<Sensor name="Sensor Facility A">
<Variable temperature="20.0"/>
<Variable power="123.4"/>
<Variable frequency="49.78"/>
<Variable consumers="3"/>
<Mode>busy</Mode>
</Sensor>
<Sensor name="Sensor Facility B">
<Variable temperature="23.1"/>
<Variable power="14.3"/>
<Variable frequency="49.78"/>
<Variable consumers="1"/>
<Mode>standby</Mode>
</Sensor>
<Sensor name="Sensor Facility C">
<Variable temperature="19.7"/>
<Variable power="0.02"/>
<Variable frequency="49.78"/>
<Variable consumers="0"/>
<Mode>error</Mode>
</Sensor>
</Bus>
```
### Basic Parsing
This example shows the basic usage of the xml parser.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
[inputs.file.xpath.tags]
gateway = "substring-before(/Gateway/Name, ' ')"
[inputs.file.xpath.fields_int]
seqnr = "/Gateway/Sequence"
[inputs.file.xpath.fields]
ok = "/Gateway/Status = 'ok'"
```
Output:
```text
file,gateway=Main,host=Hugin seqnr=12i,ok=true 1598610830000000000
```
In the _tags_ definition the XPath function `substring-before()` is used to only
extract the sub-string before the space. To get the integer value of
`/Gateway/Sequence` we have to use the _fields_int_ section as there is no XPath
expression to convert node values to integers (only float).
The `ok` field is filled with a boolean by specifying a query comparing the
query result of `/Gateway/Status` with the string _ok_. Use the type conversions
available in the XPath syntax to specify field types.
### Time and metric names
This is an example for using time and name of the metric from the XML document
itself.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_name = "name(/Gateway/Status)"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
[inputs.file.xpath.tags]
gateway = "substring-before(/Gateway/Name, ' ')"
[inputs.file.xpath.fields]
ok = "/Gateway/Status = 'ok'"
```
Output:
```text
Status,gateway=Main,host=Hugin ok=true 1596294243000000000
```
Additionally to the basic parsing example, the metric name is defined as the
name of the `/Gateway/Status` node and the timestamp is derived from the XML
document instead of using the execution time.
### Multi-node selection
For XML documents containing metrics for e.g. multiple devices (like `Sensor`s
in the _example.xml_), multiple metrics can be generated using node
selection. This example shows how to generate a metric for each _Sensor_ in the
example.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_selection = "/Bus/child::Sensor"
metric_name = "string('sensors')"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
[inputs.file.xpath.tags]
name = "substring-after(@name, ' ')"
[inputs.file.xpath.fields_int]
consumers = "Variable/@consumers"
[inputs.file.xpath.fields]
temperature = "number(Variable/@temperature)"
power = "number(Variable/@power)"
frequency = "number(Variable/@frequency)"
ok = "Mode != 'error'"
```
Output:
```text
sensors,host=Hugin,name=Facility\ A consumers=3i,frequency=49.78,ok=true,power=123.4,temperature=20 1596294243000000000
sensors,host=Hugin,name=Facility\ B consumers=1i,frequency=49.78,ok=true,power=14.3,temperature=23.1 1596294243000000000
sensors,host=Hugin,name=Facility\ C consumers=0i,frequency=49.78,ok=false,power=0.02,temperature=19.7 1596294243000000000
```
Using the `metric_selection` option we select all `Sensor` nodes in the XML
document. Please note that all field and tag definitions are relative to these
selected nodes. An exception is the timestamp definition which is relative to
the root node of the XML document.
### Batch field processing with multi-node selection
For XML documents containing metrics with a large number of fields or where the
fields are not known before (e.g. an unknown set of `Variable` nodes in the
_example.xml_), field selectors can be used. This example shows how to generate
a metric for each _Sensor_ in the example with fields derived from the
_Variable_ nodes.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_selection = "/Bus/child::Sensor"
metric_name = "string('sensors')"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
field_selection = "child::Variable"
field_name = "name(@*[1])"
field_value = "number(@*[1])"
[inputs.file.xpath.tags]
name = "substring-after(@name, ' ')"
```
Output:
```text
sensors,host=Hugin,name=Facility\ A consumers=3,frequency=49.78,power=123.4,temperature=20 1596294243000000000
sensors,host=Hugin,name=Facility\ B consumers=1,frequency=49.78,power=14.3,temperature=23.1 1596294243000000000
sensors,host=Hugin,name=Facility\ C consumers=0,frequency=49.78,power=0.02,temperature=19.7 1596294243000000000
```
Using the `metric_selection` option we select all `Sensor` nodes in the XML
document. For each _Sensor_ we then use `field_selection` to select all child
nodes of the sensor as _field-nodes_ Please note that the field selection is
relative to the selected nodes. For each selected _field-node_ we use
`field_name` and `field_value` to determining the field's name and value,
respectively. The `field_name` derives the name of the first attribute of the
node, while `field_value` derives the value of the first attribute and converts
the result to a number.
[xpath lib]: https://github.com/antchfx/xpath
[json]: https://www.json.org/
[msgpack]: https://msgpack.org/
[protobuf]: https://developers.google.com/protocol-buffers
[xml]: https://www.w3.org/XML/
[xpath]: https://www.w3.org/TR/xpath/
[xpather]: http://xpather.com/
[xpath tester]: https://codebeautify.org/Xpath-Tester
[time const]: https://golang.org/pkg/time/#pkg-constants
[time parse]: https://golang.org/pkg/time/#Parse

View File

@ -1,629 +0,0 @@
---
title: XPath MessagePack input data format
list_title: XPath MessagePack
description:
Use the `xpath_msgpack` input data format and XPath expressions to parse MessagePack data into Telegraf metrics.
menu:
telegraf_v1_ref:
name: XPath MessagePack
weight: 10
parent: Input data formats
metadata: [XPath parser plugin]
---
Use the `xpath_msgpack` input data format, provided by the [XPath parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/xpath), with XPath expressions to parse MessagePack data into Telegraf metrics.
For information about supported XPath functions, see [the underlying XPath library][xpath lib].
**NOTE:** The type of fields are specified using [XPath functions][xpath
lib]. The only exceptions are _integer_ fields that need to be specified in a
`fields_int` section.
## Supported data formats
| name | `data_format` setting | comment |
| --------------------------------------- | --------------------- | ------- |
| [Extensible Markup Language (XML)][xml] | `"xml"` | |
| [JSON][json] | `"xpath_json"` | |
| [MessagePack][msgpack] | `"xpath_msgpack"` | |
| [Protocol-buffers][protobuf] | `"xpath_protobuf"` | [see additional parameters](#protocol-buffers-additional-settings)|
### Protocol-buffers additional settings
For using the protocol-buffer format you need to specify additional
(_mandatory_) properties for the parser. Those options are described here.
#### `xpath_protobuf_file` (mandatory)
Use this option to specify the name of the protocol-buffer definition file
(`.proto`).
#### `xpath_protobuf_type` (mandatory)
This option contains the top-level message file to use for deserializing the
data to be parsed. Usually, this is constructed from the `package` name in the
protocol-buffer definition file and the `message` name as `<package
name>.<message name>`.
#### `xpath_protobuf_import_paths` (optional)
In case you import other protocol-buffer definitions within your `.proto` file
(i.e. you use the `import` statement) you can use this option to specify paths
to search for the imported definition file(s). By default the imports are only
searched in `.` which is the current-working-directory, i.e. usually the
directory you are in when starting telegraf.
Imagine you do have multiple protocol-buffer definitions (e.g. `A.proto`,
`B.proto` and `C.proto`) in a directory (e.g. `/data/my_proto_files`) where your
top-level file (e.g. `A.proto`) imports at least one other definition
```protobuf
syntax = "proto3";
package foo;
import "B.proto";
message Measurement {
...
}
```
You should use the following setting
```toml
[[inputs.file]]
files = ["example.dat"]
data_format = "xpath_protobuf"
xpath_protobuf_file = "A.proto"
xpath_protobuf_type = "foo.Measurement"
xpath_protobuf_import_paths = [".", "/data/my_proto_files"]
...
```
#### `xpath_protobuf_skip_bytes` (optional)
This option allows to skip a number of bytes before trying to parse
the protocol-buffer message. This is useful in cases where the raw data
has a header e.g. for the message length or in case of GRPC messages.
This is a list of known headers and the corresponding values for
`xpath_protobuf_skip_bytes`
| name | setting | comment |
| --------------------------------------- | ------- | ------- |
| [GRPC protocol][GRPC] | 5 | GRPC adds a 5-byte header for _Length-Prefixed-Messages_ |
| [PowerDNS logging][PDNS] | 2 | Sent messages contain a 2-byte header containing the message length |
[GRPC]: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
[PDNS]: https://docs.powerdns.com/recursor/lua-config/protobuf.html
## Configuration
```toml
[[inputs.file]]
files = ["example.xml"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "xml"
## PROTOCOL-BUFFER definitions
## Protocol-buffer definition file
# xpath_protobuf_file = "sparkplug_b.proto"
## Name of the protocol-buffer message type to use in a fully qualified form.
# xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
## List of paths to use when looking up imported protocol-buffer definition files.
# xpath_protobuf_import_paths = ["."]
## Number of (header) bytes to ignore before parsing the message.
# xpath_protobuf_skip_bytes = 0
## Print the internal XML document when in debug logging mode.
## This is especially useful when using the parser with non-XML formats like protocol-buffers
## to get an idea on the expression necessary to derive fields etc.
# xpath_print_document = false
## Allow the results of one of the parsing sections to be empty.
## Useful when not all selected files have the exact same structure.
# xpath_allow_empty_selection = false
## Get native data-types for all data-format that contain type information.
## Currently, protobuf, msgpack and JSON support native data-types
# xpath_native_types = false
## Multiple parsing sections are allowed
[[inputs.file.xpath]]
## Optional: XPath-query to select a subset of nodes from the XML document.
# metric_selection = "/Bus/child::Sensor"
## Optional: XPath-query to set the metric (measurement) name.
# metric_name = "string('example')"
## Optional: Query to extract metric timestamp.
## If not specified the time of execution is used.
# timestamp = "/Gateway/Timestamp"
## Optional: Format of the timestamp determined by the query above.
## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
## time format. If not specified, a "unix" timestamp (in seconds) is expected.
# timestamp_format = "2006-01-02T15:04:05Z"
## Optional: Timezone of the parsed time
## This will locate the parsed time to the given timezone. Please note that
## for times with timezone-offsets (e.g. RFC3339) the timestamp is unchanged.
## This is ignored for all (unix) timestamp formats.
# timezone = "UTC"
## Optional: List of fields to convert to hex-strings if they are
## containing byte-arrays. This might be the case for e.g. protocol-buffer
## messages encoding data as byte-arrays. Wildcard patterns are allowed.
## By default, all byte-array-fields are converted to string.
# fields_bytes_as_hex = []
## Tag definitions using the given XPath queries.
[inputs.file.xpath.tags]
name = "substring-after(Sensor/@name, ' ')"
device = "string('the ultimate sensor')"
## Integer field definitions using XPath queries.
[inputs.file.xpath.fields_int]
consumers = "Variable/@consumers"
## Non-integer field definitions using XPath queries.
## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string.
[inputs.file.xpath.fields]
temperature = "number(Variable/@temperature)"
power = "number(Variable/@power)"
frequency = "number(Variable/@frequency)"
ok = "Mode != 'ok'"
```
In this configuration mode, you explicitly specify the field and tags you want
to scrape from your data.
A configuration can contain multiple _xpath_ subsections (for example, the file plugin
to process the xml-string multiple times). Consult the [XPath syntax][xpath] and
the [underlying library's functions][xpath lib] for details and help regarding
XPath queries. Consider using an XPath tester such as [xpather.com][xpather] or
[Code Beautify's XPath Tester][xpath tester] for help developing and debugging
your query.
## Configuration (batch)
Alternatively to the configuration above, fields can also be specified in a
batch way. So contrary to specify the fields in a section, you can define a
`name` and a `value` selector used to determine the name and value of the fields
in the metric.
```toml
[[inputs.file]]
files = ["example.xml"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "xml"
## PROTOCOL-BUFFER definitions
## Protocol-buffer definition file
# xpath_protobuf_file = "sparkplug_b.proto"
## Name of the protocol-buffer message type to use in a fully qualified form.
# xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
## List of paths to use when looking up imported protocol-buffer definition files.
# xpath_protobuf_import_paths = ["."]
## Print the internal XML document when in debug logging mode.
## This is especially useful when using the parser with non-XML formats like protocol-buffers
## to get an idea on the expression necessary to derive fields etc.
# xpath_print_document = false
## Allow the results of one of the parsing sections to be empty.
## Useful when not all selected files have the exact same structure.
# xpath_allow_empty_selection = false
## Get native data-types for all data-format that contain type information.
## Currently, protobuf, msgpack and JSON support native data-types
# xpath_native_types = false
## Multiple parsing sections are allowed
[[inputs.file.xpath]]
## Optional: XPath-query to select a subset of nodes from the XML document.
metric_selection = "/Bus/child::Sensor"
## Optional: XPath-query to set the metric (measurement) name.
# metric_name = "string('example')"
## Optional: Query to extract metric timestamp.
## If not specified the time of execution is used.
# timestamp = "/Gateway/Timestamp"
## Optional: Format of the timestamp determined by the query above.
## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
## time format. If not specified, a "unix" timestamp (in seconds) is expected.
# timestamp_format = "2006-01-02T15:04:05Z"
## Field specifications using a selector.
field_selection = "child::*"
## Optional: Queries to specify field name and value.
## These options are only to be used in combination with 'field_selection'!
## By default the node name and node content is used if a field-selection
## is specified.
# field_name = "name()"
# field_value = "."
## Optional: Expand field names relative to the selected node
## This allows to flatten out nodes with non-unique names in the subtree
# field_name_expansion = false
## Tag specifications using a selector.
## tag_selection = "child::*"
## Optional: Queries to specify tag name and value.
## These options are only to be used in combination with 'tag_selection'!
## By default the node name and node content is used if a tag-selection
## is specified.
# tag_name = "name()"
# tag_value = "."
## Optional: Expand tag names relative to the selected node
## This allows to flatten out nodes with non-unique names in the subtree
# tag_name_expansion = false
## Tag definitions using the given XPath queries.
[inputs.file.xpath.tags]
name = "substring-after(Sensor/@name, ' ')"
device = "string('the ultimate sensor')"
```
**Please note**: The resulting fields are _always_ of type string.
It is also possible to specify a mixture of the two alternative ways of
specifying fields. In this case, _explicitly_ defined tags and fields take
_precedence_ over the batch instances if both use the same tag or field name.
### metric_selection (optional)
You can specify a [XPath][xpath] query to select a subset of nodes from the XML
document, each used to generate a new metrics with the specified fields, tags
etc.
Relative queries in subsequent queries are relative to the
`metric_selection`. To specify absolute paths, start the query with a
slash (`/`).
Specifying `metric_selection` is optional. If not specified, all relative queries
are relative to the root node of the XML document.
### metric_name (optional)
By specifying `metric_name` you can override the metric/measurement name with
the result of the given [XPath][xpath] query. If not specified, the default
metric name is used.
### timestamp, timestamp_format, timezone (optional)
By default, the current time is used for all created metrics. To set the
time from values in the XML document you can specify a [XPath][xpath] query in
`timestamp` and set the format in `timestamp_format`.
The `timestamp_format` can be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or
an accepted [Go "reference time"][time const]. Consult the Go [time][time parse]
package for details and additional examples on how to set the time format. If
`timestamp_format` is omitted `unix` format is assumed as result of the
`timestamp` query.
The `timezone` setting will be used to locate the parsed time in the given
timezone. This is helpful for cases where the time does not contain timezone
information, e.g. `2023-03-09 14:04:40` and is not located in _UTC_, which is
the default setting. It is also possible to set the `timezone` to `Local` which
used the configured host timezone.
For time formats with timezone information, e.g. RFC3339, the resulting
timestamp is unchanged. The `timezone` setting is ignored for all `unix`
timestamp formats.
### tags sub-section
[XPath][xpath] queries in the `tag name = query` format to add tags to the
metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
__NOTE:__ Results of tag-queries will always be converted to strings.
### fields_int sub-section
[XPath][xpath] queries in the `field name = query` format to add integer typed
fields to the metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
__NOTE:__ Results of field_int-queries will always be converted to
__int64__. The conversion will fail in case the query result is not convertible!
### fields sub-section
[XPath][xpath] queries in the `field name = query` format to add non-integer
fields to the metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
The type of the field is specified in the [XPath][xpath] query using the type
conversion functions of XPath such as `number()`, `boolean()` or `string()` If
no conversion is performed in the query the field will be of type string.
__NOTE: Path conversion functions will always succeed even if you convert a text
to float!__
### field_selection, field_name, field_value (optional)
You can specify a [XPath][xpath] query to select a set of nodes forming the
fields of the metric. The specified path can be absolute (starting with `/`) or
relative to the currently selected node. Each node selected by `field_selection`
forms a new field within the metric.
The _name_ and the _value_ of each field can be specified using the optional
`field_name` and `field_value` queries. The queries are relative to the selected
field if not starting with `/`. If not specified the field's _name_ defaults to
the node name and the field's _value_ defaults to the content of the selected
field node.
__NOTE__: `field_name` and `field_value` queries are only evaluated if a
`field_selection` is specified.
Specifying `field_selection` is optional. This is an alternative way to specify
fields especially for documents where the node names are not known a priori or
if there is a large number of fields to be specified. These options can also be
combined with the field specifications above.
__NOTE: Path conversion functions will always succeed even if you convert a text
to float!__
### field_name_expansion (optional)
When _true_, field names selected with `field_selection` are expanded to a
_path_ relative to the _selected node_. This is necessary if we e.g. select all
leaf nodes as fields and those leaf nodes do not have unique names. That is in
case you have duplicate names in the fields you select you should set this to
`true`.
### tag_selection, tag_name, tag_value (optional)
You can specify a [XPath][xpath] query to select a set of nodes forming the tags
of the metric. The specified path can be absolute (starting with `/`) or
relative to the currently selected node. Each node selected by `tag_selection`
forms a new tag within the metric.
The _name_ and the _value_ of each tag can be specified using the optional
`tag_name` and `tag_value` queries. The queries are relative to the selected tag
if not starting with `/`. If not specified the tag's _name_ defaults to the node
name and the tag's _value_ defaults to the content of the selected tag node.
__NOTE__: `tag_name` and `tag_value` queries are only evaluated if a
`tag_selection` is specified.
Specifying `tag_selection` is optional. This is an alternative way to specify
tags especially for documents where the node names are not known a priori or if
there is a large number of tags to be specified. These options can also be
combined with the tag specifications above.
### tag_name_expansion (optional)
When _true_, tag names selected with `tag_selection` are expanded to a _path_
relative to the _selected node_. This is necessary if we e.g. select all leaf
nodes as tags and those leaf nodes do not have unique names. That is in case you
have duplicate names in the tags you select you should set this to `true`.
## Examples
This `example.xml` file is used in the configuration examples below:
```xml
<?xml version="1.0"?>
<Gateway>
<Name>Main Gateway</Name>
<Timestamp>2020-08-01T15:04:03Z</Timestamp>
<Sequence>12</Sequence>
<Status>ok</Status>
</Gateway>
<Bus>
<Sensor name="Sensor Facility A">
<Variable temperature="20.0"/>
<Variable power="123.4"/>
<Variable frequency="49.78"/>
<Variable consumers="3"/>
<Mode>busy</Mode>
</Sensor>
<Sensor name="Sensor Facility B">
<Variable temperature="23.1"/>
<Variable power="14.3"/>
<Variable frequency="49.78"/>
<Variable consumers="1"/>
<Mode>standby</Mode>
</Sensor>
<Sensor name="Sensor Facility C">
<Variable temperature="19.7"/>
<Variable power="0.02"/>
<Variable frequency="49.78"/>
<Variable consumers="0"/>
<Mode>error</Mode>
</Sensor>
</Bus>
```
### Basic Parsing
This example shows the basic usage of the xml parser.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
[inputs.file.xpath.tags]
gateway = "substring-before(/Gateway/Name, ' ')"
[inputs.file.xpath.fields_int]
seqnr = "/Gateway/Sequence"
[inputs.file.xpath.fields]
ok = "/Gateway/Status = 'ok'"
```
Output:
```text
file,gateway=Main,host=Hugin seqnr=12i,ok=true 1598610830000000000
```
In the _tags_ definition the XPath function `substring-before()` is used to only
extract the sub-string before the space. To get the integer value of
`/Gateway/Sequence` we have to use the _fields_int_ section as there is no XPath
expression to convert node values to integers (only float).
The `ok` field is filled with a boolean by specifying a query comparing the
query result of `/Gateway/Status` with the string _ok_. Use the type conversions
available in the XPath syntax to specify field types.
### Time and metric names
This is an example for using time and name of the metric from the XML document
itself.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_name = "name(/Gateway/Status)"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
[inputs.file.xpath.tags]
gateway = "substring-before(/Gateway/Name, ' ')"
[inputs.file.xpath.fields]
ok = "/Gateway/Status = 'ok'"
```
Output:
```text
Status,gateway=Main,host=Hugin ok=true 1596294243000000000
```
Additionally to the basic parsing example, the metric name is defined as the
name of the `/Gateway/Status` node and the timestamp is derived from the XML
document instead of using the execution time.
### Multi-node selection
For XML documents containing metrics for e.g. multiple devices (like `Sensor`s
in the _example.xml_), multiple metrics can be generated using node
selection. This example shows how to generate a metric for each _Sensor_ in the
example.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_selection = "/Bus/child::Sensor"
metric_name = "string('sensors')"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
[inputs.file.xpath.tags]
name = "substring-after(@name, ' ')"
[inputs.file.xpath.fields_int]
consumers = "Variable/@consumers"
[inputs.file.xpath.fields]
temperature = "number(Variable/@temperature)"
power = "number(Variable/@power)"
frequency = "number(Variable/@frequency)"
ok = "Mode != 'error'"
```
Output:
```text
sensors,host=Hugin,name=Facility\ A consumers=3i,frequency=49.78,ok=true,power=123.4,temperature=20 1596294243000000000
sensors,host=Hugin,name=Facility\ B consumers=1i,frequency=49.78,ok=true,power=14.3,temperature=23.1 1596294243000000000
sensors,host=Hugin,name=Facility\ C consumers=0i,frequency=49.78,ok=false,power=0.02,temperature=19.7 1596294243000000000
```
Using the `metric_selection` option we select all `Sensor` nodes in the XML
document. Please note that all field and tag definitions are relative to these
selected nodes. An exception is the timestamp definition which is relative to
the root node of the XML document.
### Batch field processing with multi-node selection
For XML documents containing metrics with a large number of fields or where the
fields are not known before (e.g. an unknown set of `Variable` nodes in the
_example.xml_), field selectors can be used. This example shows how to generate
a metric for each _Sensor_ in the example with fields derived from the
_Variable_ nodes.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_selection = "/Bus/child::Sensor"
metric_name = "string('sensors')"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
field_selection = "child::Variable"
field_name = "name(@*[1])"
field_value = "number(@*[1])"
[inputs.file.xpath.tags]
name = "substring-after(@name, ' ')"
```
Output:
```text
sensors,host=Hugin,name=Facility\ A consumers=3,frequency=49.78,power=123.4,temperature=20 1596294243000000000
sensors,host=Hugin,name=Facility\ B consumers=1,frequency=49.78,power=14.3,temperature=23.1 1596294243000000000
sensors,host=Hugin,name=Facility\ C consumers=0,frequency=49.78,power=0.02,temperature=19.7 1596294243000000000
```
Using the `metric_selection` option we select all `Sensor` nodes in the XML
document. For each _Sensor_ we then use `field_selection` to select all child
nodes of the sensor as _field-nodes_ Please note that the field selection is
relative to the selected nodes. For each selected _field-node_ we use
`field_name` and `field_value` to determining the field's name and value,
respectively. The `field_name` derives the name of the first attribute of the
node, while `field_value` derives the value of the first attribute and converts
the result to a number.
[xpath lib]: https://github.com/antchfx/xpath
[json]: https://www.json.org/
[msgpack]: https://msgpack.org/
[protobuf]: https://developers.google.com/protocol-buffers
[xml]: https://www.w3.org/XML/
[xpath]: https://www.w3.org/TR/xpath/
[xpather]: http://xpather.com/
[xpath tester]: https://codebeautify.org/Xpath-Tester
[time const]: https://golang.org/pkg/time/#pkg-constants
[time parse]: https://golang.org/pkg/time/#Parse

View File

@ -1,629 +0,0 @@
---
title: XPath Protocol Buffers input data format
list_title: XPath Protocol Buffers
description:
Use the `xpath_protobuf` input data format and XPath expressions to parse protobuf (Protocol Buffer) data into Telegraf metrics.
menu:
telegraf_v1_ref:
name: XPath Protocol Buffers
weight: 10
parent: Input data formats
metadata: [XPath parser plugin]
---
Use the `xpath_protobuf` input data format, provided by the [XPath parser plugin](https://github.com/influxdata/telegraf/tree/master/plugins/parsers/xpath), with XPath expressions to parse Protocol Buffer data into Telegraf metrics.
For information about supported XPath functions, see [the underlying XPath library][xpath lib].
**NOTE:** The type of fields are specified using [XPath functions][xpath
lib]. The only exceptions are _integer_ fields that need to be specified in a
`fields_int` section.
## Supported data formats
| name | `data_format` setting | comment |
| --------------------------------------- | --------------------- | ------- |
| [Extensible Markup Language (XML)][xml] | `"xml"` | |
| [JSON][json] | `"xpath_json"` | |
| [MessagePack][msgpack] | `"xpath_msgpack"` | |
| [Protocol-buffers][protobuf] | `"xpath_protobuf"` | [see additional parameters](#protocol-buffers-additional-settings)|
### Protocol-buffers additional settings
For using the protocol-buffer format you need to specify additional
(_mandatory_) properties for the parser. Those options are described here.
#### `xpath_protobuf_file` (mandatory)
Use this option to specify the name of the protocol-buffer definition file
(`.proto`).
#### `xpath_protobuf_type` (mandatory)
This option contains the top-level message file to use for deserializing the
data to be parsed. Usually, this is constructed from the `package` name in the
protocol-buffer definition file and the `message` name as `<package
name>.<message name>`.
#### `xpath_protobuf_import_paths` (optional)
In case you import other protocol-buffer definitions within your `.proto` file
(i.e. you use the `import` statement) you can use this option to specify paths
to search for the imported definition file(s). By default the imports are only
searched in `.` which is the current-working-directory, i.e. usually the
directory you are in when starting telegraf.
Imagine you do have multiple protocol-buffer definitions (e.g. `A.proto`,
`B.proto` and `C.proto`) in a directory (e.g. `/data/my_proto_files`) where your
top-level file (e.g. `A.proto`) imports at least one other definition
```protobuf
syntax = "proto3";
package foo;
import "B.proto";
message Measurement {
...
}
```
You should use the following setting
```toml
[[inputs.file]]
files = ["example.dat"]
data_format = "xpath_protobuf"
xpath_protobuf_file = "A.proto"
xpath_protobuf_type = "foo.Measurement"
xpath_protobuf_import_paths = [".", "/data/my_proto_files"]
...
```
#### `xpath_protobuf_skip_bytes` (optional)
This option allows to skip a number of bytes before trying to parse
the protocol-buffer message. This is useful in cases where the raw data
has a header e.g. for the message length or in case of GRPC messages.
This is a list of known headers and the corresponding values for
`xpath_protobuf_skip_bytes`
| name | setting | comment |
| --------------------------------------- | ------- | ------- |
| [GRPC protocol][GRPC] | 5 | GRPC adds a 5-byte header for _Length-Prefixed-Messages_ |
| [PowerDNS logging][PDNS] | 2 | Sent messages contain a 2-byte header containing the message length |
[GRPC]: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
[PDNS]: https://docs.powerdns.com/recursor/lua-config/protobuf.html
## Configuration
```toml
[[inputs.file]]
files = ["example.xml"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "xml"
## PROTOCOL-BUFFER definitions
## Protocol-buffer definition file
# xpath_protobuf_file = "sparkplug_b.proto"
## Name of the protocol-buffer message type to use in a fully qualified form.
# xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
## List of paths to use when looking up imported protocol-buffer definition files.
# xpath_protobuf_import_paths = ["."]
## Number of (header) bytes to ignore before parsing the message.
# xpath_protobuf_skip_bytes = 0
## Print the internal XML document when in debug logging mode.
## This is especially useful when using the parser with non-XML formats like protocol-buffers
## to get an idea on the expression necessary to derive fields etc.
# xpath_print_document = false
## Allow the results of one of the parsing sections to be empty.
## Useful when not all selected files have the exact same structure.
# xpath_allow_empty_selection = false
## Get native data-types for all data-format that contain type information.
## Currently, protobuf, msgpack and JSON support native data-types
# xpath_native_types = false
## Multiple parsing sections are allowed
[[inputs.file.xpath]]
## Optional: XPath-query to select a subset of nodes from the XML document.
# metric_selection = "/Bus/child::Sensor"
## Optional: XPath-query to set the metric (measurement) name.
# metric_name = "string('example')"
## Optional: Query to extract metric timestamp.
## If not specified the time of execution is used.
# timestamp = "/Gateway/Timestamp"
## Optional: Format of the timestamp determined by the query above.
## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
## time format. If not specified, a "unix" timestamp (in seconds) is expected.
# timestamp_format = "2006-01-02T15:04:05Z"
## Optional: Timezone of the parsed time
## This will locate the parsed time to the given timezone. Please note that
## for times with timezone-offsets (e.g. RFC3339) the timestamp is unchanged.
## This is ignored for all (unix) timestamp formats.
# timezone = "UTC"
## Optional: List of fields to convert to hex-strings if they are
## containing byte-arrays. This might be the case for e.g. protocol-buffer
## messages encoding data as byte-arrays. Wildcard patterns are allowed.
## By default, all byte-array-fields are converted to string.
# fields_bytes_as_hex = []
## Tag definitions using the given XPath queries.
[inputs.file.xpath.tags]
name = "substring-after(Sensor/@name, ' ')"
device = "string('the ultimate sensor')"
## Integer field definitions using XPath queries.
[inputs.file.xpath.fields_int]
consumers = "Variable/@consumers"
## Non-integer field definitions using XPath queries.
## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string.
[inputs.file.xpath.fields]
temperature = "number(Variable/@temperature)"
power = "number(Variable/@power)"
frequency = "number(Variable/@frequency)"
ok = "Mode != 'ok'"
```
In this configuration mode, you explicitly specify the field and tags you want
to scrape from your data.
A configuration can contain multiple _xpath_ subsections (for example, the file plugin
to process the xml-string multiple times). Consult the [XPath syntax][xpath] and
the [underlying library's functions][xpath lib] for details and help regarding
XPath queries. Consider using an XPath tester such as [xpather.com][xpather] or
[Code Beautify's XPath Tester][xpath tester] for help developing and debugging
your query.
## Configuration (batch)
Alternatively to the configuration above, fields can also be specified in a
batch way. So contrary to specify the fields in a section, you can define a
`name` and a `value` selector used to determine the name and value of the fields
in the metric.
```toml
[[inputs.file]]
files = ["example.xml"]
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "xml"
## PROTOCOL-BUFFER definitions
## Protocol-buffer definition file
# xpath_protobuf_file = "sparkplug_b.proto"
## Name of the protocol-buffer message type to use in a fully qualified form.
# xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
## List of paths to use when looking up imported protocol-buffer definition files.
# xpath_protobuf_import_paths = ["."]
## Print the internal XML document when in debug logging mode.
## This is especially useful when using the parser with non-XML formats like protocol-buffers
## to get an idea on the expression necessary to derive fields etc.
# xpath_print_document = false
## Allow the results of one of the parsing sections to be empty.
## Useful when not all selected files have the exact same structure.
# xpath_allow_empty_selection = false
## Get native data-types for all data-format that contain type information.
## Currently, protobuf, msgpack and JSON support native data-types
# xpath_native_types = false
## Multiple parsing sections are allowed
[[inputs.file.xpath]]
## Optional: XPath-query to select a subset of nodes from the XML document.
metric_selection = "/Bus/child::Sensor"
## Optional: XPath-query to set the metric (measurement) name.
# metric_name = "string('example')"
## Optional: Query to extract metric timestamp.
## If not specified the time of execution is used.
# timestamp = "/Gateway/Timestamp"
## Optional: Format of the timestamp determined by the query above.
## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
## time format. If not specified, a "unix" timestamp (in seconds) is expected.
# timestamp_format = "2006-01-02T15:04:05Z"
## Field specifications using a selector.
field_selection = "child::*"
## Optional: Queries to specify field name and value.
## These options are only to be used in combination with 'field_selection'!
## By default the node name and node content is used if a field-selection
## is specified.
# field_name = "name()"
# field_value = "."
## Optional: Expand field names relative to the selected node
## This allows to flatten out nodes with non-unique names in the subtree
# field_name_expansion = false
## Tag specifications using a selector.
## tag_selection = "child::*"
## Optional: Queries to specify tag name and value.
## These options are only to be used in combination with 'tag_selection'!
## By default the node name and node content is used if a tag-selection
## is specified.
# tag_name = "name()"
# tag_value = "."
## Optional: Expand tag names relative to the selected node
## This allows to flatten out nodes with non-unique names in the subtree
# tag_name_expansion = false
## Tag definitions using the given XPath queries.
[inputs.file.xpath.tags]
name = "substring-after(Sensor/@name, ' ')"
device = "string('the ultimate sensor')"
```
**Please note**: The resulting fields are _always_ of type string.
It is also possible to specify a mixture of the two alternative ways of
specifying fields. In this case, _explicitly_ defined tags and fields take
_precedence_ over the batch instances if both use the same tag or field name.
### metric_selection (optional)
You can specify a [XPath][xpath] query to select a subset of nodes from the XML
document, each used to generate a new metrics with the specified fields, tags
etc.
Relative queries in subsequent queries are relative to the
`metric_selection`. To specify absolute paths, start the query with a
slash (`/`).
Specifying `metric_selection` is optional. If not specified, all relative queries
are relative to the root node of the XML document.
### metric_name (optional)
By specifying `metric_name` you can override the metric/measurement name with
the result of the given [XPath][xpath] query. If not specified, the default
metric name is used.
### timestamp, timestamp_format, timezone (optional)
By default, the current time is used for all created metrics. To set the
time from values in the XML document you can specify a [XPath][xpath] query in
`timestamp` and set the format in `timestamp_format`.
The `timestamp_format` can be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`, or
an accepted [Go "reference time"][time const]. Consult the Go [time][time parse]
package for details and additional examples on how to set the time format. If
`timestamp_format` is omitted `unix` format is assumed as result of the
`timestamp` query.
The `timezone` setting will be used to locate the parsed time in the given
timezone. This is helpful for cases where the time does not contain timezone
information, e.g. `2023-03-09 14:04:40` and is not located in _UTC_, which is
the default setting. It is also possible to set the `timezone` to `Local` which
used the configured host timezone.
For time formats with timezone information, e.g. RFC3339, the resulting
timestamp is unchanged. The `timezone` setting is ignored for all `unix`
timestamp formats.
### tags sub-section
[XPath][xpath] queries in the `tag name = query` format to add tags to the
metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
__NOTE:__ Results of tag-queries will always be converted to strings.
### fields_int sub-section
[XPath][xpath] queries in the `field name = query` format to add integer typed
fields to the metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
__NOTE:__ Results of field_int-queries will always be converted to
__int64__. The conversion will fail in case the query result is not convertible!
### fields sub-section
[XPath][xpath] queries in the `field name = query` format to add non-integer
fields to the metrics. The specified path can be absolute (starting with `/`) or
relative. Relative paths use the currently selected node as reference.
The type of the field is specified in the [XPath][xpath] query using the type
conversion functions of XPath such as `number()`, `boolean()` or `string()` If
no conversion is performed in the query the field will be of type string.
__NOTE: Path conversion functions will always succeed even if you convert a text
to float!__
### field_selection, field_name, field_value (optional)
You can specify a [XPath][xpath] query to select a set of nodes forming the
fields of the metric. The specified path can be absolute (starting with `/`) or
relative to the currently selected node. Each node selected by `field_selection`
forms a new field within the metric.
The _name_ and the _value_ of each field can be specified using the optional
`field_name` and `field_value` queries. The queries are relative to the selected
field if not starting with `/`. If not specified the field's _name_ defaults to
the node name and the field's _value_ defaults to the content of the selected
field node.
__NOTE__: `field_name` and `field_value` queries are only evaluated if a
`field_selection` is specified.
Specifying `field_selection` is optional. This is an alternative way to specify
fields especially for documents where the node names are not known a priori or
if there is a large number of fields to be specified. These options can also be
combined with the field specifications above.
__NOTE: Path conversion functions will always succeed even if you convert a text
to float!__
### field_name_expansion (optional)
When _true_, field names selected with `field_selection` are expanded to a
_path_ relative to the _selected node_. This is necessary if we e.g. select all
leaf nodes as fields and those leaf nodes do not have unique names. That is in
case you have duplicate names in the fields you select you should set this to
`true`.
### tag_selection, tag_name, tag_value (optional)
You can specify a [XPath][xpath] query to select a set of nodes forming the tags
of the metric. The specified path can be absolute (starting with `/`) or
relative to the currently selected node. Each node selected by `tag_selection`
forms a new tag within the metric.
The _name_ and the _value_ of each tag can be specified using the optional
`tag_name` and `tag_value` queries. The queries are relative to the selected tag
if not starting with `/`. If not specified the tag's _name_ defaults to the node
name and the tag's _value_ defaults to the content of the selected tag node.
__NOTE__: `tag_name` and `tag_value` queries are only evaluated if a
`tag_selection` is specified.
Specifying `tag_selection` is optional. This is an alternative way to specify
tags especially for documents where the node names are not known a priori or if
there is a large number of tags to be specified. These options can also be
combined with the tag specifications above.
### tag_name_expansion (optional)
When _true_, tag names selected with `tag_selection` are expanded to a _path_
relative to the _selected node_. This is necessary if we e.g. select all leaf
nodes as tags and those leaf nodes do not have unique names. That is in case you
have duplicate names in the tags you select you should set this to `true`.
## Examples
This `example.xml` file is used in the configuration examples below:
```xml
<?xml version="1.0"?>
<Gateway>
<Name>Main Gateway</Name>
<Timestamp>2020-08-01T15:04:03Z</Timestamp>
<Sequence>12</Sequence>
<Status>ok</Status>
</Gateway>
<Bus>
<Sensor name="Sensor Facility A">
<Variable temperature="20.0"/>
<Variable power="123.4"/>
<Variable frequency="49.78"/>
<Variable consumers="3"/>
<Mode>busy</Mode>
</Sensor>
<Sensor name="Sensor Facility B">
<Variable temperature="23.1"/>
<Variable power="14.3"/>
<Variable frequency="49.78"/>
<Variable consumers="1"/>
<Mode>standby</Mode>
</Sensor>
<Sensor name="Sensor Facility C">
<Variable temperature="19.7"/>
<Variable power="0.02"/>
<Variable frequency="49.78"/>
<Variable consumers="0"/>
<Mode>error</Mode>
</Sensor>
</Bus>
```
### Basic Parsing
This example shows the basic usage of the xml parser.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
[inputs.file.xpath.tags]
gateway = "substring-before(/Gateway/Name, ' ')"
[inputs.file.xpath.fields_int]
seqnr = "/Gateway/Sequence"
[inputs.file.xpath.fields]
ok = "/Gateway/Status = 'ok'"
```
Output:
```text
file,gateway=Main,host=Hugin seqnr=12i,ok=true 1598610830000000000
```
In the _tags_ definition the XPath function `substring-before()` is used to only
extract the sub-string before the space. To get the integer value of
`/Gateway/Sequence` we have to use the _fields_int_ section as there is no XPath
expression to convert node values to integers (only float).
The `ok` field is filled with a boolean by specifying a query comparing the
query result of `/Gateway/Status` with the string _ok_. Use the type conversions
available in the XPath syntax to specify field types.
### Time and metric names
This is an example for using time and name of the metric from the XML document
itself.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_name = "name(/Gateway/Status)"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
[inputs.file.xpath.tags]
gateway = "substring-before(/Gateway/Name, ' ')"
[inputs.file.xpath.fields]
ok = "/Gateway/Status = 'ok'"
```
Output:
```text
Status,gateway=Main,host=Hugin ok=true 1596294243000000000
```
Additionally to the basic parsing example, the metric name is defined as the
name of the `/Gateway/Status` node and the timestamp is derived from the XML
document instead of using the execution time.
### Multi-node selection
For XML documents containing metrics for e.g. multiple devices (like `Sensor`s
in the _example.xml_), multiple metrics can be generated using node
selection. This example shows how to generate a metric for each _Sensor_ in the
example.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_selection = "/Bus/child::Sensor"
metric_name = "string('sensors')"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
[inputs.file.xpath.tags]
name = "substring-after(@name, ' ')"
[inputs.file.xpath.fields_int]
consumers = "Variable/@consumers"
[inputs.file.xpath.fields]
temperature = "number(Variable/@temperature)"
power = "number(Variable/@power)"
frequency = "number(Variable/@frequency)"
ok = "Mode != 'error'"
```
Output:
```text
sensors,host=Hugin,name=Facility\ A consumers=3i,frequency=49.78,ok=true,power=123.4,temperature=20 1596294243000000000
sensors,host=Hugin,name=Facility\ B consumers=1i,frequency=49.78,ok=true,power=14.3,temperature=23.1 1596294243000000000
sensors,host=Hugin,name=Facility\ C consumers=0i,frequency=49.78,ok=false,power=0.02,temperature=19.7 1596294243000000000
```
Using the `metric_selection` option we select all `Sensor` nodes in the XML
document. Please note that all field and tag definitions are relative to these
selected nodes. An exception is the timestamp definition which is relative to
the root node of the XML document.
### Batch field processing with multi-node selection
For XML documents containing metrics with a large number of fields or where the
fields are not known before (e.g. an unknown set of `Variable` nodes in the
_example.xml_), field selectors can be used. This example shows how to generate
a metric for each _Sensor_ in the example with fields derived from the
_Variable_ nodes.
Config:
```toml
[[inputs.file]]
files = ["example.xml"]
data_format = "xml"
[[inputs.file.xpath]]
metric_selection = "/Bus/child::Sensor"
metric_name = "string('sensors')"
timestamp = "/Gateway/Timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
field_selection = "child::Variable"
field_name = "name(@*[1])"
field_value = "number(@*[1])"
[inputs.file.xpath.tags]
name = "substring-after(@name, ' ')"
```
Output:
```text
sensors,host=Hugin,name=Facility\ A consumers=3,frequency=49.78,power=123.4,temperature=20 1596294243000000000
sensors,host=Hugin,name=Facility\ B consumers=1,frequency=49.78,power=14.3,temperature=23.1 1596294243000000000
sensors,host=Hugin,name=Facility\ C consumers=0,frequency=49.78,power=0.02,temperature=19.7 1596294243000000000
```
Using the `metric_selection` option we select all `Sensor` nodes in the XML
document. For each _Sensor_ we then use `field_selection` to select all child
nodes of the sensor as _field-nodes_ Please note that the field selection is
relative to the selected nodes. For each selected _field-node_ we use
`field_name` and `field_value` to determining the field's name and value,
respectively. The `field_name` derives the name of the first attribute of the
node, while `field_value` derives the value of the first attribute and converts
the result to a number.
[xpath lib]: https://github.com/antchfx/xpath
[json]: https://www.json.org/
[msgpack]: https://msgpack.org/
[protobuf]: https://developers.google.com/protocol-buffers
[xml]: https://www.w3.org/XML/
[xpath]: https://www.w3.org/TR/xpath/
[xpather]: http://xpather.com/
[xpath tester]: https://codebeautify.org/Xpath-Tester
[time const]: https://golang.org/pkg/time/#pkg-constants
[time parse]: https://golang.org/pkg/time/#Parse

View File

@ -1,31 +0,0 @@
---
title: Telegraf output data formats
list_title: Output data formats
description: Telegraf serializes metrics into output data formats.
menu:
telegraf_v1_ref:
name: Output data formats
weight: 1
parent: Data formats
---
In addition to output-specific data formats, Telegraf supports the following set
of common data formats that may be selected when configuring many of the Telegraf
output plugins.
{{< children >}}
You will be able to identify the plugins with support by the presence of a
`data_format` configuration option, for example, in the File (`file`) output plugin:
```toml
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["stdout"]
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "influx"
```

View File

@ -1,61 +0,0 @@
---
title: Carbon2 output data format
list_title: Carbon2
description: Use the `carbon2` output data format (serializer) to format and output Telegraf metrics as Carbon2 format.
menu:
telegraf_v1_ref:
name: Carbon2
weight: 10
parent: Output data formats
---
Use the `carbon2` output data format (serializer) to format and output Telegraf metrics as [Carbon2 format](http://metrics20.org/implementations/).
### Configuration
```toml
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["stdout", "/tmp/metrics.out"]
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "carbon2"
```
Standard form:
```
metric=name field=field_1 host=foo 30 1234567890
metric=name field=field_2 host=foo 4 1234567890
metric=name field=field_N host=foo 59 1234567890
```
### Metrics
The serializer converts the metrics by creating `intrinsic_tags` using the combination of metric name and fields. So, if one Telegraf metric has 4 fields, the `carbon2` output will be 4 separate metrics. There will be a `metric` tag that represents the name of the metric and a `field` tag to represent the field.
### Example
If we take the following InfluxDB Line Protocol:
```
weather,location=us-midwest,season=summer temperature=82,wind=100 1234567890
```
After serializing in Carbon2, the result would be:
```
metric=weather field=temperature location=us-midwest season=summer 82 1234567890
metric=weather field=wind location=us-midwest season=summer 100 1234567890
```
### Fields and tags with spaces
When a field key or tag key-value have spaces, spaces will be replaced with `_`.
### Tags with empty values
When a tag's value is empty, it will be replaced with `null`.

View File

@ -1,60 +0,0 @@
---
title: Graphite output data format
list_title: Graphite
description: Use the `graphite` output data format (serializer) to format and output Telegraf metrics as Graphite Message Format.
menu:
telegraf_v1_ref:
name: Graphite
weight: 10
parent: Output data formats
identifier: output-data-format-graphite
---
Use the `graphite` output data format (serializer) to format and output Telegraf metrics as [Graphite Message Format](https://graphite.readthedocs.io/en/latest/feeding-carbon.html#step-3-understanding-the-graphite-message-format).
The serializer uses either the _template pattern_ method (_default_) or the _tag support_ method.
To use the tag support method, set the [`graphite_tag_support`](#graphite_tag_support) option.
## Configuration
```toml
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["stdout", "/tmp/metrics.out"]
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "graphite"
## Prefix added to each graphite bucket
prefix = "telegraf"
## Graphite template pattern
template = "host.tags.measurement.field"
## Support Graphite tags, recommended to enable when using Graphite 1.1 or later.
# graphite_tag_support = false
```
### graphite_tag_support
When the `graphite_tag_support` option is enabled, the template pattern is not
used. Instead, tags are encoded using
[Graphite tag support](http://graphite.readthedocs.io/en/latest/tags.html),
added in Graphite 1.1. The `metric_path` is a combination of the optional
`prefix` option, measurement name, and field name.
The tag `name` is reserved by Graphite, any conflicting tags and will be encoded as `_name`.
**Example conversion**:
```
cpu,cpu=cpu-total,dc=us-east-1,host=tars usage_idle=98.09,usage_user=0.89 1455320660004257758
=>
cpu.usage_user;cpu=cpu-total;dc=us-east-1;host=tars 0.89 1455320690
cpu.usage_idle;cpu=cpu-total;dc=us-east-1;host=tars 98.09 1455320690
```
### Templates
To learn more about using templates and template patterns, see [Template patterns](/telegraf/v1/configure_plugins/template-patterns/).

View File

@ -1,44 +0,0 @@
---
title: InfluxDB line protocol output data format
list_title: InfluxDB line protocol
description: Use the `influx` output data format (serializer) to format and output metrics as InfluxDB line protocol format.
menu:
telegraf_v1_ref:
name: InfluxDB line protocol
weight: 10
parent: Output data formats
identifier: output-data-format-influx
---
Use the `influx` output data format (serializer) to format and output metrics as [InfluxDB line protocol][line protocol].
InfluxData recommends this data format unless another format is required for interoperability.
## Configuration
```toml
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["stdout", "/tmp/metrics.out"]
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "influx"
## Maximum line length in bytes. Useful only for debugging.
influx_max_line_bytes = 0
## When true, fields will be output in ascending lexical order. Enabling
## this option will result in decreased performance and is only recommended
## when you need predictable ordering while debugging.
influx_sort_fields = false
## When true, Telegraf will output unsigned integers as unsigned values,
## i.e.: `42u`. You will need a version of InfluxDB supporting unsigned
## integer values. Enabling this option will result in field type errors if
## existing data has been written.
influx_uint_support = false
```
[line protocol]: /influxdb/v1/write_protocols/line_protocol_tutorial/

View File

@ -1,91 +0,0 @@
---
title: JSON output data format
list_title: JSON
description: Use the `json` output data format (serializer) to format and output Telegraf metrics as JSON documents.
menu:
telegraf_v1_ref:
name: JSON
weight: 10
parent: Output data formats
identifier: output-data-format-json
---
Use the `json` output data format (serializer) to format and output Telegraf metrics as [JSON](https://www.json.org/json-en.html) documents.
## Configuration
```toml
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["stdout", "/tmp/metrics.out"]
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "json"
## The resolution to use for the metric timestamp. Must be a duration string
## such as "1ns", "1us", "1ms", "10ms", "1s". Durations are truncated to
## the power of 10 less than the specified units.
json_timestamp_units = "1s"
```
## Examples
### Standard format
```json
{
"fields": {
"field_1": 30,
"field_2": 4,
"field_N": 59,
"n_images": 660
},
"name": "docker",
"tags": {
"host": "raynor"
},
"timestamp": 1458229140
}
```
### Batch format
When an output plugin needs to emit multiple metrics at one time, it may use the
batch format. The use of batch format is determined by the plugin -- reference
the documentation for the specific plugin.
```json
{
"metrics": [
{
"fields": {
"field_1": 30,
"field_2": 4,
"field_N": 59,
"n_images": 660
},
"name": "docker",
"tags": {
"host": "raynor"
},
"timestamp": 1458229140
},
{
"fields": {
"field_1": 30,
"field_2": 4,
"field_N": 59,
"n_images": 660
},
"name": "docker",
"tags": {
"host": "raynor"
},
"timestamp": 1458229140
}
]
}
```

View File

@ -1,49 +0,0 @@
---
title: MessagePack output data format
list_title: MessagePack
description: Use the `msgpack` output data format (serializer) to convert Telegraf metrics into MessagePack format.
menu:
telegraf_v1_ref:
name: MessagePack
weight: 10
parent: Output data formats
---
The `msgpack` output data format (serializer) translates the Telegraf metric format to the [MessagePack](https://msgpack.org/). MessagePack is an efficient binary serialization format that lets you exchange data among multiple languages like JSON.
### Configuration
```toml
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["stdout", "/tmp/metrics.out"]
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "msgpack"
```
### Example output
Output of this format is MessagePack binary representation of metrics with a structure identical to the below JSON:
```
{
"name":"cpu",
"time": <TIMESTAMP>, // https://github.com/msgpack/msgpack/blob/master/spec.md#timestamp-extension-type
"tags":{
"tag_1":"host01",
...
},
"fields":{
"field_1":30,
"field_2":true,
"field_3":"field_value"
"field_4":30.1
...
}
}
```

View File

@ -1,91 +0,0 @@
---
title: ServiceNow metrics output data format
list_title: ServiceNow metrics
description: Use the `nowmetric` ServiceNow metrics output data format (serializer) to output Telegraf metrics as ServiceNow Operational Intelligence format.
menu:
telegraf_v1_ref:
name: ServiceNow metrics
weight: 10
parent: Output data formats
---
The `nowmetric` output data format (serializer) outputs Telegraf metrics as [ServiceNow Operational Intelligence format](https://docs.servicenow.com/bundle/kingston-it-operations-management/page/product/event-management/reference/mid-POST-metrics.html).
It can be used to write to a file using the File output plugin, or for sending metrics to a MID Server with Enable REST endpoint activated using the standard telegraf HTTP output.
If you're using the HTTP output plugin, this serializer knows how to batch the metrics so you don't end up with an HTTP POST per metric.
An example event looks like:
```javascript
[{
"metric_type": "Disk C: % Free Space",
"resource": "C:\\",
"node": "lnux100",
"value": 50,
"timestamp": 1473183012000,
"ci2metric_id": {
"node": "lnux100"
},
"source": “Telegraf”
}]
```
## Using with the HTTP output plugin
To send this data to a ServiceNow MID Server with Web Server extension activated, you can use the HTTP output plugin, there are some custom headers that you need to add to manage the MID Web Server authorization, here's a sample config for an HTTP output:
```toml
[[outputs.http]]
## URL is the address to send metrics to
url = "http://<mid server fqdn or ip address>:9082/api/mid/sa/metrics"
## Timeout for HTTP message
# timeout = "5s"
## HTTP method, one of: "POST" or "PUT"
method = "POST"
## HTTP Basic Auth credentials
username = 'evt.integration'
password = 'P@$$w0rd!'
## Optional TLS Config
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
## Data format to output.
## Each data format has it's own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "nowmetric"
## Additional HTTP headers
[outputs.http.headers]
# # Should be set manually to "application/json" for json data_format
Content-Type = "application/json"
Accept = "application/json"
```
Starting with the London release, you also need to explicitly create event rule to allow binding of metric events to host CIs.
https://docs.servicenow.com/bundle/london-it-operations-management/page/product/event-management/task/event-rule-bind-metrics-to-host.html
## Using with the File output plugin
You can use the File output plugin to output the payload in a file.
In this case, just add the following section to your telegraf configuration file.
```toml
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["C:/Telegraf/metrics.out"]
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "nowmetric"
```

View File

@ -1,149 +0,0 @@
---
title: Splunk metrics output data format
list_title: Splunk metrics
description: Use the `splunkmetric` metric output data format (serializer) to output Telegraf metrics in a format that can be consumed by a Splunk metrics index.
menu:
telegraf_v1_ref:
name: Splunk metric
weight: 10
parent: Output data formats
---
Use the `splunkmetric` output data format (serializer) to output Telegraf metrics in a format that can be consumed by a Splunk metrics index.
The output data format can write to a file using the file output, or send metrics to a HEC using the standard Telegraf HTTP output.
If you're using the HTTP output, this serializer knows how to batch the metrics so you don't end up with an HTTP POST per metric.
Th data is output in a format that conforms to the specified Splunk HEC JSON format as found here:
[Send metrics in JSON format](http://dev.splunk.com/view/event-collector/SP-CAAAFDN).
An example event looks like:
```javascript
{
"time": 1529708430,
"event": "metric",
"host": "patas-mbp",
"fields": {
"_value": 0.6,
"cpu": "cpu0",
"dc": "mobile",
"metric_name": "cpu.usage_user",
"user": "ronnocol"
}
}
```
In the above snippet, the following keys are dimensions:
* cpu
* dc
* user
## Using with the HTTP output
To send this data to a Splunk HEC, you can use the HTTP output, there are some custom headers that you need to add
to manage the HEC authorization, here's a sample config for an HTTP output:
```toml
[[outputs.http]]
## URL is the address to send metrics to
url = "https://localhost:8088/services/collector"
## Timeout for HTTP message
# timeout = "5s"
## HTTP method, one of: "POST" or "PUT"
# method = "POST"
## HTTP Basic Auth credentials
# username = "username"
# password = "pa$$word"
## Optional TLS Config
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
## Data format to output.
## Each data format has it's own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "splunkmetric"
## Provides time, index, source overrides for the HEC
splunkmetric_hec_routing = true
## Additional HTTP headers
[outputs.http.headers]
# Should be set manually to "application/json" for json data_format
Content-Type = "application/json"
Authorization = "Splunk xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
X-Splunk-Request-Channel = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
```
## Overrides
You can override the default values for the HEC token you are using by adding additional tags to the config file.
The following aspects of the token can be overridden with tags:
* index
* source
You can either use `[global_tags]` or using a more advanced configuration as documented [here](https://github.com/influxdata/telegraf/blob/master/docs/CONFIGURATION.md).
Such as this example which overrides the index just on the cpu metric:
```toml
[[inputs.cpu]]
percpu = false
totalcpu = true
[inputs.cpu.tags]
index = "cpu_metrics"
```
## Using with the File output
You can use the file output when running telegraf on a machine with a Splunk forwarder.
A sample event when `hec_routing` is false (or unset) looks like:
```javascript
{
"_value": 0.6,
"cpu": "cpu0",
"dc": "mobile",
"metric_name": "cpu.usage_user",
"user": "ronnocol",
"time": 1529708430
}
```
Data formatted in this manner can be ingested with a simple `props.conf` file that
looks like this:
```ini
[telegraf]
category = Metrics
description = Telegraf Metrics
pulldown_type = 1
DATETIME_CONFIG =
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
disabled = false
INDEXED_EXTRACTIONS = json
KV_MODE = none
TIMESTAMP_FIELDS = time
TIME_FORMAT = %s.%3N
```
An example configuration of a file based output is:
```toml
# Send telegraf metrics to file(s)
[[outputs.file]]
## Files to write to, "stdout" is a specially handled file.
files = ["/tmp/metrics.out"]
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "splunkmetric"
hec_routing = false
```