docs-v2

3.1 KiB

Raw Blame History

title

description

aliases

weight

join() function

The `join()` function merges two or more input streams whose values are equal on a set of common columns into a single output stream.

/v2.0/reference/flux/functions/transformations/join

v2_0_ref

name	parent
join	built-in-transformations

401

The join() function merges two or more input streams whose values are equal on a set of common columns into a single output stream. Null values are not considered equal when comparing column values. The resulting schema is the union of the input schemas. The resulting group key is the union of the input group keys.

Function type: Transformation
Output data type: Object

join(tables: {key1: table1, key2: table2}, on: ["_time", "_field"], method: "inner")

Output schema

The column schema of the output stream is the union of the input schemas. It is also the same for the output group key. Columns are renamed using the pattern <column>_<table> to prevent ambiguity in joined tables.

Example:

If you have two streams of data, data_1 and data_2, with the following group keys:

data_1: [_time, _field]
data_2: [_time, _field]

And join them with:

join(tables: {d1: data_1, d2: data_2}, on: ["_time"])

The resulting group keys for all tables will be: [_time, _field_d1, _field_d2]

Parameters

tables

The map of streams to be joined. Required

Data type: Object

{{% note %}} join() currently only supports two input streams. {{% /note %}}

on

The list of columns on which to join. Required

Data type: Array of strings

method

The method used to join. Defaults to "inner".

Data type: String

Possible Values:

inner
cross
left
right
full

{{% note %}} The on parameter and the cross method are mutually exclusive. {{% /note %}}

Examples

Example join with sample data

Given the following two streams of data:

SF_Temp**

_time	_field	_value
0001	"temp"	70
0002	"temp"	75
0003	"temp"	72

NY_Temp**

_time	_field	_value
0001	"temp"	55
0002	"temp"	56
0003	"temp"	55

And the following join query:

join(
  tables: {sf: SF_Temp, ny: NY_Temp},
  on: ["_time", "_field"]
)

The output will be:

_time	_field	_value_ny	_value_sf
0001	"temp"	55	70
0002	"temp"	56	75
0003	"temp"	55	72

Cross-measurement join

data_1 = from(bucket:"example-bucket")
  |> range(start:-15m)
  |> filter(fn: (r) =>
    r._measurement == "cpu" and
    r._field == "usage_system"
  )

data_2 = from(bucket:"example-bucket")
  |> range(start:-15m)
  |> filter(fn: (r) =>
    r._measurement == "mem" and
    r._field == "used_percent"
  )

join(
  tables: {d1: data_1, d2: data_2},
  on: ["_time", "host"]
)

3.1 KiB Raw Blame History