Merge pull request #764 from influxdata/js-transpiler-pivot

docs: update the transpiler docs to use pivot
pull/10616/head
Jonathan A. Sternberg 2018-09-04 09:44:06 -05:00 committed by GitHub
commit 1776778a06
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 28 additions and 18 deletions

View File

@ -7,10 +7,10 @@ The InfluxQL Transpiler exists to rewrite an InfluxQL query into its equivalent
1. [Identify the cursors](#identify-cursors) 1. [Identify the cursors](#identify-cursors)
2. [Identify the query type](#identify-query-type) 2. [Identify the query type](#identify-query-type)
3. [Group the cursors](#group-cursors) 3. [Group the cursors](#group-cursors)
4. [Create the cursors for each group](#group-cursors) 4. [Create the cursors for each group](#create-groups)
1. [Identify the variables](#identify-variables) 1. [Create cursor](#create-cursor)
2. [Generate each cursor](#generate-cursor) 2. [Filter by measurement and fields](#filter-cursor)
3. [Join the cursors](#join-cursors) 3. [Generate the pivot table](#generate-pivot-table)
4. [Evaluate the condition](#evaluate-condition) 4. [Evaluate the condition](#evaluate-condition)
5. [Perform the grouping](#perform-grouping) 5. [Perform the grouping](#perform-grouping)
6. [Evaluate the function](#evaluate-function) 6. [Evaluate the function](#evaluate-function)
@ -41,32 +41,42 @@ We group the cursors based on the query type. For raw queries and selectors, all
We create the cursors within each group. This process is repeated for every group. We create the cursors within each group. This process is repeated for every group.
### <a name="create-cursor"></a> Create cursor
The cursor is generated using the following template:
create_cursor = (db, rp="autogen", start, stop=now()) => from(bucket: db+"/"+rp)
|> range(start: start, stop: stop)
This is called once per group.
### <a name="identify-variables"></a> Identify the variables ### <a name="identify-variables"></a> Identify the variables
Each of the variables in the group are identified. This involves inspecting the condition to collect the common variables in the expression while also retrieving the variables for each expression within the group. For a function call, this retrieves the variable used as a function argument rather than the function itself. Each of the variables in the group are identified. This involves inspecting the condition to collect the common variables in the expression while also retrieving the variables for each expression within the group. For a function call, this retrieves the variable used as a function argument rather than the function itself.
If a wildcard is identified, then the schema must be consulted for all of the fields and tags. If there is a wildcard in the dimensions (the group by clause), then the dimensions are excluded from the field expansion. If there is a specific listing of dimensions in the grouping, then those specific tags are excluded. If a wildcard is identified in the fields, then the field filter is cleared and only the measurement filter is used. If a regex wildcard is identified, it is added as one of the field filters.
### <a name="generate-cursor"></a> Generate each cursor ### <a name="filter-cursor"></a> Filter by measurement and fields
The base cursor for each variable is generated using the following template: A filter expression is generated by using the measurement and the fields that were identified. It follows this template:
create_cursor = (db, rp="autogen", start, stop=now(), m, f) => from(bucket: db+"/"+rp) ... |> filter(fn: (r) => r._measurement == <measurement> and <field_expr>)
|> range(start: start, stop: stop)
|> filter(fn: (r) => r._measurement == m and r._field == f)
### <a name="join-cursors"></a> Join the cursors The `<measurement>` is equal to the measurement name from the `FROM` clause. The `<field_expr>` section is generated differently depending on the fields that were found. If more than one field was selected, then each of the field filters is combined by using `or` and the expression itself is surrounded by parenthesis. For a non-wildcard field, the following expression is used:
After creating the base cursors, each of them is joined into a single stream using an `inner_join`. r._field == <name>
**TODO(jsternberg):** Raw queries need to evaluate `fill()` at this stage while selectors and aggregates should not. For a regex wildcard, the following is used:
> SELECT usage_user, usage_system FROM telegraf..cpu WHERE time >= now() - 5m r._field =~ <regex>
val1 = create_cursor(bucket: "telegraf/autogen", start: -5m, m: "cpu", f: "usage_user")
val1 = create_cursor(bucket: "telegraf/autogen", start: -5m, m: "cpu", f: "usage_system")
inner_join(tables: {val1: val1, val2: val2}, except: ["_field"], fn: (tables) => {val1: tables.val1, val2: tables.val2})
If there is only one cursor, then nothing needs to be done. If a star wildcard was used, the `<field_expr>` is omitted from the filter expression.
### <a name="generate-pivot-table"></a> Generate the pivot table
If there was more than one field selected or if one of the fields was some form of wildcard, a pivot expression is generated.
... |> pivot(rowKey: ["_time"], colKey: ["_field"], valueCol: "_value")
### <a name="evaluate-condition"></a> Evaluate the condition ### <a name="evaluate-condition"></a> Evaluate the condition