From a45bd08cf6fbe61dfda1c1a9a5774d37e036fbd3 Mon Sep 17 00:00:00 2001 From: "Jonathan A. Sternberg" Date: Fri, 31 Aug 2018 16:00:52 -0500 Subject: [PATCH] docs: update the transpiler docs to use pivot Instead of generating multiple cursors, a pivot is used to join fields within the same series. This should be easier than generating a new cursor for everything. --- query/influxql/README.md | 46 ++++++++++++++++++++++++---------------- 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/query/influxql/README.md b/query/influxql/README.md index f80d2be99c..8e6a562f35 100644 --- a/query/influxql/README.md +++ b/query/influxql/README.md @@ -7,10 +7,10 @@ The InfluxQL Transpiler exists to rewrite an InfluxQL query into its equivalent 1. [Identify the cursors](#identify-cursors) 2. [Identify the query type](#identify-query-type) 3. [Group the cursors](#group-cursors) -4. [Create the cursors for each group](#group-cursors) - 1. [Identify the variables](#identify-variables) - 2. [Generate each cursor](#generate-cursor) - 3. [Join the cursors](#join-cursors) +4. [Create the cursors for each group](#create-groups) + 1. [Create cursor](#create-cursor) + 2. [Filter by measurement and fields](#filter-cursor) + 3. [Generate the pivot table](#generate-pivot-table) 4. [Evaluate the condition](#evaluate-condition) 5. [Perform the grouping](#perform-grouping) 6. [Evaluate the function](#evaluate-function) @@ -41,32 +41,42 @@ We group the cursors based on the query type. For raw queries and selectors, all We create the cursors within each group. This process is repeated for every group. +### Create cursor + +The cursor is generated using the following template: + + create_cursor = (db, rp="autogen", start, stop=now()) => from(bucket: db+"/"+rp) + |> range(start: start, stop: stop) + +This is called once per group. + ### Identify the variables Each of the variables in the group are identified. This involves inspecting the condition to collect the common variables in the expression while also retrieving the variables for each expression within the group. For a function call, this retrieves the variable used as a function argument rather than the function itself. -If a wildcard is identified, then the schema must be consulted for all of the fields and tags. If there is a wildcard in the dimensions (the group by clause), then the dimensions are excluded from the field expansion. If there is a specific listing of dimensions in the grouping, then those specific tags are excluded. +If a wildcard is identified in the fields, then the field filter is cleared and only the measurement filter is used. If a regex wildcard is identified, it is added as one of the field filters. -### Generate each cursor +### Filter by measurement and fields -The base cursor for each variable is generated using the following template: +A filter expression is generated by using the measurement and the fields that were identified. It follows this template: - create_cursor = (db, rp="autogen", start, stop=now(), m, f) => from(bucket: db+"/"+rp) - |> range(start: start, stop: stop) - |> filter(fn: (r) => r._measurement == m and r._field == f) + ... |> filter(fn: (r) => r._measurement == and ) -### Join the cursors +The `` is equal to the measurement name from the `FROM` clause. The `` section is generated differently depending on the fields that were found. If more than one field was selected, then each of the field filters is combined by using `or` and the expression itself is surrounded by parenthesis. For a non-wildcard field, the following expression is used: -After creating the base cursors, each of them is joined into a single stream using an `inner_join`. + r._field == -**TODO(jsternberg):** Raw queries need to evaluate `fill()` at this stage while selectors and aggregates should not. +For a regex wildcard, the following is used: - > SELECT usage_user, usage_system FROM telegraf..cpu WHERE time >= now() - 5m - val1 = create_cursor(bucket: "telegraf/autogen", start: -5m, m: "cpu", f: "usage_user") - val1 = create_cursor(bucket: "telegraf/autogen", start: -5m, m: "cpu", f: "usage_system") - inner_join(tables: {val1: val1, val2: val2}, except: ["_field"], fn: (tables) => {val1: tables.val1, val2: tables.val2}) + r._field =~ -If there is only one cursor, then nothing needs to be done. +If a star wildcard was used, the `` is omitted from the filter expression. + +### Generate the pivot table + +If there was more than one field selected or if one of the fields was some form of wildcard, a pivot expression is generated. + + ... |> pivot(rowKey: ["_time"], colKey: ["_field"], valueCol: "_value") ### Evaluate the condition