5143-Add Optimize Queries page with query analysis help (#5165)
* chore(test): Use my python client fork (pending approval) to allow custom headers. * feature(query): Add Optimize Queries page with query analysis help - Closes Client library query traces: Python #5143 - Dedicated and Clustered examples for enabling query tracing and extracting headers - System.queries table - Explain and Analyze - For now, skip tests for sample Flight responses until we add in code samples. * Update content/influxdb/clustered/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/cloud-dedicated/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/cloud-dedicated/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/cloud-dedicated/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/clustered/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/clustered/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/clustered/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/clustered/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/cloud-dedicated/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/influxdb/cloud-dedicated/query-data/execute-queries/optimize-queries.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * feat(v3): influx-trace-id for dedicated, tracing not ready for clustered (Client library query traces: Python #5143) --------- Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>pull/5180/head^2
parent
6be4bbd3bc
commit
5ad8e80361
|
@ -25,6 +25,7 @@ related:
|
||||||
- /influxdb/cloud-dedicated/query-data/sql/
|
- /influxdb/cloud-dedicated/query-data/sql/
|
||||||
- /influxdb/cloud-dedicated/reference/influxql/
|
- /influxdb/cloud-dedicated/reference/influxql/
|
||||||
- /influxdb/cloud-dedicated/reference/sql/
|
- /influxdb/cloud-dedicated/reference/sql/
|
||||||
|
- /influxdb/cloud-dedicated/query-data/execute-queries/troubleshoot/
|
||||||
|
|
||||||
list_code_example: |
|
list_code_example: |
|
||||||
```py
|
```py
|
||||||
|
@ -305,7 +306,7 @@ and specify the following arguments:
|
||||||
|
|
||||||
#### Example {#execute-query-example}
|
#### Example {#execute-query-example}
|
||||||
|
|
||||||
The following examples shows how to use SQL or InfluxQL to select all fields in a measurement, and then output the results formatted as a Markdown table.
|
The following example shows how to use SQL or InfluxQL to select all fields in a measurement, and then use PyArrow functions to extract metadata and aggregate data.
|
||||||
|
|
||||||
{{% code-tabs-wrapper %}}
|
{{% code-tabs-wrapper %}}
|
||||||
{{% code-tabs %}}
|
{{% code-tabs %}}
|
||||||
|
|
|
@ -0,0 +1,442 @@
|
||||||
|
---
|
||||||
|
title: Optimize queries
|
||||||
|
description: >
|
||||||
|
Optimize your SQL and InfluxQL queries to improve performance and reduce their memory and compute (CPU) requirements.
|
||||||
|
weight: 401
|
||||||
|
menu:
|
||||||
|
influxdb_cloud_dedicated:
|
||||||
|
name: Optimize queries
|
||||||
|
parent: Execute queries
|
||||||
|
influxdb/cloud-dedicated/tags: [query, sql, influxql]
|
||||||
|
related:
|
||||||
|
- /influxdb/cloud-dedicated/query-data/sql/
|
||||||
|
- /influxdb/cloud-dedicated/query-data/influxql/
|
||||||
|
- /influxdb/cloud-dedicated/query-data/execute-queries/troubleshoot/
|
||||||
|
- /influxdb/cloud-dedicated/reference/client-libraries/v3/
|
||||||
|
---
|
||||||
|
|
||||||
|
Use the following tools to help you identify performance bottlenecks and troubleshoot problems in queries:
|
||||||
|
|
||||||
|
<!-- TOC -->
|
||||||
|
|
||||||
|
- [EXPLAIN and ANALYZE](#explain-and-analyze)
|
||||||
|
- [Enable trace logging](#enable-trace-logging)
|
||||||
|
- [Avoid unnecessary tracing](#avoid-unnecessary-tracing)
|
||||||
|
- [Syntax](#syntax)
|
||||||
|
- [Example](#example)
|
||||||
|
- [Tracing response header](#tracing-response-header)
|
||||||
|
- [Trace response header syntax](#trace-response-header-syntax)
|
||||||
|
- [Inspect Flight response headers](#inspect-flight-response-headers)
|
||||||
|
- [Retrieve query information](#retrieve-query-information)
|
||||||
|
|
||||||
|
<!-- /TOC -->
|
||||||
|
|
||||||
|
## EXPLAIN and ANALYZE
|
||||||
|
|
||||||
|
To view the query engine's execution plan and metrics for an SQL or InfluxQL query, prepend [`EXPLAIN`](/influxdb/cloud-dedicated/reference/sql/explain/) or [`EXPLAIN ANALYZE`](/influxdb/cloud-dedicated/reference/sql/explain/#explain-analyze) to the query.
|
||||||
|
The report can reveal query bottlenecks such as a large number of table scans or parquet files, and can help triage the question, "Is the query slow due to the amount of work required or due to a problem with the schema, compactor, etc.?"
|
||||||
|
|
||||||
|
The following example shows how to use the InfluxDB v3 Python client library and pandas to view `EXPLAIN` and `EXPLAIN ANALYZE` results for a query:
|
||||||
|
|
||||||
|
<!-- Import for tests and hide from users.
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
```
|
||||||
|
-->
|
||||||
|
|
||||||
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)" %}}
|
||||||
|
<!--pytest-codeblocks:cont-->
|
||||||
|
|
||||||
|
```python
|
||||||
|
from influxdb_client_3 import InfluxDBClient3
|
||||||
|
import pandas as pd
|
||||||
|
import tabulate # Required for pandas.to_markdown()
|
||||||
|
|
||||||
|
# Instantiate an InfluxDB client.
|
||||||
|
client = InfluxDBClient3(token = f"DATABASE_TOKEN",
|
||||||
|
host = f"{{< influxdb/host >}}",
|
||||||
|
database = f"DATABASE_NAME")
|
||||||
|
|
||||||
|
sql_explain = '''EXPLAIN
|
||||||
|
SELECT temp
|
||||||
|
FROM home
|
||||||
|
WHERE time >= now() - INTERVAL '90 days'
|
||||||
|
AND room = 'Kitchen'
|
||||||
|
ORDER BY time'''
|
||||||
|
|
||||||
|
table = client.query(sql_explain)
|
||||||
|
df = table.to_pandas()
|
||||||
|
print(df.to_markdown(index=False))
|
||||||
|
|
||||||
|
assert df.shape == (2, 2), f'Expect {df.shape} to have 2 columns, 2 rows'
|
||||||
|
assert 'physical_plan' in df.plan_type.values, "Expect physical_plan"
|
||||||
|
assert 'logical_plan' in df.plan_type.values, "Expect logical_plan"
|
||||||
|
```
|
||||||
|
|
||||||
|
{{< expand-wrapper >}}
|
||||||
|
{{% expand "View EXPLAIN example results" %}}
|
||||||
|
| plan_type | plan |
|
||||||
|
|:--------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
|
| logical_plan | Projection: home.temp |
|
||||||
|
| | Sort: home.time ASC NULLS LAST |
|
||||||
|
| | Projection: home.temp, home.time |
|
||||||
|
| | TableScan: home projection=[room, temp, time], full_filters=[home.time >= TimestampNanosecond(1688676582918581320, None), home.room = Dictionary(Int32, Utf8("Kitchen"))] |
|
||||||
|
| physical_plan | ProjectionExec: expr=[temp@0 as temp] |
|
||||||
|
| | SortExec: expr=[time@1 ASC NULLS LAST] |
|
||||||
|
| | EmptyExec: produce_one_row=false |
|
||||||
|
{{% /expand %}}
|
||||||
|
{{< /expand-wrapper >}}
|
||||||
|
|
||||||
|
<!--pytest-codeblocks:cont-->
|
||||||
|
|
||||||
|
```python
|
||||||
|
sql_explain_analyze = '''EXPLAIN ANALYZE
|
||||||
|
SELECT *
|
||||||
|
FROM home
|
||||||
|
WHERE time >= now() - INTERVAL '90 days'
|
||||||
|
ORDER BY time'''
|
||||||
|
|
||||||
|
table = client.query(sql_explain_analyze)
|
||||||
|
df = table.to_pandas()
|
||||||
|
print(df.to_markdown(index=False))
|
||||||
|
|
||||||
|
assert df.shape == (1,2)
|
||||||
|
assert 'Plan with Metrics' in df.plan_type.values, "Expect plan metrics"
|
||||||
|
|
||||||
|
client.close()
|
||||||
|
```
|
||||||
|
{{% /code-placeholders %}}
|
||||||
|
|
||||||
|
Replace the following:
|
||||||
|
|
||||||
|
- {{% code-placeholder-key %}}`DATABASE_NAME`{{% /code-placeholder-key %}}: your {{% product-name %}} database
|
||||||
|
- {{% code-placeholder-key %}}`DATABASE_TOKEN`{{% /code-placeholder-key %}}: a [database token](/influxdb/cloud-dedicated/admin/tokens/) with sufficient permissions to the specified database
|
||||||
|
|
||||||
|
{{< expand-wrapper >}}
|
||||||
|
{{% expand "View EXPLAIN ANALYZE example results" %}}
|
||||||
|
| plan_type | plan |
|
||||||
|
|:------------------|:-----------------------------------------------------------------------------------------------------------------------|
|
||||||
|
| Plan with Metrics | ProjectionExec: expr=[temp@0 as temp], metrics=[output_rows=0, elapsed_compute=1ns] |
|
||||||
|
| | SortExec: expr=[time@1 ASC NULLS LAST], metrics=[output_rows=0, elapsed_compute=1ns, spill_count=0, spilled_bytes=0] |
|
||||||
|
| | EmptyExec: produce_one_row=false, metrics=[]
|
||||||
|
{{% /expand %}}
|
||||||
|
{{< /expand-wrapper >}}
|
||||||
|
|
||||||
|
## Enable trace logging
|
||||||
|
|
||||||
|
When you enable trace logging for a query, InfluxDB propagates your _trace ID_ through system processes and collects additional log information.
|
||||||
|
|
||||||
|
InfluxDB Support can then use the trace ID that you provide to filter, collate, and analyze log information for the query run.
|
||||||
|
The tracing system follows the [OpenTelemetry traces](https://opentelemetry.io/docs/concepts/signals/traces/) model for providing observability into a request.
|
||||||
|
|
||||||
|
{{% warn %}}
|
||||||
|
#### Avoid unnecessary tracing
|
||||||
|
|
||||||
|
Only enable tracing for a query when you need to request troubleshooting help from InfluxDB Support.
|
||||||
|
To manage resources, InfluxDB has an upper limit for the number of trace requests.
|
||||||
|
Too many traces can cause InfluxDB to evict log information.
|
||||||
|
{{% /warn %}}
|
||||||
|
|
||||||
|
To enable tracing for a query, include the `influx-trace-id` header in your query request.
|
||||||
|
|
||||||
|
### Syntax
|
||||||
|
|
||||||
|
Use the following syntax for the `influx-trace-id` header:
|
||||||
|
|
||||||
|
```http
|
||||||
|
influx-trace-id: TRACE_ID:1112223334445:0:1
|
||||||
|
```
|
||||||
|
|
||||||
|
In the header value, replace the following:
|
||||||
|
|
||||||
|
- `TRACE_ID`: a unique string, 8-16 bytes long, encoded as hexadecimal (32 maximum hex characters).
|
||||||
|
The trace ID should uniquely identify the query run.
|
||||||
|
- `:1112223334445:0:1`: InfluxDB constant values (required, but ignored)
|
||||||
|
|
||||||
|
### Example
|
||||||
|
|
||||||
|
The following examples show how to create and pass a trace ID to enable query tracing in InfluxDB:
|
||||||
|
|
||||||
|
{{< tabs-wrapper >}}
|
||||||
|
{{% tabs %}}
|
||||||
|
[Python with FlightCallOptions](#)
|
||||||
|
[Python with FlightClientMiddleware](#python-with-flightclientmiddleware)
|
||||||
|
{{% /tabs %}}
|
||||||
|
{{% tab-content %}}
|
||||||
|
<!---- BEGIN PYTHON WITH FLIGHTCALLOPTIONS ---->
|
||||||
|
Use the `InfluxDBClient3` InfluxDB Python client and pass the `headers` argument in the
|
||||||
|
`query()` method.
|
||||||
|
|
||||||
|
<!-- Import for tests and hide from users.
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
```
|
||||||
|
-->
|
||||||
|
|
||||||
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)|APP_REQUEST_ID" %}}
|
||||||
|
|
||||||
|
<!--pytest-codeblocks:cont-->
|
||||||
|
|
||||||
|
```python
|
||||||
|
from influxdb_client_3 import InfluxDBClient3
|
||||||
|
import secrets
|
||||||
|
|
||||||
|
def use_flightcalloptions_trace_header():
|
||||||
|
print('# Use FlightCallOptions to enable tracing.')
|
||||||
|
client = InfluxDBClient3(token=f"DATABASE_TOKEN",
|
||||||
|
host=f"{{< influxdb/host >}}",
|
||||||
|
database=f"DATABASE_NAME")
|
||||||
|
|
||||||
|
# Generate a trace ID for the query:
|
||||||
|
# 1. Generate a random 8-byte value as bytes.
|
||||||
|
# 2. Encode the value as hexadecimal.
|
||||||
|
random_bytes = secrets.token_bytes(8)
|
||||||
|
trace_id = random_bytes.hex()
|
||||||
|
|
||||||
|
# Append required constants to the trace ID.
|
||||||
|
trace_value = f"{trace_id}:1112223334445:0:1"
|
||||||
|
|
||||||
|
# Encode the header key and value as bytes.
|
||||||
|
# Create a list of header tuples.
|
||||||
|
headers = [((b"influx-trace-id", trace_value.encode('utf-8')))]
|
||||||
|
|
||||||
|
sql = "SELECT * FROM home WHERE time >= now() - INTERVAL '30 days'"
|
||||||
|
influxql = "SELECT * FROM home WHERE time >= -90d"
|
||||||
|
|
||||||
|
# Use the query() headers argument to pass the list as FlightCallOptions.
|
||||||
|
client.query(sql, headers=headers)
|
||||||
|
|
||||||
|
client.close()
|
||||||
|
|
||||||
|
use_flightcalloptions_trace_header()
|
||||||
|
```
|
||||||
|
|
||||||
|
{{% /code-placeholders %}}
|
||||||
|
<!---- END PYTHON WITH FLIGHTCALLOPTIONS ---->
|
||||||
|
{{% /tab-content %}}
|
||||||
|
{{% tab-content %}}
|
||||||
|
<!---- BEGIN PYTHON WITH MIDDLEWARE ---->
|
||||||
|
Use the `InfluxDBClient3` InfluxDB Python client and `flight.ClientMiddleware` to pass and inspect headers.
|
||||||
|
|
||||||
|
### Tracing response header
|
||||||
|
|
||||||
|
With tracing enabled and a valid trace ID in the request, InfluxDB's `DoGet` action response contains a header with the trace ID that you sent.
|
||||||
|
|
||||||
|
#### Trace response header syntax
|
||||||
|
|
||||||
|
```http
|
||||||
|
trace-id: TRACE_ID
|
||||||
|
```
|
||||||
|
|
||||||
|
### Inspect Flight response headers
|
||||||
|
|
||||||
|
To inspect Flight response headers when using a client library, pass a `FlightClientMiddleware` instance.
|
||||||
|
that defines a middleware callback function for the `onHeadersReceived` event (the particular function name you use depends on the client library language).
|
||||||
|
|
||||||
|
The following example uses Python client middleware that adds request headers and extracts the trace ID from the `DoGet` response headers:
|
||||||
|
|
||||||
|
<!-- Import for tests and hide from users.
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
```
|
||||||
|
-->
|
||||||
|
|
||||||
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)|APP_REQUEST_ID" %}}
|
||||||
|
|
||||||
|
<!--pytest-codeblocks:cont-->
|
||||||
|
|
||||||
|
```python
|
||||||
|
import pyarrow.flight as flight
|
||||||
|
|
||||||
|
class TracingClientMiddleWareFactory(flight.ClientMiddleware):
|
||||||
|
# Defines a custom middleware factory that returns a middleware instance.
|
||||||
|
def __init__(self):
|
||||||
|
self.request_headers = []
|
||||||
|
self.response_headers = []
|
||||||
|
self.traces = []
|
||||||
|
|
||||||
|
def addRequestHeader(self, header):
|
||||||
|
self.request_headers.append(header)
|
||||||
|
|
||||||
|
def addResponseHeader(self, header):
|
||||||
|
self.response_headers.append(header)
|
||||||
|
|
||||||
|
def addTrace(self, traceid):
|
||||||
|
self.traces.append(traceid)
|
||||||
|
|
||||||
|
def createTrace(self, traceid):
|
||||||
|
# Append InfluxDB constants to the trace ID.
|
||||||
|
trace = f"{traceid}:1112223334445:0:1"
|
||||||
|
|
||||||
|
# To the list of request headers,
|
||||||
|
# add a tuple with the header key and value as bytes.
|
||||||
|
self.addRequestHeader((b"influx-trace-id", trace.encode('utf-8')))
|
||||||
|
|
||||||
|
def start_call(self, info):
|
||||||
|
return TracingClientMiddleware(info.method, self)
|
||||||
|
|
||||||
|
class TracingClientMiddleware(flight.ClientMiddleware):
|
||||||
|
# Defines middleware with client event callback methods.
|
||||||
|
def __init__(self, method, callback_obj):
|
||||||
|
self._method = method
|
||||||
|
self.callback = callback_obj
|
||||||
|
|
||||||
|
def call_completed(self, exception):
|
||||||
|
print('callback: call_completed')
|
||||||
|
if(exception):
|
||||||
|
print(f" ...with exception: {exception}")
|
||||||
|
|
||||||
|
def sending_headers(self):
|
||||||
|
print('callback: sending_headers: ', self.callback.request_headers)
|
||||||
|
if len(self.callback.request_headers) > 0:
|
||||||
|
return dict(self.callback.request_headers)
|
||||||
|
|
||||||
|
def received_headers(self, headers):
|
||||||
|
self.callback.addResponseHeader(headers)
|
||||||
|
# For the DO_GET action, extract the trace ID from the response headers.
|
||||||
|
if str(self._method) == "FlightMethod.DO_GET" and "trace-id" in headers:
|
||||||
|
trace_id = headers["trace-id"][0]
|
||||||
|
self.callback.addTrace(trace_id)
|
||||||
|
|
||||||
|
from influxdb_client_3 import InfluxDBClient3
|
||||||
|
import secrets
|
||||||
|
|
||||||
|
def use_middleware_trace_header():
|
||||||
|
print('# Use Flight client middleware to enable tracing.')
|
||||||
|
|
||||||
|
# Instantiate the middleware.
|
||||||
|
res = TracingClientMiddleWareFactory()
|
||||||
|
|
||||||
|
# Instantiate the client, passing in the middleware instance that provides
|
||||||
|
# event callbacks for the request.
|
||||||
|
client = InfluxDBClient3(token=f"DATABASE_TOKEN",
|
||||||
|
host=f"{{< influxdb/host >}}",
|
||||||
|
database=f"DATABASE_NAME",
|
||||||
|
flight_client_options={"middleware": (res,)})
|
||||||
|
|
||||||
|
# Generate a trace ID for the query:
|
||||||
|
# 1. Generate a random 8-byte value as bytes.
|
||||||
|
# 2. Encode the value as hexadecimal.
|
||||||
|
random_bytes = secrets.token_bytes(8)
|
||||||
|
trace_id = random_bytes.hex()
|
||||||
|
|
||||||
|
res.createTrace(trace_id)
|
||||||
|
|
||||||
|
sql = "SELECT * FROM home WHERE time >= now() - INTERVAL '30 days'"
|
||||||
|
|
||||||
|
client.query(sql)
|
||||||
|
client.close()
|
||||||
|
assert trace_id in res.traces[0], "Expect trace ID in DoGet response."
|
||||||
|
|
||||||
|
use_middleware_trace_header()
|
||||||
|
```
|
||||||
|
{{% /code-placeholders %}}
|
||||||
|
<!---- END PYTHON WITH MIDDLEWARE ---->
|
||||||
|
{{% /tab-content %}}
|
||||||
|
{{< /tabs-wrapper >}}
|
||||||
|
|
||||||
|
Replace the following:
|
||||||
|
|
||||||
|
- {{% code-placeholder-key %}}`DATABASE_NAME`{{% /code-placeholder-key %}}: your {{% product-name %}} database
|
||||||
|
- {{% code-placeholder-key %}}`DATABASE_TOKEN`{{% /code-placeholder-key %}}: a [database token](/influxdb/cloud-dedicated/admin/tokens/) with sufficient permissions to the specified database
|
||||||
|
|
||||||
|
{{% note %}}
|
||||||
|
Store or log your query trace ID to ensure you can provide it to InfluxDB Support for troubleshooting.
|
||||||
|
{{% /note %}}
|
||||||
|
|
||||||
|
After you run your query with tracing enabled, do the following:
|
||||||
|
|
||||||
|
- Remove the tracing header from subsequent runs of the query (to [avoid unnecessary tracing](#avoid-unnecessary-tracing)).
|
||||||
|
- Provide the trace ID in a request to InfluxDB Support.
|
||||||
|
|
||||||
|
## Retrieve query information
|
||||||
|
|
||||||
|
In addition to the SQL standard `information_schema`, {{% product-name %}} contains _system_ tables that provide access to
|
||||||
|
InfluxDB-specific information.
|
||||||
|
The information in each system table is scoped to the namespace you're querying;
|
||||||
|
you can only retrieve system information for that particular instance.
|
||||||
|
|
||||||
|
To get information about queries you've run on the current instance, use SQL to query the [`system.queries` table](/influxdb/cloud-dedicated/reference/internals/system-tables/#systemqueries-measurement), which contains information from the querier instance currently handling queries.
|
||||||
|
If you [enabled trace logging for the query](#enable-trace-logging-for-a-query), the `trace-id` appears in the `system.queries.trace_id` column for the query.
|
||||||
|
|
||||||
|
The `system.queries` table is an InfluxDB v3 **debug feature**.
|
||||||
|
To enable the feature and query `system.queries`, include an `"iox-debug"` header set to `"true"` and use SQL to query the table.
|
||||||
|
|
||||||
|
The following sample code shows how to use the Python client library to do the following:
|
||||||
|
|
||||||
|
1. Enable tracing for a query.
|
||||||
|
2. Retrieve the trace ID record from `system.queries`.
|
||||||
|
|
||||||
|
<!-- Import for tests and hide from users.
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
```
|
||||||
|
-->
|
||||||
|
|
||||||
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)|APP_REQUEST_ID" %}}
|
||||||
|
|
||||||
|
<!--pytest-codeblocks:cont-->
|
||||||
|
|
||||||
|
```python
|
||||||
|
from influxdb_client_3 import InfluxDBClient3
|
||||||
|
import secrets
|
||||||
|
import pandas
|
||||||
|
|
||||||
|
def get_query_information():
|
||||||
|
print('# Get query information')
|
||||||
|
|
||||||
|
client = InfluxDBClient3(token = f"DATABASE_TOKEN",
|
||||||
|
host = f"{{< influxdb/host >}}",
|
||||||
|
database = f"DATABASE_NAME")
|
||||||
|
|
||||||
|
random_bytes = secrets.token_bytes(16)
|
||||||
|
trace_id = random_bytes.hex()
|
||||||
|
trace_value = (f"{trace_id}:1112223334445:0:1").encode('utf-8')
|
||||||
|
sql = "SELECT * FROM home WHERE time >= now() - INTERVAL '30 days'"
|
||||||
|
|
||||||
|
try:
|
||||||
|
client.query(sql, headers=[(b'influx-trace-id', trace_value)])
|
||||||
|
client.close()
|
||||||
|
except Exception as e:
|
||||||
|
print("Query error: ", e)
|
||||||
|
|
||||||
|
client = InfluxDBClient3(token = f"DATABASE_TOKEN",
|
||||||
|
host = f"{{< influxdb/host >}}",
|
||||||
|
database = f"DATABASE_NAME")
|
||||||
|
|
||||||
|
import time
|
||||||
|
df = pandas.DataFrame()
|
||||||
|
|
||||||
|
for i in range(0, 5):
|
||||||
|
time.sleep(1)
|
||||||
|
# Use SQL
|
||||||
|
# To query the system.queries table for your trace ID, pass the following:
|
||||||
|
# - the iox-debug: true request header
|
||||||
|
# - an SQL query for the trace_id column
|
||||||
|
reader = client.query(f'''SELECT compute_duration, query_type, query_text,
|
||||||
|
success, trace_id
|
||||||
|
FROM system.queries
|
||||||
|
WHERE issue_time >= now() - INTERVAL '1 day'
|
||||||
|
AND trace_id = '{trace_id}'
|
||||||
|
ORDER BY issue_time DESC
|
||||||
|
''',
|
||||||
|
headers=[(b"iox-debug", b"true")],
|
||||||
|
mode="reader")
|
||||||
|
|
||||||
|
df = reader.read_all().to_pandas()
|
||||||
|
if df.shape[0]:
|
||||||
|
break
|
||||||
|
|
||||||
|
assert df.shape == (1, 5), f"Expect a row for the query trace ID."
|
||||||
|
print(df)
|
||||||
|
|
||||||
|
get_query_information()
|
||||||
|
```
|
||||||
|
{{% /code-placeholders %}}
|
||||||
|
|
||||||
|
The output is similar to the following:
|
||||||
|
|
||||||
|
```text
|
||||||
|
compute_duration query_type query_text success trace_id
|
||||||
|
0 days sql SELECT compute_duration, quer... True 67338...
|
||||||
|
```
|
|
@ -24,6 +24,7 @@ Learn how to handle responses and troubleshoot errors encountered when querying
|
||||||
- [Internal Error: Received RST_STREAM](#internal-error-received-rst_stream)
|
- [Internal Error: Received RST_STREAM](#internal-error-received-rst_stream)
|
||||||
- [Internal Error: stream terminated by RST_STREAM with NO_ERROR](#internal-error-stream-terminated-by-rst_stream-with-no_error)
|
- [Internal Error: stream terminated by RST_STREAM with NO_ERROR](#internal-error-stream-terminated-by-rst_stream-with-no_error)
|
||||||
- [Invalid Argument: Invalid ticket](#invalid-argument-invalid-ticket)
|
- [Invalid Argument: Invalid ticket](#invalid-argument-invalid-ticket)
|
||||||
|
- [Timeout: Deadline exceeded](#timeout-deadline-exceeded)
|
||||||
- [Unauthenticated: Unauthenticated](#unauthenticated-unauthenticated)
|
- [Unauthenticated: Unauthenticated](#unauthenticated-unauthenticated)
|
||||||
- [Unauthorized: Permission denied](#unauthorized-permission-denied)
|
- [Unauthorized: Permission denied](#unauthorized-permission-denied)
|
||||||
- [FlightUnavailableError: Could not get default pem root certs](#flightunavailableerror-could-not-get-default-pem-root-certs)
|
- [FlightUnavailableError: Could not get default pem root certs](#flightunavailableerror-could-not-get-default-pem-root-certs)
|
||||||
|
@ -80,7 +81,8 @@ SELECT co, delete, hum, room, temp, time
|
||||||
|
|
||||||
The Python client library outputs the following schema representation:
|
The Python client library outputs the following schema representation:
|
||||||
|
|
||||||
```py
|
<!--pytest.mark.skip-->
|
||||||
|
```python
|
||||||
Schema:
|
Schema:
|
||||||
co: int64
|
co: int64
|
||||||
-- field metadata --
|
-- field metadata --
|
||||||
|
@ -175,7 +177,7 @@ _For a list of gRPC codes that servers and clients may return, see [Status codes
|
||||||
|
|
||||||
**Example**:
|
**Example**:
|
||||||
|
|
||||||
```sh
|
```structuredtext
|
||||||
Flight returned internal error, with message: Received RST_STREAM with error code 2. gRPC client debug context: UNKNOWN:Error received from peer ipv4:34.196.233.7:443 {grpc_message:"Received RST_STREAM with error code 2"}
|
Flight returned internal error, with message: Received RST_STREAM with error code 2. gRPC client debug context: UNKNOWN:Error received from peer ipv4:34.196.233.7:443 {grpc_message:"Received RST_STREAM with error code 2"}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -192,11 +194,12 @@ Flight returned internal error, with message: Received RST_STREAM with error cod
|
||||||
|
|
||||||
**Example**:
|
**Example**:
|
||||||
|
|
||||||
|
<!--pytest.mark.skip-->
|
||||||
```sh
|
```sh
|
||||||
pyarrow._flight.FlightInternalError: Flight returned internal error, with message: stream terminated by RST_STREAM with error code: NO_ERROR. gRPC client debug context: UNKNOWN:Error received from peer ipv4:3.123.149.45:443 {created_time:"2023-07-26T14:12:44.992317+02:00", grpc_status:13, grpc_message:"stream terminated by RST_STREAM with error code: NO_ERROR"}. Client context: OK
|
pyarrow._flight.FlightInternalError: Flight returned internal error, with message: stream terminated by RST_STREAM with error code: NO_ERROR. gRPC client debug context: UNKNOWN:Error received from peer ipv4:3.123.149.45:443 {created_time:"2023-07-26T14:12:44.992317+02:00", grpc_status:13, grpc_message:"stream terminated by RST_STREAM with error code: NO_ERROR"}. Client context: OK
|
||||||
```
|
```
|
||||||
|
|
||||||
**Potential Reasons**:
|
**Potential reasons**:
|
||||||
|
|
||||||
- The server terminated the stream, but there wasn't any specific error associated with it.
|
- The server terminated the stream, but there wasn't any specific error associated with it.
|
||||||
- Possible network disruption, even if it's temporary.
|
- Possible network disruption, even if it's temporary.
|
||||||
|
@ -208,21 +211,35 @@ pyarrow._flight.FlightInternalError: Flight returned internal error, with messag
|
||||||
|
|
||||||
**Example**:
|
**Example**:
|
||||||
|
|
||||||
|
<!--pytest.mark.skip-->
|
||||||
```sh
|
```sh
|
||||||
pyarrow.lib.ArrowInvalid: Flight returned invalid argument error, with message: Invalid ticket. Error: Invalid ticket. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.158.68.83:443 {created_time:"2023-08-31T17:56:42.909129-05:00", grpc_status:3, grpc_message:"Invalid ticket. Error: Invalid ticket"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
pyarrow.lib.ArrowInvalid: Flight returned invalid argument error, with message: Invalid ticket. Error: Invalid ticket. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.158.68.83:443 {created_time:"2023-08-31T17:56:42.909129-05:00", grpc_status:3, grpc_message:"Invalid ticket. Error: Invalid ticket"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
||||||
```
|
```
|
||||||
|
|
||||||
**Potential Reasons**:
|
**Potential reasons**:
|
||||||
|
|
||||||
- The request is missing the database name or some other required metadata value.
|
- The request is missing the database name or some other required metadata value.
|
||||||
- The request contains bad query syntax.
|
- The request contains bad query syntax.
|
||||||
|
|
||||||
<!-- END -->
|
<!-- END -->
|
||||||
|
|
||||||
|
#### Timeout: Deadline exceeded
|
||||||
|
|
||||||
|
<!--pytest.mark.skip-->
|
||||||
|
```sh
|
||||||
|
pyarrow._flight.FlightTimedOutError: Flight returned timeout error, with message: Deadline Exceeded. gRPC client debug context: UNKNOWN:Deadline Exceeded {grpc_status:4, created_time:"2023-09-27T15:30:58.540385-05:00"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
||||||
|
```
|
||||||
|
|
||||||
|
**Potential reasons**:
|
||||||
|
|
||||||
|
- The server's response time exceeded the number of seconds allowed by the client.
|
||||||
|
See how to specify `timeout` in [FlightCallOptions](https://arrow.apache.org/docs/python/generated/pyarrow.flight.FlightCallOptions.html#pyarrow.flight.FlightCallOptions).
|
||||||
|
|
||||||
#### Unauthenticated: Unauthenticated
|
#### Unauthenticated: Unauthenticated
|
||||||
|
|
||||||
**Example**:
|
**Example**:
|
||||||
|
|
||||||
|
<!--pytest.mark.skip-->
|
||||||
```sh
|
```sh
|
||||||
Flight returned unauthenticated error, with message: unauthenticated. gRPC client debug context: UNKNOWN:Error received from peer ipv4:34.196.233.7:443 {grpc_message:"unauthenticated", grpc_status:16, created_time:"2023-08-28T15:38:33.380633-05:00"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
Flight returned unauthenticated error, with message: unauthenticated. gRPC client debug context: UNKNOWN:Error received from peer ipv4:34.196.233.7:443 {grpc_message:"unauthenticated", grpc_status:16, created_time:"2023-08-28T15:38:33.380633-05:00"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
||||||
```
|
```
|
||||||
|
@ -238,6 +255,7 @@ Flight returned unauthenticated error, with message: unauthenticated. gRPC clien
|
||||||
|
|
||||||
**Example**:
|
**Example**:
|
||||||
|
|
||||||
|
<!--pytest.mark.skip-->
|
||||||
```sh
|
```sh
|
||||||
pyarrow._flight.FlightUnauthorizedError: Flight returned unauthorized error, with message: Permission denied. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.158.68.83:443 {grpc_message:"Permission denied", grpc_status:7, created_time:"2023-08-31T17:51:08.271009-05:00"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
pyarrow._flight.FlightUnauthorizedError: Flight returned unauthorized error, with message: Permission denied. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.158.68.83:443 {grpc_message:"Permission denied", grpc_status:7, created_time:"2023-08-31T17:51:08.271009-05:00"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
||||||
```
|
```
|
||||||
|
@ -254,6 +272,7 @@ pyarrow._flight.FlightUnauthorizedError: Flight returned unauthorized error, wit
|
||||||
|
|
||||||
If unable to locate a root certificate for _gRPC+TLS_, the Flight client returns errors similar to the following:
|
If unable to locate a root certificate for _gRPC+TLS_, the Flight client returns errors similar to the following:
|
||||||
|
|
||||||
|
<!--pytest.mark.skip-->
|
||||||
```sh
|
```sh
|
||||||
UNKNOWN:Failed to load file... filename:"/usr/share/grpc/roots.pem",
|
UNKNOWN:Failed to load file... filename:"/usr/share/grpc/roots.pem",
|
||||||
children:[UNKNOWN:No such file or directory
|
children:[UNKNOWN:No such file or directory
|
||||||
|
|
|
@ -16,34 +16,68 @@ related:
|
||||||
InfluxDB system measurements contain time series data used by and generated from the
|
InfluxDB system measurements contain time series data used by and generated from the
|
||||||
InfluxDB internal monitoring system.
|
InfluxDB internal monitoring system.
|
||||||
|
|
||||||
Each InfluxDB Cloud Dedicated namespace includes the following system measurements:
|
Each {{% product-name %}} namespace includes the following system measurements:
|
||||||
|
|
||||||
- [queries](#_queries-system-measurement)
|
<!-- TOC -->
|
||||||
|
|
||||||
## queries system measurement
|
- [system.queries measurement](#systemqueries-measurement)
|
||||||
|
- [system.queries schema](#systemqueries-schema)
|
||||||
|
|
||||||
|
## system.queries measurement
|
||||||
|
|
||||||
The `system.queries` measurement stores log entries for queries executed for the provided namespace (database) on the node that is currently handling queries.
|
The `system.queries` measurement stores log entries for queries executed for the provided namespace (database) on the node that is currently handling queries.
|
||||||
|
|
||||||
The following example shows how to list queries recorded in the `system.queries` measurement:
|
```python
|
||||||
|
from influxdb_client_3 import InfluxDBClient3
|
||||||
|
client = InfluxDBClient3(token = DATABASE_TOKEN,
|
||||||
|
host = HOSTNAME,
|
||||||
|
org = '',
|
||||||
|
database=DATABASE_NAME)
|
||||||
|
client.query('select * from home')
|
||||||
|
reader = client.query('''
|
||||||
|
SELECT *
|
||||||
|
FROM system.queries
|
||||||
|
WHERE issue_time >= now() - INTERVAL '1 day'
|
||||||
|
AND query_text LIKE '%select * from home%'
|
||||||
|
''',
|
||||||
|
language='sql',
|
||||||
|
headers=[(b"iox-debug", b"true")],
|
||||||
|
mode="reader")
|
||||||
|
print("# system.queries schema\n")
|
||||||
|
print(reader.schema)
|
||||||
|
```
|
||||||
|
|
||||||
```sql
|
<!--pytest-codeblocks:expected-output-->
|
||||||
SELECT issue_time, query_type, query_text, success FROM system.queries;
|
|
||||||
|
`system.queries` has the following schema:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# system.queries schema
|
||||||
|
|
||||||
|
issue_time: timestamp[ns] not null
|
||||||
|
query_type: string not null
|
||||||
|
query_text: string not null
|
||||||
|
completed_duration: duration[ns]
|
||||||
|
success: bool not null
|
||||||
|
trace_id: string
|
||||||
```
|
```
|
||||||
|
|
||||||
_When listing measurements (tables) available within a namespace, some clients and query tools may include the `queries` table in the list of namespace tables._
|
_When listing measurements (tables) available within a namespace, some clients and query tools may include the `queries` table in the list of namespace tables._
|
||||||
|
|
||||||
`system.queries` reflects a process-local, in-memory, namespace-scoped query log.
|
`system.queries` reflects a process-local, in-memory, namespace-scoped query log.
|
||||||
|
The query log isn't shared across instances within the same deployment.
|
||||||
While this table may be useful for debugging and monitoring queries, keep the following in mind:
|
While this table may be useful for debugging and monitoring queries, keep the following in mind:
|
||||||
|
|
||||||
- Records stored in `system.queries` are volatile.
|
- Records stored in `system.queries` are volatile.
|
||||||
- Records are lost on pod restarts.
|
- Records are lost on pod restarts.
|
||||||
- Queries for one namespace can evict records from another namespace.
|
- Queries for one namespace can evict records from another namespace.
|
||||||
- Data reflects the state of a specific pod answering queries for the namespace.
|
- Data reflects the state of a specific pod answering queries for the namespace----the log view is scoped to the requesting namespace and queries aren't leaked across namespaces.
|
||||||
- A query for records in `system.queries` can return different results depending on the pod the request was routed to.
|
- A query for records in `system.queries` can return different results depending on the pod the request was routed to.
|
||||||
|
|
||||||
**Data retention:** System data can be transient and is deleted on pod restarts.
|
**Data retention:** System data can be transient and is deleted on pod restarts.
|
||||||
|
The log size per instance is limited and the log view is scoped to the requesting namespace.
|
||||||
|
|
||||||
### queries measurement schema
|
### system.queries schema
|
||||||
|
|
||||||
- **system.queries** _(measurement)_
|
- **system.queries** _(measurement)_
|
||||||
- **fields**:
|
- **fields**:
|
||||||
|
|
|
@ -26,6 +26,7 @@ related:
|
||||||
- /influxdb/cloud-serverless/query-data/sql/
|
- /influxdb/cloud-serverless/query-data/sql/
|
||||||
- /influxdb/cloud-serverless/reference/influxql/
|
- /influxdb/cloud-serverless/reference/influxql/
|
||||||
- /influxdb/cloud-serverless/reference/sql/
|
- /influxdb/cloud-serverless/reference/sql/
|
||||||
|
- /influxdb/cloud-serverless/query-data/execute-queries/troubleshoot/
|
||||||
|
|
||||||
list_code_example: |
|
list_code_example: |
|
||||||
```py
|
```py
|
||||||
|
@ -33,7 +34,7 @@ list_code_example: |
|
||||||
|
|
||||||
# Instantiate an InfluxDB client
|
# Instantiate an InfluxDB client
|
||||||
client = InfluxDBClient3(
|
client = InfluxDBClient3(
|
||||||
host='cloud2.influxdata.com',
|
host='{{< influxdb/host >}}',
|
||||||
token='DATABASE_TOKEN',
|
token='DATABASE_TOKEN',
|
||||||
database='DATABASE_NAME'
|
database='DATABASE_NAME'
|
||||||
)
|
)
|
||||||
|
@ -306,7 +307,7 @@ and specify the following arguments:
|
||||||
|
|
||||||
#### Example {#execute-query-example}
|
#### Example {#execute-query-example}
|
||||||
|
|
||||||
The following examples show how to use SQL or InfluxQL to select all fields in a measurement, and then output the results formatted as a Markdown table.
|
The following example shows how to use SQL or InfluxQL to select all fields in a measurement, and then use PyArrow functions to extract metadata and aggregate data.
|
||||||
|
|
||||||
{{% code-tabs-wrapper %}}
|
{{% code-tabs-wrapper %}}
|
||||||
{{% code-tabs %}}
|
{{% code-tabs %}}
|
||||||
|
|
|
@ -0,0 +1,106 @@
|
||||||
|
---
|
||||||
|
title: Optimize queries
|
||||||
|
description: >
|
||||||
|
Optimize your SQL and InfluxQL queries to improve performance and reduce their memory and compute (CPU) requirements.
|
||||||
|
weight: 401
|
||||||
|
menu:
|
||||||
|
influxdb_cloud_serverless:
|
||||||
|
name: Optimize queries
|
||||||
|
parent: Execute queries
|
||||||
|
influxdb/cloud-serverless/tags: [query, sql, influxql]
|
||||||
|
related:
|
||||||
|
- /influxdb/cloud-serverless/query-data/sql/
|
||||||
|
- /influxdb/cloud-serverless/query-data/influxql/
|
||||||
|
- /influxdb/cloud-serverless/query-data/execute-queries/troubleshoot/
|
||||||
|
- /influxdb/cloud-serverless/reference/client-libraries/v3/
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshoot query performance
|
||||||
|
|
||||||
|
Use the following tools to help you identify performance bottlenecks and troubleshoot problems in queries:
|
||||||
|
|
||||||
|
<!-- TOC -->
|
||||||
|
|
||||||
|
- [Troubleshoot query performance](#troubleshoot-query-performance)
|
||||||
|
- [EXPLAIN and ANALYZE](#explain-and-analyze)
|
||||||
|
- [Enable trace logging](#enable-trace-logging)
|
||||||
|
|
||||||
|
<!-- /TOC -->
|
||||||
|
|
||||||
|
### EXPLAIN and ANALYZE
|
||||||
|
|
||||||
|
To view the query engine's execution plan and metrics for an SQL query, prepend [`EXPLAIN`](/influxdb/cloud-serverless/reference/sql/explain/) or [`EXPLAIN ANALYZE`](/influxdb/cloud-serverless/reference/sql/explain/#explain-analyze) to the query.
|
||||||
|
The report can reveal query bottlenecks such as a large number of table scans or parquet files, and can help triage the question, "Is the query slow due to the amount of work required or due to a problem with the schema, compactor, etc.?"
|
||||||
|
|
||||||
|
The following example shows how to use the InfluxDB v3 Python client library and pandas to view `EXPLAIN` and `EXPLAIN ANALYZE` results for a query:
|
||||||
|
|
||||||
|
<!-- Import for tests and hide from users.
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
```
|
||||||
|
-->
|
||||||
|
<!--pytest-codeblocks:cont-->
|
||||||
|
{{% code-placeholders "BUCKET_NAME|API_TOKEN|APP_REQUEST_ID" %}}
|
||||||
|
```python
|
||||||
|
from influxdb_client_3 import InfluxDBClient3
|
||||||
|
import pandas as pd
|
||||||
|
import tabulate # Required for pandas.to_markdown()
|
||||||
|
|
||||||
|
def explain_and_analyze():
|
||||||
|
print('Use SQL EXPLAIN and ANALYZE to view query plan information.')
|
||||||
|
|
||||||
|
# Instantiate an InfluxDB client.
|
||||||
|
client = InfluxDBClient3(token = f"API_TOKEN",
|
||||||
|
host = f"{{< influxdb/host >}}",
|
||||||
|
database = f"BUCKET_NAME")
|
||||||
|
|
||||||
|
sql_explain = '''EXPLAIN SELECT *
|
||||||
|
FROM home
|
||||||
|
WHERE time >= now() - INTERVAL '90 days'
|
||||||
|
ORDER BY time'''
|
||||||
|
|
||||||
|
table = client.query(sql_explain)
|
||||||
|
df = table.to_pandas()
|
||||||
|
|
||||||
|
sql_explain_analyze = '''EXPLAIN ANALYZE SELECT *
|
||||||
|
FROM home
|
||||||
|
WHERE time >= now() - INTERVAL '90 days'
|
||||||
|
ORDER BY time'''
|
||||||
|
|
||||||
|
table = client.query(sql_explain_analyze)
|
||||||
|
|
||||||
|
# Combine the Dataframes and output the plan information.
|
||||||
|
df = pd.concat([df, table.to_pandas()])
|
||||||
|
|
||||||
|
assert df.shape == (3, 2) and df.columns.to_list() == ['plan_type', 'plan']
|
||||||
|
print(df[['plan_type', 'plan']].to_markdown(index=False))
|
||||||
|
|
||||||
|
client.close()
|
||||||
|
|
||||||
|
explain_and_analyze()
|
||||||
|
```
|
||||||
|
{{% /code-placeholders %}}
|
||||||
|
|
||||||
|
Replace the following:
|
||||||
|
|
||||||
|
- {{% code-placeholder-key %}}`BUCKET_NAME`{{% /code-placeholder-key %}}: your {{% product-name %}} database
|
||||||
|
- {{% code-placeholder-key %}}`API_TOKEN`{{% /code-placeholder-key %}}: a [database token](/influxdb/cloud-serverless/admin/tokens/) with sufficient permissions to the specified database
|
||||||
|
|
||||||
|
The output is similar to the following:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
| plan_type | plan |
|
||||||
|
|:------------------|:---------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
|
| logical_plan | Sort: home.time ASC NULLS LAST |
|
||||||
|
| | TableScan: home projection=[co, hum, room, sensor, temp, time], full_filters=[home.time >= TimestampNanosecond(1688491380936276013, None)] |
|
||||||
|
| physical_plan | SortExec: expr=[time@5 ASC NULLS LAST] |
|
||||||
|
| | EmptyExec: produce_one_row=false |
|
||||||
|
| Plan with Metrics | SortExec: expr=[time@5 ASC NULLS LAST], metrics=[output_rows=0, elapsed_compute=1ns, spill_count=0, spilled_bytes=0] |
|
||||||
|
| | EmptyExec: produce_one_row=false, metrics=[]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Enable trace logging
|
||||||
|
|
||||||
|
Customers with an {{% product-name %}} [annual or support contract](https://www.influxdata.com/influxdb-cloud-pricing/) can [contact InfluxData Support](https://support.influxdata.com/) to enable tracing and request help troubleshooting your query.
|
||||||
|
With tracing enabled, InfluxDB Support can trace system processes and analyze log information for a query instance.
|
||||||
|
The tracing system follows the [OpenTelemetry traces](https://opentelemetry.io/docs/concepts/signals/traces/) model for providing observability into a request.
|
|
@ -197,7 +197,7 @@ Flight returned internal error, with message: Received RST_STREAM with error cod
|
||||||
pyarrow._flight.FlightInternalError: Flight returned internal error, with message: stream terminated by RST_STREAM with error code: NO_ERROR. gRPC client debug context: UNKNOWN:Error received from peer ipv4:3.123.149.45:443 {created_time:"2023-07-26T14:12:44.992317+02:00", grpc_status:13, grpc_message:"stream terminated by RST_STREAM with error code: NO_ERROR"}. Client context: OK
|
pyarrow._flight.FlightInternalError: Flight returned internal error, with message: stream terminated by RST_STREAM with error code: NO_ERROR. gRPC client debug context: UNKNOWN:Error received from peer ipv4:3.123.149.45:443 {created_time:"2023-07-26T14:12:44.992317+02:00", grpc_status:13, grpc_message:"stream terminated by RST_STREAM with error code: NO_ERROR"}. Client context: OK
|
||||||
```
|
```
|
||||||
|
|
||||||
**Potential Reasons**:
|
**Potential reasons**:
|
||||||
|
|
||||||
- The server terminated the stream, but there wasn't any specific error associated with it.
|
- The server terminated the stream, but there wasn't any specific error associated with it.
|
||||||
- Possible network disruption, even if it's temporary.
|
- Possible network disruption, even if it's temporary.
|
||||||
|
@ -213,7 +213,7 @@ pyarrow._flight.FlightInternalError: Flight returned internal error, with messag
|
||||||
ArrowInvalid: Flight returned invalid argument error, with message: bucket "otel5" not found. gRPC client debug context: UNKNOWN:Error received from peer ipv4:3.123.149.45:443 {grpc_message:"bucket \"otel5\" not found", grpc_status:3, created_time:"2023-08-09T16:37:30.093946+01:00"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
ArrowInvalid: Flight returned invalid argument error, with message: bucket "otel5" not found. gRPC client debug context: UNKNOWN:Error received from peer ipv4:3.123.149.45:443 {grpc_message:"bucket \"otel5\" not found", grpc_status:3, created_time:"2023-08-09T16:37:30.093946+01:00"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
||||||
```
|
```
|
||||||
|
|
||||||
**Potential Reasons**:
|
**Potential reasons**:
|
||||||
|
|
||||||
- The specified bucket doesn't exist.
|
- The specified bucket doesn't exist.
|
||||||
|
|
||||||
|
@ -227,7 +227,7 @@ ArrowInvalid: Flight returned invalid argument error, with message: bucket "otel
|
||||||
pyarrow.lib.ArrowInvalid: Flight returned invalid argument error, with message: Invalid ticket. Error: Invalid ticket. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.158.68.83:443 {created_time:"2023-08-31T17:56:42.909129-05:00", grpc_status:3, grpc_message:"Invalid ticket. Error: Invalid ticket"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
pyarrow.lib.ArrowInvalid: Flight returned invalid argument error, with message: Invalid ticket. Error: Invalid ticket. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.158.68.83:443 {created_time:"2023-08-31T17:56:42.909129-05:00", grpc_status:3, grpc_message:"Invalid ticket. Error: Invalid ticket"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
||||||
```
|
```
|
||||||
|
|
||||||
**Potential Reasons**:
|
**Potential reasons**:
|
||||||
|
|
||||||
- The request is missing the bucket name or some other required metadata value.
|
- The request is missing the bucket name or some other required metadata value.
|
||||||
- The request contains bad query syntax.
|
- The request contains bad query syntax.
|
||||||
|
|
|
@ -0,0 +1,115 @@
|
||||||
|
---
|
||||||
|
title: Optimize queries
|
||||||
|
description: >
|
||||||
|
Optimize your SQL and InfluxQL queries to improve performance and reduce their memory and compute (CPU) requirements.
|
||||||
|
weight: 401
|
||||||
|
menu:
|
||||||
|
influxdb_clustered:
|
||||||
|
name: Optimize queries
|
||||||
|
parent: Execute queries
|
||||||
|
influxdb/clustered/tags: [query, sql, influxql]
|
||||||
|
related:
|
||||||
|
- /influxdb/clustered/query-data/sql/
|
||||||
|
- /influxdb/clustered/query-data/influxql/
|
||||||
|
- /influxdb/clustered/query-data/execute-queries/troubleshoot/
|
||||||
|
- /influxdb/clustered/reference/client-libraries/v3/
|
||||||
|
---
|
||||||
|
|
||||||
|
Use the following tools to help you identify performance bottlenecks and troubleshoot problems in queries:
|
||||||
|
|
||||||
|
<!-- TOC -->
|
||||||
|
|
||||||
|
- [EXPLAIN and ANALYZE](#explain-and-analyze)
|
||||||
|
|
||||||
|
<!-- /TOC -->
|
||||||
|
|
||||||
|
### EXPLAIN and ANALYZE
|
||||||
|
|
||||||
|
To view the query engine's execution plan and metrics for an SQL query, prepend [`EXPLAIN`](/influxdb/clustered/reference/sql/explain/) or [`EXPLAIN ANALYZE`](/influxdb/clustered/reference/sql/explain/#explain-analyze) to the query.
|
||||||
|
The report can reveal query bottlenecks such as a large number of table scans or parquet files, and can help triage the question, "Is the query slow due to the amount of work required or due to a problem with the schema, compactor, etc.?"
|
||||||
|
|
||||||
|
The following example shows how to use the InfluxDB v3 Python client library and pandas to view `EXPLAIN` and `EXPLAIN ANALYZE` results for a query:
|
||||||
|
|
||||||
|
<!-- Import for tests and hide from users.
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
```
|
||||||
|
-->
|
||||||
|
|
||||||
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)" %}}
|
||||||
|
<!--pytest-codeblocks:cont-->
|
||||||
|
|
||||||
|
```python
|
||||||
|
from influxdb_client_3 import InfluxDBClient3
|
||||||
|
import pandas as pd
|
||||||
|
import tabulate # Required for pandas.to_markdown()
|
||||||
|
|
||||||
|
# Instantiate an InfluxDB client.
|
||||||
|
client = InfluxDBClient3(token = f"DATABASE_TOKEN",
|
||||||
|
host = f"{{< influxdb/host >}}",
|
||||||
|
database = f"DATABASE_NAME")
|
||||||
|
|
||||||
|
sql_explain = '''EXPLAIN
|
||||||
|
SELECT temp
|
||||||
|
FROM home
|
||||||
|
WHERE time >= now() - INTERVAL '90 days'
|
||||||
|
AND room = 'Kitchen'
|
||||||
|
ORDER BY time'''
|
||||||
|
|
||||||
|
table = client.query(sql_explain)
|
||||||
|
df = table.to_pandas()
|
||||||
|
print(df.to_markdown(index=False))
|
||||||
|
|
||||||
|
assert df.shape == (2, 2), f'Expect {df.shape} to have 2 columns, 2 rows'
|
||||||
|
assert 'physical_plan' in df.plan_type.values, "Expect physical_plan"
|
||||||
|
assert 'logical_plan' in df.plan_type.values, "Expect logical_plan"
|
||||||
|
```
|
||||||
|
|
||||||
|
{{< expand-wrapper >}}
|
||||||
|
{{% expand "View EXPLAIN example results" %}}
|
||||||
|
| plan_type | plan |
|
||||||
|
|:--------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
|
| logical_plan | Projection: home.temp |
|
||||||
|
| | Sort: home.time ASC NULLS LAST |
|
||||||
|
| | Projection: home.temp, home.time |
|
||||||
|
| | TableScan: home projection=[room, temp, time], full_filters=[home.time >= TimestampNanosecond(1688676582918581320, None), home.room = Dictionary(Int32, Utf8("Kitchen"))] |
|
||||||
|
| physical_plan | ProjectionExec: expr=[temp@0 as temp] |
|
||||||
|
| | SortExec: expr=[time@1 ASC NULLS LAST] |
|
||||||
|
| | EmptyExec: produce_one_row=false |
|
||||||
|
{{% /expand %}}
|
||||||
|
{{< /expand-wrapper >}}
|
||||||
|
|
||||||
|
<!--pytest-codeblocks:cont-->
|
||||||
|
|
||||||
|
```python
|
||||||
|
sql_explain_analyze = '''EXPLAIN ANALYZE
|
||||||
|
SELECT *
|
||||||
|
FROM home
|
||||||
|
WHERE time >= now() - INTERVAL '90 days'
|
||||||
|
ORDER BY time'''
|
||||||
|
|
||||||
|
table = client.query(sql_explain_analyze)
|
||||||
|
df = table.to_pandas()
|
||||||
|
print(df.to_markdown(index=False))
|
||||||
|
|
||||||
|
assert df.shape == (1,2)
|
||||||
|
assert 'Plan with Metrics' in df.plan_type.values, "Expect plan metrics"
|
||||||
|
|
||||||
|
client.close()
|
||||||
|
```
|
||||||
|
{{% /code-placeholders %}}
|
||||||
|
|
||||||
|
Replace the following:
|
||||||
|
|
||||||
|
- {{% code-placeholder-key %}}`DATABASE_NAME`{{% /code-placeholder-key %}}: your {{% product-name %}} database
|
||||||
|
- {{% code-placeholder-key %}}`DATABASE_TOKEN`{{% /code-placeholder-key %}}: a [database token](/influxdb/cloud-dedicated/admin/tokens/) with sufficient permissions to the specified database
|
||||||
|
|
||||||
|
{{< expand-wrapper >}}
|
||||||
|
{{% expand "View EXPLAIN ANALYZE example results" %}}
|
||||||
|
| plan_type | plan |
|
||||||
|
|:------------------|:-----------------------------------------------------------------------------------------------------------------------|
|
||||||
|
| Plan with Metrics | ProjectionExec: expr=[temp@0 as temp], metrics=[output_rows=0, elapsed_compute=1ns] |
|
||||||
|
| | SortExec: expr=[time@1 ASC NULLS LAST], metrics=[output_rows=0, elapsed_compute=1ns, spill_count=0, spilled_bytes=0] |
|
||||||
|
| | EmptyExec: produce_one_row=false, metrics=[]
|
||||||
|
{{% /expand %}}
|
||||||
|
{{< /expand-wrapper >}}
|
|
@ -197,7 +197,7 @@ Flight returned internal error, with message: Received RST_STREAM with error cod
|
||||||
pyarrow._flight.FlightInternalError: Flight returned internal error, with message: stream terminated by RST_STREAM with error code: NO_ERROR. gRPC client debug context: UNKNOWN:Error received from peer ipv4:3.123.149.45:443 {created_time:"2023-07-26T14:12:44.992317+02:00", grpc_status:13, grpc_message:"stream terminated by RST_STREAM with error code: NO_ERROR"}. Client context: OK
|
pyarrow._flight.FlightInternalError: Flight returned internal error, with message: stream terminated by RST_STREAM with error code: NO_ERROR. gRPC client debug context: UNKNOWN:Error received from peer ipv4:3.123.149.45:443 {created_time:"2023-07-26T14:12:44.992317+02:00", grpc_status:13, grpc_message:"stream terminated by RST_STREAM with error code: NO_ERROR"}. Client context: OK
|
||||||
```
|
```
|
||||||
|
|
||||||
**Potential Reasons**:
|
**Potential reasons**:
|
||||||
|
|
||||||
- The server terminated the stream, but there wasn't any specific error associated with it.
|
- The server terminated the stream, but there wasn't any specific error associated with it.
|
||||||
- Possible network disruption, even if it's temporary.
|
- Possible network disruption, even if it's temporary.
|
||||||
|
@ -213,7 +213,7 @@ pyarrow._flight.FlightInternalError: Flight returned internal error, with messag
|
||||||
pyarrow.lib.ArrowInvalid: Flight returned invalid argument error, with message: Invalid ticket. Error: Invalid ticket. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.158.68.83:443 {created_time:"2023-08-31T17:56:42.909129-05:00", grpc_status:3, grpc_message:"Invalid ticket. Error: Invalid ticket"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
pyarrow.lib.ArrowInvalid: Flight returned invalid argument error, with message: Invalid ticket. Error: Invalid ticket. gRPC client debug context: UNKNOWN:Error received from peer ipv4:54.158.68.83:443 {created_time:"2023-08-31T17:56:42.909129-05:00", grpc_status:3, grpc_message:"Invalid ticket. Error: Invalid ticket"}. Client context: IOError: Server never sent a data message. Detail: Internal
|
||||||
```
|
```
|
||||||
|
|
||||||
**Potential Reasons**:
|
**Potential reasons**:
|
||||||
|
|
||||||
- The request is missing the database name or some other required metadata value.
|
- The request is missing the database name or some other required metadata value.
|
||||||
- The request contains bad query syntax.
|
- The request contains bad query syntax.
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
## Code sample dependencies
|
## Code sample dependencies
|
||||||
influxdb3-python
|
# Temporary fork for passing headers in query options.
|
||||||
|
influxdb3-python @ git+https://github.com/jstirnaman/influxdb3-python@4abd41c710e79f85333ba81258b10daff54d05b0
|
||||||
pandas
|
pandas
|
||||||
## Tabulate for printing pandas DataFrames.
|
## Tabulate for printing pandas DataFrames.
|
||||||
tabulate
|
tabulate
|
||||||
|
|
Loading…
Reference in New Issue