docs-v2/content/influxdb/clustered/query-data/execute-queries/optimize-queries.md

5.7 KiB

title description weight menu influxdb/clustered/tags related
Optimize queries Optimize your SQL and InfluxQL queries to improve performance and reduce their memory and compute (CPU) requirements. 401
influxdb_clustered
name parent
Optimize queries Execute queries
query
sql
influxql
/influxdb/clustered/query-data/sql/
/influxdb/clustered/query-data/influxql/
/influxdb/clustered/query-data/execute-queries/troubleshoot/
/influxdb/clustered/reference/client-libraries/v3/

Use the following tools to help you identify performance bottlenecks and troubleshoot problems in queries:

EXPLAIN and ANALYZE

To view the query engine's execution plan and metrics for an SQL query, prepend EXPLAIN or EXPLAIN ANALYZE to the query. The report can reveal query bottlenecks such as a large number of table scans or parquet files, and can help triage the question, "Is the query slow due to the amount of work required or due to a problem with the schema, compactor, etc.?"

The following example shows how to use the InfluxDB v3 Python client library and pandas to view EXPLAIN and EXPLAIN ANALYZE results for a query:

{{% code-placeholders "DATABASE_(NAME|TOKEN)" %}}

from influxdb_client_3 import InfluxDBClient3
import pandas as pd
import tabulate # Required for pandas.to_markdown()

# Instantiate an InfluxDB client.
client = InfluxDBClient3(token = f"DATABASE_TOKEN",
                        host = f"{{< influxdb/host >}}",
                        database = f"DATABASE_NAME")

sql_explain = '''EXPLAIN
              SELECT temp
              FROM home
              WHERE time >= now() - INTERVAL '90 days'
              AND room = 'Kitchen'
              ORDER BY time'''

table = client.query(sql_explain)
df = table.to_pandas()
print(df.to_markdown(index=False))

assert df.shape == (2, 2), f'Expect {df.shape} to have 2 columns, 2 rows'
assert 'physical_plan' in df.plan_type.values, "Expect physical_plan"
assert 'logical_plan' in df.plan_type.values, "Expect logical_plan"

{{< expand-wrapper >}} {{% expand "View EXPLAIN example results" %}}

plan_type plan
logical_plan Projection: home.temp
Sort: home.time ASC NULLS LAST
Projection: home.temp, home.time
TableScan: home projection=[room, temp, time], full_filters=[home.time >= TimestampNanosecond(1688676582918581320, None), home.room = Dictionary(Int32, Utf8("Kitchen"))]
physical_plan ProjectionExec: expr=[temp@0 as temp]
SortExec: expr=[time@1 ASC NULLS LAST]
EmptyExec: produce_one_row=false
{{% /expand %}}
{{< /expand-wrapper >}}
sql_explain_analyze = '''EXPLAIN ANALYZE
                      SELECT *
                      FROM home
                      WHERE time >= now() - INTERVAL '90 days'
                      ORDER BY time'''

table = client.query(sql_explain_analyze)
df = table.to_pandas()
print(df.to_markdown(index=False))

assert df.shape == (1,2)
assert 'Plan with Metrics' in df.plan_type.values, "Expect plan metrics"

client.close()

{{% /code-placeholders %}}

Replace the following:

  • {{% code-placeholder-key %}}DATABASE_NAME{{% /code-placeholder-key %}}: your {{% product-name %}} database
  • {{% code-placeholder-key %}}DATABASE_TOKEN{{% /code-placeholder-key %}}: a database token with sufficient permissions to the specified database

{{< expand-wrapper >}} {{% expand "View EXPLAIN ANALYZE example results" %}}

plan_type plan
Plan with Metrics ProjectionExec: expr=[temp@0 as temp], metrics=[output_rows=0, elapsed_compute=1ns]
SortExec: expr=[time@1 ASC NULLS LAST], metrics=[output_rows=0, elapsed_compute=1ns, spill_count=0, spilled_bytes=0]
EmptyExec: produce_one_row=false, metrics=[]
{{% /expand %}}
{{< /expand-wrapper >}}