docs-v2/content/shared/influxdb-client-libraries-r.../flight/python-flight.md

3.6 KiB

Apache Arrow Python bindings integrate with Python scripts and applications to query data stored in InfluxDB.

[!Note]

Use InfluxDB 3 client libraries

We recommend using the influxdb3-python Python client library for integrating InfluxDB 3 with your Python application code.

InfluxDB 3 client libraries wrap Apache Arrow Flight clients and provide convenient methods for writing, querying, and processing data stored in {{% product-name %}}. Client libraries can query using SQL or InfluxQL.

The following examples show how to use the pyarrow.flight and pandas Python modules to query and format data stored in an {{% product-name %}} database:

{{< code-tabs-wrapper >}} {{% code-tabs %}} SQL InfluxQL {{% /code-tabs %}} {{% code-tab-content %}}

{{% code-placeholders "DATABASE_NAME|DATABASE_TOKEN" %}}

# Using pyarrow>=12.0.0 FlightClient
from pyarrow.flight import FlightClient, Ticket, FlightCallOptions 
import json
import pandas
import tabulate

# Downsampling query groups data into 2-hour bins
sql="""
  SELECT DATE_BIN(INTERVAL '2 hours', time) AS time,
    room,
    selector_max(temp, time)['value'] AS 'max temp',
    selector_min(temp, time)['value'] AS 'min temp',
    avg(temp) AS 'average temp'
  FROM home
  GROUP BY
    1,
    room
  ORDER BY room, 1"""
  
flight_ticket = Ticket(json.dumps({
  "namespace_name": "DATABASE_NAME",
  "sql_query": sql,
  "query_type": "sql"
}))

token = (b"authorization", bytes(f"Bearer DATABASE_TOKEN".encode('utf-8')))
options = FlightCallOptions(headers=[token])
client = FlightClient(f"grpc+tls://{{< influxdb/host >}}:443")

reader = client.do_get(flight_ticket, options)
arrow_table = reader.read_all()
# Use pyarrow and pandas to view and analyze data
data_frame = arrow_table.to_pandas()
print(data_frame.to_markdown())

{{% /code-placeholders %}}

{{% /code-tab-content %}} {{% code-tab-content %}}

{{% code-placeholders "DATABASE_NAME|DATABASE_TOKEN" %}}

# Using pyarrow>=12.0.0 FlightClient
from pyarrow.flight import FlightClient, Ticket, FlightCallOptions 
import json
import pandas
import tabulate

# Downsampling query groups data into 2-hour bins
influxql="""
  SELECT FIRST(temp)
  FROM home 
  WHERE room = 'kitchen'
    AND time >= now() - 100d
    AND time <= now() - 10d
  GROUP BY time(2h)"""
  
flight_ticket = Ticket(json.dumps({
  "namespace_name": "DATABASE_NAME",
  "sql_query": influxql,
  "query_type": "influxql"
}))

token = (b"authorization", bytes(f"Bearer DATABASE_TOKEN".encode('utf-8')))
options = FlightCallOptions(headers=[token])
client = FlightClient(f"grpc+tls://{{< influxdb/host >}}:443")

reader = client.do_get(flight_ticket, options)
arrow_table = reader.read_all()
# Use pyarrow and pandas to view and analyze data
data_frame = arrow_table.to_pandas()
print(data_frame.to_markdown())

{{% /code-placeholders %}}

{{% /code-tab-content %}} {{< /code-tabs-wrapper >}}

Replace the following:

  • {{% code-placeholder-key %}}DATABASE_NAME{{% /code-placeholder-key %}}: your {{% product-name %}} database
  • {{% code-placeholder-key %}}DATABASE_TOKEN{{% /code-placeholder-key %}}: a database token with sufficient permissions to the specified database