412 lines
14 KiB
Markdown
412 lines
14 KiB
Markdown
---
|
|
title: Use Python to query data
|
|
seotitle: Use Python and SQL or InfluxQL to query data
|
|
list_title: Use Python
|
|
description: >
|
|
Use the `influxdb_client_3` Python module and SQL or InfluxQL to query data stored in InfluxDB.
|
|
Execute queries and retrieve data over the Flight+gRPC protocol, and then process data using common Python tools.
|
|
weight: 401
|
|
menu:
|
|
influxdb3_clustered:
|
|
parent: Use client libraries
|
|
name: Use Python
|
|
identifier: query-with-python-sql
|
|
influxdb3/clustered/tags: [Flight client, query, flight, python, sql, influxql]
|
|
related:
|
|
- /influxdb3/clustered/reference/client-libraries/v3/python/
|
|
- /influxdb3/clustered/process-data/tools/pandas/
|
|
- /influxdb3/clustered/process-data/tools/pyarrow/
|
|
- /influxdb3/clustered/query-data/influxql/
|
|
- /influxdb3/clustered/query-data/sql/
|
|
- /influxdb3/clustered/reference/influxql/
|
|
- /influxdb3/clustered/reference/sql/
|
|
|
|
list_code_example: |
|
|
```py
|
|
from influxdb_client_3 import InfluxDBClient3
|
|
|
|
# Instantiate an InfluxDB client
|
|
client = InfluxDBClient3(
|
|
host='{{< influxdb/host >}}',
|
|
token='DATABASE_TOKEN',
|
|
database='DATABASE_NAME'
|
|
)
|
|
|
|
# Execute the query and return an Arrow table
|
|
table = client.query(
|
|
query="SELECT * FROM home",
|
|
language="sql"
|
|
)
|
|
|
|
# Return query results as a markdown table
|
|
print(table.to_pandas().to_markdown())
|
|
```
|
|
---
|
|
|
|
Use the InfluxDB `influxdb_client_3` Python client library module and SQL or InfluxQL to query data stored in InfluxDB.
|
|
Execute queries and retrieve data over the Flight+gRPC protocol, and then process data using common Python tools.
|
|
|
|
- [Get started using Python to query InfluxDB](#get-started-using-python-to-query-influxdb)
|
|
- [Create a Python virtual environment](#create-a-python-virtual-environment)
|
|
- [Install Python](#install-python)
|
|
- [Create a project virtual environment](#venv-install)
|
|
- [Install Anaconda](?t=Anaconda#conda-install)
|
|
- [Query InfluxDB](#query-influxdb)
|
|
- [Install the influxdb3-python library](#install-the-influxdb3-python-library)
|
|
- [Create an InfluxDB client](#create-an-influxdb-client)
|
|
- [Execute a query](#execute-a-query)
|
|
|
|
## Get started using Python to query InfluxDB
|
|
|
|
This guide assumes the following prerequisites:
|
|
|
|
- an {{% product-name %}} [database](/influxdb3/clustered/admin/databases/) with data to query
|
|
- a [database token](/influxdb3/clustered/admin/tokens/#database-tokens) with _read_ access to the database
|
|
|
|
To learn how to set up InfluxDB and write data, see the [Setup instructions](/influxdb3/clustered/get-started/setup/) in the Get Started tutorial.
|
|
|
|
## Create a Python virtual environment
|
|
|
|
This guide follows the recommended practice of using Python _virtual environments_.
|
|
If you don't want to use virtual environments and you have Python installed,
|
|
continue to [Query InfluxDB](#query-influxdb).
|
|
Python [virtual environments](https://docs.python.org/3/library/venv.html) keep
|
|
the Python interpreter and dependencies for your project self-contained and isolated from other projects.
|
|
|
|
To install Python and create a virtual environment, choose one of the following options:
|
|
|
|
- [Python venv](?t=venv#venv-install): The [`venv` module](https://docs.python.org/3/library/venv.html) comes standard in Python as of version 3.5.
|
|
- [Anaconda® Distribution](?t=Anaconda#conda-install): A Python/R data science distribution that provides Python and the **conda** package and environment manager.
|
|
|
|
{{< tabs-wrapper >}}
|
|
{{% tabs "small" %}}
|
|
[venv]()
|
|
[Anaconda]()
|
|
{{% /tabs %}}
|
|
{{% tab-content %}}
|
|
<!--------------------------------- Begin venv -------------------------------->
|
|
|
|
### Install Python
|
|
|
|
1. Follow the [Python installation instructions](https://wiki.python.org/moin/BeginnersGuide/Download)
|
|
to install a recent version of the Python programming language for your system.
|
|
2. Check that you can run `python` and `pip` commands.
|
|
`pip` is a package manager included in most Python distributions.
|
|
|
|
In your terminal, enter the following commands:
|
|
|
|
```sh
|
|
python --version
|
|
```
|
|
|
|
```sh
|
|
pip --version
|
|
```
|
|
|
|
Depending on your system, you may need to use version-specific commands--for example.
|
|
|
|
```sh
|
|
python3 --version
|
|
```
|
|
|
|
```sh
|
|
pip3 --version
|
|
```
|
|
|
|
If neither `pip` nor `pip<PYTHON_VERSION>` works, follow one of the [Pypa.io Pip installation](https://pip.pypa.io/en/stable/installation/) methods for your system.
|
|
|
|
### Create a project virtual environment {#venv-install}
|
|
|
|
1. Create a directory for your Python project and change to the new directory--for example:
|
|
|
|
```sh
|
|
mkdir ./PROJECT_DIRECTORY && cd $_
|
|
```
|
|
|
|
2. Use the Python `venv` module to create a virtual environment--for example:
|
|
|
|
```sh
|
|
python -m venv envs/virtualenv-1
|
|
```
|
|
|
|
`venv` creates the new virtual environment directory in your project.
|
|
|
|
3. To activate the new virtual environment in your terminal, run the `source` command and pass the path of the virtual environment `activate` script:
|
|
|
|
```sh
|
|
source envs/VIRTUAL_ENVIRONMENT_NAME/bin/activate
|
|
```
|
|
|
|
For example:
|
|
|
|
```sh
|
|
source envs/virtualenv-1/bin/activate
|
|
```
|
|
<!---------------------------------- End venv --------------------------------->
|
|
{{% /tab-content %}}
|
|
{{% tab-content %}}
|
|
<!-------------------------------- Begin conda -------------------------------->
|
|
|
|
### Install Anaconda {#conda-install}
|
|
|
|
1. Follow the [Anaconda installation instructions](https://docs.continuum.io/anaconda/install/) for your system.
|
|
2. Check that you can run the `conda` command:
|
|
|
|
```sh
|
|
conda
|
|
```
|
|
|
|
3. Use `conda` to create a virtual environment--for example:
|
|
|
|
```sh
|
|
conda create --prefix envs/virtualenv-1
|
|
```
|
|
|
|
`conda` creates a virtual environment in a directory named `./envs/virtualenv-1`.
|
|
|
|
4. To activate the new virtual environment, use the `conda activate` command and pass the directory path of the virtual environment:
|
|
|
|
```sh
|
|
conda activate envs/VIRTUAL_ENVIRONMENT_NAME
|
|
```
|
|
|
|
For example:
|
|
|
|
```sh
|
|
conda activate ./envs/virtualenv-1
|
|
```
|
|
|
|
<!--------------------------------- END conda --------------------------------->
|
|
{{% /tab-content %}}
|
|
{{< /tabs-wrapper >}}
|
|
|
|
When a virtual environment is activated, the name displays at the beginning of your terminal command line--for example:
|
|
|
|
{{% code-callout "(virtualenv-1)"%}}
|
|
```sh
|
|
(virtualenv-1) $ PROJECT_DIRECTORY
|
|
```
|
|
{{% /code-callout %}}
|
|
|
|
## Query InfluxDB
|
|
|
|
1. [Install the influxdb3-python library](#install-the-influxdb3-python-library)
|
|
2. [Create an InfluxDB client](#create-an-influxdb-client)
|
|
3. [Execute a query](#execute-a-query)
|
|
|
|
### Install the influxdb3-python library
|
|
|
|
The `influxdb3-python` package provides the `influxdb_client_3` module for integrating {{% product-name %}} with your Python code.
|
|
The module supports writing data to InfluxDB and querying data using SQL or InfluxQL.
|
|
|
|
Install the following dependencies:
|
|
|
|
{{% req type="key" text="Already installed in the [Write data section](/influxdb3/clustered/get-started/write/?t=Python#write-line-protocol-to-influxdb)" color="magenta" %}}
|
|
|
|
- `influxdb3-python` {{< req text="\* " color="magenta" >}}: Provides the `influxdb_client_3` module and also installs the [`pyarrow` package](https://arrow.apache.org/docs/python/index.html) for working with Arrow data returned from queries.
|
|
- `pandas`: Provides [pandas modules](https://pandas.pydata.org/) for analyzing and manipulating data.
|
|
- `tabulate`: Provides the [`tabulate` function](https://pypi.org/project/tabulate/) for formatting tabular data.
|
|
|
|
Enter the following command in your terminal:
|
|
|
|
```sh
|
|
pip install influxdb3-python pandas tabulate
|
|
```
|
|
|
|
With `influxdb3-python` and `pyarrow` installed, you're ready to query and
|
|
analyze data stored in an InfluxDB database.
|
|
|
|
### Create an InfluxDB client
|
|
|
|
The following example shows how to use Python with the `influxdb_client_3`
|
|
module to instantiate a client configured for an {{% product-name %}} database.
|
|
|
|
In your editor, copy and paste the following sample code to a new file--for
|
|
example, `query-example.py`:
|
|
|
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)" %}}
|
|
```py
|
|
# query-example.py
|
|
|
|
from influxdb_client_3 import InfluxDBClient3
|
|
|
|
# Instantiate an InfluxDBClient3 client configured for your database
|
|
client = InfluxDBClient3(
|
|
host='{{< influxdb/host >}}',
|
|
token='DATABASE_TOKEN',
|
|
database='DATABASE_NAME'
|
|
)
|
|
```
|
|
{{% /code-placeholders %}}
|
|
|
|
{{< expand-wrapper >}}
|
|
{{% expand "<span class='req'>Important</span>: If using **Windows**, specify the **Windows** certificate path" %}}
|
|
|
|
If using a non-POSIX-compliant operating system (such as Windows), specify the root certificate path when instantiating the client.
|
|
|
|
1. In your terminal, install the Python `certifi` package.
|
|
|
|
```sh
|
|
pip install certifi
|
|
```
|
|
|
|
2. In your Python code, import `certifi` and call the `certifi.where()` method to retrieve the certificate path.
|
|
3. When instantiating the client, pass the `flight_client_options.tls_root_certs=<ROOT_CERT_PATH>` option with the certificate path.
|
|
|
|
The following example shows how to use the Python `certifi` package and client library options to pass the certificate path:
|
|
|
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)" %}}
|
|
{{< code-callout "flight_client_options|tls_root_certs|(cert\b)" >}}
|
|
```py
|
|
from influxdb_client_3 import InfluxDBClient3, flight_client_options
|
|
import certifi
|
|
|
|
fh = open(certifi.where(), "r")
|
|
cert = fh.read()
|
|
fh.close()
|
|
|
|
client = InfluxDBClient3(
|
|
host="{{< influxdb/host >}}",
|
|
token='DATABASE_TOKEN',
|
|
database='DATABASE_NAME',
|
|
flight_client_options=flight_client_options(
|
|
tls_root_certs=cert))
|
|
...
|
|
```
|
|
{{< /code-callout >}}
|
|
{{% /code-placeholders %}}
|
|
|
|
For more information, see [`influxdb_client_3` query exceptions](/influxdb3/clustered/reference/client-libraries/v3/python/#query-exceptions).
|
|
|
|
{{% /expand %}}
|
|
{{< /expand-wrapper >}}
|
|
|
|
Replace the following configuration values:
|
|
|
|
- **`database`**: the name of the [{{% product-name %}} database](/influxdb3/clustered/admin/databases/) to query
|
|
- **`token`**: a [database token](/influxdb3/clustered/admin/tokens/#database-tokens) with _read_ access to the specified database.
|
|
_Store this in a secret store or environment variable to avoid exposing the raw token string._
|
|
|
|
### Execute a query
|
|
|
|
To execute a query, call the following client method:
|
|
|
|
[`query(query,language)` method](/influxdb3/clustered/reference/client-libraries/v3/python/#influxdbclient3query)
|
|
|
|
and specify the following arguments:
|
|
|
|
- **query**: A string. The SQL or InfluxQL query to execute.
|
|
- **language**: A string (`"sql"` or `"influxql"`). The `query` language.
|
|
|
|
#### Example {#execute-query-example}
|
|
|
|
The following examples shows how to use SQL or InfluxQL to select all fields in a measurement, and then output the results formatted as a Markdown table.
|
|
|
|
{{% code-tabs-wrapper %}}
|
|
{{% code-tabs %}}
|
|
[SQL](#)
|
|
[InfluxQL](#)
|
|
{{% /code-tabs %}}
|
|
{{% code-tab-content %}}
|
|
<!---- BEGIN SQL EXAMPLE --->
|
|
{{% influxdb/custom-timestamps %}}
|
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)" %}}
|
|
```py
|
|
# query-example.py
|
|
|
|
from influxdb_client_3 import InfluxDBClient3
|
|
|
|
client = InfluxDBClient3(
|
|
host='{{< influxdb/host >}}',
|
|
token='DATABASE_TOKEN',
|
|
database='DATABASE_NAME'
|
|
)
|
|
|
|
# Execute the query and return an Arrow table
|
|
table = client.query(
|
|
query="SELECT * FROM home",
|
|
language="sql"
|
|
)
|
|
|
|
print("\n#### View Schema information\n")
|
|
print(table.schema)
|
|
print(table.schema.names)
|
|
print(table.schema.types)
|
|
print(table.field('room').type)
|
|
print(table.schema.field('time').metadata)
|
|
|
|
print("\n#### View column types (timestamp, tag, and field) and data types\n")
|
|
print(table.schema.field('time').metadata[b'iox::column::type'])
|
|
print(table.schema.field('room').metadata[b'iox::column::type'])
|
|
print(table.schema.field('temp').metadata[b'iox::column::type'])
|
|
|
|
print("\n#### Use PyArrow to read the specified columns\n")
|
|
print(table.column('temp'))
|
|
print(table.select(['room', 'temp']))
|
|
print(table.select(['time', 'room', 'temp']))
|
|
|
|
print("\n#### Use PyArrow compute functions to aggregate data\n")
|
|
print(table.group_by('hum').aggregate([]))
|
|
print(table.group_by('room').aggregate([('temp', 'mean')]))
|
|
```
|
|
{{% /code-placeholders %}}
|
|
{{% /influxdb/custom-timestamps %}}
|
|
<!---- END SQL EXAMPLE ---->
|
|
{{% /code-tab-content %}}
|
|
{{% code-tab-content %}}
|
|
<!---- BEGIN INFLUXQL EXAMPLE ---->
|
|
{{% code-placeholders "DATABASE_(NAME|TOKEN)" %}}
|
|
```py
|
|
# query-example.py
|
|
|
|
from influxdb_client_3 import InfluxDBClient3
|
|
|
|
client = InfluxDBClient3(
|
|
host='{{< influxdb/host >}}',
|
|
token='DATABASE_TOKEN',
|
|
database='DATABASE_NAME'
|
|
)
|
|
|
|
# Execute the query and return an Arrow table
|
|
table = client.query(
|
|
query="SELECT * FROM home",
|
|
language="influxql"
|
|
)
|
|
|
|
print("\n#### View Schema information\n")
|
|
print(table.schema)
|
|
print(table.schema.names)
|
|
print(table.schema.types)
|
|
print(table.field('room').type)
|
|
print(table.schema.field('time').metadata)
|
|
|
|
print("\n#### View column types (timestamp, tag, and field) and data types\n")
|
|
print(table.schema.field('time').metadata[b'iox::column::type'])
|
|
print(table.schema.field('room').metadata[b'iox::column::type'])
|
|
print(table.schema.field('temp').metadata[b'iox::column::type'])
|
|
|
|
print("\n#### Use PyArrow to read the specified columns\n")
|
|
print(table.column('temp'))
|
|
print(table.select(['room', 'temp']))
|
|
print(table.select(['time', 'room', 'temp']))
|
|
|
|
print("\n#### Use PyArrow compute functions to aggregate data\n")
|
|
print(table.group_by('hum').aggregate([]))
|
|
print(table.group_by('room').aggregate([('temp', 'mean')]))
|
|
```
|
|
{{% /code-placeholders %}}
|
|
<!---- END INFLUXQL EXAMPLE ---->
|
|
{{% /code-tab-content %}}
|
|
{{% /code-tabs-wrapper %}}
|
|
|
|
Replace the following configuration values:
|
|
|
|
- **`database`**: the name of the [{{% product-name %}} database](/influxdb3/clustered/admin/databases/) to query
|
|
- **`token`**: a [database token](/influxdb3/clustered/admin/tokens/#database-tokens) with _read_ access to the specified database.
|
|
_Store this in a secret store or environment variable to avoid exposing the raw token string._
|
|
|
|
Next, learn how to use Python tools to work with time series data:
|
|
|
|
- [Use PyArrow](/influxdb3/clustered/process-data/tools/pyarrow/)
|
|
- [Use pandas](/influxdb3/clustered/process-data/tools/pandas/)
|