4806-document-different-ways-to-execute-queries-against-iox (#4850)

* chore(refactor): refactor the SQL schema intro into a partial shortcode for reuse.

* fix(cloud-iox): #4806 Setup and query Grafana Flight SQL

- Help configuring a Homebrew installed Grafana
- Add some query help
- Add note about schema elements.
- Warn about functions not working.
- Screenshot of query builder.

* fix(cloud-iox): Grafana frontmatter

* fix(cloud-iox): Remove incorrect note about aggregate function support. Clarify required time column and the use of aggregations.

* wip: pandas.md

* wip: python.md

* chore(cloud-iox): update frontmatter for Execute queries

* feature(cloud-iox): use Python and Flight SQL to query, pandas and pyarrow to analyze:

- Adds /tools to Query Data
- Adds using Python with flightsql-dbapi to query data
- Adds starter for using PyArrow to analyze data
- Adds starter for using pandas to analyze data

* chore(cloud-iox): Move pages from Visualize data into Query Data > Tools

- Move Grafana
- Move Superset
- #4806

* Update content/influxdb/cloud-iox/query-data/tools/grafana.md

* Apply suggestions from code review

@sanderson Thanks for the review! Sorry for all the whitespace fixes.

Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>

* fix(cloud-iox): infinite cardinality (#4851) (#4853)

* fix(cloud-iox): pyarrow examples.

* fix(cloud-iox): wip - python examples.

* fix(cloud-iox): Python flightsql-dbapi, pandas, pyarrow guides

- part of #4806
- Add list code example
- Add code comments
- Fix whitespace
- Fix description
- add related
- add steps
- fix frontmatter
- add comments
- cleanup example

* Update content/influxdb/cloud-iox/query-data/execute-queries/flight-sql/python.md

* Update content/influxdb/cloud-iox/query-data/tools/grafana.md

* fix(cloud-iox): #4806 Grafana instructions

* fix(cloud-iox): #4806 pandas instructions

---------

Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>
pull/4858/head
Jason Stirnaman 2023-04-10 15:06:52 -05:00 committed by GitHub
parent f57adc4118
commit 07650896dc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
10 changed files with 671 additions and 40 deletions

View File

@ -1,7 +1,7 @@
---
title: Execute queries
seotitle: Different ways to query InfluxDB
description: There are multiple ways to query data from InfluxDB including the InfluxDB UI, CLI, and API.
seotitle: Execute queries for data stored in an InfluxDB bucket powered by IOx
description: Use tools and libraries to query data stored in an InfluxDB bucket powered by IOx.
weight: 103
menu:
influxdb_cloud_iox:

View File

@ -0,0 +1,298 @@
---
title: Use Python and the Flight SQL library to query data
description: >
Use Python and the `flightsql-dbapi` Flight SQL library to query data
stored in a bucket powered by InfluxDB IOx.
weight: 101
menu:
influxdb_cloud_iox:
parent: Query with Flight SQL
name: Use Python
identifier: query_with_python
influxdb/cloud-iox/tags: [query, flightsql, python]
related:
- /influxdb/cloud-iox/query-data/tools/pandas/
- /influxdb/cloud-iox/query-data/tools/pyarrow/
- /influxdb/cloud-iox/query-data/sql/
list_code_example: |
```py
from flightsql import FlightSQLClient
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
info = client.execute("SELECT * FROM home")
ticket = info.endpoints[0].ticket
reader = client.do_get(ticket)
```
---
Use Python and the Flight SQL library to query data stored in a bucket powered by InfluxDB IOx.
- [Get started using Python to query InfluxDB](#get-started-using-python-to-query-influxdb)
- [Create a Python virtual environment](#create-a-python-virtual-environment)
- [Install Python](#install-python)
- [Create a project virtual environment](#create-a-project-virtual-environment)
- [Install Anaconda](#install-anaconda)
- [Query InfluxDB using Flight SQL](#query-influxdb-using-flight-sql)
- [Install the Flight SQL Python Library](#install-the-flight-sql-python-library)
- [Create a query client](#create-a-query-client)
- [Execute a query](#execute-a-query)
- [Retrieve data for Flight SQL query results](#retrieve-data-for-flight-sql-query-results)
## Get started using Python to query InfluxDB
This guide follows the recommended practice of using Python _virtual environments_.
If you don't want to use virtual environments and you have Python installed,
continue to [Query InfluxDB using Flight SQL](#query-influxdb-using-flight-sql).
## Create a Python virtual environment
Python [virtual environments](https://docs.python.org/3/library/venv.html) keep the Python interpreter and dependencies for your project self-contained and isolated from other projects.
To install Python and create a virtual environment, choose one of the following options:
- [Python venv](?t=venv#venv-install): The [`venv` module](https://docs.python.org/3/library/venv.html) comes standard in Python as of version 3.5.
- [Anaconda® Distribution](?t=Anaconda#conda-install): A Python/R data science distribution that provides Python and the **conda** package and environment manager.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[venv](#venv-install)
[Anaconda](#conda-install)
{{% /code-tabs %}}
{{% code-tab-content %}}
<!-- Begin venv -->
### Install Python
1. Follow the [Python installation instructions](https://wiki.python.org/moin/BeginnersGuide/Download)
to install a recent version of the Python programming language for your system.
2. Check that you can run `python` and `pip` commands.
`pip` is a package manager included in most Python distributions.
In your terminal, enter the following commands:
```sh
python --version
```
```sh
pip --version
```
Depending on your system, you may need to use version-specific commands--for example.
```sh
python3 --version
```
```sh
pip3 --version
```
If neither `pip` nor `pip<PYTHON_VERSION>` works, follow one of the [Pypa.io Pip installation](https://pip.pypa.io/en/stable/installation/) methods for your system.
### Create a project virtual environment
1. Create a directory for your Python project and change to the new directory--for example:
```sh
mkdir ./PROJECT_DIRECTORY && cd $_
```
2. Use the Python `venv` module to create a virtual environment--for example:
```sh
python -m venv envs/virtualenv-1
```
`venv` creates the new virtual environment directory in your project.
3. To activate the new virtual environment in your terminal, run the `source` command and pass the file path of the virtual environment `activate` script:
```sh
source envs/VIRTUAL_ENVIRONMENT_NAME/bin/activate
```
For example:
```sh
source envs/virtualenv-1/bin/activate
```
<!-- End venv -->
{{% /code-tab-content %}}
{{% code-tab-content %}}
<!-- Begin conda -->
### Install Anaconda
1. Follow the [Anaconda installation instructions](https://docs.continuum.io/anaconda/install/) for your system.
2. Check that you can run the `conda` command:
```sh
conda
```
3. Use `conda` to create a virtual environment--for example:
```sh
conda create --prefix envs/virtualenv-1
```
`conda` creates a virtual environment in a directory named `./envs/virtualenv-1`.
4. To activate the new virtual environment, use the `conda activate` command and pass the directory path of the virtual environment:
```sh
conda activate envs/VIRTUAL_ENVIRONMENT_NAME
```
For example:
```sh
conda activate ./envs/virtualenv-1
```
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
When a virtual environment is activated, the name displays at the beginning of your terminal command line--for example:
{{% code-callout "(virtualenv-1)"%}}
```sh
(virtualenv-1) $ PROJECT_DIRECTORY
```
{{% /code-callout %}}
## Query InfluxDB using Flight SQL
1. [Install the Flight SQL Python Library](#install-the-flight-sql-python-library)
2. [Create a query client](#create-a-query-client)
3. [Execute a query](#execute-a-query)
### Install the Flight SQL Python Library
The [`flightsql-dbapi`](https://github.com/influxdata/flightsql-dbapi) Flight SQL library for Python provides a
[DB API 2](https://peps.python.org/pep-0249/) interface and
[SQLAlchemy](https://www.sqlalchemy.org/) dialect for
[Flight SQL](https://arrow.apache.org/docs/format/FlightSql.html).
Installing `flightsql-dbapi` also installs the [`pyarrow`](https://arrow.apache.org/docs/python/index.html) library that you'll use for working with Arrow data.
In your terminal, use `pip` to install `flightsql-dbapi`:
```sh
pip install flightsql-dbapi
```
With `flightsql-dbapi` and `pyarrow` installed, you're ready to query and analyze data stored in an InfluxDB bucket.
### Create a query client
The following example shows how to use Python with `flightsql-dbapi`
and the _DB API 2_ interface to instantiate a Flight SQL client configured for an InfluxDB bucket.
1. In your editor, copy and paste the following sample code to a new file--for example, `query-example.py`:
```py
# query-example.py
from flightsql import FlightSQLClient
# Instantiate a FlightSQLClient configured for your bucket
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
```
2. Replace the following configuration values:
- **`INFLUX_READ_WRITE_TOKEN`**: Your InfluxDB token with read permissions on the databases you want to query.
- **`INFLUX_BUCKET`**: The name of your InfluxDB bucket.
### Execute a query
To execute an SQL query, call the query client's `execute(query)` method and pass the query as a string.
#### Syntax {#execute-query-syntax}
```py
execute(query: str, call_options: Optional[FlightSQLCallOptions] = None)
```
#### Example {#execute-query-example}
```py
# query-example.py
from flightsql import FlightSQLClient
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
# Execute the query
info = client.execute("SELECT * FROM home")
```
The response contains a `flight.FlightInfo` object that contains metadata and an `endpoints: [...]` list. Each endpoint contains the following:
- A list of addresses where you can retrieve the data.
- A `ticket` value that identifies the data to retrieve.
Next, use the ticket to [retrieve data for Flight SQL query results](#retrieve-data-for-flight-sql-query-results)
### Retrieve data for Flight SQL query results
To retrieve Arrow data for a query result, call the client's `do_get(ticket)` method.
#### Syntax {#retrieve-data-syntax}
```py
do_get(ticket, call_options: Optional[FlightSQLCallOptions] = None)
```
#### Example {#retrieve-data-example}
The following sample shows how to use Python with `flightsql-dbapi` and `pyarrow` to query InfluxDB and retrieve data.
```py
# query-example.py
from flightsql import FlightSQLClient
# Instantiate a FlightSQLClient configured for a bucket
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
# Execute the query to retrieve FlightInfo
info = client.execute("SELECT * FROM home")
# Extract the token for retrieving data
ticket = info.endpoints[0].ticket
# Use the ticket to request the Arrow data stream.
# Return a FlightStreamReader for streaming the results.
reader = client.do_get(ticket)
# Read all data to a pyarrow.Table
table = reader.read_all()
```
`do_get(ticket)` returns a [`pyarrow.flight.FlightStreamReader`](https://arrow.apache.org/docs/python/generated/pyarrow.flight.FlightStreamReader.html) for streaming Arrow [record batches](https://arrow.apache.org/docs/python/data.html#record-batches).
To read data from the stream, call one of the following `FlightStreamReader` methods:
- `read_all()`: Read all record batches as a [`pyarrow.Table`](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html).
- `read_chunk()`: Read the next RecordBatch and metadata.
- `read_pandas()`: Read all record batches and convert them to a [`pandas.DataFrame`](https://pandas.pydata.org/docs/reference/frame.html).
Next, learn how to use Python tools to work with time series data:
- [Use PyArrow](/influxdb/cloud-iox/query-data/tools/pyarrow/)
- [Use pandas](/influxdb/cloud-iox/query-data/tools/pandas/)

View File

@ -86,7 +86,7 @@ pip3 --version
If neither `pip` nor `pip3` works, follow one of the [Pypa.io Pip Installation](https://pip.pypa.io/en/stable/installation/) methods for your system.
3. Use Pip to install the `flightsql-dbapi` Flight SQL SQL Alchemy library.
3. Use Pip to install the `flightsql-dbapi` library.
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
@ -105,11 +105,11 @@ pip3 install flightsql-dbapi
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
Flight SQL SQL Alchemy is a Python library that provides a
The `flightsql-dbapi` library for Python provides a
[DB API 2](https://peps.python.org/pep-0249/) interface and
[SQLAlchemy](https://www.sqlalchemy.org/) dialect for
[Flight SQL](https://arrow.apache.org/docs/format/FlightSql.html).
Later, you'll add it to Superset's Docker configuration.
Later, you'll add `flightsql-dbapi` to Superset's Docker configuration.
{{% warn %}}
The `flightsql-dbapi` library is experimental and under active development.

View File

@ -0,0 +1,14 @@
---
title: Use analysis and visualization tools with InfluxDB Cloud (IOx) APIs
description: Use popular tools to analyze and visualize time series data stored in an InfluxDB bucket powered by IOx.
weight: 201
menu:
influxdb_cloud_iox:
name: Analyze and visualize data
parent: Query data
influxdb/cloud-iox/tags: [analysis, visualization, tools]
aliases:
- /influxdb/cloud-iox/visualize-data/
---
{{< children >}}

View File

@ -9,8 +9,10 @@ weight: 101
menu:
influxdb_cloud_iox:
name: Use Grafana
parent: Visualize data
influxdb/cloud-iox/tags: [visualization]
parent: Analyze and visualize data
influxdb/cloud-iox/tags: [query, visualization]
aliases:
- /influxdb/cloud-iox/query-data/tools/grafana/
alt_engine: /influxdb/cloud/tools/grafana/
---
@ -28,11 +30,11 @@ Install the [grafana-flight-sql-plugin](https://github.com/influxdata/grafana-fl
<!-- TOC -->
- [Install Grafana](#install-grafana)
- [Download the Grafana Flight SQL Plugin](#download-the-grafana-flight-sql-plugin)
- [Download the Grafana Flight SQL plugin](#download-the-grafana-flight-sql-plugin)
- [Extract the Flight SQL plugin](#extract-the-flight-sql-plugin)
- [Install the Grafana Flight SQL plugin](#install-the-grafana-flight-sql-plugin)
- [Install with Docker Run](#install-with-docker-run)
- [Install with Docker-Compose](#install-with-docker-compose)
- [Install with Docker Run](#install-with-docker-run)
- [Install with Docker-Compose](#install-with-docker-compose)
- [Configure the Flight SQL datasource](#configure-the-flight-sql-datasource)
- [Query InfluxDB with Grafana](#query-influxdb-with-grafana)
- [Build visualizations with Grafana](#build-visualizations-with-grafana)
@ -41,18 +43,19 @@ Install the [grafana-flight-sql-plugin](https://github.com/influxdata/grafana-fl
## Install Grafana
Follow [Grafana installations instructions](https://grafana.com/docs/grafana/latest/setup-grafana/installation/)
for your operating system to Install Grafana.
Follow [Grafana instructions](https://grafana.com/docs/grafana/latest/setup-grafana/installation/)
to Install Grafana for your operating system.
{{% warn %}}
Because Grafana Flight SQL Plugin is a custom plugin, you can't use it with Grafana Cloud.
For more information, see [Find and Use Plugins in the Grafana Cloud documentation](https://grafana.com/docs/grafana-cloud/fundamentals/find-and-use-plugins/)
{{% /warn %}}
## Download the Grafana Flight SQL plugin
Download the latest release from [influxdata/grafana-flightsql-datasource releases](https://github.com/influxdata/grafana-flightsql-datasource/releases).
{{% warn %}}
Because Grafana Flight SQL Plugin is a custom plugin, you can't use it with Grafana Cloud.
For more information, see [Find and Use Plugins in the Grafana Cloud documentation](https://grafana.com/docs/grafana-cloud/fundamentals/find-and-use-plugins/)
The Grafana Flight SQL plugin is experimental and subject to change.
{{% /warn %}}
@ -78,11 +81,6 @@ unzip influxdata-flightsql-datasource.zip -d /custom/plugins/directory/
Install the custom-built Flight SQL plugin in a local or Docker-based instance
of Grafana OSS or Grafana Enterprise.
{{% warn %}}
Because Grafana Flight SQL Plugin is a custom plugin, you can't use it with Grafana Cloud.
For more information, see [Find and Use Plugins in the Grafana Cloud documentation](https://grafana.com/docs/grafana-cloud/fundamentals/find-and-use-plugins/)
{{% /warn %}}
{{< tabs-wrapper >}}
{{% tabs %}}
[Local](#)

View File

@ -0,0 +1,200 @@
---
title: Use pandas to analyze and visualize data
seotitle: Use Python and pandas to analyze and visualize data
description: >
Use the [pandas](https://pandas.pydata.org/) Python data analysis library
to analyze and visualize data stored in a bucket powered by InfluxDB IOx.
weight: 101
menu:
influxdb_cloud_iox:
parent: Analyze and visualize data
name: Use pandas
influxdb/cloud-iox/tags: [analysis, pandas, pyarrow, python, visualization]
related:
- /influxdb/cloud-iox/query-data/tools/python/
- /influxdb/cloud-iox/query-data/tools/pyarrow/
- /influxdb/cloud-iox/query-data/sql/
list_code_example: |
```py
...
dataframe = reader.read_pandas()
dataframe = dataframe.set_index('time')
print(dataframe.index)
resample = dataframe.resample("1H")
resample['temp'].mean()
```
---
Use [pandas](https://pandas.pydata.org/), the Python data analysis library, to process, analyze, and visualize data
stored in an InfluxDB bucket powered by InfluxDB IOx.
> **pandas** is an open source, BSD-licensed library providing high-performance,
> easy-to-use data structures and data analysis tools for the Python programming language.
>
> {{% caption %}}[pandas documentation](https://pandas.pydata.org/docs/){{% /caption %}}
<!-- TOC -->
- [Install prerequisites](#install-prerequisites)
- [Install pandas](#install-pandas)
- [Use PyArrow to convert query results to pandas](#use-pyarrow-to-convert-query-results-to-pandas)
- [Use pandas to analyze data](#use-pandas-to-analyze-data)
- [View data information and statistics](#view-data-information-and-statistics)
- [Downsample time series](#downsample-time-series)
<!-- /TOC -->
## Install prerequisites
The examples in this guide assume using a Python virtual environment and the Flight SQL library for Python.
Installing `flightsql-dbapi` also installs the [`pyarrow`](https://arrow.apache.org/docs/python/index.html) library that provides Python bindings for Apache Arrow.
For more information, see how to [get started querying InfluxDB with Python and flightsql-dbapi](/influxdb/cloud-iox/query-data/execute-queries/flight-sql/python/)
## Install pandas
To use pandas, you need to install and import the `pandas` library.
In your terminal, use `pip` to install `pandas` in your active [Python virtual environment](/influxdb/cloud-iox/query-data/execute-queries/flight-sql/python/#create-a-project-virtual-environment):
```sh
pip install pandas
```
## Use PyArrow to convert query results to pandas
The following steps use Python, `flightsql-dbapi`, and `pyarrow` to query InfluxDB and stream Arrow data to a pandas `DataFrame`.
1. In your editor, copy and paste the following code to a new file--for example, `pandas-example.py`:
```py
# pandas-example.py
from flightsql import FlightSQLClient
import pandas
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
info = client.execute("SELECT * FROM home")
reader = client.do_get(info.endpoints[0].ticket)
# Read all record batches in the stream to a pandas DataFrame
dataframe = reader.read_pandas()
dataframe.info()
```
2. Replace the following configuration values:
- **`INFLUX_READ_WRITE_TOKEN`**: Your InfluxDB token with read permissions on the databases you want to query.
- **`INFLUX_BUCKET`**: The name of your InfluxDB bucket.
3. In your terminal, use the Python interpreter to run the file:
```sh
python pandas-example.py
```
The `pyarrow.flight.FlightStreamReader` [`read_pandas()`](https://arrow.apache.org/docs/python/generated/pyarrow.flight.FlightStreamReader.html#pyarrow.flight.FlightStreamReader.read_pandas) method:
- Takes the same options as [`pyarrow.Table.to_pandas()`](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.to_pandas).
- Reads all Arrow record batches in the stream to a `pyarrow.Table` and then converts the `Table` to a [`pandas.DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas.DataFrame).
Next, [use pandas to analyze data](#use-pandas-to-analyze-data).
## Use pandas to analyze data
- [View information and statistics for data](#view-information-and-statistics-for-data)
- [Downsample time series](#downsample-time-series)
### View data information and statistics
The following example uses the DataFrame `info()` and `describe()`
methods to print information about the DataFrame.
```py
# pandas-example.py
from flightsql import FlightSQLClient
import pandas
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
info = client.execute("SELECT * FROM home")
reader = client.do_get(info.endpoints[0].ticket)
dataframe = reader.read_pandas()
# Print a summary of the DataFrame to stdout
dataframe.info()
# Calculate summary statistics for the data
print(dataframe.describe())
```
### Downsample time series
The pandas library provides extensive features for working with time series data.
The [`pandas.DataFrame.resample()` method](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html) downsamples and upsamples data to time-based groups--for example:
```py
from flightsql import FlightSQLClient
import pandas
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
info = client.execute("SELECT * FROM home")
reader = client.do_get(info.endpoints[0].ticket)
dataframe = reader.read_pandas()
# Use the `time` column to generate a DatetimeIndex for the DataFrame
dataframe = dataframe.set_index('time')
# Print information about the index
print(dataframe.index)
# Downsample data into 5-minute groups based on the DatetimeIndex
resample = dataframe.resample("1H")
# Print a summary that shows the start time and average temp for each group
print(resample['temp'].mean())
```
{{< expand-wrapper >}}
{{% expand "View example results" %}}
```sh
time
1970-01-01 00:00:00 22.374138
1970-01-01 01:00:00 NaN
1970-01-01 02:00:00 NaN
1970-01-01 03:00:00 NaN
1970-01-01 04:00:00 NaN
...
2023-07-16 22:00:00 NaN
2023-07-16 23:00:00 22.600000
2023-07-17 00:00:00 22.513889
2023-07-17 01:00:00 22.208333
2023-07-17 02:00:00 22.300000
Freq: H, Name: temp, Length: 469323, dtype: float64
```
{{% /expand %}}
{{< /expand-wrapper >}}
For more detail and examples, see the [pandas documentation](https://pandas.pydata.org/docs/index.html).

View File

@ -0,0 +1,136 @@
---
title: Use the PyArrow library to analyze data
description: >
Use [PyArrow](https://arrow.apache.org/docs/python/) to read and analyze InfluxDB query results from a bucket powered by InfluxDB IOx.
weight: 101
menu:
influxdb_cloud_iox:
parent: Analyze and visualize data
name: Use PyArrow
influxdb/cloud-iox/tags: [analysis, arrow, pyarrow, python]
related:
- /influxdb/cloud-iox/query-data/tools/pandas/
- /influxdb/cloud-iox/query-data/tools/pyarrow/
- /influxdb/cloud-iox/query-data/sql/
list_code_example: |
```py
...
table = reader.read_all()
table.group_by('room').aggregate([('temp', 'mean')])
```
---
Use [PyArrow](https://arrow.apache.org/docs/python/) to read and analyze query results
from an InfluxDB bucket powered by InfluxDB IOx.
The PyArrow library provides efficient computation, aggregation, serialization, and conversion of Arrow format data.
> Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable
> big data systems to store, process and move data fast.
>
> The Arrow Python bindings (also named “PyArrow”) have first-class integration with NumPy, pandas, and built-in Python objects. They are based on the C++ implementation of Arrow.
> {{% caption %}}[PyArrow documentation](https://arrow.apache.org/docs/python/index.html){{% /caption %}}
<!-- TOC -->
- [Install prerequisites](#install-prerequisites)
- [Use PyArrow to read query results](#use-pyarrow-to-read-query-results)
- [Use PyArrow to analyze data](#use-pyarrow-to-analyze-data)
- [Group and aggregate data](#group-and-aggregate-data)
<!-- /TOC -->
## Install prerequisites
The examples in this guide assume using a Python virtual environment and the Flight SQL library for Python.
For more information, see how to [get started using Python to query InfluxDB](/influxdb/cloud-iox/query-data/execute-queries/flight-sql/python/)
Installing `flightsql-dbapi` also installs the [`pyarrow`](https://arrow.apache.org/docs/python/index.html) library that provides Python bindings for Apache Arrow.
## Use PyArrow to read query results
The following example shows how to use Python with `flightsql-dbapi` and `pyarrow` to query InfluxDB and view Arrow data as a PyArrow `Table`.
1. In your editor, copy and paste the following sample code to a new file--for example, `pyarrow-example.py`:
```py
# pyarrow-example.py
from flightsql import FlightSQLClient
# Instantiate a FlightSQLClient configured for a bucket
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
# Execute the query to retrieve FlightInfo
info = client.execute('SELECT * FROM home')
# Use the ticket to request the Arrow data stream.
# Return a FlightStreamReader for streaming the results.
reader = client.do_get(info.endpoints[0].ticket)
# Read all data to a pyarrow.Table
table = reader.read_all()
print(table)
```
2. Replace the following configuration values:
- **`INFLUX_READ_WRITE_TOKEN`**: Your InfluxDB token with read permissions on the databases you want to query.
- **`INFLUX_BUCKET`**: The name of your InfluxDB bucket.
3. In your terminal, use the Python interpreter to run the file:
```sh
python pyarrow-example.py
```
The `FlightStreamReader.read_all()` method reads all Arrow record batches in the stream as a [`pyarrow.Table`](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html).
Next, [use PyArrow to analyze data](#use-pyarrow-to-analyze-data).
## Use PyArrow to analyze data
### Group and aggregate data
With a `pyarrow.Table`, you can use values in a column as _keys_ for grouping.
The following example shows how to query InfluxDB, group the table data, and then calculate an aggregate value for each group:
```py
# pyarrow-example.py
from flightsql import FlightSQLClient
client = FlightSQLClient(host='cloud2.influxdata.com',
token='INFLUX_READ_WRITE_TOKEN',
metadata={'bucket-name': 'INFLUX_BUCKET'},
features={'metadata-reflection': 'true'})
info = client.execute('SELECT * FROM home')
reader = client.do_get(info.endpoints[0].ticket)
table = reader.read_all()
# Use PyArrow to aggregate data
print(table.group_by('room').aggregate([('temp', 'mean')]))
```
{{< expand-wrapper >}}
{{% expand "View example results" %}}
```arrow
pyarrow.Table
temp_mean: double
room: string
----
temp_mean: [[22.581987577639747,22.10807453416151]]
room: [["Kitchen","Living Room"]]
```
{{% /expand %}}
{{< /expand-wrapper >}}
For more detail and examples, see the [PyArrow documentation](https://arrow.apache.org/docs/python/getstarted.html) and the [Apache Arrow Python Cookbook](https://arrow.apache.org/cookbook/py/data.html).

View File

@ -8,10 +8,12 @@ description: >
weight: 101
menu:
influxdb_cloud_iox:
parent: Visualize data
parent: Analyze and visualize data
name: Use Superset
identifier: visualize_with_superset
influxdb/cloud-iox/tags: [visualization]
influxdb/cloud-iox/tags: [query, visualization]
aliases:
- /influxdb/cloud-iox/query-data/tools/superset/
related:
- /influxdb/cloud-iox/query-data/execute-queries/flight-sql/superset/
---

View File

@ -1,17 +0,0 @@
---
title: Visualize data
seotitle: Visualize data stored in InfluxDB
description: >
Use tools like Grafana and Apache Superset to visualize time series data
stored in InfluxDB.
weight: 5
menu:
influxdb_cloud_iox:
name: Visualize data
influxdb/cloud-iox/tags: [visualization]
---
Use visualization tools like Grafana and Apache Superset to visualize your
time series data stored in InfluxDB.
{{< children >}}

View File

@ -1,3 +1,3 @@
When working with the InfluxDB SQL implementation, a **bucket** is equivalent
to a databases, a **measurement** is structured as a table, and **time**,
to a database, a **measurement** is structured as a table, and **time**,
**fields**, and **tags** are structured as columns.