docs-v2/content/influxdb/cloud-dedicated/process-data/summarize.md

---
title: Summarize query results and data distribution
description: >
  Query data stored in InfluxDB and use tools like pandas to summarize the results schema and distribution.
menu:
  influxdb_cloud_dedicated:
    name: Summarize data
    parent: Process & visualize data
weight: 101
influxdb/cloud-dedicated/tags: [analysis, pandas, pyarrow, python, schema]
related:
  - /influxdb/cloud-dedicated/query-data/execute-queries/client-libraries/python/
---

Query data stored in InfluxDB and use tools like pandas to summarize the results schema and distribution.

{{% note %}}
#### Sample data

The following examples use the sample data written in the
[Get started writing data guide](/influxdb/cloud-dedicated/get-started/write/).
To run the example queries and return results,
[write the sample data](/influxdb/cloud-dedicated/get-started/write/#write-line-protocol-to-influxdb)
to your {{% product-name %}} database before running the example queries.
{{% /note %}}

### View data information and statistics

#### Using Python and pandas

The following example uses the [InfluxDB client library for Python](/influxdb/cloud-dedicated/reference/client-libraries/v3/python/) to query an {{% product-name %}} database,
and then uses pandas [`DataFrame.info()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.info.html) and [`DataFrame.describe()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html) methods to summarize the schema and distribution of the data.

1.  In your editor, create a file (for example, `pandas-example.py`) and enter the following sample code:
    <!-- tabs-wrapper allows code-placeholders to work when indented -->
    {{% tabs-wrapper %}}
{{% code-placeholders "DATABASE_TOKEN|DATABASE_NAME" %}}
```py
# pandas-example.py

import influxdb_client_3 as InfluxDBClient3
import pandas

client = InfluxDBClient3.InfluxDBClient3(token='DATABASE_TOKEN',
                      host='{{< influxdb/host >}}',
                      database='DATABASE_NAME',
                      org="",
                      write_options=SYNCHRONOUS)

table = client.query("select * from home where room like '%'")
dataframe = table.to_pandas()

# Print information about the results DataFrame,
# including the index dtype and columns, non-null values, and memory usage.
dataframe.info()

# Calculate descriptive statistics that summarize the distribution of the results.
print(dataframe.describe())
```
{{% /code-placeholders %}}
    {{% /tabs-wrapper %}}

2.  Enter the following command in your terminal to execute the file using the Python interpreter:

    ```sh
    python pandas-example.py
    ```

    The output is similar to the following:

    ```sh
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 411 entries, 0 to 410
    Data columns (total 8 columns):
    #   Column     Non-Null Count  Dtype         
    ---  ------     --------------  -----         
    0   co         405 non-null    float64       
    1   host       2 non-null      object        
    2   hum        406 non-null    float64       
    3   room       411 non-null    object        
    4   sensor     1 non-null      object        
    5   sensor_id  2 non-null      object        
    6   temp       411 non-null    float64       
    7   time       411 non-null    datetime64[ns]
    dtypes: datetime64[ns](1), float64(3), object(4)
    memory usage: 25.8+ KB

                  co         hum        temp                           time
    count  405.000000  406.000000  411.000000                            411
    mean     5.320988   35.860591   23.803893  2008-06-12 13:33:49.074302208
    min      0.000000   20.200000   18.400000     1970-01-01 00:00:01.641024
    25%      0.000000   35.900000   22.200000  1970-01-01 00:00:01.685054600
    50%      1.000000   36.000000   22.500000            2023-03-21 05:46:40
    75%      9.000000   36.300000   22.800000            2023-07-15 21:34:10
    max     26.000000   80.000000   74.000000            2023-07-17 02:07:00
    std      7.640154    3.318794    8.408807                            NaN
    ```
Link Tableau into Analyze and Visualize, reorganize tools for query-data and process-data, fix URLs (#5011) * chore(v3): copy pandas .info and .describe example to new Summarize Data page * fix(v3): url * chore(v3): link Tableau guide in Analyze and Visualize, move Analyze and Visualize (query-data/tools) to process-data/tools, alias sql/execute-queries as query-data/tools/, cleanup aliases and relateds, fix urls. (closes #5010) * chore(v3): update links with new tools URLs * fix(v3): transposed URLs 2023-07-06 17:39:02 +00:00			`---`
			`title: Summarize query results and data distribution`
			`description: >`
			`Query data stored in InfluxDB and use tools like pandas to summarize the results schema and distribution.`
			`menu:`
			`influxdb_cloud_dedicated:`
			`name: Summarize data`
Restructure query, process, and visualize docs (#5027) * restructure query, process, and visualize docs * Apply suggestions from code review Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> 2023-07-14 17:17:50 +00:00			`parent: Process & visualize data`
Link Tableau into Analyze and Visualize, reorganize tools for query-data and process-data, fix URLs (#5011) * chore(v3): copy pandas .info and .describe example to new Summarize Data page * fix(v3): url * chore(v3): link Tableau guide in Analyze and Visualize, move Analyze and Visualize (query-data/tools) to process-data/tools, alias sql/execute-queries as query-data/tools/, cleanup aliases and relateds, fix urls. (closes #5010) * chore(v3): update links with new tools URLs * fix(v3): transposed URLs 2023-07-06 17:39:02 +00:00			`weight: 101`
			`influxdb/cloud-dedicated/tags: [analysis, pandas, pyarrow, python, schema]`
			`related:`
Version restructure (#5133) * mass changes for version restructure * fixed latest-patch and flux version generator * updated hugo configs * fixed flux frontmatter injector * fixed flux frontmatter injector * WIP api generator updates for version restructure (#5128) * fixed telegraf plugin list * removed latest shortcode * fixed current-version * fixed product dropdown crosslinking * fixed alt links * WIP fixing links * fixed broken links * updated api doc generation * fixed additional resources * added version redirects to edge.js * fixed search placeholder * fixed paged titles 2023-09-13 05:33:31 +00:00			`- /influxdb/cloud-dedicated/query-data/execute-queries/client-libraries/python/`
Link Tableau into Analyze and Visualize, reorganize tools for query-data and process-data, fix URLs (#5011) * chore(v3): copy pandas .info and .describe example to new Summarize Data page * fix(v3): url * chore(v3): link Tableau guide in Analyze and Visualize, move Analyze and Visualize (query-data/tools) to process-data/tools, alias sql/execute-queries as query-data/tools/, cleanup aliases and relateds, fix urls. (closes #5010) * chore(v3): update links with new tools URLs * fix(v3): transposed URLs 2023-07-06 17:39:02 +00:00			`---`

			`Query data stored in InfluxDB and use tools like pandas to summarize the results schema and distribution.`

			`{{% note %}}`
			`#### Sample data`

			`The following examples use the sample data written in the`
			`[Get started writing data guide](/influxdb/cloud-dedicated/get-started/write/).`
			`To run the example queries and return results,`
			`[write the sample data](/influxdb/cloud-dedicated/get-started/write/#write-line-protocol-to-influxdb)`
InfluxDB Clustered documentation (#5126) * WIP base changes for clustered docs * WIP clustered docs * Add new influxdb/host shortcode and implement it in 3.0 docs (#5077) * add new influxdb/host shortcode and implement it in 3.0 docs * remove oss- cloud-only shortcodes from serverless * Apply suggestions from code review Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> * updated urls js to PR suggestion * Updated JavaScript, templates, and styles for Clustered URLs (#5079) * updated js, templates, and styles for clustered urls * Apply suggestions from code review Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> * restructure product dropdown template to be more extensible * fixed more page template bugs * fixed references to cloud in clustered * updated docsearch templates * added early access flagging and cta-link shortcode * minor content updates in clustered * updated staging config * fixed typo in clustered description * ported influxctl 2.0.1 to clustered * ported get started changes to clustered * ported 3.0 admin docs to clustered * port null tag content to clustered * ported influxctl note to clustered * ported query reorg changes to clustered * updated early access to limited availability, updated clustered landing content * ported new content to clustered * ported new content to clustered * updated cta on clustered landing page * Updated notifications and added InfluxDB Clustered announcement notification (#5125) * updated notifications, added clustered announcement notification * updated cta in clustered notification * updated influxctl profile configs * update clustered search attributes * updated learn more link in clustered notification * Apply suggestions from code review * fixed typos * fixed typos --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> 2023-09-06 12:21:47 +00:00			`to your {{% product-name %}} database before running the example queries.`
Link Tableau into Analyze and Visualize, reorganize tools for query-data and process-data, fix URLs (#5011) * chore(v3): copy pandas .info and .describe example to new Summarize Data page * fix(v3): url * chore(v3): link Tableau guide in Analyze and Visualize, move Analyze and Visualize (query-data/tools) to process-data/tools, alias sql/execute-queries as query-data/tools/, cleanup aliases and relateds, fix urls. (closes #5010) * chore(v3): update links with new tools URLs * fix(v3): transposed URLs 2023-07-06 17:39:02 +00:00			`{{% /note %}}`

			`### View data information and statistics`

			`#### Using Python and pandas`

InfluxDB Clustered documentation (#5126) * WIP base changes for clustered docs * WIP clustered docs * Add new influxdb/host shortcode and implement it in 3.0 docs (#5077) * add new influxdb/host shortcode and implement it in 3.0 docs * remove oss- cloud-only shortcodes from serverless * Apply suggestions from code review Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> * updated urls js to PR suggestion * Updated JavaScript, templates, and styles for Clustered URLs (#5079) * updated js, templates, and styles for clustered urls * Apply suggestions from code review Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> * restructure product dropdown template to be more extensible * fixed more page template bugs * fixed references to cloud in clustered * updated docsearch templates * added early access flagging and cta-link shortcode * minor content updates in clustered * updated staging config * fixed typo in clustered description * ported influxctl 2.0.1 to clustered * ported get started changes to clustered * ported 3.0 admin docs to clustered * port null tag content to clustered * ported influxctl note to clustered * ported query reorg changes to clustered * updated early access to limited availability, updated clustered landing content * ported new content to clustered * ported new content to clustered * updated cta on clustered landing page * Updated notifications and added InfluxDB Clustered announcement notification (#5125) * updated notifications, added clustered announcement notification * updated cta in clustered notification * updated influxctl profile configs * update clustered search attributes * updated learn more link in clustered notification * Apply suggestions from code review * fixed typos * fixed typos --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> 2023-09-06 12:21:47 +00:00			`The following example uses the [InfluxDB client library for Python](/influxdb/cloud-dedicated/reference/client-libraries/v3/python/) to query an {{% product-name %}} database,`
Link Tableau into Analyze and Visualize, reorganize tools for query-data and process-data, fix URLs (#5011) * chore(v3): copy pandas .info and .describe example to new Summarize Data page * fix(v3): url * chore(v3): link Tableau guide in Analyze and Visualize, move Analyze and Visualize (query-data/tools) to process-data/tools, alias sql/execute-queries as query-data/tools/, cleanup aliases and relateds, fix urls. (closes #5010) * chore(v3): update links with new tools URLs * fix(v3): transposed URLs 2023-07-06 17:39:02 +00:00			and then uses pandas [`DataFrame.info()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.info.html) and [`DataFrame.describe()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html) methods to summarize the schema and distribution of the data.

			1. In your editor, create a file (for example, `pandas-example.py`) and enter the following sample code:
			`<!-- tabs-wrapper allows code-placeholders to work when indented -->`
			`{{% tabs-wrapper %}}`
			`{{% code-placeholders "DATABASE_TOKEN\|DATABASE_NAME" %}}`
			```py
			`# pandas-example.py`

			`import influxdb_client_3 as InfluxDBClient3`
			`import pandas`

			`client = InfluxDBClient3.InfluxDBClient3(token='DATABASE_TOKEN',`
InfluxDB Clustered documentation (#5126) * WIP base changes for clustered docs * WIP clustered docs * Add new influxdb/host shortcode and implement it in 3.0 docs (#5077) * add new influxdb/host shortcode and implement it in 3.0 docs * remove oss- cloud-only shortcodes from serverless * Apply suggestions from code review Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> * updated urls js to PR suggestion * Updated JavaScript, templates, and styles for Clustered URLs (#5079) * updated js, templates, and styles for clustered urls * Apply suggestions from code review Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> * restructure product dropdown template to be more extensible * fixed more page template bugs * fixed references to cloud in clustered * updated docsearch templates * added early access flagging and cta-link shortcode * minor content updates in clustered * updated staging config * fixed typo in clustered description * ported influxctl 2.0.1 to clustered * ported get started changes to clustered * ported 3.0 admin docs to clustered * port null tag content to clustered * ported influxctl note to clustered * ported query reorg changes to clustered * updated early access to limited availability, updated clustered landing content * ported new content to clustered * ported new content to clustered * updated cta on clustered landing page * Updated notifications and added InfluxDB Clustered announcement notification (#5125) * updated notifications, added clustered announcement notification * updated cta in clustered notification * updated influxctl profile configs * update clustered search attributes * updated learn more link in clustered notification * Apply suggestions from code review * fixed typos * fixed typos --------- Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com> 2023-09-06 12:21:47 +00:00			`host='{{< influxdb/host >}}',`
Link Tableau into Analyze and Visualize, reorganize tools for query-data and process-data, fix URLs (#5011) * chore(v3): copy pandas .info and .describe example to new Summarize Data page * fix(v3): url * chore(v3): link Tableau guide in Analyze and Visualize, move Analyze and Visualize (query-data/tools) to process-data/tools, alias sql/execute-queries as query-data/tools/, cleanup aliases and relateds, fix urls. (closes #5010) * chore(v3): update links with new tools URLs * fix(v3): transposed URLs 2023-07-06 17:39:02 +00:00			`database='DATABASE_NAME',`
			`org="",`
			`write_options=SYNCHRONOUS)`

			`table = client.query("select * from home where room like '%'")`
			`dataframe = table.to_pandas()`

			`# Print information about the results DataFrame,`
			`# including the index dtype and columns, non-null values, and memory usage.`
			`dataframe.info()`

			`# Calculate descriptive statistics that summarize the distribution of the results.`
			`print(dataframe.describe())`
			```
			`{{% /code-placeholders %}}`
			`{{% /tabs-wrapper %}}`

			`2. Enter the following command in your terminal to execute the file using the Python interpreter:`

			```sh
			`python pandas-example.py`
			```

			`The output is similar to the following:`

			```sh
			`<class 'pandas.core.frame.DataFrame'>`
			`RangeIndex: 411 entries, 0 to 410`
			`Data columns (total 8 columns):`
			`# Column Non-Null Count Dtype`
			`--- ------ -------------- -----`
			`0 co 405 non-null float64`
			`1 host 2 non-null object`
			`2 hum 406 non-null float64`
			`3 room 411 non-null object`
			`4 sensor 1 non-null object`
			`5 sensor_id 2 non-null object`
			`6 temp 411 non-null float64`
			`7 time 411 non-null datetime64[ns]`
			`dtypes: datetime64[ns](1), float64(3), object(4)`
			`memory usage: 25.8+ KB`

			`co hum temp time`
			`count 405.000000 406.000000 411.000000 411`
			`mean 5.320988 35.860591 23.803893 2008-06-12 13:33:49.074302208`
			`min 0.000000 20.200000 18.400000 1970-01-01 00:00:01.641024`
			`25% 0.000000 35.900000 22.200000 1970-01-01 00:00:01.685054600`
			`50% 1.000000 36.000000 22.500000 2023-03-21 05:46:40`
			`75% 9.000000 36.300000 22.800000 2023-07-15 21:34:10`
			`max 26.000000 80.000000 74.000000 2023-07-17 02:07:00`
			`std 7.640154 3.318794 8.408807 NaN`
			```