diff --git a/content/enterprise_influxdb/v1.9/concepts/schema_and_data_layout.md b/content/enterprise_influxdb/v1.9/concepts/schema_and_data_layout.md
index 603c5f4e8..0d8e2ef4b 100644
--- a/content/enterprise_influxdb/v1.9/concepts/schema_and_data_layout.md
+++ b/content/enterprise_influxdb/v1.9/concepts/schema_and_data_layout.md
@@ -9,136 +9,160 @@ menu:
parent: Concepts
---
-Every InfluxDB use case is special and your [schema](/enterprise_influxdb/v1.9/concepts/glossary/#schema) will reflect that uniqueness.
-There are, however, general guidelines to follow and pitfalls to avoid when designing your schema.
+Each InfluxDB use case is unique and your [schema](/enterprise_influxdb/v1.9/concepts/glossary/#schema) reflects that uniqueness.
+In general, a schema designed for querying leads to simpler and more performant queries.
+We recommend the following design guidelines for most use cases:
-
+ - [Where to store data (tag or field)](#where-to-store-data-tag-or-field)
+ - [Avoid too many series](#avoid-too-many-series)
+ - [Use recommended naming conventions](#use-recommended-naming-conventions)
+ - [Shard Group Duration Management](#shard-group-duration-management)
-## General recommendations
+## Where to store data (tag or field)
-### Encouraged schema design
+Your queries should guide what data you store in [tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag) and what you store in [fields](/enterprise_influxdb/v1.9/concepts/glossary/#field) :
-We recommend that you:
+- Store commonly-queried and grouping ([`group()`](/flux/v0.x/stdlib/universe/group) or [`GROUP BY`](/enterprise_influxdb/v1.9/query_language/explore-data/#group-by-tags)) metadata in tags.
+- Store data in fields if each data point contains a different value.
+- Store numeric values as fields ([tag values](/enterprise_influxdb/v1.9/concepts/glossary/#tag-value) only support string values).
-- [Encode meta data in tags](#encode-meta-data-in-tags)
-- [Avoid using keywords as tag or field names](#avoid-using-keywords-as-tag-or-field-names)
+## Avoid too many series
-#### Encode meta data in tags
+IndexDB indexes the following data elements to speed up reads:
-[Tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag) are indexed and [fields](/enterprise_influxdb/v1.9/concepts/glossary/#field) are not indexed.
-This means that queries on tags are more performant than those on fields.
+- [measurement](/enterprise_influxdb/v1.9/concepts/glossary/#measurement)
+- [tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag)
-In general, your queries should guide what gets stored as a tag and what gets stored as a field:
+[Tag values](/enterprise_influxdb/v1.9/concepts/glossary/#tag-value) are indexed and [field values](/enterprise_influxdb/v1.9/concepts/glossary/#field-value) are not.
+This means that querying by tags is more performant than querying by fields.
+However, when too many indexes are created, both writes and reads may start to slow down.
-- Store commonly-queried meta data in tags
-- Store data in tags if you plan to use them with the InfluxQL `GROUP BY` clause
-- Store data in fields if you plan to use them with an [InfluxQL](/enterprise_influxdb/v1.9/query_language/functions/) function
-- Store numeric values as fields ([tag values](/enterprise_influxdb/v1.9/concepts/glossary/#tag-value) only support string values)
+Each unique set of indexed data elements forms a [series key](/enterprise_influxdb/v1.9/concepts/glossary/#series-key).
+[Tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag) containing highly variable information like unique IDs, hashes, and random strings lead to a large number of [series](/enterprise_influxdb/v1.9/concepts/glossary/#series), also known as high [series cardinality](/enterprise_influxdb/v1.9/concepts/glossary/#series-cardinality).
+High series cardinality is a primary driver of high memory usage for many database workloads.
+Therefore, to reduce memory consumption, consider storing high-cardinality values in field values rather than in tags or field keys.
-#### Avoid using keywords as tag or field names
+{{% note %}}
-Not required, but simplifies writing queries because you won't have to wrap tag or field names in double quotes.
-See [InfluxQL](https://github.com/influxdata/influxql/blob/master/README.md#keywords) and [Flux](https://github.com/influxdata/flux/blob/master/docs/SPEC.md#keywords) keywords to avoid.
+If reads and writes to InfluxDB start to slow down, you may have high series cardinality (too many series).
+See [how to find and reduce high series cardinality](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter).
-Also, if a tag or field name contains characters other than `[A-z,_]`, you must wrap it in double quotes in InfluxQL or use [bracket notation](/{{< latest "influxdb" "v2" >}}/query-data/get-started/syntax-basics/#records) in Flux.
+{{% /note %}}
-### Discouraged schema design
+## Use recommended naming conventions
-We recommend that you:
+Use the following conventions when naming your tag and field keys:
-- [Avoid too many series](#avoid-too-many-series)
-- [Avoid the same name for a tag and a field](#avoid-the-same-name-for-a-tag-and-a-field)
-- [Avoid encoding data in measurement names](#avoid-encoding-data-in-measurement-names)
-- [Avoid putting more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
+- [Avoid reserved keywords in tag and field keys](#avoid-reserved-keywords-in-tag-and-field-keys)
+- [Avoid the same tag and field name](#avoid-the-same-name-for-a-tag-and-a-field)
+- [Avoid encoding data in measurements and keys](#avoid-encoding-data-in-measurements-and-keys)
+- [Avoid more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
-#### Avoid too many series
+### Avoid reserved keywords in tag and field keys
-[Tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag) containing highly variable information like UUIDs, hashes, and random strings lead to a large number of [series](/enterprise_influxdb/v1.9/concepts/glossary/#series) in the database, also known as high series cardinality. High series cardinality is a primary driver of high memory usage for many database workloads.
+Not required, but avoiding the use of reserved keywords in your tag keys and field keys simplifies writing queries because you won't have to wrap your keys in double quotes.
+See [InfluxQL](https://github.com/influxdata/influxql/blob/master/README.md#keywords) and [Flux keywords](/{{< latest "flux" >}}/spec/lexical-elements/#keywords) to avoid.
-
+Also, if a tag key or field key contains characters other than `[A-z,_]`, you must wrap it in double quotes in InfluxQL or use [bracket notation](/{{< latest "flux" >}}/data-types/composite/record/#bracket-notation) in Flux.
-If the system has memory constraints, consider storing high-cardinality data as a field rather than a tag. For more information, see [series cardinality](/enterprise_influxdb/v1.9/concepts/glossary/#series-cardinality).
-
-
-
-#### Avoid the same name for a tag and a field
+### Avoid the same name for a tag and a field
Avoid using the same name for a tag and field key.
This often results in unexpected behavior when querying data.
-If you inadvertently add the same name for a tag and field key, see
+If you inadvertently add the same name for a tag and a field, see
[Frequently asked questions](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#tag-and-field-key-with-the-same-name)
for information about how to query the data predictably and how to fix the issue.
-#### Avoid encoding data in measurement names
+### Avoid encoding data in measurements and keys
-InfluxDB queries merge data that falls within the same [measurement](/enterprise_influxdb/v1.9/concepts/glossary/#measurement); it's better to differentiate data with [tags](/enterprise_influxdb/v1.9/concepts/glossary/#tag) than with detailed measurement names. If you encode data in a measurement name, you must use a regular expression to query the data, making some queries more complicated or impossible.
+Store data in [tag values](/enterprise_influxdb/v1.9/concepts/glossary/#tag-value) or [field values](/enterprise_influxdb/v1.9/concepts/glossary/#field-value), not in [tag keys](/enterprise_influxdb/v1.9/concepts/glossary/#tag-key), [field keys](/enterprise_influxdb/v1.9/concepts/glossary/#field-key), or [measurements](/enterprise_influxdb/v1.9/concepts/glossary/#measurement). If you design your schema to store data in tag and field values,
+your queries will be easier to write and more efficient.
-_Example:_
+In addition, you'll keep cardinality low by not creating measurements and keys as you write data.
+To learn more about the performance impact of high series cardinality, see [how to find and reduce high series cardinality](/enterprise_influxdb/v1.9/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter).
-Consider the following schema represented by line protocol.
+#### Compare schemas
+Compare the following valid schemas represented by line protocol.
+
+**Recommended**: the following schema stores metadata in separate `crop`, `plot`, and `region` tags. The `temp` field contains variable numeric data.
+
+##### {id="good-measurements-schema"}
```
-Schema 1 - Data encoded in the measurement name
--------------
-blueberries.plot-1.north temp=50.1 1472515200000000000
-blueberries.plot-2.midwest temp=49.8 1472515200000000000
-```
-
-The long measurement names (`blueberries.plot-1.north`) with no tags are similar to Graphite metrics.
-Encoding the `plot` and `region` in the measurement name makes the data more difficult to query.
-
-For example, calculating the average temperature of both plots 1 and 2 is not possible with schema 1.
-Compare this to schema 2:
-
-```
-Schema 2 - Data encoded in tags
+Good Measurements schema - Data encoded in tags (recommended)
-------------
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
```
-Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region:
+**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the measurement, similar to Graphite metrics.
-##### Flux
+##### {id="bad-measurements-schema"}
+```
+Bad Measurements schema - Data encoded in the measurement (not recommended)
+-------------
+blueberries.plot-1.north temp=50.1 1472515200000000000
+blueberries.plot-2.midwest temp=49.8 1472515200000000000
+```
+
+**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the field key.
+
+##### {id="bad-keys-schema"}
+```
+Bad Keys schema - Data encoded in field keys (not recommended)
+-------------
+weather_sensor blueberries.plot-1.north.temp=50.1 1472515200000000000
+weather_sensor blueberries.plot-2.midwest.temp=49.8 1472515200000000000
+```
+
+#### Compare queries
+
+Compare the following queries of the [_Good Measurements_](#good-measurements-schema) and [_Bad Measurements_](#bad-measurements-schema) schemas.
+The [Flux](/{{< latest "flux" >}}/) queries calculate the average `temp` for blueberries in the `north` region
+
+**Easy to query**: [_Good Measurements_](#good-measurements-schema) data is easily filtered by `region` tag values, as in the following example.
```js
-// Schema 1 - Query for data encoded in the measurement name
-from(bucket:"/")
- |> range(start:2016-08-30T00:00:00Z)
- |> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
- |> mean()
-
-// Schema 2 - Query for data encoded in tags
-from(bucket:"/")
+// Query *Good Measurements*, data stored in separate tags (recommended)
+from(bucket: "/")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
-##### InfluxQL
+**Difficult to query**: [_Bad Measurements_](#bad-measurements-schema) requires regular expressions to extract `plot` and `region` from the measurement, as in the following example.
+
+```js
+// Query *Bad Measurements*, data encoded in the measurement (not recommended)
+from(bucket: "/")
+ |> range(start:2016-08-30T00:00:00Z)
+ |> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
+ |> mean()
+```
+
+Complex measurements make some queries impossible. For example, calculating the average temperature of both plots is not possible with the [_Bad Measurements_](#bad-measurements-schema) schema.
+
+
+##### InfluxQL example to query schemas
```
-# Schema 1 - Query for data encoded in the measurement name
+# Query *Bad Measurements*, data encoded in the measurement (not recommended)
> SELECT mean("temp") FROM /\.north$/
-# Schema 2 - Query for data encoded in tags
+# Query *Good Measurements*, data stored in separate tag values (recommended)
> SELECT mean("temp") FROM "weather_sensor" WHERE "region" = 'north'
```
### Avoid putting more than one piece of information in one tag
-Splitting a single tag with multiple pieces into separate tags simplifies your queries and reduces the need for regular expressions.
+Splitting a single tag with multiple pieces into separate tags simplifies your queries and improves performance by
+ reducing the need for regular expressions.
Consider the following schema represented by line protocol.
+#### Example line protocol schemas
+
```
Schema 1 - Multiple data encoded in a single tag
-------------
@@ -159,7 +183,7 @@ weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000
Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region.
Schema 2 is preferable because using multiple tags, you don't need a regular expression.
-##### Flux
+#### Flux example to query schemas
```js
// Schema 1 - Query for multiple data encoded in a single tag
@@ -175,7 +199,7 @@ from(bucket:"/")
|> mean()
```
-##### InfluxQL
+#### InfluxQL example to query schemas
```
# Schema 1 - Query for multiple data encoded in a single tag
diff --git a/content/influxdb/cloud/query-data/execute-queries/query-sample-data.md b/content/influxdb/cloud/query-data/execute-queries/query-sample-data.md
index 7e0eb4f42..e1ec04471 100644
--- a/content/influxdb/cloud/query-data/execute-queries/query-sample-data.md
+++ b/content/influxdb/cloud/query-data/execute-queries/query-sample-data.md
@@ -31,7 +31,7 @@ Approximate sample dataset sizes are listed for each [sample dataset](/influxdb/
- **Bird migration sample data**: Explore, visualize, and monitor the latitude and longitude of bird migration patterns.
- **NOAA NDBC sample data**: Explore, visualize, and monitor NDBC's observations from their buoys. This data observes air temperature, wind speed, and more from specific locations.
- **NOAA water sample data**: Explore, visualize, and monitor temperature, water level, pH, and quality from specific locations.
- - **USGS Earthquake data**: Explore, visualize, and monitor earthquake monitoring data. This data includes alerts, cdi, quarry blast, magnitide, and more.
+ - **USGS Earthquake data**: Explore, visualize, and monitor earthquake monitoring data. This data includes alerts, cdi, quarry blast, magnitude, and more.
2. Do one of the following to download sample data:
- [Add sample data with community template](#add-sample-data-with-community-templates)
- [Add sample data using the InfluxDB UI](#add-sample-data)
@@ -42,6 +42,8 @@ Approximate sample dataset sizes are listed for each [sample dataset](/influxdb/
{{< nav-icon "settings" >}}
+2. Paste the Sample Data community temple URL in **resource manifest file** field:
+
2. Paste the [Sample Data community template URL](https://github.com/influxdata/community-templates/blob/master/sample-data/sample-data.yml) in the **resource manifest file** field and click the **{{< caps >}}Lookup Template{{< /caps >}}** button.
#### Sample Data community template URL
diff --git a/content/influxdb/cloud/write-data/best-practices/schema-design.md b/content/influxdb/cloud/write-data/best-practices/schema-design.md
index 8cfe9eb79..c231ce2d3 100644
--- a/content/influxdb/cloud/write-data/best-practices/schema-design.md
+++ b/content/influxdb/cloud/write-data/best-practices/schema-design.md
@@ -1,7 +1,7 @@
---
title: InfluxDB schema design
description: >
- Improve InfluxDB schema design and data layout. Store unique values in fields and other tips to make your data more performant.
+ Design your schema for simpler and more performant queries.
menu:
influxdb_cloud:
name: Schema design
diff --git a/content/influxdb/v1.8/concepts/schema_and_data_layout.md b/content/influxdb/v1.8/concepts/schema_and_data_layout.md
index 65e722972..3d032c167 100644
--- a/content/influxdb/v1.8/concepts/schema_and_data_layout.md
+++ b/content/influxdb/v1.8/concepts/schema_and_data_layout.md
@@ -1,7 +1,7 @@
---
title: InfluxDB schema design and data layout
description: >
- General guidelines for InfluxDB schema design and data layout.
+ Improve InfluxDB schema design and data layout to reduce high cardinality and make your data more performant.
menu:
influxdb_1_8:
name: Schema design and data layout
@@ -9,132 +9,161 @@ menu:
parent: Concepts
---
-Every InfluxDB use case is special and your [schema](/influxdb/v1.8/concepts/glossary/#schema) will reflect that uniqueness.
-There are, however, general guidelines to follow and pitfalls to avoid when designing your schema.
+Each InfluxDB use case is unique and your [schema](/influxdb/v1.8/concepts/glossary/#schema) reflects that uniqueness.
+In general, a schema designed for querying leads to simpler and more performant queries.
+We recommend the following design guidelines for most use cases:
-
+ - [Where to store data (tag or field)](#where-to-store-data-tag-or-field)
+ - [Avoid too many series](#avoid-too-many-series)
+ - [Use recommended naming conventions](#use-recommended-naming-conventions)
+ - [Shard Group Duration Management](#shard-group-duration-management)
-## General recommendations
+## Where to store data (tag or field)
-### Encouraged schema design
+Your queries should guide what data you store in [tags](/influxdb/v1.8/concepts/glossary/#tag) and what you store in [fields](/influxdb/v1.8/concepts/glossary/#field) :
-We recommend that you:
+- Store commonly-queried and grouping ([`group()`](/flux/v0.x/stdlib/universe/group) or [`GROUP BY`](/influxdb/v1.8/query_language/explore-data/#group-by-tags)) metadata in tags.
+- Store data in fields if each data point contains a different value.
+- Store numeric values as fields ([tag values](/influxdb/v1.8/concepts/glossary/#tag-value) only support string values).
-- [Encode meta data in tags](#encode-meta-data-in-tags)
-- [Avoid using keywords as tag or field names](#avoid-using-keywords-as-tag-or-field-names)
+## Avoid too many series
-#### Encode meta data in tags
+IndexDB indexes the following data elements to speed up reads:
-[Tags](/influxdb/v1.8/concepts/glossary/#tag) are indexed and [fields](/influxdb/v1.8/concepts/glossary/#field) are not indexed.
-This means that queries on tags are more performant than those on fields.
+- [measurement](/influxdb/v1.8/concepts/glossary/#measurement)
+- [tags](/influxdb/v1.8/concepts/glossary/#tag)
-In general, your queries should guide what gets stored as a tag and what gets stored as a field:
+[Tag values](/influxdb/v1.8/concepts/glossary/#tag-value) are indexed and [field values](/influxdb/v1.8/concepts/glossary/#field-value) are not.
+This means that querying by tags is more performant than querying by fields.
+However, when too many indexes are created, both writes and reads may start to slow down.
-- Store commonly-queried meta data in tags
-- Store data in tags if you plan to use them with the InfluxQL `GROUP BY` clause
-- Store data in fields if you plan to use them with an [InfluxQL](/influxdb/v1.8/query_language/functions/) function
-- Store numeric values as fields ([tag values](/influxdb/v1.8/concepts/glossary/#tag-value) only support string values)
+Each unique set of indexed data elements forms a [series key](/influxdb/v1.8/concepts/glossary/#series-key).
+[Tags](/influxdb/v1.8/concepts/glossary/#tag) containing highly variable information like unique IDs, hashes, and random strings lead to a large number of [series](/influxdb/v1.8/concepts/glossary/#series), also known as high [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality).
+High series cardinality is a primary driver of high memory usage for many database workloads.
+Therefore, to reduce memory consumption, consider storing high-cardinality values in field values rather than in tags or field keys.
-#### Avoid using keywords as tag or field names
+{{% note %}}
-Not required, but simplifies writing queries because you won't have to wrap tag or field names in double quotes.
-See [InfluxQL](https://github.com/influxdata/influxql/blob/master/README.md#keywords) and [Flux](https://github.com/influxdata/flux/blob/master/docs/SPEC.md#keywords) keywords to avoid.
+If reads and writes to InfluxDB start to slow down, you may have high series cardinality (too many series).
+See [how to find and reduce series high cardinality](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter).
-Also, if a tag or field name contains characters other than `[A-z,_]`, you must wrap it in double quotes in InfluxQL or use [bracket notation](/{{< latest "influxdb" "v2" >}}/query-data/get-started/syntax-basics/#records) in Flux.
+{{% /note %}}
-### Discouraged schema design
+## Use recommended naming conventions
-We recommend that you:
+Use the following conventions when naming your tag and field keys:
-- [Avoid too many series](#avoid-too-many-series)
-- [Avoid the same name for a tag and a field](#avoid-the-same-name-for-a-tag-and-a-field)
-- [Avoid encoding data in measurement names](#avoid-encoding-data-in-measurement-names)
-- [Avoid putting more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
+- [Avoid reserved keywords in tag and field keys](#avoid-reserved-keywords-in-tag-and-field-keys)
+- [Avoid the same tag and field name](#avoid-the-same-name-for-a-tag-and-a-field)
+- [Avoid encoding data in measurements and keys](#avoid-encoding-data-in-measurements-and-keys)
+- [Avoid more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
-#### Avoid too many series
+### Avoid reserved keywords in tag and field keys
-[Tags](/influxdb/v1.8/concepts/glossary/#tag) containing highly variable information like UUIDs, hashes, and random strings lead to a large number of [series](/influxdb/v1.8/concepts/glossary/#series) in the database, also known as high series cardinality. High series cardinality is a primary driver of high memory usage for many database workloads.
+Not required, but avoiding the use of reserved keywords in your tag and field keys simplifies writing queries because you won't have to wrap your keys in double quotes.
+See [InfluxQL](https://github.com/influxdata/influxql/blob/master/README.md#keywords) and [Flux keywords](/{{< latest "flux" >}}/spec/lexical-elements/#keywords) to avoid.
-See [Hardware sizing guidelines](/influxdb/v1.8/guides/hardware_sizing/) for [series cardinality](/influxdb/v1.8/concepts/glossary/#series-cardinality) recommendations based on your hardware. If the system has memory constraints, consider storing high-cardinality data as a field rather than a tag.
+Also, if a tag or field key contains characters other than `[A-z,_]`, you must wrap it in double quotes in InfluxQL or use [bracket notation](/{{< latest "flux" >}}/data-types/composite/record/#bracket-notation) in Flux.
-#### Avoid the same name for a tag and a field
+### Avoid the same name for a tag and a field
Avoid using the same name for a tag and field key.
This often results in unexpected behavior when querying data.
-If you inadvertently add the same name for a tag and field key, see
+If you inadvertently add the same name for a tag and a field, see
[Frequently asked questions](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#tag-and-field-key-with-the-same-name)
for information about how to query the data predictably and how to fix the issue.
-#### Avoid encoding data in measurement names
+### Avoid encoding data in measurements and keys
-InfluxDB queries merge data that falls within the same [measurement](/influxdb/v1.8/concepts/glossary/#measurement); it's better to differentiate data with [tags](/influxdb/v1.8/concepts/glossary/#tag) than with detailed measurement names. If you encode data in a measurement name, you must use a regular expression to query the data, making some queries more complicated or impossible.
-_Example:_
+Store data in [tag values](/influxdb/v1.8/concepts/glossary/#tag-value) or [field values](/influxdb/v1.8/concepts/glossary/#field-value), not in [tag keys](/influxdb/v1.8/concepts/glossary/#tag-key), [field keys](/influxdb/v1.8/concepts/glossary/#field-key), or [measurements](/influxdb/v1.8/concepts/glossary/#measurement). If you design your schema to store data in tag and field values,
+your queries will be easier to write and more efficient.
-Consider the following schema represented by line protocol.
+In addition, you'll keep cardinality low by not creating measurements and keys as you write data.
+To learn more about the performance impact of high series cardinality, see [how to find and reduce high series cardinality](/influxdb/v1.8/troubleshooting/frequently-asked-questions/#why-does-series-cardinality-matter).
+#### Compare schemas
+
+Compare the following valid schemas represented by line protocol.
+
+**Recommended**: the following schema stores metadata in separate `crop`, `plot`, and `region` tags. The `temp` field contains variable numeric data.
+
+##### {id="good-measurements-schema"}
```
-Schema 1 - Data encoded in the measurement name
--------------
-blueberries.plot-1.north temp=50.1 1472515200000000000
-blueberries.plot-2.midwest temp=49.8 1472515200000000000
-```
-
-The long measurement names (`blueberries.plot-1.north`) with no tags are similar to Graphite metrics.
-Encoding the `plot` and `region` in the measurement name makes the data more difficult to query.
-
-For example, calculating the average temperature of both plots 1 and 2 is not possible with schema 1.
-Compare this to schema 2:
-
-```
-Schema 2 - Data encoded in tags
+Good Measurements schema - Data encoded in tags (recommended)
-------------
weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
```
-Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region:
+**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the measurement, similar to Graphite metrics.
-##### Flux
+##### {id="bad-measurements-schema"}
+```
+Bad Measurements schema - Data encoded in the measurement (not recommended)
+-------------
+blueberries.plot-1.north temp=50.1 1472515200000000000
+blueberries.plot-2.midwest temp=49.8 1472515200000000000
+```
+
+**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the field key.
+
+##### {id="bad-keys-schema"}
+```
+Bad Keys schema - Data encoded in field keys (not recommended)
+-------------
+weather_sensor blueberries.plot-1.north.temp=50.1 1472515200000000000
+weather_sensor blueberries.plot-2.midwest.temp=49.8 1472515200000000000
+```
+
+#### Compare queries
+
+Compare the following queries of the [_Good Measurements_](#good-measurements-schema) and [_Bad Measurements_](#bad-measurements-schema) schemas.
+The [Flux](/{{< latest "flux" >}}/) queries calculate the average `temp` for blueberries in the `north` region
+
+**Easy to query**: [_Good Measurements_](#good-measurements-schema) data is easily filtered by `region` tag values, as in the following example.
```js
-// Schema 1 - Query for data encoded in the measurement name
-from(bucket:"/")
- |> range(start:2016-08-30T00:00:00Z)
- |> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
- |> mean()
-
-// Schema 2 - Query for data encoded in tags
-from(bucket:"/")
+// Query *Good Measurements*, data stored in separate tag values (recommended)
+from(bucket: "/")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
-##### InfluxQL
+**Difficult to query**: [_Bad Measurements_](#bad-measurements-schema) requires regular expressions to extract `plot` and `region` from the measurement, as in the following example.
+
+```js
+// Query *Bad Measurements*, data encoded in the measurement (not recommended)
+from(bucket: "/")
+ |> range(start:2016-08-30T00:00:00Z)
+ |> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
+ |> mean()
+```
+
+Complex measurements make some queries impossible. For example, calculating the average temperature of both plots is not possible with the [_Bad Measurements_](#bad-measurements-schema) schema.
+
+
+##### InfluxQL example to query schemas
```
-# Schema 1 - Query for data encoded in the measurement name
+# Query *Bad Measurements*, data encoded in the measurement (not recommended)
> SELECT mean("temp") FROM /\.north$/
-# Schema 2 - Query for data encoded in tags
+# Query *Good Measurements*, data stored in separate tag values (recommended)
> SELECT mean("temp") FROM "weather_sensor" WHERE "region" = 'north'
```
### Avoid putting more than one piece of information in one tag
-Splitting a single tag with multiple pieces into separate tags simplifies your queries and reduces the need for regular expressions.
+Splitting a single tag with multiple pieces into separate tags simplifies your queries and improves performance by
+ reducing the need for regular expressions.
Consider the following schema represented by line protocol.
+#### Example line protocol schemas
+
```
Schema 1 - Multiple data encoded in a single tag
-------------
@@ -155,7 +184,7 @@ weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000
Use Flux or InfluxQL to calculate the average `temp` for blueberries in the `north` region.
Schema 2 is preferable because using multiple tags, you don't need a regular expression.
-##### Flux
+#### Flux example to query schemas
```js
// Schema 1 - Query for multiple data encoded in a single tag
@@ -171,7 +200,7 @@ from(bucket:"/")
|> mean()
```
-##### InfluxQL
+#### InfluxQL example to query schemas
```
# Schema 1 - Query for multiple data encoded in a single tag
diff --git a/content/influxdb/v2.0/write-data/best-practices/resolve-high-cardinality.md b/content/influxdb/v2.0/write-data/best-practices/resolve-high-cardinality.md
index c073ee0a6..faf4426df 100644
--- a/content/influxdb/v2.0/write-data/best-practices/resolve-high-cardinality.md
+++ b/content/influxdb/v2.0/write-data/best-practices/resolve-high-cardinality.md
@@ -10,11 +10,48 @@ menu:
---
If reads and writes to InfluxDB have started to slow down, high [series cardinality](/influxdb/v2.0/reference/glossary/#series-cardinality) (too many series) may be causing memory issues.
+Take steps to understand and resolve high series cardinality.
+
+1. [Learn the causes of high cardinality](#learn-the-causes-of-high-series-cardinality)
+2. [Measure series cardinality](#measure-series-cardinality)
+3. [Resolve high cardinality](#resolve-high-cardinality)
+
+## Learn the causes of high series cardinality
+
+{{% oss-only %}}
+
+ IndexDB indexes the following data elements to speed up reads:
+ - [measurement](/influxdb/v2.0/reference/glossary/#measurement)
+ - [tags](/influxdb/v2.0/reference/glossary/#tag)
+
+{{% /oss-only %}}
+{{% cloud-only %}}
+
+ IndexDB indexes the following data elements to speed up reads:
+ - [measurement](/influxdb/v2.0/reference/glossary/#measurement)
+ - [tags](/influxdb/v2.0/reference/glossary/#tag)
+ - [field keys](/influxdb/cloud/reference/glossary/#field-key)
+
+{{% /cloud-only %}}
+
+Each unique set of indexed data elements forms a [series key](/influxdb/v2.0/reference/glossary/#series-key).
+[Tags](/influxdb/v2.0/reference/glossary/#tag) containing highly variable information like unique IDs, hashes, and random strings lead to a large number of [series](/influxdb/v2.0/reference/glossary/#series), also known as high [series cardinality](/influxdb/v2.0/reference/glossary/#series-cardinality).
+High series cardinality is a primary driver of high memory usage for many database workloads.
+
+## Measure series cardinality
+
+Use the following to measure series cardinality of your buckets:
+- [`influxdb.cardinality()`](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/cardinality): Flux function that returns the number of unique [series keys](/influxdb/v2.0/reference/glossary/#series) in your data.
+
+- [`SHOW SERIES CARDINALITY`](/influxdb/v2.0/query_language/spec/#show-series-cardinality): InfluxQL command that returns the number of unique [series keys](/influxdb/v2.0/reference/glossary/#series) in your data.
+
+## Resolve high cardinality
To resolve high series cardinality, complete the following steps (for multiple buckets if applicable):
1. [Review tags](#review-tags).
-2. [Adjust your schema](#adjust-your-schema).
+2. [Improve your schema](#improve-your-schema).
+3. [Delete high cardinality data](#delete-data-to-reduce-high-cardinality).
## Review tags
@@ -80,38 +117,14 @@ cardinalityByTag(bucket: "example-bucket")
|> count()
```
-These queries should help to identify the sources of high cardinality in each of your buckets. To determine which specific tags are growing, check the cardinality again after 24 hours to see if one or more tags have grown significantly.
+These queries should help identify the sources of high cardinality in each of your buckets. To determine which specific tags are growing, check the cardinality again after 24 hours to see if one or more tags have grown significantly.
-## Adjust your schema
+## Improve your schema
-Usually, resolving high cardinality is as simple as changing a tag with many unique values to a field. Review the following potential solutions for resolving high cardinality:
+To minimize cardinality in the future, design your schema for easy and performant querying.
+Review [best practices for schema design](/influxdb/v2.0/write-data/best-practices/schema-design/).
-- Delete data to reduce high cardinality
-- Design schema for read performance
+## Delete data to reduce high cardinality
-### Delete data to reduce high cardinality
-
-Consider whether you need the data causing high cardinality. In some cases, you may decide you no longer need this data, in which case you may choose to [delete the whole bucket](/influxdb/v2.0/organizations/buckets/delete-bucket/) or [delete a range of data](/influxdb/v2.0/write-data/delete-data/).
-
-### Design schema for read performance
-
-Tags are valuable for indexing, so during a query, the query engine doesn't need to scan every single record in a bucket. However, too many indexes may create performance problems. The trick is to create a middle ground between scanning and indexing.
-
-For example, if you query for specific user IDs with thousands of users, a simple query like this, where `userId` is a field, requires InfluxDB to scan every row for the `userId`:
-
-```js
-from(bucket: "example-bucket")
- |> range(start: -7d)
- |> filter(fn: (r) => r._field == "userId" and r._value == "abcde")
-```
-
-If you include a tag in your schema that can be reasonably indexed, such as a `company` tag, you can reduce the number of rows scanned and retrieve data more quickly:
-
-```js
-from(bucket: "example-bucket")
- |> range(start: -7d)
- |> filter(fn: (r) => r.company == "Acme")
- |> filter(fn: (r) => r._field == "userId" and r._value == "abcde")
-```
-
-Consider tags that can be reasonably indexed to make your queries more performant. For more guidelines to consider, see [InfluxDB schema design](/influxdb/v2.0/write-data/best-practices/schema-design/).
+Consider whether you need the data that is causing high cardinality.
+If you no longer need this data, you can [delete the whole bucket](/influxdb/v2.0/organizations/buckets/delete-bucket/) or [delete a range of data](/influxdb/v2.0/write-data/delete-data/).
diff --git a/content/influxdb/v2.0/write-data/best-practices/schema-design.md b/content/influxdb/v2.0/write-data/best-practices/schema-design.md
index 49645a45c..f49eb9105 100644
--- a/content/influxdb/v2.0/write-data/best-practices/schema-design.md
+++ b/content/influxdb/v2.0/write-data/best-practices/schema-design.md
@@ -1,7 +1,7 @@
---
title: InfluxDB schema design
description: >
- Improve InfluxDB schema design and data layout to reduce high cardinality and make your data more performant.
+ Design your schema for simpler and more performant queries.
menu:
influxdb_2_0:
name: Schema design
@@ -9,220 +9,239 @@ menu:
parent: write-best-practices
---
-Each InfluxDB use case is unique and your [schema](/influxdb/v2.0/reference/glossary/#schema) design reflects the uniqueness. We recommend the following design guidelines for most use cases:
+Design your [schema](/influxdb/v2.0/reference/glossary/#schema) for simpler and more performant queries.
+Follow design guidelines to make your schema easy to query.
+Learn how these guidelines lead to more performant queries.
-- [Where to store data (tag or field)](#where-to-store-data-tags-or-fields)
-- [Avoid too many series](#avoid-too-many-series)
-- [Use recommended naming conventions](#use-recommended-naming-conventions)
-
+- [Design to query](#design-to-query)
+ - [Keep measurements and keys simple](#keep-measurements-and-keys-simple)
+- [Use tags and fields](#use-tags-and-fields)
+ - [Use fields for unique and numeric data](#use-fields-for-unique-and-numeric-data)
+ - [Use tags to improve query performance](#use-tags-to-improve-query-performance)
+ - [Keep tags simple](#keep-tags-simple)
{{% note %}}
-Follow these guidelines to minimize high series cardinality and make your data more performant.
+
+Good schema design can prevent high series cardinality, resulting in better performing queries. If you notice data reads and writes slowing down or want to learn how cardinality affects performance, see how to [resolve high cardinality](/influxdb/v2.0/write-data/best-practices/resolve-high-cardinality/).
+
{{% /note %}}
-## Where to store data (tag or field)
+## Design to query
-[Tags](/influxdb/v2.0/reference/glossary/#tag) are indexed and [fields](/influxdb/v2.0/reference/glossary/#field) are not.
-This means that querying by tags is more performant than querying by fields.
+The schemas below demonstrate [measurements](/influxdb/v2.0/reference/glossary/#measurement), [tag keys](/influxdb/v2.0/reference/glossary/#tag-key), and [field keys](/influxdb/v2.0/reference/glossary/#field-key) that are easy to query.
-In general, your queries should guide what gets stored as a tag and what gets stored as a field:
+| measurement | tag key | tag key | field key | field key |
+|----------------------|-----------|---------|-----------|-------------|
+| airSensor | sensorId | station | humidity | temperature |
+| waterQualitySensor | sensorId | station | pH | temperature |
-- Store commonly-queried meta data in tags.
-- Store data in fields if each data point contains a different value.
-- Store numeric values as fields ([tag values](/influxdb/v2.0/reference/glossary/#tag-value) only support string values).
+The `airSensor` and `waterQualitySensor` schemas illustrate the following guidelines:
+- Each measurement is a simple name that describes a schema.
+- Keys [don't repeat within a schema](#avoid-duplicate-names-for-tags-and-fields).
+- Keys [don't use reserved keywords or special characters](#avoid-keywords-and-special-characters-in-keys).
+- Tags (`sensorId` and `station`) [store metadata common across many data points](#use-tags-to-improve-query-performance).
+- Fields (`humidity`, `pH`, and `temperature`) [store numeric data](#use-fields-for-unique-and-numeric-data).
+- Fields [store unique or highly variable](#use-fields-for-unique-and-numeric-data) data.
+- Measurements and keys [don't contain data](#keep-measurements-and-keys-simple); tag values and field values will store data.
-## Avoid too many series
-
-[Tags](/influxdb/v2.0/reference/glossary/#tag) containing highly variable information like unique IDs, hashes, and random strings lead to a large number of [series](/influxdb/v2.0/reference/glossary/#series), also known as high [series cardinality](/influxdb/v2.0/reference/glossary/#series-cardinality).
-
-High series cardinality is a primary driver of high memory usage for many database workloads.
-InfluxDB uses measurements and tags to create indexes and speed up reads. However, when too many indexes created, both writes and reads may start to slow down. Therefore, if a system has memory constraints, consider storing high-cardinality data as a field rather than a tag.
-
-{{% note %}}
-If reads and writes to InfluxDB start to slow down, you may have high series cardinality (too many series). See how to [resolve high cardinality](/influxdb/v2.0/write-data/best-practices/resolve-high-cardinality/).
-{{% /note %}}
-
-## Use recommended naming conventions
-
-Use the following conventions when naming your tag and field keys:
-
-- [Avoid keywords in tag and field names](#avoid-keywords-as-tag-or-field-names)
-- [Avoid the same tag and field name](#avoid-the-same-name-for-a-tag-and-a-field)
-- [Avoid encoding data in measurement names](#avoid-encoding-data-in-measurement-names)
-- [Avoid more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
-
-### Avoid keywords as tag or field names
-
-Not required, but simplifies writing queries because you won't have to wrap tag or field names in double quotes.
-See [Flux keywords](/{{< latest "flux" >}}/spec/lexical-elements/#keywords) to avoid.
-
-Also, if a tag or field name contains non-alphanumeric characters, you must use [bracket notation](/{{< latest "flux" >}}/data-types/composite/record/#bracket-notation) in Flux.
-
-### Avoid the same name for a tag and a field
-
-Avoid using the same name for a tag and field key, which may result in unexpected behavior when querying data.
-
-### Avoid encoding data in measurement names
-
-InfluxDB queries merge data that falls within the same [measurement](/influxdb/v2.0/reference/glossary/#measurement), so it's better to differentiate data with [tags](/influxdb/v2.0/reference/glossary/#tag) than with detailed measurement names. If you encode data in a measurement name, you must use a regular expression to query the data, making some queries more complicated.
-
-#### Example line protocol schemas
-
-Consider the following schema represented by line protocol.
+The following points (formatted as line protocol) use the `airSensor` and `waterQualitySensor` schemas:
```
-Schema 1 - Data encoded in the measurement name
+airSensor,sensorId=A0100,station=Harbor humidity=35.0658,temperature=21.667 1636729543000000000
+waterQualitySensor,sensorId=W0101,station=Harbor pH=6.1,temperature=16.103 1472515200000000000
+```
+
+### Keep measurements and keys simple
+
+Store data in [tag values](/influxdb/v2.0/reference/glossary/#tag-value) or [field values](/influxdb/v2.0/reference/glossary/#field-value), not in [tag keys](/influxdb/v2.0/reference/glossary/#tag-key), [field keys](/influxdb/v2.0/reference/glossary/#field-key), or [measurements](/influxdb/v2.0/reference/glossary/#measurement).
+If you design your schema to store data in tag and field values,
+your queries will be easier to write and more efficient.
+
+{{% oss-only %}}
+
+In addition, you'll keep cardinality low by not creating measurements and keys as you write data.
+To learn more about the performance impact of high series cardinality, see how to [resolve high cardinality](/influxdb/v2.0/write-data/best-practices/resolve-high-cardinality/).
+
+{{% /oss-only %}}
+
+#### Compare schemas
+
+Compare the following valid schemas represented by line protocol.
+
+**Recommended**: the following schema stores metadata in separate `crop`, `plot`, and `region` tags. The `temp` field contains variable numeric data.
+
+##### {id="good-measurements-schema"}
+```
+Good Measurements schema - Data encoded in tags (recommended)
+-------------
+weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
+weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
+```
+
+**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the measurement, similar to Graphite metrics.
+
+##### {id="bad-measurements-schema"}
+```
+Bad Measurements schema - Data encoded in the measurement (not recommended)
-------------
blueberries.plot-1.north temp=50.1 1472515200000000000
blueberries.plot-2.midwest temp=49.8 1472515200000000000
```
-The long measurement names (`blueberries.plot-1.north`) with no tags are similar to Graphite metrics.
-Encoding the `plot` and `region` in the measurement name makes the data more difficult to query.
-
-For example, calculating the average temperature of both plots 1 and 2 is not possible with schema 1.
-Compare this to schema 2:
+**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the field key.
+##### {id="bad-keys-schema"}
```
-Schema 2 - Data encoded in tags
+Bad Keys schema - Data encoded in field keys (not recommended)
-------------
-weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
-weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
+weather_sensor blueberries.plot-1.north.temp=50.1 1472515200000000000
+weather_sensor blueberries.plot-2.midwest.temp=49.8 1472515200000000000
```
-#### Flux example to query schemas
+#### Compare queries
-Use Flux to calculate the average `temp` for blueberries in the `north` region:
+Compare the following queries of the [_Good Measurements_](#good-measurements-schema) and [_Bad Measurements_](#bad-measurements-schema) schemas.
+The [Flux](/{{< latest "flux" >}}/) queries calculate the average `temp` for blueberries in the `north` region
+
+**Easy to query**: [_Good Measurements_](#good-measurements-schema) data is easily filtered by `region` tag values, as in the following example.
```js
-// Schema 1 - Query for data encoded in the measurement name
-from(bucket:"example-bucket")
- |> range(start:2016-08-30T00:00:00Z)
- |> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
- |> mean()
-
-// Schema 2 - Query for data encoded in tags
+// Query *Good Measurements*, data stored in separate tags (recommended)
from(bucket:"example-bucket")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
-In schema 1, we see that querying the `plot` and `region` in the measurement name makes the data more difficult to query.
-
-### Avoid putting more than one piece of information in one tag
-
-Splitting a single tag with multiple pieces into separate tags simplifies your queries and reduces the need for regular expressions.
-
-#### Example line protocol schemas
-
-Consider the following schema represented by line protocol.
+**Difficult to query**: [_Bad Measurements_](#bad-measurements-schema) requires regular expressions to extract `plot` and `region` from the measurement, as in the following example.
+```js
+// Query *Bad Measurements*, data encoded in the measurement (not recommended)
+from(bucket:"example-bucket")
+ |> range(start:2016-08-30T00:00:00Z)
+ |> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
+ |> mean()
```
-Schema 1 - Multiple data encoded in a single tag
+
+Complex measurements make some queries impossible. For example, calculating the average temperature of both plots is not possible with the [_Bad Measurements_](#bad-measurements-schema) schema.
+
+#### Keep keys simple
+
+In addition to keeping your keys free of data, follow these additional guidelines to make them easier to query:
+- [Avoid keywords and special characters](#avoid-keywords-and-special-characters-in-keys)
+- [Avoid duplicate names for tags and fields](#avoid-duplicate-names-for-tags-and-fields)
+
+##### Avoid keywords and special characters in keys
+
+To simplify query writing, don't include reserved keywords or special characters in tag and field keys.
+If you use [Flux keywords](/{{< latest "flux" >}}/spec/lexical-elements/#keywords) in keys,
+then you'll have to wrap the keys in double quotes.
+If you use non-alphanumeric characters in keys, then you'll have to use [bracket notation](/{{< latest "flux" >}}/data-types/composite/record/#bracket-notation) in [Flux]((/{{< latest "flux" >}}/).
+
+##### Avoid duplicate names for tags and fields
+
+Avoid using the same name for a [tag key](/influxdb/v2.0/reference/glossary/#tag-key) and a [field key](/influxdb/v2.0/reference/glossary/#field-key) within the same schema.
+Your query results may be unpredictable if you have a tag and a field with the same name.
+
+{{% cloud-only %}}
+
+{{% note %}}
+Use [explicit bucket schemas]() to enforce unique tag and field keys within a schema.
+{{% /note %}}
+
+{{% /cloud-only %}}
+
+## Use tags and fields
+
+[Tag values](/influxdb/v2.0/reference/glossary/#tag-value) are indexed and [field values](/influxdb/v2.0/reference/glossary/#field-value) aren't.
+This means that querying tags is more performant than querying fields.
+Your queries should guide what you store in tags and what you store in fields.
+
+### Use fields for unique and numeric data
+
+- Store unique or frequently changing values as field values.
+- Store numeric values as field values. ([Tags](/influxdb/v2.0/reference/glossary/#tag-value) only store strings).
+
+### Use tags to improve query performance
+
+- Store values as tag values if they can be reasonably indexed.
+- Store values as [tag values](/influxdb/v2.0/reference/glossary/#tag-value) if the values are used in [filter()]({{< latest "flux" >}}/universe/filter/) or [group()](/{{< latest "flux" >}}/universe/group/) functions.
+- Store values as tag values if the values are shared across multiple data points, i.e. metadata about the field.
+
+Because InfluxDB indexes tags, the query engine doesn't need to scan every record in a bucket to locate a tag value.
+For example, consider a bucket that stores data about thousands of users. With `userId` stored in a [field](/influxdb/v2.0/reference/glossary/#field), a query for user `abcde` requires InfluxDB to scan `userId` in every row.
+
+```js
+from(bucket: "example-bucket")
+ |> range(start: -7d)
+ |> filter(fn: (r) => r._field == "userId" and r._value == "abcde")
+```
+
+To retrieve data more quickly, filter on a tag to reduce the number of rows scanned.
+The tag should store data that can be reasonably indexed.
+The following query filters by the `company` tag to reduce the number of rows scanned for `userId`.
+
+```js
+from(bucket: "example-bucket")
+ |> range(start: -7d)
+ |> filter(fn: (r) => r.company == "Acme")
+ |> filter(fn: (r) => r._field == "userId" and r._value == "abcde")
+```
+
+### Keep tags simple
+
+Use one tag for each data attribute.
+If your source data contains multiple data attributes in a single parameter,
+split each attribute into its own tag.
+When each tag represents one attribute (not multiple concatenated attributes) of your data,
+you'll reduce the need for regular expressions in your queries.
+Without regular expressions, your queries will be easier to write and more performant.
+
+#### Compare schemas
+
+Compare the following valid schemas represented by line protocol.
+
+**Recommended**: the following schema splits location data into `plot` and `region` tags.
+
+##### {id="good-tags-schema"}
+```
+Good Tags schema - Data encoded in multiple tags
+-------------
+weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
+weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
+```
+
+**Not recommended**: the following schema stores multiple attributes (`plot` and `region`) concatenated within the `location` tag value (`plot-1.north`).
+
+##### {id="bad-tags-schema"}
+```
+Bad Tags schema - Multiple data encoded in a single tag
-------------
weather_sensor,crop=blueberries,location=plot-1.north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,location=plot-2.midwest temp=49.8 1472515200000000000
```
-The schema 1 data encodes multiple parameters, the `plot` and `region`, into a long tag value (`plot-1.north`).
-Compare this to schema 2:
+#### Compare queries
-```
-Schema 2 - Data encoded in multiple tags
--------------
-weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
-weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
-```
+Compare queries of the [_Good Tags_](#good-tags-schema) and [_Bad Tags_](#bad-tags-schema) schemas.
+The [Flux](/{{< latest "flux" >}}/) queries calculate the average `temp` for blueberries in the `north` region.
-Schema 2 is preferable because, with multiple tags, you don't need a regular expression.
-
-#### Flux example to query schemas
-
-The following Flux examples show how to calculate the average `temp` for blueberries in the `north` region; both for schema 1 and schema 2.
+**Easy to query**: [_Good Tags_](#good-tags-schema) data is easily filtered by `region` tag values, as in the following example.
```js
-// Schema 1 - Query for multiple data encoded in a single tag
-from(bucket:"example-bucket")
- |> range(start:2016-08-30T00:00:00Z)
- |> filter(fn: (r) => r._measurement == "weather_sensor" and r.location =~ /\.north$/ and r._field == "temp")
- |> mean()
-
-// Schema 2 - Query for data encoded in multiple tags
+// Query *Good Tags* schema, data encoded in multiple tags
from(bucket:"example-bucket")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
-In schema 1, we see that querying the `plot` and `region` in a single tag makes the data more difficult to query.
-
+```js
+// Query *Bad Tags* schema, multiple data encoded in a single tag
+from(bucket:"example-bucket")
+ |> range(start:2016-08-30T00:00:00Z)
+ |> filter(fn: (r) => r._measurement == "weather_sensor" and r.location =~ /\.north$/ and r._field == "temp")
+ |> mean()
+```
diff --git a/content/influxdb/v2.1/write-data/best-practices/resolve-high-cardinality.md b/content/influxdb/v2.1/write-data/best-practices/resolve-high-cardinality.md
index d68e6cd94..09030107f 100644
--- a/content/influxdb/v2.1/write-data/best-practices/resolve-high-cardinality.md
+++ b/content/influxdb/v2.1/write-data/best-practices/resolve-high-cardinality.md
@@ -10,11 +10,48 @@ menu:
---
If reads and writes to InfluxDB have started to slow down, high [series cardinality](/influxdb/v2.1/reference/glossary/#series-cardinality) (too many series) may be causing memory issues.
+Take steps to understand and resolve high series cardinality.
+
+1. [Learn the causes of high cardinality](#learn-the-causes-of-high-series-cardinality)
+2. [Measure series cardinality](#measure-series-cardinality)
+3. [Resolve high cardinality](#resolve-high-cardinality)
+
+## Learn the causes of high series cardinality
+
+{{% oss-only %}}
+
+ IndexDB indexes the following data elements to speed up reads:
+ - [measurement](/influxdb/v2.1/reference/glossary/#measurement)
+ - [tags](/influxdb/v2.1/reference/glossary/#tag)
+
+{{% /oss-only %}}
+{{% cloud-only %}}
+
+ IndexDB indexes the following data elements to speed up reads:
+ - [measurement](/influxdb/v2.1/reference/glossary/#measurement)
+ - [tags](/influxdb/v2.1/reference/glossary/#tag)
+ - [field keys](/influxdb/cloud/reference/glossary/#field-key)
+
+{{% /cloud-only %}}
+
+Each unique set of indexed data elements forms a [series key](/influxdb/v2.1/reference/glossary/#series-key).
+[Tags](/influxdb/v2.1/reference/glossary/#tag) containing highly variable information like unique IDs, hashes, and random strings lead to a large number of [series](/influxdb/v2.1/reference/glossary/#series), also known as high [series cardinality](/influxdb/v2.1/reference/glossary/#series-cardinality).
+High series cardinality is a primary driver of high memory usage for many database workloads.
+
+## Measure series cardinality
+
+Use the following to measure series cardinality of your buckets:
+- [`influxdb.cardinality()`](/{{< latest "flux" >}}/stdlib/influxdata/influxdb/cardinality): Flux function that returns the number of unique [series keys](/influxdb/v2.1/reference/glossary/#series) in your data.
+
+- [`SHOW SERIES CARDINALITY`](/influxdb/v2.1/query_language/spec/#show-series-cardinality): InfluxQL command that returns the number of unique [series keys](/influxdb/v2.1/reference/glossary/#series) in your data.
+
+## Resolve high cardinality
To resolve high series cardinality, complete the following steps (for multiple buckets if applicable):
1. [Review tags](#review-tags).
-2. [Adjust your schema](#adjust-your-schema).
+2. [Improve your schema](#improve-your-schema).
+3. [Delete high cardinality data](#delete-data-to-reduce-high-cardinality).
## Review tags
@@ -80,38 +117,14 @@ cardinalityByTag(bucket: "example-bucket")
|> count()
```
-These queries should help to identify the sources of high cardinality in each of your buckets. To determine which specific tags are growing, check the cardinality again after 24 hours to see if one or more tags have grown significantly.
+These queries should help identify the sources of high cardinality in each of your buckets. To determine which specific tags are growing, check the cardinality again after 24 hours to see if one or more tags have grown significantly.
-## Adjust your schema
+## Improve your schema
-Usually, resolving high cardinality is as simple as changing a tag with many unique values to a field. Review the following potential solutions for resolving high cardinality:
+To minimize cardinality in the future, design your schema for easy and performant querying.
+Review [best practices for schema design](/influxdb/v2.1/write-data/best-practices/schema-design/).
-- Delete data to reduce high cardinality
-- Design schema for read performance
+## Delete data to reduce high cardinality
-### Delete data to reduce high cardinality
-
-Consider whether you need the data causing high cardinality. In some cases, you may decide you no longer need this data, in which case you may choose to [delete the whole bucket](/influxdb/v2.1/organizations/buckets/delete-bucket/) or [delete a range of data](/influxdb/v2.1/write-data/delete-data/).
-
-### Design schema for read performance
-
-Tags are valuable for indexing, so during a query, the query engine doesn't need to scan every single record in a bucket. However, too many indexes may create performance problems. The trick is to create a middle ground between scanning and indexing.
-
-For example, if you query for specific user IDs with thousands of users, a simple query like this, where `userId` is a field, requires InfluxDB to scan every row for the `userId`:
-
-```js
-from(bucket: "example-bucket")
- |> range(start: -7d)
- |> filter(fn: (r) => r._field == "userId" and r._value == "abcde")
-```
-
-If you include a tag in your schema that can be reasonably indexed, such as a `company` tag, you can reduce the number of rows scanned and retrieve data more quickly:
-
-```js
-from(bucket: "example-bucket")
- |> range(start: -7d)
- |> filter(fn: (r) => r.company == "Acme")
- |> filter(fn: (r) => r._field == "userId" and r._value == "abcde")
-```
-
-Consider tags that can be reasonably indexed to make your queries more performant. For more guidelines to consider, see [InfluxDB schema design](/influxdb/v2.1/write-data/best-practices/schema-design/).
+Consider whether you need the data that is causing high cardinality.
+If you no longer need this data, you can [delete the whole bucket](/influxdb/v2.1/organizations/buckets/delete-bucket/) or [delete a range of data](/influxdb/v2.1/write-data/delete-data/).
diff --git a/content/influxdb/v2.1/write-data/best-practices/schema-design.md b/content/influxdb/v2.1/write-data/best-practices/schema-design.md
index 6857ddab8..eae3a2a07 100644
--- a/content/influxdb/v2.1/write-data/best-practices/schema-design.md
+++ b/content/influxdb/v2.1/write-data/best-practices/schema-design.md
@@ -1,7 +1,7 @@
---
title: InfluxDB schema design
description: >
- Improve InfluxDB schema design and data layout to reduce high cardinality and make your data more performant.
+ Design your schema for simpler and more performant queries.
menu:
influxdb_2_1:
name: Schema design
@@ -9,220 +9,238 @@ menu:
parent: write-best-practices
---
-Each InfluxDB use case is unique and your [schema](/influxdb/v2.1/reference/glossary/#schema) design reflects the uniqueness. We recommend the following design guidelines for most use cases:
+Design your [schema](/influxdb/v2.1/reference/glossary/#schema) for simpler and more performant queries.
+Follow design guidelines to make your schema easy to query.
+Learn how these guidelines lead to more performant queries.
-- [Where to store data (tag or field)](#where-to-store-data-tags-or-fields)
-- [Avoid too many series](#avoid-too-many-series)
-- [Use recommended naming conventions](#use-recommended-naming-conventions)
-
+- [Design to query](#design-to-query)
+ - [Keep measurements and keys simple](#keep-measurements-and-keys-simple)
+- [Use tags and fields](#use-tags-and-fields)
+ - [Use fields for unique and numeric data](#use-fields-for-unique-and-numeric-data)
+ - [Use tags to improve query performance](#use-tags-to-improve-query-performance)
+ - [Keep tags simple](#keep-tags-simple)
{{% note %}}
-Follow these guidelines to minimize high series cardinality and make your data more performant.
+
+Good schema design can prevent high series cardinality, resulting in better performing queries. If you notice data reads and writes slowing down or want to learn how cardinality affects performance, see how to [resolve high cardinality](/influxdb/v2.1/write-data/best-practices/resolve-high-cardinality/).
+
{{% /note %}}
-## Where to store data (tag or field)
+## Design to query
-[Tags](/influxdb/v2.1/reference/glossary/#tag) are indexed and [fields](/influxdb/v2.1/reference/glossary/#field) are not.
-This means that querying by tags is more performant than querying by fields.
+The schemas below demonstrate [measurements](/influxdb/v2.1/reference/glossary/#measurement), [tag keys](/influxdb/v2.1/reference/glossary/#tag-key), and [field keys](/influxdb/v2.1/reference/glossary/#field-key) that are easy to query.
-In general, your queries should guide what gets stored as a tag and what gets stored as a field:
+| measurement | tag key | tag key | field key | field key |
+|----------------------|-----------|---------|-----------|-------------|
+| airSensor | sensorId | station | humidity | temperature |
+| waterQualitySensor | sensorId | station | pH | temperature |
-- Store commonly-queried meta data in tags.
-- Store data in fields if each data point contains a different value.
-- Store numeric values as fields ([tag values](/influxdb/v2.1/reference/glossary/#tag-value) only support string values).
+The `airSensor` and `waterQualitySensor` schemas illustrate the following guidelines:
+- Each measurement is a simple name that describes a schema.
+- Keys [don't repeat within a schema](#avoid-duplicate-names-for-tags-and-fields).
+- Keys [don't use reserved keywords or special characters](#avoid-keywords-and-special-characters-in-keys).
+- Tags (`sensorId` and `station`) [store metadata common across many data points](#use-tags-to-improve-query-performance).
+- Fields (`humidity`, `pH`, and `temperature`) [store numeric data](#use-fields-for-unique-and-numeric-data).
+- Fields [store unique or highly variable](#use-fields-for-unique-and-numeric-data) data.
+- Measurements and keys [don't contain data](#keep-measurements-and-keys-simple); tag values and field values will store data.
-## Avoid too many series
-
-[Tags](/influxdb/v2.1/reference/glossary/#tag) containing highly variable information like unique IDs, hashes, and random strings lead to a large number of [series](/influxdb/v2.1/reference/glossary/#series), also known as high [series cardinality](/influxdb/v2.1/reference/glossary/#series-cardinality).
-
-High series cardinality is a primary driver of high memory usage for many database workloads.
-InfluxDB uses measurements and tags to create indexes and speed up reads. However, when too many indexes created, both writes and reads may start to slow down. Therefore, if a system has memory constraints, consider storing high-cardinality data as a field rather than a tag.
-
-{{% note %}}
-If reads and writes to InfluxDB start to slow down, you may have high series cardinality (too many series). See how to [resolve high cardinality](/influxdb/v2.1/write-data/best-practices/resolve-high-cardinality/).
-{{% /note %}}
-
-## Use recommended naming conventions
-
-Use the following conventions when naming your tag and field keys:
-
-- [Avoid keywords in tag and field names](#avoid-keywords-as-tag-or-field-names)
-- [Avoid the same tag and field name](#avoid-the-same-name-for-a-tag-and-a-field)
-- [Avoid encoding data in measurement names](#avoid-encoding-data-in-measurement-names)
-- [Avoid more than one piece of information in one tag](#avoid-putting-more-than-one-piece-of-information-in-one-tag)
-
-### Avoid keywords as tag or field names
-
-Not required, but simplifies writing queries because you won't have to wrap tag or field names in double quotes.
-See [Flux keywords](/{{< latest "flux" >}}/spec/lexical-elements/#keywords) to avoid.
-
-Also, if a tag or field name contains non-alphanumeric characters, you must use [bracket notation](/{{< latest "flux" >}}/data-types/composite/record/#bracket-notation) in Flux.
-
-### Avoid the same name for a tag and a field
-
-Avoid using the same name for a tag and field key, which may result in unexpected behavior when querying data.
-
-### Avoid encoding data in measurement names
-
-InfluxDB queries merge data that falls within the same [measurement](/influxdb/v2.1/reference/glossary/#measurement), so it's better to differentiate data with [tags](/influxdb/v2.1/reference/glossary/#tag) than with detailed measurement names. If you encode data in a measurement name, you must use a regular expression to query the data, making some queries more complicated.
-
-#### Example line protocol schemas
-
-Consider the following schema represented by line protocol.
+The following points (formatted as line protocol) use the `airSensor` and `waterQualitySensor` schemas:
```
-Schema 1 - Data encoded in the measurement name
+airSensor,sensorId=A0100,station=Harbor humidity=35.0658,temperature=21.667 1636729543000000000
+waterQualitySensor,sensorId=W0101,station=Harbor pH=6.1,temperature=16.103 1472515200000000000
+```
+
+### Keep measurements and keys simple
+
+Store data in [tag values](/influxdb/v2.1/reference/glossary/#tag-value) or [field values](/influxdb/v2.1/reference/glossary/#field-value), not in [tag keys](/influxdb/v2.1/reference/glossary/#tag-key), [field keys](/influxdb/v2.1/reference/glossary/#field-key), or [measurements](/influxdb/v2.1/reference/glossary/#measurement). If you design your schema to store data in tag and field values,
+your queries will be easier to write and more efficient.
+
+{{% oss-only %}}
+
+In addition, you'll keep cardinality low by not creating measurements and keys as you write data.
+To learn more about the performance impact of high series cardinality, see how to [resolve high cardinality](/influxdb/v2.1/write-data/best-practices/resolve-high-cardinality/).
+
+{{% /oss-only %}}
+
+#### Compare schemas
+
+Compare the following valid schemas represented by line protocol.
+
+**Recommended**: the following schema stores metadata in separate `crop`, `plot`, and `region` tags. The `temp` field contains variable numeric data.
+
+##### {id="good-measurements-schema"}
+```
+Good Measurements schema - Data encoded in tags (recommended)
+-------------
+weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
+weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
+```
+
+**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the measurement, similar to Graphite metrics.
+
+##### {id="bad-measurements-schema"}
+```
+Bad Measurements schema - Data encoded in the measurement (not recommended)
-------------
blueberries.plot-1.north temp=50.1 1472515200000000000
blueberries.plot-2.midwest temp=49.8 1472515200000000000
```
-The long measurement names (`blueberries.plot-1.north`) with no tags are similar to Graphite metrics.
-Encoding the `plot` and `region` in the measurement name makes the data more difficult to query.
-
-For example, calculating the average temperature of both plots 1 and 2 is not possible with schema 1.
-Compare this to schema 2:
+**Not recommended**: the following schema stores multiple attributes (`crop`, `plot` and `region`) concatenated (`blueberries.plot-1.north`) within the field key.
+##### {id="bad-keys-schema"}
```
-Schema 2 - Data encoded in tags
+Bad Keys schema - Data encoded in field keys (not recommended)
-------------
-weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
-weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
+weather_sensor blueberries.plot-1.north.temp=50.1 1472515200000000000
+weather_sensor blueberries.plot-2.midwest.temp=49.8 1472515200000000000
```
-#### Flux example to query schemas
+#### Compare queries
-Use Flux to calculate the average `temp` for blueberries in the `north` region:
+Compare the following queries of the [_Good Measurements_](#good-measurements-schema) and [_Bad Measurements_](#bad-measurements-schema) schemas.
+The [Flux](/{{< latest "flux" >}}/) queries calculate the average `temp` for blueberries in the `north` region
+
+**Easy to query**: [_Good Measurements_](#good-measurements-schema) data is easily filtered by `region` tag values, as in the following example.
```js
-// Schema 1 - Query for data encoded in the measurement name
-from(bucket:"example-bucket")
- |> range(start:2016-08-30T00:00:00Z)
- |> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
- |> mean()
-
-// Schema 2 - Query for data encoded in tags
+// Query *Good Measurements*, data stored in separate tags (recommended)
from(bucket:"example-bucket")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
-In schema 1, we see that querying the `plot` and `region` in the measurement name makes the data more difficult to query.
-
-### Avoid putting more than one piece of information in one tag
-
-Splitting a single tag with multiple pieces into separate tags simplifies your queries and reduces the need for regular expressions.
-
-#### Example line protocol schemas
-
-Consider the following schema represented by line protocol.
+**Difficult to query**: [_Bad Measurements_](#bad-measurements-schema) requires regular expressions to extract `plot` and `region` from the measurement, as in the following example.
+```js
+// Query *Bad Measurements*, data encoded in the measurement (not recommended)
+from(bucket:"example-bucket")
+ |> range(start:2016-08-30T00:00:00Z)
+ |> filter(fn: (r) => r._measurement =~ /\.north$/ and r._field == "temp")
+ |> mean()
```
-Schema 1 - Multiple data encoded in a single tag
+
+Complex measurements make some queries impossible. For example, calculating the average temperature of both plots is not possible with the [_Bad Measurements_](#bad-measurements-schema) schema.
+
+#### Keep keys simple
+
+In addition to keeping your keys free of data, follow these additional guidelines to make them easier to query:
+- [Avoid keywords and special characters](#avoid-keywords-and-special-characters-in-keys)
+- [Avoid duplicate names for tags and fields](#avoid-duplicate-names-for-tags-and-fields)
+
+##### Avoid keywords and special characters in keys
+
+To simplify query writing, don't include reserved keywords or special characters in tag and field keys.
+If you use [Flux keywords](/{{< latest "flux" >}}/spec/lexical-elements/#keywords) in keys,
+then you'll have to wrap the keys in double quotes.
+If you use non-alphanumeric characters in keys, then you'll have to use [bracket notation](/{{< latest "flux" >}}/data-types/composite/record/#bracket-notation) in [Flux]((/{{< latest "flux" >}}/).
+
+##### Avoid duplicate names for tags and fields
+
+Avoid using the same name for a [tag key](/influxdb/v2.1/reference/glossary/#tag-key) and a [field key](/influxdb/v2.1/reference/glossary/#field-key) within the same schema.
+Your query results may be unpredictable if you have a tag and a field with the same name.
+
+{{% cloud-only %}}
+
+{{% note %}}
+Use [explicit bucket schemas]() to enforce unique tag and field keys within a schema.
+{{% /note %}}
+
+{{% /cloud-only %}}
+
+## Use tags and fields
+
+[Tag values](/influxdb/v2.1/reference/glossary/#tag-value) are indexed and [field values](/influxdb/v2.1/reference/glossary/#field-value) aren't.
+This means that querying tags is more performant than querying fields.
+Your queries should guide what you store in tags and what you store in fields.
+
+### Use fields for unique and numeric data
+
+- Store unique or frequently changing values as field values.
+- Store numeric values as field values. ([Tags](/influxdb/v2.1/reference/glossary/#tag-value) only store strings).
+
+### Use tags to improve query performance
+
+- Store values as tag values if they can be reasonably indexed.
+- Store values as [tag values](/influxdb/v2.1/reference/glossary/#tag-value) if the values are used in [filter()]({{< latest "flux" >}}/universe/filter/) or [group()](/{{< latest "flux" >}}/universe/group/) functions.
+- Store values as tag values if the values are shared across multiple data points, i.e. metadata about the field.
+
+Because InfluxDB indexes tags, the query engine doesn't need to scan every record in a bucket to locate a tag value.
+For example, consider a bucket that stores data about thousands of users. With `userId` stored in a [field](/influxdb/v2.1/reference/glossary/#field), a query for user `abcde` requires InfluxDB to scan `userId` in every row.
+
+```js
+from(bucket: "example-bucket")
+ |> range(start: -7d)
+ |> filter(fn: (r) => r._field == "userId" and r._value == "abcde")
+```
+
+To retrieve data more quickly, filter on a tag to reduce the number of rows scanned.
+The tag should store data that can be reasonably indexed.
+The following query filters by the `company` tag to reduce the number of rows scanned for `userId`.
+
+```js
+from(bucket: "example-bucket")
+ |> range(start: -7d)
+ |> filter(fn: (r) => r.company == "Acme")
+ |> filter(fn: (r) => r._field == "userId" and r._value == "abcde")
+```
+
+### Keep tags simple
+
+Use one tag for each data attribute.
+If your source data contains multiple data attributes in a single parameter,
+split each attribute into its own tag.
+When each tag represents one attribute (not multiple concatenated attributes) of your data,
+you'll reduce the need for regular expressions in your queries.
+Without regular expressions, your queries will be easier to write and more performant.
+
+#### Compare schemas
+
+Compare the following valid schemas represented by line protocol.
+
+**Recommended**: the following schema splits location data into `plot` and `region` tags.
+
+##### {id="good-tags-schema"}
+```
+Good Tags schema - Data encoded in multiple tags
+-------------
+weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
+weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
+```
+
+**Not recommended**: the following schema stores multiple attributes (`plot` and `region`) concatenated within the `location` tag value (`plot-1.north`).
+
+##### {id="bad-tags-schema"}
+```
+Bad Tags schema - Multiple data encoded in a single tag
-------------
weather_sensor,crop=blueberries,location=plot-1.north temp=50.1 1472515200000000000
weather_sensor,crop=blueberries,location=plot-2.midwest temp=49.8 1472515200000000000
```
-The schema 1 data encodes multiple parameters, the `plot` and `region`, into a long tag value (`plot-1.north`).
-Compare this to schema 2:
+#### Compare queries
-```
-Schema 2 - Data encoded in multiple tags
--------------
-weather_sensor,crop=blueberries,plot=1,region=north temp=50.1 1472515200000000000
-weather_sensor,crop=blueberries,plot=2,region=midwest temp=49.8 1472515200000000000
-```
+Compare queries of the [_Good Tags_](#good-tags-schema) and [_Bad Tags_](#bad-tags-schema) schemas.
+The [Flux](/{{< latest "flux" >}}/) queries calculate the average `temp` for blueberries in the `north` region.
-Schema 2 is preferable because, with multiple tags, you don't need a regular expression.
-
-#### Flux example to query schemas
-
-The following Flux examples show how to calculate the average `temp` for blueberries in the `north` region; both for schema 1 and schema 2.
+**Easy to query**: [_Good Tags_](#good-tags-schema) data is easily filtered by `region` tag values, as in the following example.
```js
-// Schema 1 - Query for multiple data encoded in a single tag
-from(bucket:"example-bucket")
- |> range(start:2016-08-30T00:00:00Z)
- |> filter(fn: (r) => r._measurement == "weather_sensor" and r.location =~ /\.north$/ and r._field == "temp")
- |> mean()
-
-// Schema 2 - Query for data encoded in multiple tags
+// Query *Good Tags* schema, data encoded in multiple tags
from(bucket:"example-bucket")
|> range(start:2016-08-30T00:00:00Z)
|> filter(fn: (r) => r._measurement == "weather_sensor" and r.region == "north" and r._field == "temp")
|> mean()
```
-In schema 1, we see that querying the `plot` and `region` in a single tag makes the data more difficult to query.
-
+```js
+// Query *Bad Tags* schema, multiple data encoded in a single tag
+from(bucket:"example-bucket")
+ |> range(start:2016-08-30T00:00:00Z)
+ |> filter(fn: (r) => r._measurement == "weather_sensor" and r.location =~ /\.north$/ and r._field == "temp")
+ |> mean()
+```