influxdb/query
Andrew Lamb 5fb3e00f2a
fix: Properly record total_count and null_count in statistics (#2103)
* fix: Properly record total_count and null_count in statistics

* fix: fix statistics calculation in mutable_buffer

* refactor: expose null counts in read_buffer

* refactor: expose null_count in parquet_file

* fix: update server crate tests

* fix: update query_tests tests

* docs: tweak comments

* refactor: Use storage_stats rather than adding `null_count`

* refactor: rename test data field for clarity

* fix: fixup merge conflicts

* refactor: rename initial_non_null_count to initial_total_count

* refactor: caculate null_count as row_count - to_add
2021-07-26 18:13:36 +00:00
..
src fix: Properly record total_count and null_count in statistics (#2103) 2021-07-26 18:13:36 +00:00
Cargo.toml chore: update to arrow 5.0 and master datafusion (#2049) 2021-07-19 12:49:51 +00:00
README.md docs: Add query README file and explain some rationale (#648) 2021-01-12 18:26:32 -05:00

README.md

IOx Query Layer

The IOx query layer is responsible for translating query requests from different query languages and planning and executing them against Chunks stored across various IOx storage systems.

Query Frontends

  • SQL
  • Storage gRPC
  • Flux (possibly in the future)
  • InfluxQL (possibly in the future)
  • Others (possibly in the future)

Sources of Chunk data

  • ReadBuffer
  • MutableBuffer
  • Parquet Files
  • Others (possibly in the future, like Remote Chunk?)

The goal is to use the shared query / plan representation in order to avoid N*M combinations of language and Chunk source.

Thus query planning is implemented in terms of traits, and those traits are implemented by different chunk implementations.

Among other things, this means that this crate should not depend directly on the ReadBuffer or the MutableBuffer.

┌───────────────┐  ┌────────────────┐    ┌──────────────┐      ┌──────────────┐
│Mutable Buffer │  │  Read Buffer   │    │Parquet Files │  ... │Future Source │
│               │  │                │    │              │      │              │
└───────────────┘  └────────────────┘    └──────────────┘      └──────────────┘
        ▲                   ▲                    ▲                     ▲
        └───────────────────┴─────────┬──────────┴─────────────────────┘
                                      │
                                      │
                     ┌─────────────────────────────────┐
                     │          Shared Common          │
                     │   Predicate, Plans, Execution   │
                     └─────────────────────────────────┘
                                      ▲
                                      │
                                      │
               ┌──────────────────────┼─────────────────────────┐
               │                      │                         │
               │                      │                         │
               │                      │                         │
     ┌───────────────────┐  ┌──────────────────┐      ┌──────────────────┐
     │   SQL Frontend    │  │   gRPC Storage   │ ...  │ Future Frontend  │
     │                   │  │     Frontend     │      │ (e.g. InfluxQL)  │
     └───────────────────┘  └──────────────────┘      └──────────────────┘

We are trying to avoid ending up with something like this:

                          ┌─────────────────────────────────────────────────┐
                          │                                                 │
                          ▼                                                 │
                   ┌────────────┐                                           │
                   │Read Buffer │                  ┌────────────────────────┤
        ┌──────────┼────────────┼─────┬────────────┼────────────────────────┤
        │          └────────────┘     │            ▼                        │
        ▼                 ▲           │    ┌──────────────┐                 │
┌───────────────┐         │           │    │Parquet Files │                 │
│Mutable Buffer │         │           ├───▶│              │...              │
│               │◀────────┼───────────┤    └──────────────┘   ┌─────────────┼┐
└───────────────┘         │           │            ▲          │Future Source││
        ▲                 │           ├────────────┼─────────▶│             ││◀─┐
        │                 │           │            │          └─────────────┼┘  │
        │                 │           │            │                        │   │
        │                 │           │            │                        │   │
        │      ┌──────────┘           │            │                        │   │
        │      │                      │            │                        │   │
        │      ├──────────────────────┼────────────┘                        │   │
        └──────┤                      │                                     │   │
               │                      │                                     │   │
               │                      │                                     │   │
               │                      │                                     │   │
               │                      │                                     │   │
               │                      │                                     │   │
               │                      │                                     │   │
               │                      │                                     │   │
     ┌───────────────────┐  ┌──────────────────┐      ┌──────────────────┐  │   │
     │   SQL Frontend    │  │   gRPC Storage   │ ...  │ Future Frontend  │  │   │
     │                   │  │     Frontend     │      │ (e.g. InfluxQL)  │──┴───┘
     └───────────────────┘  └──────────────────┘      └──────────────────┘