Commit Graph

301 Commits (418cc4cf0eafffcd56dcf35de07ce0892dc05557)

Author SHA1 Message Date
Edd Robinson 326016966f refactor: update read_buffer/src/row_group.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-03-22 16:15:49 +00:00
Edd Robinson ffa06e4f0e test: more test coverage 2021-03-22 16:15:49 +00:00
Edd Robinson e5f7cc143a refactor: use AggregateVec for results 2021-03-22 16:15:49 +00:00
Edd Robinson 3725161fc3 refactor: use AggregateVec on rle grouping 2021-03-22 16:15:49 +00:00
Edd Robinson fd7cdc4fa4 refactor: remove single column group_by 2021-03-22 16:15:49 +00:00
Edd Robinson 9c48cebf78 refactor: column-wise aggregates 2021-03-22 16:15:49 +00:00
Edd Robinson d88e20c7fe fix: fix bad expected cardinality 2021-03-22 16:15:49 +00:00
Andrew Lamb 6e1795fda0
refactor: Move some types (not yet exposed to clients) into internal_types (#1015)
* refactor: Move some types (not yet exposed to clients) into internal_types

* docs: Add README.md explaining the rationale

* refactor: remove some stragglers

* fix: fix benches

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: add clippy lints

* fix: fmt

* docs: Apply suggestions from code review

fix typos

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-19 16:27:57 +00:00
Raphael Taylor-Davies 65f7a1ac5b
fix: use consistent crate versions (#989)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-15 15:42:19 +00:00
Andrew Lamb 6ac7e2c1a7
feat: Add management API and CLI to list chunks (#968)
* feat: Add management API and CLI to list chunks

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: add comment to protobuf

* fix: fix comment

* fix: fmt, fixup merge errors

* fix: fascinating type dance with prost generated types

* fix: clippy

* fix: move command to influxdb_iox database chunk list

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-03-12 13:56:14 +00:00
Edd Robinson 41a0784603 refactor: enable clipp Self 2021-03-02 15:51:13 +00:00
Edd Robinson 58f5ad5da2 refactor: tidy up with clippy 2021-03-01 18:45:39 +00:00
Edd Robinson 5b329996a9 refactor: update read_buffer/src/chunk.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-03-01 11:42:54 +00:00
Edd Robinson 5bf164f5a9 refactor: PR feedback 2021-03-01 11:42:54 +00:00
Edd Robinson 01791fbc07 feat: wire up column_values to exposed API 2021-03-01 11:42:54 +00:00
Edd Robinson d9e8132a3a refactor: wire up column_values for row_group 2021-03-01 11:42:54 +00:00
Edd Robinson 9b1346ddea feat: wire up checking non-null values on dictionaries 2021-03-01 11:42:54 +00:00
Edd Robinson fcc978bb75 refactor: wire up distinct_values with iterator 2021-03-01 11:42:54 +00:00
Edd Robinson 7d0248cc94 feat: implement distinct_values on dictionary 2021-03-01 11:42:54 +00:00
Edd Robinson cd83fcbfdb feat: implement Selection on column_names 2021-02-22 15:32:55 +00:00
Marko Mikulicic b8dc4c93dc docs: Rename read group to row group 2021-02-19 23:37:51 +00:00
Edd Robinson baa45d2c4c refactor: unlock the power of STRINGS 2021-02-17 21:17:56 +00:00
Edd Robinson b5922b6a08 refactor: apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-02-16 22:07:45 +00:00
Edd Robinson bedb748ae0 feat: implement size on rest of read buffer 2021-02-16 22:01:27 +00:00
Edd Robinson 6d694a0fd3 refactor: remove size/rows from column meta 2021-02-16 22:01:27 +00:00
Edd Robinson f952d673ba feat: add size on bool encoding 2021-02-16 22:01:27 +00:00
Edd Robinson 497ab1140b feat: implement size on fixed_null encoding 2021-02-16 22:01:27 +00:00
Edd Robinson 5c65e7072e feat: implement size on RLE 2021-02-16 22:01:27 +00:00
Edd Robinson e9f1b0f3e2 refactor: add arc clone lint 2021-02-15 12:35:14 +00:00
Edd Robinson 0fe590cedd refactor: move row/value concepts into module 2021-02-15 11:14:14 +00:00
Edd Robinson 11453eca46 refactor: move StringEncoding to module 2021-02-15 11:14:14 +00:00
Edd Robinson 3769724009 refactor: move BooleanEncoding into own module 2021-02-15 11:14:14 +00:00
Edd Robinson e2d798da05 refactor: move FloatEncoding to module 2021-02-15 11:14:14 +00:00
Edd Robinson cbf79cb822 refactor: use Self 2021-02-15 11:14:14 +00:00
Edd Robinson 1da4514112 refactor: move IntegerEncoding to own module 2021-02-15 11:14:14 +00:00
Edd Robinson edc217c783 refactor: move encodings 2021-02-15 11:14:14 +00:00
Marko Mikulicic 9e39e91139 chore: Cleaning things in prep for rust 2021
Also remove a NUL byte in a test string literal; some editors drop them.
2021-02-12 16:48:17 +00:00
Andrew Lamb 92d237988d
refactor: Change ReadBuffer column_names to return Option<BTreeSet> (#792)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-11 15:16:10 +00:00
Andrew Lamb a316b16960
feat: Change table_names to return either Some(set) or None, rather than a plan (try 2) (#776)
* feat: Change table_names to return either Some(set) or None, rather than a plan

* docs: improve comments

* docs: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: merge conflict

* fix: don't clone a string unless needed

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-02-09 12:20:59 -05:00
Andrew Lamb 8399c56587
feat: remove RwLock on entire ReadBuffer (#761) 2021-02-05 16:58:17 -05:00
Edd Robinson f0748cc379 refactor: address PR feedback
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-02-05 15:28:20 +00:00
Edd Robinson 48b29a9c72 refactor: change name back 2021-02-05 15:28:20 +00:00
Edd Robinson 6bec4c6eef feat: expose column_names via external API 2021-02-05 15:28:20 +00:00
Edd Robinson fd28738abf fix: implement on all types 2021-02-05 15:28:20 +00:00
Edd Robinson 519db7e8f9 feat: implement column_names on row group 2021-02-05 15:28:20 +00:00
Edd Robinson 115e542e70 feat: add non-null checking to column abstraction 2021-02-05 15:28:20 +00:00
Edd Robinson 4614abb7f8 feat: teach encoders ability to detect non-null values 2021-02-05 15:28:20 +00:00
Carol (Nichols || Goulding) fbf776c6b3
chore: Clean up Cargo.tomls (#754)
* fix: test_helpers crate should only be a dev-dep

* fix: object_store no longer has a build script, so no longer needs a build dep

* chore: Alphabetize all Cargo.tomls
2021-02-04 18:56:02 -05:00
Andrew Lamb 288861e646
feat: implement table_schema in partition chunk, mutable buffer, read buffer (#705)
fix: sort output schema by name

fix: Update data_types/src/schema.rs

Co-authored-by: Edd Robinson <me@edd.io>

refactor: Update read_buffer/src/lib.rs

Co-authored-by: Edd Robinson <me@edd.io>

Co-authored-by: Edd Robinson <me@edd.io>
2021-02-01 13:54:58 -05:00
Edd Robinson 2885b0528b refactor: address PR comments 2021-02-01 16:56:56 +00:00
Edd Robinson d44099d242
refactor: update read_buffer/src/column/bool.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-02-01 15:17:51 +00:00
Edd Robinson 7e1ac34906 refactor: recordbatch boolean -> Bool 2021-02-01 12:29:04 +00:00
Edd Robinson 172c8d146c refactor: wire up bool encoding to read buffer column 2021-02-01 12:02:14 +00:00
Edd Robinson 36d9541cbc feat: an arrow-backed boolean encoding 2021-02-01 12:02:14 +00:00
Edd Robinson 679e04fdb3 refactor: remove redundant From implementations 2021-02-01 10:52:27 +00:00
Edd Robinson 42e2178110 refactor: simpify arrow -> read buffer column 2021-01-31 21:19:18 +00:00
Edd Robinson 28b596b883 refactor: wire up fixed null encoding 2021-01-31 21:12:47 +00:00
Edd Robinson 0195dfc03a test: add coverage for NULL values in queries 2021-01-31 21:12:06 +00:00
Edd Robinson 4d107334dd test: adjust number of rows 2021-01-31 12:01:26 +00:00
Edd Robinson 0bed5e2290
refactor: update read_buffer/src/row_group.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-01-31 11:55:04 +00:00
Edd Robinson 02c154c746 refactor: min/max support 2021-01-30 22:58:39 +00:00
Edd Robinson 9cc9c714ed refactor: wire up sum 2021-01-30 22:42:58 +00:00
Edd Robinson fd25b6a9e2 refactor: wire up aggregate count 2021-01-30 21:01:25 +00:00
Edd Robinson 75a2e1caff refactor: results can just contain aggregates 2021-01-30 10:02:03 +00:00
Edd Robinson a71d3eea86 refactor: address PR feedback 2021-01-29 22:01:51 +00:00
Edd Robinson 3bb58fe971 refactor: tidy up commented code 2021-01-29 22:01:51 +00:00
Edd Robinson 46f20df089 refactor: change Rc -> Arc 2021-01-29 22:01:51 +00:00
Edd Robinson ec10c81041 refactor: simplify chunk locking implementation 2021-01-29 22:01:51 +00:00
Edd Robinson 6b6c1476f6 refactor: implement table meta-data rebuilding 2021-01-29 22:01:51 +00:00
Edd Robinson 13cbc12298 feat: make database support concurrent access 2021-01-29 22:01:51 +00:00
Edd Robinson bc08c6b404 feat: support concurrent access to Chunk 2021-01-29 22:01:51 +00:00
Edd Robinson 30b90943bc feat: make Table concurrent-safe 2021-01-29 22:01:51 +00:00
Edd Robinson 050185ad92 refactor: ensure meta updated when rowgroup add/removed 2021-01-29 22:01:51 +00:00
Edd Robinson 338bbb9b55 refactor: materialise rb for read_aggregate at table 2021-01-29 22:01:51 +00:00
Edd Robinson e3afab12a7 refactor: rb from table read_filter 2021-01-29 22:01:51 +00:00
Edd Robinson 9d3c623a14 refactor: baseline Rc 2021-01-29 22:01:51 +00:00
Andrew Lamb 2282a68e65
refactor: Move selection to the data_types crate and remove redundant implemenation (#704) 2021-01-29 13:35:07 -05:00
Edd Robinson c8ce27ce5e perf: add benchmark for table_names
This commit adds some benchmarks for `table_names` against the read
buffer's Database implementation. On my laptop these look like:

database_table_names_all_tables
                        time:   [2.2104 us 2.2242 us 2.2381 us]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

database_table_names_meta_pred_no_match
                        time:   [1.8389 us 1.8488 us 1.8593 us]
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

database_table_names_single_pred_match
                        time:   [5.5457 us 5.5694 us 5.5919 us]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

database_table_names_multi_pred_match
                        time:   [478.85 us 480.32 us 481.83 us]
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

database_table_names_multi_pred_match_multi_tables
                        time:   [476.47 us 478.93 us 482.25 us]
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
2021-01-26 17:00:53 +00:00
Edd Robinson 42d629ac32 feat: wire up predicate support to external API 2021-01-26 17:00:53 +00:00
Edd Robinson c89a569e03 feat: add per-chunk pred support in table_names 2021-01-26 17:00:53 +00:00
Edd Robinson 8a23e22957 feat: determine if row group satisfies predicate 2021-01-26 17:00:53 +00:00
Andrew Lamb c3b0371c84
feat: Initial RPC Query Frontend (#692)
* feat: Initial RPC Query Frontend

* docs: s/immutable buffer/mutable buffer

* docs: Correct type in docstring
2021-01-25 08:33:39 -05:00
Edd Robinson 3da9b73464 refactor: add assertion 2021-01-25 11:26:15 +00:00
Edd Robinson 5fe5ed0569 test: more aggregate coverage 2021-01-25 11:21:20 +00:00
Edd Robinson c60cfbd2bb refactor: update read_buffer/src/lib.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-01-25 11:21:20 +00:00
Edd Robinson a4bde5c252 refactor: error on unsupported aggregates 2021-01-25 11:21:20 +00:00
Edd Robinson d848526124 refactor: tidy up 2021-01-25 11:21:20 +00:00
Edd Robinson 9270874760 feat: wire up read_aggregate to external API 2021-01-25 11:21:20 +00:00
Edd Robinson e6b8d0e072 feat: add support for converting to record batch: 2021-01-25 11:21:20 +00:00
Edd Robinson 09ec6b78d3 feat: add ability to merge ReadAggregateResults 2021-01-25 11:21:20 +00:00
Andrew Lamb 7969808f09
feat: Chunk Migration APIs and query data in the read buffer via SQL (#668)
* feat: Chunk Migration APIs and query data in the read buffer via SQL

* fix: Make code more consistent

* fix: fmt / clippy

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: Remove unecessary Result and make chunks() infallable

* chore: Apply more suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Edd Robinson <me@edd.io>
2021-01-19 13:28:26 -05:00
Edd Robinson 221ed86853 refactor: apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-01-19 15:33:44 +00:00
Edd Robinson 57365b082c refactor: clean up filter_map 2021-01-19 15:33:44 +00:00
Edd Robinson dbdd885e58 refactor: follow snafu style guide 2021-01-19 15:33:44 +00:00
Edd Robinson 5f6335573b fix: ensure table missing handled 2021-01-19 15:33:44 +00:00
Edd Robinson 9f65c4b6ef refactor: encapsulate column meta data 2021-01-19 15:33:44 +00:00
Edd Robinson 0c7424465d refactor: use schema type for read_filter 2021-01-19 15:33:44 +00:00
Edd Robinson 93f4f8aa41 feat: teach read_buffer schema -> data_types schema 2021-01-19 15:33:44 +00:00
Edd Robinson 864e9e4dac refactor: tidy up columns in row group 2021-01-19 15:33:44 +00:00
Edd Robinson 71fce96b3b feat: encapsulate semantic column type in result schema 2021-01-19 15:33:44 +00:00
Edd Robinson 17358589ed refactor: move AggregateType to schema 2021-01-19 15:33:44 +00:00
Edd Robinson d805ce6189 refactor: move LogicalDataType into Schema 2021-01-19 15:33:44 +00:00
Edd Robinson e34979532d refactor: fix Display implementation 2021-01-18 12:05:11 +00:00
Edd Robinson bdeacdcf37 docs: define merging 2021-01-18 12:05:11 +00:00
Edd Robinson e2a16d7d4c refactor: fix error handling 2021-01-18 12:05:11 +00:00
Edd Robinson 3c61fdb773 refactor: update read_buffer/src/row_group.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-01-18 12:05:11 +00:00
Edd Robinson a82bf8de78 refactor: Update read_buffer/src/chunk.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-01-18 12:05:11 +00:00
Edd Robinson 26d11be09c refactor: wire up read_aggregate to chunk 2021-01-18 12:05:11 +00:00
Edd Robinson eaabb82404 refactor: wire up read_aggregate in table 2021-01-18 12:05:11 +00:00
Edd Robinson 0a4d890f31 refactor: move schema information for results into type 2021-01-18 12:05:11 +00:00
Edd Robinson 6dbb00715e refactor: rename method 2021-01-18 12:05:11 +00:00
Edd Robinson 42c0ccf274 refactor: rename method to 2021-01-18 12:05:11 +00:00
Andrew Lamb 71627120b9
refactor: consolidate line protocol schema creation into data_types and port code to use it (#663)
* refactor: consolidate line protocol schema creation into data_types, and port code to use it

refactor: Port mutable buffer to use SchemaBuilder

* fix: doctest

* refactor: remove unecessary clippyisms

* docs: Improve comments via suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>

* refactor: use more idomatic try_ naming and TryInto trait

* docs: Change from line protocol data model to InfluxDB data model

* refactor: rename LP --> Influx in code

* feat: add support for UInteger type

Co-authored-by: Edd Robinson <me@edd.io>
2021-01-15 17:29:30 -05:00
Edd Robinson 7f4f44211f test: fix broken test 2021-01-15 12:54:24 +00:00
Edd Robinson c6ff633afd test: expose internals for benchmarking
This commit is a bit of a hack. The first thing I could think of. The
problem is that I want to be able to benchmark various modules in the
read buffer but I don't want to expose those internals via the external
API.

Becuase criterion only lets you exercise the exported API I needed to
expose some internals. I did this by creating a documented module
`benchmarks` in the `read_buffer` crate, which re-exports identifiers
that can be used by a criterion crate.

The idea is that it will be clear that this module is not part of the
public API.
2021-01-14 22:46:27 +00:00
Edd Robinson d55bc4835a refactor: rename method 2021-01-14 21:45:54 +00:00
Edd Robinson d82293ba23 refactor: add API method 2021-01-14 21:26:50 +00:00
Edd Robinson 1d5bc6f345 refactor: expose column_names for getting tag keys 2021-01-14 21:26:50 +00:00
Edd Robinson 676b58dc2c fix: ensure correct chunks used 2021-01-14 21:26:50 +00:00
Edd Robinson 4c4d2e8e67 refactor: add API method for read_aggregate_window 2021-01-14 21:26:50 +00:00
Edd Robinson 3026668878 refactor: add API method for read_aggregate 2021-01-14 21:26:50 +00:00
Edd Robinson bdb8a78569 refactor: reduce expose API 2021-01-14 21:26:50 +00:00
Edd Robinson d536f82879 fix: get benchmarks compiling 2021-01-14 14:06:17 +00:00
Edd Robinson 6c5fdc0fae refactor: clean up comment 2021-01-14 13:46:20 +00:00
Edd Robinson 2ba438cf4f refactor: update read_buffer/src/row_group.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-01-14 13:46:20 +00:00
Edd Robinson 05e083d0e2 feat: support converting from DF types 2021-01-14 13:14:08 +00:00
Edd Robinson 728d556b7d refactor: encapsulate read buffer predicats in type 2021-01-14 13:14:08 +00:00
Andrew Lamb a5240af080
docs: Document desired crate dependencies in comments (#638)
* docs: Document the desire for read buffer and mutable buffer to be independent of query layer

* docs: Document desire for the query layer to not depend on storage systems

* fix: Apply suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>

Co-authored-by: Edd Robinson <me@edd.io>
2021-01-12 17:49:03 -05:00
Edd Robinson 9ec0ae26e1 refactor: implement From<ArrayRef> 2021-01-11 16:11:21 +00:00
Edd Robinson 9eef7b4d7f feat: add enum for selecting 'all' columns 2021-01-11 16:11:21 +00:00
Edd Robinson 61466fed44 refactor: add outline for read_aggregate 2021-01-11 16:11:21 +00:00
Edd Robinson 5a15a11a5c feat: lazily return record batches for read_filter 2021-01-11 16:11:21 +00:00
Edd Robinson c3019a91bd feat: add support for determining logical column types 2021-01-11 16:11:20 +00:00
Edd Robinson 1d972e01c8 refactor: Update read_buffer/src/lib.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-01-08 21:03:38 +00:00
Edd Robinson 23f27e10fa feat: add partition concept to ReadBuffer 2021-01-08 21:03:38 +00:00
Edd Robinson 46f85bb6a6 refactor: adrress PR comments
Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Dom <dom@itsallbroken.com>
2021-01-08 16:19:19 +00:00
Edd Robinson 4ce6821d90 feat: implement table_names on 2021-01-08 16:19:19 +00:00
Edd Robinson 590b74e386 refactor: add test for adding chunk to database 2021-01-08 16:19:19 +00:00
Edd Robinson 6df2de62bb refactor: provide an API for table row groups 2021-01-08 16:19:19 +00:00
Edd Robinson 954da31e83 test: fix tests 2021-01-08 16:19:19 +00:00
Edd Robinson 2178a6eae4 feat: hook up record batch -> chunk to store 2021-01-08 16:19:19 +00:00
Edd Robinson b1ab6a189d feat: record batch -> read buffer column 2021-01-08 16:19:19 +00:00
Edd Robinson 8382501440 refactor: validate column types 2021-01-08 16:19:19 +00:00
Andrew Lamb 8219403fab
feat: Instantiate ReadBuffer as part of server creation (#620)
* feat: Instantiate ReadBuffer as part of server creation

* refactor: remove Store from read_buffer
2021-01-07 13:25:42 -05:00
Edd Robinson 937442cfa0 refactor: update partition refs to chunk 2020-12-28 21:08:56 +00:00
Edd Robinson c46cf6fdcf refactor: rename partition to chunk 2020-12-28 21:08:56 +00:00
Edd Robinson c0dc93a8cb refactor: change partition to chunk 2020-12-28 21:08:56 +00:00
Edd Robinson fa8afe845d refactor: fix benchmarks 2020-12-22 21:26:05 +00:00
Edd Robinson b1aabc14b2 refactor: rename Segment to RowGroup 2020-12-22 21:26:04 +00:00
Edd Robinson 0af935d123 refactor: rename segment module to row_group 2020-12-22 21:26:04 +00:00
Edd Robinson 199ba68769 refactor: rename segment_store crate to read_buffer 2020-12-22 21:26:04 +00:00