Commit Graph

59 Commits (488252d9f106497f1805a2e2638f1df5c6b39e05)

Author SHA1 Message Date
Jacob Marble 488252d9f1 fix: test 2021-03-09 07:52:52 -08:00
Jacob Marble 09c4f88758 fix: cargo fmt 2021-03-09 07:45:05 -08:00
Jacob Marble abb991ffca fix: add tests to unsigned integer parsing in LP and MUB 2021-03-08 15:29:05 -08:00
Jacob Marble ac1b0c04ae fix(line-protocol): add unsigned integer field type
Fixes #904

The line protocol parser was lacking the unsigned integer type, which
suffixes values with `u`. This adds unsigned integer support to the line
protocol parser, and fills a few corresponding gaps in the mutable
buffer.
2021-03-08 09:59:12 -08:00
Andrew Lamb 746373a687
refactor: Remove mutable_buffer crate dependency on query crate (#927) 2021-03-05 11:34:27 +00:00
Andrew Lamb a6965769b4
refactor: Remove impl of `query::PartititionChunk` in mutable_buffer: This PR (#923) 2021-03-05 10:51:08 +00:00
Andrew Lamb b1e5cfedf7
refactor: Move chunk predicate creation from query::Predicate into server crate (#922)
* refactor: Create `ChunkPredicate` predicate in the Server crate

* refactor: move chunk predicate out of chunk

* fix: remove comment in server/src/db/pred.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-04 23:23:43 +00:00
Andrew Lamb 3be5c26f92
refactor: Remove impl of `query::Database` in mutable_buffer (#914) 2021-03-04 22:02:42 +00:00
Andrew Lamb 945d2f8d45
refactor: Remove query stuff from mutable buffer (#912)
* refactor: remove query code from mutable_buffer database.rs

* refactor: remove query code from mutable_buffer table.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-04 21:42:56 +00:00
Andrew Lamb 8b1f100df3
feat: make read_group and read_window_aggregate work across chunks (#905)
* feat: make read_group and read_window_aggregate work across chunks

* refactor: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: Update query/src/frontend/influxrpc.rs

Improve logic and use strings directly

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: fmt

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-04 17:06:31 +00:00
Andrew Lamb 7d8d00781c
feat: Make read_filter work for mutable buffer and read buffer (#882)
* feat: port read_filter to InfluxRPCPlanner

* fix: remove commented out vestigal test

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: fmt

* fix: Update arrow_deps/src/util.rs

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-03-01 16:50:29 +00:00
Nga Tran 18de3bdcab chore: merge main into branch
Merge branch 'main' into ntran/optimize_column_selection
2021-02-26 15:29:43 -05:00
Nga Tran f37e5846aa feat: fmt auto fix 2021-02-26 14:56:10 -05:00
NGA TRAN eb81975151 feat: Optimize Column Selection 2021-02-26 14:28:46 -05:00
Andrew Lamb 12deacd8a0
refactor: move SeriesSetPlans into its own module (#878)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-25 23:12:39 +00:00
Andrew Lamb 8fb7651719
feat: Port tag_values to the InfluxRPCPlanner (#859)
* feat: Port tag_values to the InfluxRPCPlanner

* refactor: merge imports

* refactor: rename column_names to tag_column_names for clarity

* fix: Update query/src/frontend/influxrpc.rs

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: use ensure!

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: less silly whitespace

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: code review comments

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-24 23:11:22 +00:00
Andrew Lamb ed7859e182
refactor: rename tag_column_names --> tag_keys in InfluxRPCPlanner (#860) 2021-02-23 17:04:53 +00:00
Carol (Nichols || Goulding) 56c515a063 refactor: Simplify partition sorting code 2021-02-22 11:19:47 -05:00
Carol (Nichols || Goulding) 04e2631f26 refactor: Switch to explicit Arc::clone 2021-02-22 11:10:07 -05:00
Carol (Nichols || Goulding) d0707725cf Merge remote-tracking branch 'origin/main' into pd-mutable-buffer-data-eviction 2021-02-22 10:21:59 -05:00
Edd Robinson 92eb8b9e85 refactor: make certain Database method sync
A couple of methods don't seem to have any await points in their
implementations, so it feels like they could just be `sync`.
2021-02-19 17:14:17 +00:00
Andrew Lamb 9b91e0624c
feat: implement field_columns plan (#819)
* feat: implement field_columns plan

* fix: fix doc tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-17 20:43:24 +00:00
Andrew Lamb 94a93e56ff
feat: implement `tag_keys` in gRPC planner and across mutable buffer (#795)
* feat: move tag_column_names into rpc planner

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: compile error

* refactor: remove PassThrough error type

* fix: Avoid extra layers of errors in mutable buffer chunk

* fix: use HashMap::get rather than values() and find

* fix: push filtering down to chunk in gRPC planner

* fix: fixup trait bounds to be non-silly

* fix: remove incorrect comment

* fix: remove cruft

* fix: clippy + fmt

* fix: correct comment

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-02-15 16:47:52 +00:00
Edd Robinson 8a85158a98 refactor: add arc clone lint 2021-02-15 12:40:19 +00:00
Paul Dix dc465e5d02 feat: Add function to check db size and drop partitions
Adds functionality to the server Db to check the mutable buffer size and drop partitions based on the database rules.
2021-02-13 17:19:40 -06:00
Paul Dix 83bfa6d949 feat: Add created_at, last_write_at tracking to partition and sorting
This commit adds created_at and last_write_at instants to partitions in the mutable buffer. It adds a method on the mutable buffer database to get back the partitions in sorted order based on either the created_at or last_write_at instants. Ordering based on the summary stats from a column are still left to do.

Finally, it modifies the helper function to create replicated write to take a Partitioner trait that can generate partition keys based on lines, rather than taking the DatabaseRules struct directly. This makes it easier to write test cases where data is split into multiple partitions in the mutable buffer.
2021-02-13 17:19:40 -06:00
Marko Mikulicic 9e39e91139 chore: Cleaning things in prep for rust 2021
Also remove a NUL byte in a test string literal; some editors drop them.
2021-02-12 16:48:17 +00:00
Andrew Lamb a316b16960
feat: Change table_names to return either Some(set) or None, rather than a plan (try 2) (#776)
* feat: Change table_names to return either Some(set) or None, rather than a plan

* docs: improve comments

* docs: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: merge conflict

* fix: don't clone a string unless needed

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-02-09 12:20:59 -05:00
Paul Dix e5da2ab589
feat: add ability to roll up summaries from multiple chunks (#763) 2021-02-08 18:11:21 -05:00
Paul Dix 47bc28460e
refactor: rename partition, table, and column in parition_meta for clarity (#757)
* refactor: rename partition, table, and column in parition_meta for clarity
2021-02-05 08:00:22 -05:00
Paul Dix de7bc7d645
feat: add column name to the partition metadata summaries (#755) 2021-02-05 07:20:16 -05:00
Andrew Lamb 3ec483b769
refactor: Reduce async in mutable buffer, use std::sync (#749)
* refactor: Reduce async in mutable buffer, use std::sync

* fix: logical confict with new code
2021-02-05 06:47:40 -05:00
Carol (Nichols || Goulding) fbf776c6b3
chore: Clean up Cargo.tomls (#754)
* fix: test_helpers crate should only be a dev-dep

* fix: object_store no longer has a build script, so no longer needs a build dep

* chore: Alphabetize all Cargo.tomls
2021-02-04 18:56:02 -05:00
Paul Dix 5c3661dd91 chore: refactor how columns are kept in the table in the mutable buffer
This one is a bit of a yak shave in advance of adding column names to the summary statistics. I needed the column and its name (or identifier) to be together, rather than the id to index map that existed before. I think the table_id and column_id stuff should be refactored out over time since they add a ton of complexity to the code and don't add much value. Having those as Strings would be much easier and probably be a drop in the bucket for memory usage. Basically, I don't think they need to be interned. But that would be an even more massive refactor touching so many things in the MutableBuffer, I leave it as a later exercise.

Hopefully this makes the code simpler and cleaner in the interim and it gives me the column_id with the column so that I can easily look up the name when generating the summary statistics for a chunk.
2021-02-04 16:31:55 -05:00
Paul Dix 1f8043a3f8 feat: add approximate memory size tracking to mutable buffer
This updates the mutable buffer, partitions, chunks, dictionary, tables, and individual columns to be able to return their approximate memory size used. This doesn't aim to be exact. There are spots where I'm not counting table or column pointers or the partition key. My expectation is that the data size will dominate and a few pointers here and there won't matter.
2021-02-04 13:50:43 -05:00
Andrew Lamb d66eae1a44
feat: Implement TableProvider for Trait for `Db` (#730)
* feat: Implement TableProvider for Db

Gets us selection pushdown in plans, sets us up for predicate pushdown

Includes: SendableRecordBatchStreams for mutable buffer and read buffer results

fixup snapshots

* docs: comments
2021-02-03 14:18:47 -05:00
Andrew Lamb abc26a33c1
chore: Update dependencies (again) (#718)
* chore: Update dependencies (again)

* refactor: update for changes in DataFusion API

* fix: fmt

* fix: clippy
2021-02-02 18:33:01 -05:00
Andrew Lamb 288861e646
feat: implement table_schema in partition chunk, mutable buffer, read buffer (#705)
fix: sort output schema by name

fix: Update data_types/src/schema.rs

Co-authored-by: Edd Robinson <me@edd.io>

refactor: Update read_buffer/src/lib.rs

Co-authored-by: Edd Robinson <me@edd.io>

Co-authored-by: Edd Robinson <me@edd.io>
2021-02-01 13:54:58 -05:00
Andrew Lamb f3bd8bd0e3
chore: update deps (tokio 1.0 and ecosystem) (#707)
* chore: Update arrow + tokio deps

* chore: Use bleeding edge azure

* chore: Update aws + other deps

* fix: fmt

* fix: Switch to in-house version of routerify

* fix: Upgrade to hyper 0.14

The hyper::error module is now private; hyper::Error is the public
re-export

* fix: Upgrade cloud storage to get tokio upgrade

* fix: Upgrade open_telemetry

* fix: Do not call `panic::set_hook` during another panic

Doing so leads to a double panic which aborts the process.

* fix: new h2 error who dis

Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2021-01-29 16:11:55 -05:00
Andrew Lamb 2282a68e65
refactor: Move selection to the data_types crate and remove redundant implemenation (#704) 2021-01-29 13:35:07 -05:00
Andrew Lamb efb1e0f8ae
feat: Add selection interface to mutable buffer and query interface (#700)
* feat: Add selection interface to mutable buffer and query interface

* docs: Update mutable_buffer/src/table.rs

* refactor: rename for consistency

* refactor: use map and filter_map  rather than fold
2021-01-27 14:31:10 -05:00
Andrew Lamb 504ca67532
test: revamp rpc query testing so it works in multiple chunk scenarios (#696)
* test: revamp testing so it works in multiple scenarios, fix bug found by same

* fix: Update docs in server/src/db.rs

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: use tsp rather than different functions

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-01-25 16:34:19 -05:00
Andrew Lamb c3b0371c84
feat: Initial RPC Query Frontend (#692)
* feat: Initial RPC Query Frontend

* docs: s/immutable buffer/mutable buffer

* docs: Correct type in docstring
2021-01-25 08:33:39 -05:00
Andrew Lamb 75b0a62fa5
refactor: Delete remove dead code (#686) 2021-01-21 19:20:39 -05:00
Andrew Lamb 747b96d801
chore: Upgrade arrow dependencies, reduce duplication with upstream (#676) 2021-01-21 08:58:11 -05:00
Andrew Lamb 7969808f09
feat: Chunk Migration APIs and query data in the read buffer via SQL (#668)
* feat: Chunk Migration APIs and query data in the read buffer via SQL

* fix: Make code more consistent

* fix: fmt / clippy

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: Remove unecessary Result and make chunks() infallable

* chore: Apply more suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Edd Robinson <me@edd.io>
2021-01-19 13:28:26 -05:00
Andrew Lamb 71627120b9
refactor: consolidate line protocol schema creation into data_types and port code to use it (#663)
* refactor: consolidate line protocol schema creation into data_types, and port code to use it

refactor: Port mutable buffer to use SchemaBuilder

* fix: doctest

* refactor: remove unecessary clippyisms

* docs: Improve comments via suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>

* refactor: use more idomatic try_ naming and TryInto trait

* docs: Change from line protocol data model to InfluxDB data model

* refactor: rename LP --> Influx in code

* feat: add support for UInteger type

Co-authored-by: Edd Robinson <me@edd.io>
2021-01-15 17:29:30 -05:00
Hu Ming 99605b27d7
chore: rename (#660) 2021-01-14 12:49:03 -05:00
Andrew Lamb a5240af080
docs: Document desired crate dependencies in comments (#638)
* docs: Document the desire for read buffer and mutable buffer to be independent of query layer

* docs: Document desire for the query layer to not depend on storage systems

* fix: Apply suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>

Co-authored-by: Edd Robinson <me@edd.io>
2021-01-12 17:49:03 -05:00
Andrew Lamb 6376891da3
feat: implement query planning in terms of chunks (#647) 2021-01-12 16:04:45 -05:00