Commit Graph

48 Commits (d8130a60004a1d21ae14cd46ac4ce215015b4ed4)

Author SHA1 Message Date
Andrew Lamb 4e1e8dbf79
chore: Upgrade version of Arrow/DataFusion (2 of 3) (#391)
* chore: Upgrade version of Arrow/DataFusion (2 of 3)

* fix: Fixup error type usage and use async stream interface

* fix: post merge fixups
2020-10-26 13:49:16 -04:00
Andrew Lamb 880958d9c7
feat: switch end-to-end test to use write_buffer implementation rather than partitioned store (#386)
* feat: switch end-to-end test to use write_buffer implementation rather than partitioned store

* fix: Apply PR suggestions
2020-10-26 13:42:38 -04:00
Andrew Lamb 88b9f43110
chore: Upgrade version of Arrow/DataFusion (1 of 3) (#390)
* chore: Upgrade version of Arrow/DataFusion

* fix: update code for deps
2020-10-26 11:46:02 -04:00
Andrew Lamb a66bd4a738
feat: Implement measurement_fields gRPC route (#384)
* feat: Implement measurement_fields gRPC route

* fix: Apply suggestions from code review in github

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>

* fix: apply code review comments locally

* fix: fix based on code review

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2020-10-23 14:01:46 -04:00
Andrew Lamb 0b443fdb12
feat: add --num-threads CLI argument, allow single threaded operation (#378) 2020-10-21 17:12:32 -04:00
Paul Dix 57bb63717d chore: convert writer id to u32 2020-10-21 08:30:19 -04:00
Paul Dix a7bde81d8c feat: Add server id to replicated writes
Adds the server id to replicated writes. Changes type from u8 to u16 for more potential write servers. Also updates the server struct in Delorean to no longer have locks. Locking will happen higher up, likely with the server being arc_swap'd out on any configuration update.
2020-10-21 08:30:19 -04:00
Andrew Lamb 53b529fe19
refactor: Remove some uses of pub use (#377)
* refactor: Remove some uses of pub use

* fix: remove bad comment
2020-10-21 06:38:38 -04:00
Andrew Lamb afa59f9086
fix: do not run plans in write buffer when predicates refer to columns that do not exist (#372)
* fix: do not run plans for tables in write buffer when predicates refer to columns that do not exist

* fix: Apply suggestions from code review
2020-10-20 16:40:52 -04:00
Andrew Lamb ee344c3d51
feat: Plan for computing groups (#366) 2020-10-19 14:14:43 -04:00
Andrew Lamb bfb966b1f1
feat: basic read_group plumbing (#365)
* feat: basic read_group plumbing

* fix: Update delorean_storage/src/exec.rs
2020-10-19 11:45:46 -04:00
Carol (Nichols || Goulding) 083e6947df refactor: Take impl Into<String> Db names and possibly avoid some allocations
Also remove some explicit clones that SNAFU will take care of
2020-10-16 13:57:16 -04:00
Carol (Nichols || Goulding) 4faf5f04dc refactor: Implement Default on Db to make creation easier 2020-10-16 13:57:16 -04:00
Andrew Lamb dc4898e1e4
feat: conversion between SeriesSets and ReadResponses (#362)
* feat: conversion between SeriesSets and ReadResponses

* fix: Address PR review comments 1

* fix: Address PR review comments 1
2020-10-16 06:37:50 -04:00
Andrew Lamb 0ef76db208
feat: implement series_query for write buffer database, tests for same (#360)
* feat: implement series_query for write buffer database, tests for same

* fix: fixup comments

* fix: sort field columns too
2020-10-15 17:23:14 -04:00
Paul Dix dbc6b7b2d6 feat: Make table return all columns if none specified for arrow batch 2020-10-14 07:28:58 -04:00
Andrew Lamb 80088ffe37
feat: gRPC plumbing + interface structures for read_filter (#351)
* feat: gRPC plumbing + support structures for read_filter

* fix: cleanup comments
2020-10-12 14:12:53 -04:00
Paul Dix a80eb0fed3 feat: Store replicated writes
This commit refactors the flatbuffers data types from the wal to a new crate where they can be used by storage, write buffer, and cluster. It also refactors cluster to move the configuration types out to the data types crate so they can be used across storage and elsewhere.

Finally, it adds a new method to store replicated writes on a database in the database trait and implements it.
2020-10-11 15:45:08 -04:00
Andrew Lamb 2b8c04f2b4
chore: Update arrow (again) to pick up latest changes to datafusion (#345) 2020-10-09 07:17:02 -04:00
Andrew Lamb aaeb0d4c84
refactor: implement automatic error conversion for errors that do not have lots of context (#341)
* refactor: implement automatic error conversion for errors that do not have lots of context

* fix: implement code review suggestions
2020-10-08 11:21:54 -04:00
Andrew Lamb 5400c55b2a
refactor: apply timestamp predicate in visit code (#340) 2020-10-07 12:33:04 -04:00
Andrew Lamb 9a81bf4d72
feat: implement column_values for write buffer database (#339) 2020-10-07 10:12:28 -04:00
Andrew Lamb 3ba1a95795
refactor: extract "traverse the write buffer structure" into a visitor trait/pattern (#338) 2020-10-06 17:08:46 -04:00
Andrew Lamb 3d670fb556
feat: Implement gRPC routes tag_values and measurement_tag_values (#337) 2020-10-06 17:07:03 -04:00
Paul Dix 1b69a5a79c
refactor: WriteBuffer database and WAL Flatbuffers (#331)
* chore: Refactor write buffer WAL

This commit refactors the WAL to remove partition events and to collapse rows into a single write buffer entry.
This further simplifies the WAL by removing WriteBufferBatch.
Finally, this removes the concept of a partition generation as that is currently not used.

* refactor: WriteBuffer database and WAL Flatbuffers

This refactor updates the WriteBuffer write path signficantly. At the public API it takes parsed lines, but then immediately converts them over to a built Flatbuffer byte array, which has also been signficantly refactored.

The Flatbuffer structure has been updated so that a WriteBufferBatch contains a vec of WriteBufferEntry. Each of those entries corresponds to a collection of data that is bound for a single partition. The generated partition key is now kept as part of this entry.

Within the WriteBufferEntry you now have a vec of TableWriteBatch which have the table name and a vec of Row. This pulls the table name out of the row, elminating redundancy for writes that have multiple rows being written into the same table.

The database now has methods to accept the Flatbuffer WriteBufferEntry with updates down the line to Partition and Table.

This also has a nice little performance bump for WAL restore:

wal-restoration/restore_single_entry_single_partition
                        time:   [684.51 us 688.45 us 692.53 us]
                        thrpt:  [1.4440 Melem/s 1.4525 Melem/s 1.4609 Melem/s]
                 change:
                        time:   [-55.913% -55.351% -54.800%] (p = 0.00 < 0.05)
                        thrpt:  [+121.24% +123.97% +126.82%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

wal-restoration/restore_multiple_entry_multiple_partition
                        time:   [8.7483 ms 8.8964 ms 9.0815 ms]
                        thrpt:  [1.3214 Melem/s 1.3489 Melem/s 1.3717 Melem/s]
                 change:
                        time:   [-55.952% -55.166% -54.213%] (p = 0.00 < 0.05)
                        thrpt:  [+118.40% +123.04% +127.02%]
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

* fix: fmt

Co-authored-by: alamb <andrew@nerdnetworks.org>
2020-10-02 13:52:00 -04:00
Andrew Lamb 45c4f1e24e
refactor: make table_names API consistent with tag_keys API, other cleanups (#327) 2020-10-02 09:42:06 -04:00
Andrew Lamb 2b98da593b
feat: write_database support for predicates (#326)
* feat: write_database support for predicates

* fix: temporarily pull in arrow fork to pick up fix for ARROW-10136

* fix: Update mutex usage based on PR feedback

* fix: more mutex polish and use OptionExt

* fix: update comments

* fix: rust-fu the table lookup

* fix: update docs

* fix: more idomatic rust types

* fix: better usage of reference types
2020-10-01 14:34:53 -04:00
Andrew Lamb 8a14896487
chore: update version of datafusion (#324)
* chore: update version of datafusion

* chore: Update interfaces to be async
2020-09-30 08:02:15 -04:00
Andrew Lamb d40ed663fb
fix: respect `columns` parameter in table_to_arrow (#322) 2020-09-29 07:48:19 -04:00
Andrew Lamb d606a1f1cd
refactor: split delorean_write_buffer/src/database.rs into multiple modules (#317) 2020-09-28 06:20:59 -04:00
Carol (Nichols || Goulding) 818ffff411 refactor: Use value_as_[type]_values methods
These call value_type anyway, so it feels like this eliminates duplicate
calls... not seeing too much of a difference in profiling though.
2020-09-24 16:23:12 -04:00
Carol (Nichols || Goulding) cbc11717cc fix: Only do one string interner lookup when we want to insert
Previously, we needed to add an entry to the WAL when we added a new
string in the dictionary; now that the WAL entries are self-describing,
we only need to do one lookup here.
2020-09-24 16:20:09 -04:00
Andrew Lamb 77f58efca7
chore: update Arrow/Parquet/DataFusion versions, consolidate references into new crate (#309)
* chore: consolidate all arrow/parquet/datafusion dependencies

* chore: update datafusion version
2020-09-24 08:46:54 -04:00
Carol (Nichols || Goulding) 0b64840282 fix: Update schema for wal flatbuffers to be self describing
As recommended by Paul here: https://github.com/influxdata/delorean/issues/277#issuecomment-693670676
2020-09-23 21:22:02 -04:00
Andrew Lamb 498478c066
refactor: rename delorean_storage_interface to delorean_storage (#308) 2020-09-22 17:18:53 -04:00
Andrew Lamb d0f2902c8d
feat: implement tag_keys and measurement_tag_keys (#307)
* feat: implement tag_keys and measurement_tag_keys

* fix: fix timestamp bound evaluation
2020-09-22 16:42:45 -04:00
Jake Goulding cf4cb72b1c feat: Add a WAL restoration benchmark with multiple entries and partitions 2020-09-18 16:45:03 -04:00
Jake Goulding 648d42568d feat: Add a benchmark for restoring the WAL 2020-09-18 16:45:01 -04:00
Carol (Nichols || Goulding) 5858aa2d08
refactor: Changes to write buffer unrelated to self-describing change (#303)
* refactor: Extract column count to be a method on ParsedLine

* refactor: Simplify parsing nanoseconds as datetime

* refactor: Extract a constructor for WalEntryBuilder

* refactor: Extract a method on WalEntryBuilder to get bytes

* refactor: Avoid an iteration and a vec allocation

* refactor: Organize and alphabetize dependencies and imports

* fix: Propagate warning config to new crates
2020-09-18 16:29:19 -04:00
Carol (Nichols || Goulding) 841ee6e808 test: Improve display of test failures 2020-09-18 11:15:29 -04:00
Carol (Nichols || Goulding) 10f5c116cc test: Write a failing test for only restoring part of the WAL
The expected output for `cpu` probably isn't right; but this doesn't run
successfully yet.
2020-09-18 11:15:29 -04:00
Carol (Nichols || Goulding) eb7d5469b7 feat: Keep track of partitions to record them in a wal batch
So that each wal batch describes the partitions it uses.

Only restore a partition if it hasn't already been restored.
2020-09-18 11:15:29 -04:00
Carol (Nichols || Goulding) c66726600b refactor: Make restoring to partitions entirely separate from database
By removing the database name argument (which was only used for errors)
and making a separate error type for restoration errors.

This makes it easier to benchmark the restore_partitions_from_wal
function independently. This also has a nice side effect of splitting
errors that can happen during restoration from errors that can happen at
other times.

Switched some fields that were named `_id` to not have a suffix to set
up easier switching of those from u32 to String in a future commit.
2020-09-18 09:38:03 -04:00
Carol (Nichols || Goulding) 73455044bd feat: Add more structured information to SchemaMismatch errors 2020-09-18 09:23:37 -04:00
Carol (Nichols || Goulding) a1f62e3b35 refactor: Extract function to restore partitions from wal entries 2020-09-18 08:58:11 -04:00
Carol (Nichols || Goulding) 5a27466295 refactor: Reorder a few lines to make the next extraction clearer 2020-09-18 08:55:49 -04:00
Carol (Nichols || Goulding) 474dfad9fc refactor: Extract a struct to hold statistics about restoring from WAL 2020-09-18 08:55:42 -04:00
Andrew Lamb 642b1b4370
refactor: move write_buffer to delorean_write_buffer crate (#299) 2020-09-18 08:11:48 -04:00