Marco Neumann
4172d7946c
refactor: make `SchemaMerger` self-consuming
...
The error handling in `merge` was incomplete, aka it could leave the
merger in a half-modified state in case of an error. That's generally a
bad idea and can lead to ugly bugs. Also the "builder" pattern that is
used here usually consumes itself (and provides a clone impl), so it is
easier to reason about modifications. So this commit just changes it to
self-consuming builder.
A nice side effect of the new pattern is also that it is build-time
checked and does not contain a runtime assert any longer.
2021-07-06 18:20:05 +02:00
Raphael Taylor-Davies
5fe49aa017
feat: add flush guard to PersistenceWindows ( #1883 )
...
* feat: add flush guard to PersistenceWindows
* docs: Update comments based on code review
* fix: fmt
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2021-07-02 20:15:33 +00:00
Andrew Lamb
07826306ed
fix: Always deduplicate data prior to insertion into the ReadBuffer ( #1863 )
...
* fix: mark ReadBuffer as always deduplicated
* fix: Use compact plans during merge
* docs: Update server/src/db/chunk.rs
Co-authored-by: Nga Tran <ntran@influxdata.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Nga Tran <ntran@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:23:37 +00:00
Jacob Marble
0779b0d9bd
feat: add gRPC listener for new write protocol ( #1842 )
...
* feat: add gRPC listener for new write protocol
* chore: clippy happy
* chore: lint
* chore: cargo fmt --all
* chore: cargo clippy
* chore: protobuf-lint
* chore: more formatting
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-01 16:15:12 +00:00
Andrew Lamb
bed6ec8c31
feat: Handle merging chunks that have different schemas ( #1761 )
...
* feat: Handle merging chunks that have different schemas
* test: print out original (non deduplicated) data in tests
2021-06-21 15:52:13 +00:00
Andrew Lamb
6559a9e997
refactor: use Schema to compute InfluxDB primary keys ( #1757 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-18 21:15:31 +00:00
Andrew Lamb
de67bd3efe
refactor: Remove PartitionChunk::table_schema ( #1756 )
...
* refactor: Remove PartitionChunk::table_schema
* docs: update comments
2021-06-18 16:13:16 +00:00
Andrew Lamb
ec43a87909
chore: Update itertools deps ( #1750 )
2021-06-17 17:56:44 +00:00
Raphael Taylor-Davies
dd422492e2
feat: sort order in schema ( #1357 ) ( #1667 )
...
* feat: sort order in schema (#1357 )
* chore: review feedback
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-14 18:10:41 +00:00
Andrew Lamb
a614fef5bc
chore: remove more unused dependencies ( #1658 )
...
* chore: remove more unused deps
* refactor: move benchmarks into server_benchmarks crate
2021-06-09 10:17:20 +00:00
Raphael Taylor-Davies
07c4277ca7
refactor: schema merge to give more control over field merging ( #1653 )
...
* refactor: schema merge to give more control over field merging
* chore: review feedback
2021-06-09 06:30:45 +00:00
Andrew Lamb
34ba268cf1
feat: Group chunks by potential overlap ( #1654 )
...
* feat: Group chunks by potential overlap
* docs: clarify in what way the calculation is conservative
* fix: Add test for mixed nulls
2021-06-08 16:55:29 +00:00
Raphael Taylor-Davies
1e7ef193a6
refactor: use field metadata to store influx types ( #1642 )
...
* refactor: use field metadata to store influx types
make SchemaBuilder non-consuming
* chore: remove unused variants
* chore: fix lints
2021-06-07 13:26:39 +00:00
Raphael Taylor-Davies
5749a2c119
chore: cleanup legacy TSM -> parquet code ( #1639 )
...
* chore: cleanup legacy parquet code
* chore: remove tests of removed functionality
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-07 12:59:33 +00:00
Raphael Taylor-Davies
4fcc04e6c9
chore: enable arrow prettyprint feature ( #1566 )
2021-05-27 10:28:14 +00:00
Andrew Lamb
14ba25f86d
chore: Update datafusion and use released version of arrow crates ( #1546 )
...
* chore: Update datafusion and use released version of arrow crate
* fix: Update for change in API
2021-05-24 15:37:22 +00:00
Carol (Nichols || Goulding)
febc1538ff
chore: Update Rust version ( #1445 )
...
* chore: Update Rust version
* refactor: Make struct constructor field orderings consistent
Sometimes I changed the struct definition, sometimes changed the struct
construction instance, depending on consistency with code around each
(other similar structs, function argument orders, etc)
More info: https://rust-lang.github.io/rust-clippy/master/index.html#inconsistent_struct_constructor
* refactor: Use flatten where appropriate
One instance is a false positive with a clippy bug.
More info:
- https://rust-lang.github.io/rust-clippy/master/index.html#filter_map_identity
- https://rust-lang.github.io/rust-clippy/master/index.html#manual_flatten
* refactor: Use Option map instead of match
More info: https://rust-lang.github.io/rust-clippy/master/index.html#manual_map
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-06 22:07:10 +00:00
Raphael Taylor-Davies
10f89a3e8d
refactor: split entry out into separate crate ( #1428 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-06 11:36:23 +00:00
Raphael Taylor-Davies
411cf134e9
refactor: explode arrow_deps ( #1425 )
...
* refactor: explode arrow_deps
* chore: workaround doctest bug
2021-05-05 16:59:12 +00:00
Carol (Nichols || Goulding)
7d5c988fba
feat: Actually route SequencedEntry to the Write Buffer, if present
...
Connects to #1157 .
Rearrange some code and comments to be consistent with the design. Make
some more places not care whether they're getting an owned or borrowed
SequencedEntry.
2021-05-05 10:55:11 -04:00
Paul Dix
979f5f9347
refactor: write buffer to use sequenced entry and new segment
...
This refactors the write buffer to use the sequenced entry structure and the new segment definition. It removes the old replicated write and write_buffer.fbs.
Finally, it updates the SequencedEntry wrapper type around the Flatbuffer structure to be a trait so that SequencedEntry can be initialized from a borrowed Flatbuffer or an owned Vec<u8>.
How writes go into segments in the buffer and any kind of validation will likely have to be updated based on what kinds of guarantees we want to make in the buffer. However, that should probably come after we've rethought the design a bit around the new layout of chunks in the Parquet persistence.
2021-04-30 17:00:23 -04:00
Andrew Lamb
40b9b09cdc
refactor: rename assert_table_eq to assert_batches_eq ( #1368 )
2021-04-30 10:51:08 +00:00
Carol (Nichols || Goulding)
9aefcd216f
fix: Validate that ClockValue is never 0
2021-04-28 13:54:55 -04:00
Carol (Nichols || Goulding)
2f4d7189ff
fix: Validate ServerId when creating structs from flatbuffers
...
When we get the flatbuffers, we won't have the server ID in addition to
the flatbuffers-- it's in the flatbuffers. But we want to validate the
`ServerId` once when the `SequencedEntry` is created so that its
`server_id` method can assume it has a valid `ServerId`.
2021-04-28 13:06:12 -04:00
Raphael Taylor-Davies
6bdc153361
feat: sort RUB (read buffer) chunks ( #1308 )
...
* feat: sort chunks before upserting to read buffer (#1216 )
* chore: review feedback
* chore: fix merge conflict
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 16:50:48 +00:00
Marco Neumann
eddc9319ff
docs: deny broken intradoc links
2021-04-27 13:22:28 +02:00
Raphael Taylor-Davies
20117de078
feat: string dictionary encoding ( #1220 ) ( #1262 )
...
* feat: string dictionary encoding (#1220 )
* chore: review comments
* chore: fix lint
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 09:36:58 +00:00
Carol (Nichols || Goulding)
272cdb85ce
fix: Use the ServerId type everywhere, for writing, querying, anything
2021-04-26 18:44:32 +00:00
Jake Goulding
67f5ad841d
refactor: Introduce ServerId and CurrentServerId types
2021-04-26 18:44:32 +00:00
Raphael Taylor-Davies
0a835436ac
feat: use bitmasks within MUB ( #1274 ) ( #1289 )
...
* feat: use bitmasks within MUB (#1274 )
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-26 18:00:16 +00:00
Marko Mikulicic
83d6550316
feat: Implement write_entry_downstream
2021-04-21 20:50:46 +00:00
Carol (Nichols || Goulding)
88ca1a5245
fix: Rename wal.fbs to write_buffer.fbs
2021-04-21 17:43:03 +00:00
Carol (Nichols || Goulding)
80995afb70
fix: Change WAL to Write Buffer in comments and documentation
2021-04-21 17:43:03 +00:00
Carol (Nichols || Goulding)
f136931225
fix: Inconsistent ordering lints
2021-04-19 08:48:11 -04:00
Andrew Lamb
e226b5a820
feat: Use TimestampNanosecondArray for timestamps in IOx ( #1230 )
...
* refactor: Create Arrow arrays using iterators
* feat: use Timestamp64(TimeUnit::Nanosecond) for timestamps
* feat: add support for timestamp array
* fix: update more tests
* fix: remove unecessary code
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-16 15:55:33 +00:00
Marko Mikulicic
878b1b318e
feat: Initial scaffolding for routing layer
...
Part of #916
Adding first-class concept of ShardId in shard config, fixes #1156
NEXT:
- [ ] implement sharder
- [ ] implement `write_entry_downstream`
- [ ] add tests
2021-04-15 09:02:47 +00:00
Edd Robinson
4db3a4b3b5
test: enable writer to split large batches
2021-04-14 09:36:39 +00:00
Paul Dix
bd13c09bad
refactor: make sharder optional when generating entry
2021-04-13 12:52:14 +00:00
Paul Dix
7e28f8ef66
feat: Implement Entry writing to Db
...
This removes the old ReplicatedWrite structure and implements the writing of an Entry to the Db. I also call out in `server/lib.rs` and in the `Db` where sharding and replication might happen.
I've also added helpers in various places to write line protocol to chunks, tables, and databases. That enabled removing a good amount of code from the test helpers crate.
2021-04-13 12:52:14 +00:00
Raphael Taylor-Davies
1997324344
feat: mutable buffer snapshotting ( #1179 )
...
* feat: mutable buffer snapshotting
* chore: review feedback
2021-04-13 12:14:54 +00:00
Paul Dix
5893c17905
refactor: PR feedback and change ClockValue to actual type.
2021-04-12 18:43:14 +00:00
Paul Dix
31115742ec
feat: Add writing of Entry structures to MB Chunk
...
This adds writing of Entry of a vec of TableWriteBatch to the Mutable Buffer Chunk. This is additional to the previous method of writing via ReplicatedWrite. The next step is to remove the old ReplicatedWrite bits.
Test helpers for parsing line protocol into Entry and writing line protocol directly to Chunks have also been added.
2021-04-12 18:43:14 +00:00
Paul Dix
0a3386f24a
refactor: Make PartitionWrite, Table, and Column return keys/names
2021-04-12 18:43:14 +00:00
Paul Dix
3f928ed374
refactor: Add ClockValue and WriterId to Entry
2021-04-12 18:43:14 +00:00
Paul Dix
dad8d6bafd
chore: Update Entry with test helpers
2021-04-12 18:43:14 +00:00
Paul Dix
3bb283fe20
chore: make SequencedEntryRaw the method of choice for SequencedEntry
2021-04-08 12:57:11 -04:00
Paul Dix
81926279fc
chore: add benchmark for SequencedEntryRaw::new_from_entry_bytes
2021-04-08 12:57:11 -04:00
Paul Dix
c002d83e9a
feat: add SequencedEntryRaw for raw entry bytes
2021-04-08 12:57:11 -04:00
Paul Dix
0546968e13
chore: add SequencedEntry::new_from_entry benchmark
2021-04-08 12:57:11 -04:00
Paul Dix
0c082e2347
feat: add sequenced entry builder
2021-04-08 12:57:11 -04:00