Raphael Taylor-Davies
b0e8b75a8a
fix: TestCatalogState unique chunk ID
2021-08-19 17:19:12 +01:00
kodiakhq[bot]
47431148d5
Merge branch 'main' into er/refactor/read_buffer/bitmap_size
2021-08-18 21:20:13 +00:00
Raphael Taylor-Davies
e81b82c0a4
feat: split db worker loop ( #2337 )
...
* feat: split db worker loop
* chore: review feedback
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-18 17:33:13 +00:00
Carol (Nichols || Goulding)
61263c8774
feat: Add a debugging-suitable way to get the object storage path of a database
2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding)
fbf3ceb1e2
refactor: Extract listing of all databases into iox_object_store
2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding)
f782e77dcc
test: Use the iox_object_store when testing a database's object store files
2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding)
ff89398132
fix: Remove DatabaseConfig store_path field
...
This is now managed by the iox_object_store crate.
2021-08-18 11:32:39 -04:00
Jake Goulding
63111d9d9a
refactor: Move the database rules functionality to iox_object_store
2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding)
4447f1e22c
test: Adjust parquet file sizes; only storing relative paths now
2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding)
6d5cb9c117
refactor: Extract a ParquetFilePath to handle paths to parquet files in a db's object store
2021-08-18 11:32:39 -04:00
Edd Robinson
b9f09fce49
feat: improve bitset size estimation
2021-08-17 22:54:22 +01:00
Edd Robinson
1daa30cc7d
fix: include enum in sizing
2021-08-17 22:54:22 +01:00
kodiakhq[bot]
006d4db0c1
Merge branch 'main' into er/feat/read_buffer/row_group_metrics
2021-08-17 21:44:01 +00:00
Andrew Lamb
6b2ac77b8b
docs: Add some doc comments about sortedness in catalog Partition chunks ( #2323 )
...
* docs: Note on iteration order in catalog::Partition
* test: add tests for chunk_id order
2021-08-17 15:17:12 +00:00
Edd Robinson
211d814c8c
Merge branch 'main' into er/feat/read_buffer/row_group_metrics
2021-08-17 13:00:44 +01:00
Edd Robinson
c795fc7f9d
feat: add metric to track total row groups
2021-08-17 12:55:11 +01:00
Marco Neumann
55e9a3beda
docs: better explain locking
2021-08-17 10:14:20 +02:00
Marco Neumann
e540798eed
test: drop two chunks in `drop_partition` test
2021-08-17 10:07:26 +02:00
Marco Neumann
5b0c3728b6
fix: ensure that code invariants hold
2021-08-17 10:03:28 +02:00
Marco Neumann
32cf23100d
docs: explain why `drop_partition` does not deadlock
2021-08-17 09:52:30 +02:00
Marco Neumann
4a5dfc895a
docs: clarify that `Partition::chunks` returns an ordered iterator
2021-08-17 09:52:07 +02:00
Marco Neumann
177d5fbb35
docs: fix typo in `Step::Drop`
...
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-08-17 09:44:35 +02:00
Marco Neumann
9454e06d61
test: test interaction of dropping partitions and replay
2021-08-17 09:44:35 +02:00
Marco Neumann
77892a0998
feat: add API to drop entire partitions
2021-08-17 09:44:35 +02:00
Ning Sun
c012e996ab
refactor: remove display methods, use fmt::Display instead. ( #2272 )
...
* refactor: remove display methods, use fmt::Display instead.
Signed-off-by: Ning Sun <sunng@protonmail.com>
* refactor: update a few calls from .display to .to_string()
* fix: consistently use `Path` rather than occasionally `DirsAndFileName`
* fix: fixup for merge conflicts
* fix: update test
* fix: Catch another case or two
* fix: fmt
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-16 18:00:22 +00:00
Marco Neumann
5caa2ad8ec
fix: typo
2021-08-16 18:31:45 +02:00
Marco Neumann
114a9004b3
test: restore `write_buffer_errors_propagate`
...
This was removed in #2203 due to insufficient mocking capabilities.
2021-08-16 18:31:43 +02:00
Marco Neumann
825c19d726
fix: disallow dropping unpersted chunks from persisted DB
...
It doesn't play well w/ replay at the moment since we would forget which
sequence numbers we've already seen.
Fixes #2291 .
2021-08-16 13:21:30 +02:00
Edd Robinson
13aaa1f105
Merge branch 'main' into er/feat/read_buffer_metrics
2021-08-13 15:02:03 +01:00
kodiakhq[bot]
d506da2a1a
Merge branch 'main' into cn/extract-iox-object-store
2021-08-13 13:45:35 +00:00
Edd Robinson
efde3a8f5a
feat: expose required bytes metric
2021-08-13 11:57:46 +01:00
Edd Robinson
311d36d776
refactor: include capacity in Read Buffer chunk size
2021-08-13 11:57:46 +01:00
Edd Robinson
fa8da19c45
refactor: expose enc size API into column
2021-08-13 11:57:46 +01:00
kodiakhq[bot]
1307450c78
Merge branch 'main' into crepererum/replay_skip_while_in_error_state_part_1b
2021-08-13 07:03:25 +00:00
Carol (Nichols || Goulding)
564238ad8c
refactor: Organize uses
2021-08-12 15:05:32 -04:00
Carol (Nichols || Goulding)
ae6b0e669b
refactor: Extract a database persister type that wraps object store
...
Connects to #2193 .
2021-08-12 15:05:32 -04:00
Edd Robinson
c68bbb6309
test: update test
2021-08-12 15:05:47 +01:00
Raphael Taylor-Davies
2c4384625a
feat: shutdown Database and Server on drop ( #2241 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-12 12:37:47 +00:00
Marco Neumann
e8bc7ee909
feat: server functionality to recover DB by skipping replay
2021-08-12 14:18:38 +02:00
kodiakhq[bot]
7956729ffa
Merge branch 'main' into crepererum/improve_write_buffer_mocking
2021-08-12 10:00:19 +00:00
Marco Neumann
1eb6e1f7f2
refactor: write buffer mocking is only required for tests
2021-08-12 11:46:24 +02:00
kodiakhq[bot]
c46c2a35fa
Merge branch 'main' into crepererum/database_creation_code_move
2021-08-12 09:30:33 +00:00
Andrew Lamb
34a1c1674f
chore: remove unused dependency ( #2247 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-12 08:57:12 +00:00
Marco Neumann
a5c74f2798
feat: ability to inject mocked write buffers into server/database
2021-08-12 10:46:16 +02:00
Marco Neumann
7d105e9229
docs: fix warnings
2021-08-12 09:30:54 +02:00
Dom
3de6b44e23
build: use new rustdoc lint name ( #2261 )
...
* fix: nocache feature code rot
The MBChunk::snapshot code when using the "nocache" option no longer
compiles - this commit updates it to match the not(nocache) code.
* build: use updated broken_intra_doc_links name
The broken_intra_doc_links lint was renamed
rustdoc::broken_intra_doc_links
https://doc.rust-lang.org/rustdoc/lints.html
2021-08-11 19:48:51 +00:00
Marco Neumann
794a9c039d
refactor: move database creation code around
...
Now all the code that is required to create a new database lives under
`server::database`, so it can easily be used for tests that don't
involve a server.
2021-08-11 18:44:55 +02:00
Marco Neumann
65b1ca2071
fix: also seed persistence windows when skipping replay
2021-08-11 10:27:52 +02:00
Marco Neumann
2082042626
test: do not wipe-on-error during tests
2021-08-11 10:27:51 +02:00
Marco Neumann
2eaf486eac
fix: always remember max seen sequ. numbers during replay
...
Do not forget max seen sequence numbers for partition-sequencer
combinations that can be skipped during replay.
Fixes #2215 .
2021-08-11 10:26:12 +02:00
Raphael Taylor-Davies
2344c28f4e
feat: drain database jobs on shutdown ( #2239 )
...
* feat: drain database jobs on shutdown
* chore: fmt
* chore: review feedback
* chore: use join() not member directly
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 16:47:37 +00:00
Raphael Taylor-Davies
29ac62c6f8
fix: reduce flakiness of lock_tracker_metrics test ( #2238 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 11:47:08 +00:00
Marco Neumann
4cf9244457
test: restore test assertions
2021-08-10 11:29:48 +02:00
Marco Neumann
cd414f28ef
fix: incorrect speculation of post-persist sequence ranges
...
This fixes an edge case where the speculated sequence ranges that can be
obtained from flush handles do not account for overlapping windows. The
symptom being that the resulting partition checkpoint marked sequence
numbers as unpersisted that where already persisted.
Fixes #2206 .
2021-08-10 11:29:48 +02:00
Raphael Taylor-Davies
cd5f4e1755
feat: background worker panic handling ( #2091 ) ( #2234 )
...
* feat: worker panic handling (#2091 )
* chore: add test comments
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 09:17:56 +00:00
Raphael Taylor-Davies
564819d24f
feat: Server own background worker ( #2232 )
...
* feat: Server own background worker
* chore: fix shutdown
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-09 18:01:48 +00:00
Marco Neumann
4dcee10d1e
refactor: do not construct replay plan when skipping replay
...
Up until now we only skipped the execution of the replay plan, not its
construction. The replay plan construction has some bugs left, so let's
move this part behind the toggle as well.
2021-08-09 15:23:39 +02:00
Raphael Taylor-Davies
c11eb25d4e
feat: remove create_database_lock ( #2227 )
...
* feat: remove create_database_lock
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-09 13:22:11 +00:00
kodiakhq[bot]
bf15e50ce7
Merge branch 'main' into crepererum/fix_checkpoint_ordering3
2021-08-09 12:27:20 +00:00
Raphael Taylor-Davies
54a8fff328
feat: database initialization logging ( #2228 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-09 12:13:33 +00:00
Andrew Lamb
559db4529d
refactor: Move DatabaseStore out of query crate ( #2219 )
...
* refactor: Move DatabaseStore out of query crate
* fix: doc links
2021-08-09 12:06:25 +00:00
Marco Neumann
92334a3747
docs: explain test intend
2021-08-09 13:26:31 +02:00
Marco Neumann
ae93a1cb89
test: adjust replay tests
2021-08-09 10:54:23 +02:00
Marco Neumann
950286e5b7
feat: make replay planning work w/ unordered checkpoints
2021-08-09 10:54:23 +02:00
Marco Neumann
57bbae7e34
refactor: persistence windows row counts are non-zero
2021-08-09 10:33:24 +02:00
Raphael Taylor-Davies
c957d8154f
feat: blocking Freezable ( #2224 )
...
* feat: blocking Freezable
* chore: test
2021-08-08 19:26:11 +00:00
Raphael Taylor-Davies
1f450ef371
feat: add Database abstraction ( #2186 ) ( #2203 )
...
* feat: add Database abstraction
* chore: minor tweaks
* chore: remove redundant test fixture restart
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-08 17:14:23 +00:00
Andrew Lamb
d41b44d312
feat: use zstd compression when writing parquet files ( #2218 )
...
* feat: use ZSTD when writing parquet files
* fix: test
2021-08-06 18:45:55 +00:00
Andrew Lamb
5d525cdc70
docs: Add note about what uses `ApplicationState` ( #2216 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-06 14:44:06 +00:00
Marco Neumann
4c79e3548e
test: do not rely on too many edge cases
2021-08-06 10:24:26 +02:00
Marco Neumann
882f89cecf
fix: only warn when partition ckpt and DB ckpt mins are out-of-sync
...
There are currently a few bugs and semi-understood edge cases that can
lead to this case. So instead of bailing out, just issue a warning.
2021-08-06 09:48:26 +02:00
Marco Neumann
4ffdb3d95d
test: drop-unpersisted is not required to trigger that bug
2021-08-06 09:48:26 +02:00
Marco Neumann
bde2b2b5df
refactor: `Tick` -> `MakeWritesPersistable`
2021-08-05 14:21:36 +02:00
Marco Neumann
548145a70e
docs: state that `background_worker_now_override` is for testing only
...
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-08-05 14:08:24 +02:00
Marco Neumann
015d858f88
test: add failing regression test for #2185
...
We need a partition that is partially persisted for this.
This requires some rework for the time handling in `Db` to make it
mockable.
The remaining bits are test framework extensions.
2021-08-05 11:44:44 +02:00
Raphael Taylor-Davies
dd9beab166
feat: error database if no rules ( #2187 )
2021-08-04 11:58:59 +00:00
Marco Neumann
60aee3e70c
refactor: avoid copying a sequence
2021-08-04 13:23:30 +02:00
Marco Neumann
1b2e331ec1
test: extend replay tests a bit
2021-08-04 13:23:30 +02:00
Marco Neumann
af1edcdcbb
fix: docstrings
...
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-08-04 13:23:30 +02:00
Marco Neumann
39f30fd0b6
test: make test queries easier to understand
2021-08-04 13:23:30 +02:00
Marco Neumann
567ef7e991
test: expland replay tests a bit
2021-08-04 13:23:30 +02:00
Marco Neumann
b868cd160e
docs: fix code comment about sequence ranges
2021-08-04 13:23:30 +02:00
Marco Neumann
ed70b73fd8
test: determistic concurreny for `TestDb`
2021-08-04 13:23:30 +02:00
Marco Neumann
a2bc97b923
feat: prune sequence numbers during replay
...
This only prunes entire sequence numbers, it does not (yet!) prune
individual rows for sequence numbers that are partially persisted.
2021-08-04 13:23:30 +02:00
Andrew Lamb
7a18087044
feat: Log messages during database initialization ( #2180 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-04 11:04:41 +00:00
Marco Neumann
65991270e4
refactor: rename handle and shutdown to link them to background worker
2021-08-04 12:04:47 +02:00
Marco Neumann
c2faf0876b
fix: fix typo and explain policy storage
2021-08-04 11:55:31 +02:00
Marco Neumann
42953b0561
fix: increase max wait time for compaction to 60s
2021-08-04 11:51:07 +02:00
Marco Neumann
164c6e3743
feat: improve hard buffer logging and use that as test assertions
2021-08-04 11:49:05 +02:00
Marco Neumann
657f469317
test: fix `seek_to_end_works`
2021-08-04 11:33:47 +02:00
Marco Neumann
6ce1984d75
test: improve hard buffer limit tests
2021-08-04 11:33:47 +02:00
Marco Neumann
3ac88ffc49
fix: hard buffer limits around write buffer consumption
...
- when reading entries from write buffer during normal playback, do not
throw away entries when hitting the hard buffer limit. instead wait
for compaction to sort it out
- during playback, wait for compaction
2021-08-04 11:33:47 +02:00
Marco Neumann
9ea04a42ff
refactor: start background worker before performing replay
...
This enables compaction during replay.
2021-08-04 11:33:47 +02:00
Marco Neumann
0fe8eda89e
refactor: move lifecycle policy into Db struct
2021-08-04 11:33:47 +02:00
Jacob Marble
98d4c9fca1
feat: switch protobuf write service to canonical definition ( #2182 )
...
* feat: switch protobuf write service to canonical definition
The protobuf definition used for the proto write endpoint was a WIP. Now
that a canonical definition exists at
https://github.com/influxdata/influxdb-pb-data-protocol/ we can switch
to that.
* chore: lint etc
* chore: fix rustdoc nit in proto definition comment
2021-08-04 00:16:49 +00:00
Raphael Taylor-Davies
ffb36cd50c
refactor: extract ApplicationState from Server ( #2167 )
...
* refactor: extract Application from Server
* chore: review feedback
2021-08-03 09:36:55 +00:00
Marco Neumann
f504d6002a
docs: error handling for `seek_to_end`
2021-08-03 09:40:40 +02:00
Marco Neumann
c912e91c95
feat: add flag to skip replay
...
Closes #2169 .
2021-08-02 18:14:19 +02:00
Carol (Nichols || Goulding)
9d15798288
fix: Address or allow Clippy warnings new with Rust 1.54
2021-07-30 09:59:59 -04:00
kodiakhq[bot]
545222303f
Merge branch 'main' into cn/cc-only
2021-07-29 17:18:16 +00:00
Carol (Nichols || Goulding)
79a04f861f
refactor: Take chunk and write time when creating a new MUB chunk
...
This makes it more consistent with the API of creating a new read buffer
chunk and a new object store chunk.
2021-07-29 10:11:50 -04:00
Raphael Taylor-Davies
431774c8b7
refactor: extract resolver from server::Config ( #2143 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-29 13:14:58 +00:00
Raphael Taylor-Davies
336ff30484
refactor: make server fields private ( #2144 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-29 13:06:05 +00:00
Raphael Taylor-Davies
df3b162475
refactor: move connection manager to separate module ( #2142 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-29 12:58:15 +00:00
Carol (Nichols || Goulding)
ad0a9549de
fix: Avoid an unnecessary parsing of iox metadata
...
In one case where ParquetChunk::new was being called, the calling code
had just parsed the IoxMetadata too. In the other case, the calling code
had just *created* the IoxMetadata being parsed. In both cases, this
re-parsing wasn't actually needed; the two bits of info
ParquetChunk::new can be easily passed in.
2021-07-28 14:25:56 -04:00
Carol (Nichols || Goulding)
af7866a638
refactor: Remove first/last write times from ParquetFile chunks
2021-07-28 14:12:36 -04:00
Marco Neumann
9371f781fe
test: add "missing entry" replay test
2021-07-28 17:34:02 +02:00
Marco Neumann
04e797c706
refactor: pass sequencer numbers directly to DB checkpoint
...
First of all using a partition checkpoint as some kind of intermediate
representation was kinda a hack because partition checkpoints should
only created for to-be-persisted partitions, not for the others.
API-wise it should only be possible to construct a partition checkpoint
from a flush handle.
Also we were only able to construct partition checkpoints for partitions
that had unpersisted data, otherwise there was no sane way to fill the
`min_unpersisted_timestamp`. We must however scan all partitions no
matter if there is unpersisted data so that we can determine the maximum
seen sequence numbers. This was caught by a replay test resulting in a
catalog state where the last database checkpoint had lower maximum seen
sequence numbers than some partition checkpoint, bailing out with an
error.
So overall it turns out that passing the sequencer numbers directly
instead of wrapping them into a partition checkpoint is the better
implementation.
2021-07-28 17:28:34 +02:00
Marco Neumann
a0764cbafd
test: add failing replay test
2021-07-28 17:28:34 +02:00
Marco Neumann
29ddc36154
docs: state the reason for some replay tests
2021-07-28 17:28:34 +02:00
Marco Neumann
ca90e92ecc
fix: replay tests should not fail when awaiting on query results
2021-07-28 17:28:34 +02:00
Carol (Nichols || Goulding)
11b7755325
refactor: Remove first/last write times from RUB chunks
2021-07-28 11:22:22 -04:00
Carol (Nichols || Goulding)
4689b5e4e5
refactor: Remove first/last write times from MUB chunks
2021-07-28 11:02:57 -04:00
Carol (Nichols || Goulding)
0f5398c4b9
refactor: Store first/last write on DbChunk snapshots
2021-07-28 11:02:56 -04:00
Carol (Nichols || Goulding)
bc2ec3338f
refactor: Move MBChunk creation inside CatalogChunk new_open
2021-07-28 11:02:56 -04:00
Carol (Nichols || Goulding)
b5195571fa
refactor: Move MBChunk creation inside partition create_open_chunk
2021-07-28 11:02:56 -04:00
kodiakhq[bot]
7b73190d79
Merge branch 'main' into crepererum/ingest_wallclock
2021-07-28 13:49:08 +00:00
Marco Neumann
0fcec6b742
refactor: move ingest timestamp from sequence to sequended entry
2021-07-28 15:40:35 +02:00
Raphael Taylor-Davies
754d647c06
feat: enable row timestamp metrics with environment variable ( #2135 )
...
* feat: enable row timestamp metrics with environment variable
* chore: fix test
* chore: fix typo
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-28 13:39:56 +00:00
Carol (Nichols || Goulding)
8add00e761
feat: Make CatalogChunk first/last write times required
...
Connects to #1927 .
2021-07-28 09:22:06 -04:00
Carol (Nichols || Goulding)
09e48018a0
refactor: Move ts_to_timestamp fn into the only file it's used in
2021-07-28 09:22:06 -04:00
Carol (Nichols || Goulding)
7c9a21632b
refactor: Organize uses
2021-07-28 09:22:04 -04:00
Marco Neumann
7b1301851a
feat: metric for ingest wall-clock time
2021-07-28 14:41:18 +02:00
Marco Neumann
e736bc6953
feat: add ingest timestamp to `Sequence`
...
This allows us to track wall-clock ingest time for entries that we
receive via write buffer (e.g. Kafka).
2021-07-28 14:41:18 +02:00
Marko Mikulicic
ec0804900a
feat(iox): Quick&Dirty KafkaProducer sink implementation
...
RoutingRules such as RoutingConfig and ShardConfig use a sink to decide where to write
the entries.
The write buffer is currently implemented in the `db` and is accessed by using the `write_local_entry`
code path. This PR simply invokes that legacy code path whenever a "kafka" sink is selected.
This allows us immediately to benefit from the ability of the ShardingConfig to select or reject
tables and send some to kafka, some to devnull.
This PR does not allow us yet to split an input batch into mulitiple shards and send each
to a different kafka topic. For that, we'll need to pull out the write buffer code path out of
the `db` and do something similar to a ConnectionManager but for write buffers. TODO
2021-07-28 10:13:22 +02:00
Andrew Lamb
3ea84c6be4
feat: expose null_counts in system.chunk_columns ( #2105 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-27 11:05:23 +00:00
kodiakhq[bot]
5551dd3a87
Merge branch 'main' into devnull
2021-07-27 09:57:16 +00:00
kodiakhq[bot]
119b913fa3
Merge branch 'main' into crepererum/improve_replay_tests
2021-07-27 07:27:58 +00:00
Andrew Lamb
5fb3e00f2a
fix: Properly record total_count and null_count in statistics ( #2103 )
...
* fix: Properly record total_count and null_count in statistics
* fix: fix statistics calculation in mutable_buffer
* refactor: expose null counts in read_buffer
* refactor: expose null_count in parquet_file
* fix: update server crate tests
* fix: update query_tests tests
* docs: tweak comments
* refactor: Use storage_stats rather than adding `null_count`
* refactor: rename test data field for clarity
* fix: fixup merge conflicts
* refactor: rename initial_non_null_count to initial_total_count
* refactor: caculate null_count as row_count - to_add
2021-07-26 18:13:36 +00:00
Marko Mikulicic
094945a72d
feat: Add '/dev/null' sink
2021-07-26 19:19:11 +02:00
Marco Neumann
d7e0b03064
refactor: use `drop` instead of `Option`
2021-07-26 17:43:03 +02:00
Marco Neumann
2d5a095d2d
refactor: rename `ActionOrTest` to `Step`
2021-07-26 17:34:13 +02:00
Marco Neumann
5787fbdb21
refactor: rename framework tests
2021-07-26 17:32:46 +02:00
Marco Neumann
aa61eb2732
refactor: improve replay test naming and add more docs
2021-07-26 17:31:13 +02:00
Marco Neumann
43cb148566
fix: docstring
...
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-07-26 17:14:01 +02:00
Marco Neumann
43f29422f8
refactor: isolate replay code and improve tests
...
This puts all the replay logic under `server::db::replay` as well as its
error variants and tests.
The tests are reworked using a more generic
test framework which allows us to specify a number of steps instead of
filling pre-defined ones with variables. Each step is either an action
(e.g. restart DB, perform replay, ingest data into the write buffer
state) or a check (e.g. assert that these partitions exists, await until
the background workers has ingested these partitions). The entire
framework is kept generic so it should be easy to create more checks and
actions in the future. The resulting tests are more verbose, but (at
least in my opinion) easier to follow along since the reader can see
what's happening at which step and does not jump back and forth between
the test config and the "driver" that uses the config.
2021-07-26 17:14:01 +02:00
kodiakhq[bot]
009c77d864
Merge branch 'main' into cn/parquet-first-last
2021-07-26 14:59:54 +00:00
Raphael Taylor-Davies
0b88deea43
refactor: don't pass sequence to MUB ( #2107 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-26 14:40:39 +00:00
Marko Mikulicic
e5ee252876
feat: Add kafka sink variant
2021-07-26 11:08:02 +02:00
Marko Mikulicic
d58a3ccbc7
refactor: Add sink to routing config
...
This deprecates the "target" field in the RoutingConfig and replaces it with the "sink"
field, which has a variant that accepts a node group.
This commit is backward compatible in that it will accept existing configs.
The configs will roundtrip to the new format though (i.e. `database get` will render
the sink field).
2021-07-26 11:08:01 +02:00
Marko Mikulicic
16a82ba350
refactor: Generailize sinks: Rename Shard to Sink
...
The ShardConfig applies matchers that resolve to a shard number.
The config then applies a mapping between shard numbers to targets.
The type that encapsulated the target that a shard points to was also called
a "Shard". This is confusing. This commit changes it to "Sink", i.e. a destination
for traffic to go to. Subsequent commits will expand the definition of a Sink to
encompass different kinds of sinks (like kafka write buffer, "devnull", ...)
This changes only the name of the protobuf message and the related rust types,
it doesn't change any name of the json-rendered protobuf configs.
2021-07-26 11:08:00 +02:00
Raphael Taylor-Davies
c595039c81
feat: add row timestamp metrics ( #2101 )
...
* feat: add row timestamp metrics
* chore: review feedback
2021-07-23 19:17:11 +00:00
Jake Goulding
d928bc84e6
feat: Thread time_of_{first,last}_write through Parquet metadata
2021-07-23 14:07:35 -04:00
Raphael Taylor-Davies
446af5eb15
fix: consistent write timestamps ( #2104 )
...
* fix: consistent write timestamps
* chore: fix benchmarks
2021-07-23 18:04:15 +00:00
Carol (Nichols || Goulding)
3c794153dd
refactor: Organize uses
2021-07-23 13:48:15 -04:00
Carol (Nichols || Goulding)
7de946c534
fix: ChunkStage::WrittenToObjectStore is now called ChunkStage::Persisted
2021-07-23 13:11:42 -04:00
Raphael Taylor-Davies
844a025c7c
feat: drop based on LRU ( #2075 ) ( #2092 )
...
* feat: drop based on LRU (#2075 )
* chore: review feedback
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-23 08:31:28 +00:00
Marco Neumann
53b00ec4e0
test: split replay tests
2021-07-23 10:17:02 +02:00
Marco Neumann
be1bc7025c
refactor: use a single seek loop during replay
2021-07-23 10:05:11 +02:00
Marco Neumann
ace247d5c2
feat: add replay logging
2021-07-23 10:03:02 +02:00
Marco Neumann
0c89930b7c
feat: check that replay plan and write buffer are in-sync
2021-07-23 09:39:46 +02:00
Marco Neumann
db0f501b02
feat: implement naive replay
2021-07-23 09:24:04 +02:00
Marco Neumann
6ef3680554
feat: collect replay plan during catalog loading
2021-07-23 09:23:06 +02:00
kodiakhq[bot]
71f3f1aba2
Merge branch 'main' into cn/refactorings
2021-07-22 19:44:18 +00:00
Andrew Lamb
01c79f1a1a
fix: Print all timestamps using RFC3339 format ( #2098 )
...
* fix: Use IOx pretty printer rather than arrow pretty printer
* chore: update tests in the query crate
* chore: update influxdb_iox tests
* chore: Update end to end tests
* chore: update query_tests
* chore: update mutable_buffer tests
* refactor: update parquet_file tests
* refactor: update db tests
* chore: update kafka integration test output
* fix: merge conflict
2021-07-22 19:04:52 +00:00
Raphael Taylor-Davies
20d06e3225
feat: include more information in system.operations table ( #2097 )
...
* feat: include more information in system.operations table
* chore: review feedback
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 17:16:09 +00:00
Carol (Nichols || Goulding)
14cb2a6bef
test: Add assertions for first/last write times as chunks move
2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding)
37f24ebfc7
feat: Record first/last write times for creation of read_buffer::Chunk
2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding)
0c44179aa9
feat: Add first/last write time on DbChunk
...
To eventually be used in collect_rub
2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding)
8d1d877196
feat: Record first/last write times for RUB chunks
2021-07-22 11:35:22 -04:00
Carol (Nichols || Goulding)
28fc01ecee
test: Make test failure messages easier to read
2021-07-22 11:15:19 -04:00
Carol (Nichols || Goulding)
6feea3b2d5
feat: Require at least one RecordBatch to create a read_buffer::Chunk::new
...
In the signature only for the moment.
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding)
d347750366
refactor: Make collect_rub create the RBChunk
...
Which gets rid of the need for new_rub_chunk.
This will enable creating RBChunks that are guaranteed to have data.
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding)
0a724878e6
refactor: Organize uses
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding)
7371b0aabf
refactor: Use existing new_rub_chunk function that has the same code
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding)
eadcb3265a
refactor: Use some TryStreamExt adapters in collect_rub
2021-07-22 11:15:18 -04:00
Raphael Taylor-Davies
38e375d11a
feat: add chunk storage metrics ( #2069 )
...
* feat: add chunk storage metrics
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 15:13:09 +00:00
Raphael Taylor-Davies
8c974beba0
feat: add access timestamps to CatalogChunk ( #2075 ) ( #2081 )
...
* feat: add access timestamps to CatalogChunk (#2075 )
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 12:19:30 +00:00
kodiakhq[bot]
8c4f5cb237
Merge branch 'main' into crepererum/fix_db_checkpoints
2021-07-21 16:46:13 +00:00
Marco Neumann
cddf94653c
refactor: use `write_buffer` subsystem for ingest metrics
2021-07-21 15:07:59 +02:00
Marco Neumann
fd00206fbb
refactor: increase watermark update frequence to once per 10s
2021-07-21 15:02:48 +02:00
Marco Neumann
2f1efcf517
docs: clarify difference
2021-07-21 15:00:53 +02:00
Marco Neumann
4d5f209030
docs: do not repeat unix that often
2021-07-21 14:59:07 +02:00
Marco Neumann
ec866de193
fix: collect checkpoint data from all tables
2021-07-21 14:48:29 +02:00
Marco Neumann
7d597d1d5c
refactor: make ingest metrics easier to understand
2021-07-21 13:57:53 +02:00
Marco Neumann
fb931bb1ca
feat: write buffer ingestion metrics
2021-07-21 11:59:52 +02:00
Raphael Taylor-Davies
091837420f
feat: add PersistenceWindows sytem table ( #2030 ) ( #2062 )
...
* feat: add PersistenceWindows sytem table (#2030 )
* chore: update log
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 13:10:57 +00:00
Raphael Taylor-Davies
e4d2c51e8b
fix: update PersistenceWindows on rules update ( #2018 ) ( #2060 )
...
* fix: update PersistenceWindows on rules update (#2018 )
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 12:44:47 +00:00
kodiakhq[bot]
58dd7e9532
Merge branch 'main' into crepererum/writer_buffer_seek
2021-07-20 12:29:18 +00:00
Raphael Taylor-Davies
cf8a60252d
refactor: split system_tables module into smaller modules ( #2061 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 12:19:20 +00:00
Marco Neumann
ec7ebdff29
refactor: use lifetimes to ensure single stream / no seek while streaming
2021-07-20 13:52:33 +02:00
Marco Neumann
b0663a0337
feat: disallow multiple write buffer streams and seeking while streams
...
Multiple streams will mess up ordering. Seeking while streaming is
likely a bug and should not work.
2021-07-20 12:35:20 +02:00
Raphael Taylor-Davies
767c2a6fe1
refactor: explicit server startup state machine ( #2040 )
...
* refactor: explicit server startup state machine
* chore: update `ServerStage` docs
* chore: further docs
* chore: more logging
* chore: format
2021-07-20 10:11:18 +00:00
kodiakhq[bot]
5bf68c4a57
Merge branch 'main' into jg/snafu-driveby
2021-07-19 20:20:30 +00:00
Raphael Taylor-Davies
1c8c227668
refactor: push database rules update into Db ( #2052 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-19 16:05:21 +00:00
kodiakhq[bot]
1d1ac12522
Merge branch 'main' into crepererum/write_buffer_multiple_streams
2021-07-19 15:50:42 +00:00
Andrew Lamb
4da8a16c18
chore: update to arrow 5.0 and master datafusion ( #2049 )
...
* chore: update to arrow 5.0 and master datafusion
* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Raphael Taylor-Davies
e2a23c7ac3
fix: persist deadlock ( #2045 ) ( #2046 )
2021-07-19 11:52:48 +00:00
Marco Neumann
592424c896
refactor: use one stream per sequencer/partition
...
Advantages are:
- for large DBs w/ many partitions we can ingest data in-parallel
- on top of this change we can implement per-sequencer seeking, which is
required for replay
2021-07-19 12:26:58 +02:00
kodiakhq[bot]
a1d47a8a7a
Merge branch 'main' into crepererum/simplify_testdb_lifecycle_rules
2021-07-19 09:53:35 +00:00
Raphael Taylor-Davies
5fc98c7c56
feat: add failure reporting to TaskTracker ( #2031 )
...
* feat: add failure reporting to TaskTracker
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-19 09:17:20 +00:00
Marco Neumann
2263189e09
test: make TestDb lifecycle better for testing
...
This is a leftover from #1972 .
2021-07-19 09:50:44 +02:00
Jake Goulding
449ba46b22
refactor: Make more use of SNAFU's context methods and ensure! macro
2021-07-16 16:31:50 -04:00
Edd Robinson
54ad69ed86
fix: ensure correct table meta size used
2021-07-16 10:48:45 -04:00
Marco Neumann
f57ba6afdb
fix: use fixed-size timestamps for parquet metadata ( #2032 )
...
This fixes flaky tests that rely on predictable files sizes.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-16 13:14:02 +00:00
Marco Neumann
2498642c00
fix: `persist_partition` docstring
2021-07-16 12:46:07 +02:00
Marco Neumann
1ef2bc1887
refactor: `Db::{write_chunk_to_object_store => Db::persist_partition}`
...
The previous method allowed to persist any chunk -- even ones that
should not be persisted yet and w/o any order of peristence. That will
break our persistence windows. So instead offer a sane higher-level
interface that can trigger persistence of a partition within the
boundaries of the lifecycle rules. This needs some adjustments for our
test suite.
2021-07-16 12:07:58 +02:00
Marco Neumann
9683d91f32
refactor: adjust to upstream changes
2021-07-16 11:45:34 +02:00
Marco Neumann
2b0a4bbe0a
feat: persist real (non-fake) part.+DB checkpoints
2021-07-16 11:45:34 +02:00
Marco Neumann
8276511bd3
feat: allow to construct partition checkpoint from partition
2021-07-16 11:45:34 +02:00