Carol (Nichols || Goulding)
d347750366
refactor: Make collect_rub create the RBChunk
...
Which gets rid of the need for new_rub_chunk.
This will enable creating RBChunks that are guaranteed to have data.
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding)
0a724878e6
refactor: Organize uses
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding)
7371b0aabf
refactor: Use existing new_rub_chunk function that has the same code
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding)
eadcb3265a
refactor: Use some TryStreamExt adapters in collect_rub
2021-07-22 11:15:18 -04:00
Raphael Taylor-Davies
38e375d11a
feat: add chunk storage metrics ( #2069 )
...
* feat: add chunk storage metrics
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 15:13:09 +00:00
Raphael Taylor-Davies
8c974beba0
feat: add access timestamps to CatalogChunk ( #2075 ) ( #2081 )
...
* feat: add access timestamps to CatalogChunk (#2075 )
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 12:19:30 +00:00
kodiakhq[bot]
f4b9fe20fd
Merge pull request #2084 from influxdata/crepererum/fix_checkpoints_again
...
refactor: correctly track "seen" ranges in persistence checkpoints
2021-07-22 11:45:39 +00:00
Marco Neumann
50241bae9e
refactor: do not abuse `uint64::MAX` as sentinal for `None`
2021-07-22 12:51:43 +02:00
Marco Neumann
47ad397918
fix: address review comments
2021-07-22 12:37:07 +02:00
Marco Neumann
57a9d5ade0
refactor: correctly track "seen" ranges in persistence checkpoints
...
Now we can handle all these cases:
There are two partitions w/ a single write each:
1. A reads sequence number 1
2. B reads sequence number 2
3. we persist A which only knows the sequences up until 1
=> the DB checkpoint needs the global max, otherwise we forget sequences
during replay (2 in this case, so B would be gone)
1. B reads sequence number 1
2. A reads sequence number 2
3. we persist A which (w/o this commit) would not track the sequencer at
all in this checkpoint (since there is nothing to replay)
=> we MUST also remember that we already read up until 2, otherwise we'll
re-read 2 after replay
=> the partition checkpoint needs the local seen max (no matter if there's
something to to persist)
2021-07-21 19:19:49 +02:00
kodiakhq[bot]
e1b2909818
Merge pull request #2079 from influxdata/crepererum/fix_db_checkpoints
...
fix: checkpoint collection (replay preparation)
2021-07-21 16:53:52 +00:00
kodiakhq[bot]
8c4f5cb237
Merge branch 'main' into crepererum/fix_db_checkpoints
2021-07-21 16:46:13 +00:00
kodiakhq[bot]
13ae2f0d78
Merge pull request #2070 from influxdata/ntran/dedup_compare_cols_order
...
feat: new algorithm to compute key ranges for deduplication
2021-07-21 15:50:11 +00:00
kodiakhq[bot]
18dd108ba6
Merge branch 'main' into ntran/dedup_compare_cols_order
2021-07-21 15:42:30 +00:00
Nga Tran
86add39175
refactor: address review comments
2021-07-21 11:41:21 -04:00
kodiakhq[bot]
56dd430d8f
Merge pull request #2077 from influxdata/crepererum/sequencer_metrics
...
feat: write buffer ingestion metrics
2021-07-21 13:30:22 +00:00
kodiakhq[bot]
91acf3911c
Merge branch 'main' into crepererum/sequencer_metrics
2021-07-21 13:23:23 +00:00
Marco Neumann
55490c279a
fix: Kafka watermark error for new partitions
2021-07-21 15:21:52 +02:00
Marco Neumann
cddf94653c
refactor: use `write_buffer` subsystem for ingest metrics
2021-07-21 15:07:59 +02:00
Marco Neumann
fd00206fbb
refactor: increase watermark update frequence to once per 10s
2021-07-21 15:02:48 +02:00
Marco Neumann
2f1efcf517
docs: clarify difference
2021-07-21 15:00:53 +02:00
Marco Neumann
4d5f209030
docs: do not repeat unix that often
2021-07-21 14:59:07 +02:00
Marco Neumann
a5fc1c7d38
fix: collect min AND max in database checkpoints
...
This is required to correctly handle the following case:
1. There are two partitions A and B w/ a single write each (from the same
sequencer).
2. We persist A:
- The partition checkpoint for A will be empty because after persistence
there will be nothing to replay (the single write is persisted and
we're ready).
- The database checkpoint that contains the global minimum of all ranges
recognizes that for the sequencer there is indeed something left (the
minimum sequence number from B).
3. DB restart happens, replay starts
4. We scan all persisted files, figure out that we have a DB checkpoint
with a sequence minimum but (w/o the change in this commit) there is no
maximum. Only partition checkpoints contain maxima, and the only partition
checkpoint that was persisted was the one for partition A and that one was
empty (see above).
5. So now how do we recover partition B?
2021-07-21 14:48:29 +02:00
Marco Neumann
ec866de193
fix: collect checkpoint data from all tables
2021-07-21 14:48:29 +02:00
Marco Neumann
7d597d1d5c
refactor: make ingest metrics easier to understand
2021-07-21 13:57:53 +02:00
Raphael Taylor-Davies
ffe6e62aee
feat: add instant to datetime conversion ( #2078 )
...
* feat: add instant to datetime conversion
* chore: review feedback
2021-07-21 11:43:27 +00:00
Marco Neumann
fb931bb1ca
feat: write buffer ingestion metrics
2021-07-21 11:59:52 +02:00
Marco Neumann
5df88c70aa
feat: add ability to fetch watermarks from write buffer
2021-07-21 11:59:52 +02:00
kodiakhq[bot]
58108b79ec
Merge pull request #2058 from influxdata/pd/add-cache-config
...
feat: add parquet cache size setting to database rules
2021-07-21 09:42:07 +00:00
kodiakhq[bot]
94a45339fd
Merge branch 'main' into pd/add-cache-config
2021-07-21 09:35:26 +00:00
Andrew Lamb
387667330a
chore: Update datafusion deps ( #2073 )
...
* chore: Update datafusion deps
* fix: update tests
2021-07-21 08:27:03 +00:00
Paul Dix
a4704dd165
chore: update parquet_cache_limit to u64 and 0 for default
2021-07-20 15:41:06 -04:00
Paul Dix
297e059085
feat: add parquet cache size setting to database rules
2021-07-20 15:41:06 -04:00
Nga Tran
d547c22e97
refactor: comments
2021-07-20 15:27:41 -04:00
Nga Tran
150e166813
refactor: fix comments
2021-07-20 15:16:24 -04:00
Nga Tran
fa6d216a85
refactor: cleanup
2021-07-20 15:11:02 -04:00
Nga Tran
b98888e8d6
feat: implement key_ranges function that uses new range identify algo
2021-07-20 14:58:54 -04:00
Raphael Taylor-Davies
61da0fe4df
fix: update last_instant when rotating into persistable window ( #2067 )
2021-07-20 16:38:28 +00:00
Raphael Taylor-Davies
091837420f
feat: add PersistenceWindows sytem table ( #2030 ) ( #2062 )
...
* feat: add PersistenceWindows sytem table (#2030 )
* chore: update log
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 13:10:57 +00:00
Raphael Taylor-Davies
e4d2c51e8b
fix: update PersistenceWindows on rules update ( #2018 ) ( #2060 )
...
* fix: update PersistenceWindows on rules update (#2018 )
* chore: review feedback
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 12:44:47 +00:00
kodiakhq[bot]
7d9e1f9704
Merge pull request #2059 from influxdata/crepererum/writer_buffer_seek
...
feat: implement `seek` for write buffer
2021-07-20 12:36:20 +00:00
kodiakhq[bot]
58dd7e9532
Merge branch 'main' into crepererum/writer_buffer_seek
2021-07-20 12:29:18 +00:00
kodiakhq[bot]
2a7848cbf2
Merge pull request #2064 from influxdata/biggermsg
...
fix: Increase kafka message size to 30MiB
2021-07-20 12:28:55 +00:00
kodiakhq[bot]
a4951b5835
Merge branch 'main' into biggermsg
2021-07-20 12:22:19 +00:00
Raphael Taylor-Davies
cf8a60252d
refactor: split system_tables module into smaller modules ( #2061 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 12:19:20 +00:00
Marko Mikulicic
c01cfbc34c
fix: Increase kafka message size
2021-07-20 14:17:37 +02:00
kodiakhq[bot]
cf30d19fd7
Merge pull request #2063 from influxdata/er/fix/flaky_compact_test
...
test: ensure high enough limit
2021-07-20 12:02:27 +00:00
Marco Neumann
ec7ebdff29
refactor: use lifetimes to ensure single stream / no seek while streaming
2021-07-20 13:52:33 +02:00
Edd Robinson
cc0aaa58a7
test: ensure high enough limit
2021-07-20 12:43:10 +01:00
Marco Neumann
b0663a0337
feat: disallow multiple write buffer streams and seeking while streams
...
Multiple streams will mess up ordering. Seeking while streaming is
likely a bug and should not work.
2021-07-20 12:35:20 +02:00