Commit Graph

1362 Commits (36e87d7b2e15a42907c4487a765035d5085c06c4)

Author SHA1 Message Date
Marco Neumann 3f2e46c397 feat: prune old transactions from preserved catalog 2021-09-14 12:08:17 +02:00
Nga Tran 042a78e5a7 feat: apply delete predicate during query to emilimate deleted data 2021-09-13 18:02:55 -04:00
Andrew Lamb 5eef76c868
chore: Update dependencies (including datafusion) (#2521)
* chore: Update datafusion deps to pre-release

* refactor: Update IOx to use new datafusion Statistics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 21:30:44 +00:00
Raphael Taylor-Davies f3bcafcfea
feat: migrate http metrics to metric crate (#2508)
* feat: migrate http metrics

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 18:56:20 +00:00
kodiakhq[bot] e76d70ea36
Merge branch 'main' into ntran/delete_pred_chunks 2021-09-13 16:15:57 +00:00
Nga Tran 40499b222e chore: merge main to branch 2021-09-13 12:15:16 -04:00
Nga Tran 8292c4d2e4 refactor: address review comments 2021-09-13 11:44:18 -04:00
Jake Goulding 0b6e577da5 fix: Return same error when querying deleted vs uncreated database
Closes #2446
2021-09-13 11:43:07 -04:00
Raphael Taylor-Davies 574149d644
feat: migrate remaining catalog metrics to new crate (#2490)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 14:42:14 +00:00
Raphael Taylor-Davies 20143e4f4e
feat: migrate chunk pruning metrics (#2516) 2021-09-13 13:13:47 +00:00
Nga Tran 3798ca09bb feat: save delete predicates in chunks 2021-09-10 17:16:18 -04:00
Raphael Taylor-Davies b8f7319704
feat: migrate read buffer metrics to metric crate (#2510)
* feat: migrate read buffer metrics to metric crate

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-10 19:51:43 +00:00
kodiakhq[bot] f6c0d94991
Merge branch 'main' into crepererum/rust_155 2021-09-10 10:59:59 +00:00
Andrew Lamb ec63321bb0
feat: Less errors on update_database_rules (#2433)
* fix: serialize concurrent database rules updates

* fix: second attempt

* docs: Apply suggestions from code review

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-09-10 10:46:26 +00:00
Marco Neumann 368f0369ee chore: Rust 1.55 2021-09-10 12:36:49 +02:00
Raphael Taylor-Davies eed81e752d
feat: remove deprecated catalog metrics (#2489)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-10 10:12:04 +00:00
kodiakhq[bot] faa05f394b
Merge branch 'main' into ntran/parse_delete_2 2021-09-09 18:28:39 +00:00
Raphael Taylor-Davies 44918e4afc
feat: migrate chunk metrics (#2491)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-09 16:02:16 +00:00
kodiakhq[bot] 76271a141a
Merge branch 'main' into crepererum/remove_process_clock 2021-09-09 15:08:40 +00:00
Marco Neumann 4d6ec4bfe6 refactor: remove process clock
The process clock is a leftover from the pre-Kafka writer buffer design
and is no longer required.
2021-09-09 16:55:48 +02:00
kodiakhq[bot] a9e2ed4c14
Merge branch 'main' into crepererum/fix_job_metrics 2021-09-09 14:53:00 +00:00
Raphael Taylor-Davies 3cee899f77
feat: migrate catalog timestamp summary to `metric` crate (#2486)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-09 14:52:25 +00:00
Marco Neumann a5d4d954fb fix: increase job duration histogram range
The default upper limit of 10s is too tight for many jobs. This now
increases the histogram range to 5000s (no joke, we've seen jobs w/ over
40min run time, even though that shouldn't happen).
2021-09-09 16:48:21 +02:00
Marco Neumann 40d3f53aee feat: add DB and table name to job metrics 2021-09-09 16:37:44 +02:00
Marco Neumann 0a31f5f2e5 fix: fix job metrics naming
For duration historgrams, the exporter takes care of the correct suffix
depending on the resolution used by it. For example the prometheus
exporter will use a `..._seconds` metric to encode the historgram. The
IOx internal metric should therefore NOT append any resolution. This
then removes the `_nanoseconds` suffix, renaming the externally visible
metric from

```text
influxdb_iox_job_completed_{cpu,wall}_nanoseconds_seconds
```

to

```text
influxdb_iox_job_completed_{cpu,wall}_seconds
```
2021-09-09 16:37:44 +02:00
Raphael Taylor-Davies 9de12745e7
feat: migrate lock metrics to metric crate (#2481)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-09 08:56:19 +00:00
Nga Tran 00df7b064c feat: finally have the delete predicate parsed 2021-09-08 17:30:10 -04:00
Marco Neumann 801cf08be7 feat: auto-creation of sequencers by write buffer
For Kafka, that basically means that we create a topic if it doesn't
exist yet.

Closed #2455.
Fixes #2189.
2021-09-07 18:24:57 +02:00
Marco Neumann d5662328b0 refactor: `n_sequencers` should be non-zero 2021-09-07 18:18:20 +02:00
Nga Tran dbe4bcff22 chore: merge main to branch 2021-09-07 10:54:59 -04:00
Marco Neumann 31cbb646b9 feat: skip individual rows during replay based on timestamp 2021-09-07 11:44:52 +02:00
Marco Neumann fe0df2ab0c fix: job metric race condition 2021-09-06 14:33:59 +02:00
Marco Neumann 998bafcd85 fix: typo
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-09-06 13:39:22 +02:00
Marco Neumann 77287ad228 feat: rework job metrics to be push-based, add wall/cpu time histograms 2021-09-06 13:39:22 +02:00
Marco Neumann e6f12f965c feat: expose job metrics
Closes #2416.
2021-09-06 13:39:22 +02:00
kodiakhq[bot] f6e040df3d
Merge branch 'main' into dependabot/cargo/tokio-1.11.0 2021-09-06 09:18:55 +00:00
Raphael Taylor-Davies a4b0cbc0e7
feat: migrate jemalloc metrics to `metric` crate (#2435)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-06 09:18:27 +00:00
dependabot[bot] b67610d9b9
chore(deps): bump tokio from 1.10.1 to 1.11.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.10.1 to 1.11.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.10.1...tokio-1.11.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 09:11:38 +00:00
Nga Tran 9de3b79a90 refactor: more cleanup 2021-09-06 01:45:47 -04:00
Nga Tran de0bd80c3d refactor: cleanup 2021-09-06 01:07:07 -04:00
Nga Tran 4801b2c238 feat: Have the ParseDelete message and its corresponding ProvidedParseDelete struct ready for building delete parser 2021-09-06 00:13:59 -04:00
dependabot[bot] b1bb390893
chore(deps): bump parking_lot from 0.11.1 to 0.11.2
Bumps [parking_lot](https://github.com/Amanieu/parking_lot) from 0.11.1 to 0.11.2.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Amanieu/parking_lot/compare/0.11.1...0.11.2)

---
updated-dependencies:
- dependency-name: parking_lot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 01:18:24 +00:00
kodiakhq[bot] 2d41fd519f
Merge branch 'main' into cn/list-soft-deleted 2021-09-03 15:16:32 +00:00
Marco Neumann 3c968ac092 feat: correctly account MUB sizes
Fixes #1565.
2021-09-03 09:15:49 +02:00
Nga Tran a85d95d2e9 refactor: cleanup 2021-09-02 17:25:41 -04:00
Nga Tran e2274a9f41 feat: parser for delete predicate 2021-09-02 17:02:05 -04:00
Carol (Nichols || Goulding) ce6030a3cb feat: Wire list deleted databases through gRPC and CLI APIs 2021-09-02 15:48:07 -04:00
Marco Neumann ecf1f99ddb refactor: more flexible writer buffer config
This allows:

- different types (instead of guessing through the connection URL)
- sequencer counts (not used yet but will be by #2455)
- extensible configs (e.g. to configure Kafka in a more granular way,
  not wired up yet)
- future extensions (since we use a message now instead of a single
  string)

**BREAKING: This requires changes for deployed systems / existing DBs!**
2021-09-02 16:41:35 +02:00
kodiakhq[bot] b3d04b3e26
Merge branch 'main' into cn/server-startup-delete 2021-09-01 13:32:34 +00:00
Carol (Nichols || Goulding) c89ad70d07
test: Ensure we deleted some tombstone file
To guard against forgetting to change this test if we change the tombstone file.

Co-authored-by: Marco Neumann <marco@crepererum.net>
2021-09-01 09:04:25 -04:00
kodiakhq[bot] e183ecb3e7
Merge branch 'main' into cn/list-but-not-deleted-databases 2021-09-01 13:03:09 +00:00
Marco Neumann 06833110ab test: allow creation of less complex parquet chunks 2021-09-01 11:26:05 +02:00
Nga Tran a4183de411 feat: more progress on the delete flow from grpc API to catalog chunks 2021-08-31 17:42:07 -04:00
Marco Neumann 79ad48ac3a chore: rename "labels" to "attributes" 2021-08-31 11:31:15 +02:00
Nga Tran f962d0ef2e feat: first step to add delete_predicate into chunk catalog 2021-08-30 17:16:08 -04:00
Carol (Nichols || Goulding) 396bc6a3ad test: Database startup error when there are multiple active generations
Fixes #2196.
2021-08-30 15:49:12 -04:00
Carol (Nichols || Goulding) c4693a08a5 fix: Remove an unnecessary clone 2021-08-30 14:14:23 -04:00
Carol (Nichols || Goulding) e67624dd37 fix: Assert on which error getting a deleted database returns 2021-08-30 11:29:25 -04:00
Carol (Nichols || Goulding) 442a26bb99 fix: Remove some unneded snafu-related allocations 2021-08-30 10:49:20 -04:00
Carol (Nichols || Goulding) 01103002f4 fix: Return an error if we can't find an iox object store to write a tombstone file in 2021-08-30 10:42:46 -04:00
Carol (Nichols || Goulding) d688678464 feat: Add an iox_object_store API for writing the tombstone file
Connects to #1871.
2021-08-30 10:42:45 -04:00
Marco Neumann 96b0026203 fix: make "persist partition" a bit more stable
- add longer wait times to tests
- exclude chunks that have active lifecycle actions early (instead of
  failing the whole set)
- properly catch the "no chunks" case

Fixes #2434.
2021-08-30 13:11:12 +02:00
Andrew Lamb 2f49e47a23
feat: return DatabaseRules for ListDatabases request (#2431) 2021-08-28 10:53:24 +00:00
Andrew Lamb 779b027271
feat: Store only the database rules sent by the client (do not store default values) (#2430)
* feat: add omit_default to protobuf definition

* feat: Persist only the client provided rules

* fix: Remove race conditions

* fix: merge confit

* refactor: do not use macro

* refactor: restore use of glob import

* fix: review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-28 10:26:32 +00:00
Nga Tran 499af57299 chore: merge mian to branch and resolve conflicts 2021-08-27 17:51:07 -04:00
Nga Tran b79eaa34d1 refactor: address review comments 2021-08-27 15:53:27 -04:00
kodiakhq[bot] 400ee89e70
Merge branch 'main' into crepererum/refactor_catalog_crate 2021-08-27 14:16:14 +00:00
Marco Neumann a2efe3299d refactor: restructure catalog code in `parquet_file`
No functional change (except for slightly changing error messages). This
will make it easier to add more functionality.
2021-08-27 15:06:31 +02:00
Raphael Taylor-Davies fcec394a28
feat: connect up new metrics (#2428) 2021-08-27 12:55:35 +00:00
Edd Robinson 6c7f8d6630 feat: add delete to crate Read Buffer API 2021-08-27 12:30:20 +01:00
Nga Tran bcd39e225c feat: Management API for delete 2021-08-26 17:31:21 -04:00
Raphael Taylor-Davies e3e801d29a
feat: propagate span context into storage RPC queries (#2407)
* feat: propagate span context into storage RPC queries

* refactor: create ExecutionContextProvider trait

* chore: cleanup imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
Carol (Nichols || Goulding) 7cf7fb02ed refactor: Rename database ObjectStore state types to DatabaseObjectStore 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 6d0959fbc3 fix: Move IOx object store creation logic into Database state machine 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 199d212b18 refactor: Move find-or-create IoxObjectStore logic into tests
This is the only place this logic is used; it's not appropriate for
production usage as we only ever want to either find and error or create
and error in real life.
2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) c7eceac8a3 refactor: Have server determine database generation from object store 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 5e1b57de9a refactor: Borrow arcs instead of as_ref 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) cee2f21d47 feat: Add a find_or_create object store function for tests 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) 18ba3b5c59 feat: Create database directories with a generation ID 2021-08-26 09:14:22 -04:00
Marco Neumann 026202a05c fix: correctly account for parquet metadata size
We need to hold the parquet metadata in memory so that we're able to
create catalog checkpoints. We used to do that by holding the decoded
structure (provided by the upstream `parquet` crate) in memory and
serializing that data on demand to Apache Thrift.

There are two drawbacks:

1. We did not account for the memory usage of the decoded structures (or
   at least not fully).
2. We actually don't need the decoded data in-memory, since for the
   checkpoint creation we only need to write the serialized data.

So this PR changes our wrapper so it holds the serialized data which is
then only decoded when it's really necessary. Since the serialized data
is a simple byte vector, we can also easily account for the size.

Note that this makes the accounted size of parquet chunks larger.
However this data was always there, we just ignored it up until now. If
the size of the parquet metadata really becomes an issue, we could trait
some CPU time for memory by compressing it.
2021-08-26 13:24:32 +02:00
kodiakhq[bot] b1ecf1bfed
Merge branch 'main' into crepererum/job_start_time_in_system_table 2021-08-26 08:04:10 +00:00
Andrew Lamb ddf6c6362e
chore: update DataFusion again (#2411)
* chore: update datafusion ref

* chore: run cargo update

* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
Marco Neumann 558aa54aa3 feat: add start time to `operations` system table 2021-08-26 10:00:29 +02:00
Edd Robinson 69329b0b38
Merge branch 'main' into er/refactor/read_buffer/rle_entries 2021-08-25 12:08:44 +01:00
Edd Robinson 11e88877f4 fix: correct size estimation of RLE encoding 2021-08-25 12:03:04 +01:00
Edd Robinson f3c57c47fa
Merge branch 'main' into er/refactor/read_buffer/table_arg 2021-08-25 10:30:12 +01:00
kodiakhq[bot] c98723e3b3
Merge branch 'main' into crepererum/rub_shrink_rle 2021-08-25 08:58:22 +00:00
Marco Neumann 2ad9843e5f feat: make `RLE` a bit smaller by capacity-based allocation
For some demo data this reduced the overall chunk size from

195049367 bytes
to
191088095 bytes
2021-08-25 10:22:43 +02:00
kodiakhq[bot] 5d97acb2f3
Merge branch 'main' into crepererum/issue2372 2021-08-25 07:08:15 +00:00
Edd Robinson 5648817285 refactor: remove redunant argument 2021-08-24 22:26:17 +01:00
Raphael Taylor-Davies f7792aafe6
feat: query tracing (#2273) (#2391)
* feat: query tracing (#2273)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 17:35:59 +00:00
Marco Neumann 363d202202 feat: stop application executor in one dedicated place 2021-08-24 14:46:36 +02:00
Raphael Taylor-Davies a6c9cc2bf2
refactor: rework exec module (#2384)
* refactor: rework exec module

* chore: update docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 08:39:54 +00:00
Andrew Lamb 35cf560c9f
fix: do not error if partition has no chunks (#2383)
* fix: do not error if partition has no chunks

* fix: do not panic

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 17:33:54 +00:00
Raphael Taylor-Davies 0946ffe916
refactor: reuse IOxExecutionContext (#2373)
* refactor: reuse IOxExecutionContext

* fix: orphaned comment

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 15:47:15 +00:00
kodiakhq[bot] ec0152714e
Merge branch 'main' into catalog-test-determinism 2021-08-19 17:53:04 +00:00
Raphael Taylor-Davies b0e8b75a8a fix: TestCatalogState unique chunk ID 2021-08-19 17:19:12 +01:00
kodiakhq[bot] 47431148d5
Merge branch 'main' into er/refactor/read_buffer/bitmap_size 2021-08-18 21:20:13 +00:00
Raphael Taylor-Davies e81b82c0a4
feat: split db worker loop (#2337)
* feat: split db worker loop

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-18 17:33:13 +00:00
Carol (Nichols || Goulding) 61263c8774 feat: Add a debugging-suitable way to get the object storage path of a database 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) fbf3ceb1e2 refactor: Extract listing of all databases into iox_object_store 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) f782e77dcc test: Use the iox_object_store when testing a database's object store files 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) ff89398132 fix: Remove DatabaseConfig store_path field
This is now managed by the iox_object_store crate.
2021-08-18 11:32:39 -04:00
Jake Goulding 63111d9d9a refactor: Move the database rules functionality to iox_object_store 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) 4447f1e22c test: Adjust parquet file sizes; only storing relative paths now 2021-08-18 11:32:39 -04:00
Carol (Nichols || Goulding) 6d5cb9c117 refactor: Extract a ParquetFilePath to handle paths to parquet files in a db's object store 2021-08-18 11:32:39 -04:00
Edd Robinson b9f09fce49 feat: improve bitset size estimation 2021-08-17 22:54:22 +01:00
Edd Robinson 1daa30cc7d fix: include enum in sizing 2021-08-17 22:54:22 +01:00
kodiakhq[bot] 006d4db0c1
Merge branch 'main' into er/feat/read_buffer/row_group_metrics 2021-08-17 21:44:01 +00:00
Andrew Lamb 6b2ac77b8b
docs: Add some doc comments about sortedness in catalog Partition chunks (#2323)
* docs: Note on iteration order in catalog::Partition

* test: add tests for chunk_id order
2021-08-17 15:17:12 +00:00
Edd Robinson 211d814c8c
Merge branch 'main' into er/feat/read_buffer/row_group_metrics 2021-08-17 13:00:44 +01:00
Edd Robinson c795fc7f9d feat: add metric to track total row groups 2021-08-17 12:55:11 +01:00
Marco Neumann 55e9a3beda docs: better explain locking 2021-08-17 10:14:20 +02:00
Marco Neumann e540798eed test: drop two chunks in `drop_partition` test 2021-08-17 10:07:26 +02:00
Marco Neumann 5b0c3728b6 fix: ensure that code invariants hold 2021-08-17 10:03:28 +02:00
Marco Neumann 32cf23100d docs: explain why `drop_partition` does not deadlock 2021-08-17 09:52:30 +02:00
Marco Neumann 4a5dfc895a docs: clarify that `Partition::chunks` returns an ordered iterator 2021-08-17 09:52:07 +02:00
Marco Neumann 177d5fbb35 docs: fix typo in `Step::Drop`
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-08-17 09:44:35 +02:00
Marco Neumann 9454e06d61 test: test interaction of dropping partitions and replay 2021-08-17 09:44:35 +02:00
Marco Neumann 77892a0998 feat: add API to drop entire partitions 2021-08-17 09:44:35 +02:00
Ning Sun c012e996ab
refactor: remove display methods, use fmt::Display instead. (#2272)
* refactor: remove display methods, use fmt::Display instead.

Signed-off-by: Ning Sun <sunng@protonmail.com>

* refactor: update a few calls from .display to .to_string()

* fix: consistently use `Path` rather than occasionally `DirsAndFileName`

* fix: fixup for merge conflicts

* fix: update test

* fix: Catch another case or two

* fix: fmt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-16 18:00:22 +00:00
Marco Neumann 5caa2ad8ec fix: typo 2021-08-16 18:31:45 +02:00
Marco Neumann 114a9004b3 test: restore `write_buffer_errors_propagate`
This was removed in #2203 due to insufficient mocking capabilities.
2021-08-16 18:31:43 +02:00
Marco Neumann 825c19d726 fix: disallow dropping unpersted chunks from persisted DB
It doesn't play well w/ replay at the moment since we would forget which
sequence numbers we've already seen.

Fixes #2291.
2021-08-16 13:21:30 +02:00
Edd Robinson 13aaa1f105
Merge branch 'main' into er/feat/read_buffer_metrics 2021-08-13 15:02:03 +01:00
kodiakhq[bot] d506da2a1a
Merge branch 'main' into cn/extract-iox-object-store 2021-08-13 13:45:35 +00:00
Edd Robinson efde3a8f5a feat: expose required bytes metric 2021-08-13 11:57:46 +01:00
Edd Robinson 311d36d776 refactor: include capacity in Read Buffer chunk size 2021-08-13 11:57:46 +01:00
Edd Robinson fa8da19c45 refactor: expose enc size API into column 2021-08-13 11:57:46 +01:00
kodiakhq[bot] 1307450c78
Merge branch 'main' into crepererum/replay_skip_while_in_error_state_part_1b 2021-08-13 07:03:25 +00:00
Carol (Nichols || Goulding) 564238ad8c refactor: Organize uses 2021-08-12 15:05:32 -04:00
Carol (Nichols || Goulding) ae6b0e669b refactor: Extract a database persister type that wraps object store
Connects to #2193.
2021-08-12 15:05:32 -04:00
Edd Robinson c68bbb6309 test: update test 2021-08-12 15:05:47 +01:00
Raphael Taylor-Davies 2c4384625a
feat: shutdown Database and Server on drop (#2241)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-12 12:37:47 +00:00
Marco Neumann e8bc7ee909 feat: server functionality to recover DB by skipping replay 2021-08-12 14:18:38 +02:00
kodiakhq[bot] 7956729ffa
Merge branch 'main' into crepererum/improve_write_buffer_mocking 2021-08-12 10:00:19 +00:00
Marco Neumann 1eb6e1f7f2 refactor: write buffer mocking is only required for tests 2021-08-12 11:46:24 +02:00
kodiakhq[bot] c46c2a35fa
Merge branch 'main' into crepererum/database_creation_code_move 2021-08-12 09:30:33 +00:00
Andrew Lamb 34a1c1674f
chore: remove unused dependency (#2247)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-12 08:57:12 +00:00
Marco Neumann a5c74f2798 feat: ability to inject mocked write buffers into server/database 2021-08-12 10:46:16 +02:00
Marco Neumann 7d105e9229 docs: fix warnings 2021-08-12 09:30:54 +02:00
Dom 3de6b44e23
build: use new rustdoc lint name (#2261)
* fix: nocache feature code rot

The MBChunk::snapshot code when using the "nocache" option no longer
compiles - this commit updates it to match the not(nocache) code.

* build: use updated broken_intra_doc_links name

The broken_intra_doc_links lint was renamed
rustdoc::broken_intra_doc_links

https://doc.rust-lang.org/rustdoc/lints.html
2021-08-11 19:48:51 +00:00
Marco Neumann 794a9c039d refactor: move database creation code around
Now all the code that is required to create a new database lives under
`server::database`, so it can easily be used for tests that don't
involve a server.
2021-08-11 18:44:55 +02:00
Marco Neumann 65b1ca2071 fix: also seed persistence windows when skipping replay 2021-08-11 10:27:52 +02:00
Marco Neumann 2082042626 test: do not wipe-on-error during tests 2021-08-11 10:27:51 +02:00
Marco Neumann 2eaf486eac fix: always remember max seen sequ. numbers during replay
Do not forget max seen sequence numbers for partition-sequencer
combinations that can be skipped during replay.

Fixes #2215.
2021-08-11 10:26:12 +02:00
Raphael Taylor-Davies 2344c28f4e
feat: drain database jobs on shutdown (#2239)
* feat: drain database jobs on shutdown

* chore: fmt

* chore: review feedback

* chore: use join() not member directly

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 16:47:37 +00:00
Raphael Taylor-Davies 29ac62c6f8
fix: reduce flakiness of lock_tracker_metrics test (#2238)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 11:47:08 +00:00
Marco Neumann 4cf9244457 test: restore test assertions 2021-08-10 11:29:48 +02:00
Marco Neumann cd414f28ef fix: incorrect speculation of post-persist sequence ranges
This fixes an edge case where the speculated sequence ranges that can be
obtained from flush handles do not account for overlapping windows. The
symptom being that the resulting partition checkpoint marked sequence
numbers as unpersisted that where already persisted.

Fixes #2206.
2021-08-10 11:29:48 +02:00
Raphael Taylor-Davies cd5f4e1755
feat: background worker panic handling (#2091) (#2234)
* feat: worker panic handling (#2091)

* chore: add test comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 09:17:56 +00:00
Raphael Taylor-Davies 564819d24f
feat: Server own background worker (#2232)
* feat: Server own background worker

* chore: fix shutdown

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-09 18:01:48 +00:00
Marco Neumann 4dcee10d1e refactor: do not construct replay plan when skipping replay
Up until now we only skipped the execution of the replay plan, not its
construction. The replay plan construction has some bugs left, so let's
move this part behind the toggle as well.
2021-08-09 15:23:39 +02:00
Raphael Taylor-Davies c11eb25d4e
feat: remove create_database_lock (#2227)
* feat: remove create_database_lock

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-09 13:22:11 +00:00
kodiakhq[bot] bf15e50ce7
Merge branch 'main' into crepererum/fix_checkpoint_ordering3 2021-08-09 12:27:20 +00:00
Raphael Taylor-Davies 54a8fff328
feat: database initialization logging (#2228)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-09 12:13:33 +00:00
Andrew Lamb 559db4529d
refactor: Move DatabaseStore out of query crate (#2219)
* refactor: Move DatabaseStore out of query crate

* fix: doc links
2021-08-09 12:06:25 +00:00
Marco Neumann 92334a3747 docs: explain test intend 2021-08-09 13:26:31 +02:00
Marco Neumann ae93a1cb89 test: adjust replay tests 2021-08-09 10:54:23 +02:00
Marco Neumann 950286e5b7 feat: make replay planning work w/ unordered checkpoints 2021-08-09 10:54:23 +02:00
Marco Neumann 57bbae7e34 refactor: persistence windows row counts are non-zero 2021-08-09 10:33:24 +02:00
Raphael Taylor-Davies c957d8154f
feat: blocking Freezable (#2224)
* feat: blocking Freezable

* chore: test
2021-08-08 19:26:11 +00:00
Raphael Taylor-Davies 1f450ef371
feat: add Database abstraction (#2186) (#2203)
* feat: add Database abstraction

* chore: minor tweaks

* chore: remove redundant test fixture restart

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-08 17:14:23 +00:00
Andrew Lamb d41b44d312
feat: use zstd compression when writing parquet files (#2218)
* feat: use ZSTD when writing parquet files

* fix: test
2021-08-06 18:45:55 +00:00
Andrew Lamb 5d525cdc70
docs: Add note about what uses `ApplicationState` (#2216)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-06 14:44:06 +00:00
Marco Neumann 4c79e3548e test: do not rely on too many edge cases 2021-08-06 10:24:26 +02:00
Marco Neumann 882f89cecf fix: only warn when partition ckpt and DB ckpt mins are out-of-sync
There are currently a few bugs and semi-understood edge cases that can
lead to this case. So instead of bailing out, just issue a warning.
2021-08-06 09:48:26 +02:00
Marco Neumann 4ffdb3d95d test: drop-unpersisted is not required to trigger that bug 2021-08-06 09:48:26 +02:00
Marco Neumann bde2b2b5df refactor: `Tick` -> `MakeWritesPersistable` 2021-08-05 14:21:36 +02:00
Marco Neumann 548145a70e
docs: state that `background_worker_now_override` is for testing only
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-08-05 14:08:24 +02:00
Marco Neumann 015d858f88 test: add failing regression test for #2185
We need a partition that is partially persisted for this.
This requires some rework for the time handling in `Db` to make it
mockable.

The remaining bits are test framework extensions.
2021-08-05 11:44:44 +02:00
Raphael Taylor-Davies dd9beab166
feat: error database if no rules (#2187) 2021-08-04 11:58:59 +00:00
Marco Neumann 60aee3e70c refactor: avoid copying a sequence 2021-08-04 13:23:30 +02:00
Marco Neumann 1b2e331ec1 test: extend replay tests a bit 2021-08-04 13:23:30 +02:00
Marco Neumann af1edcdcbb fix: docstrings
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-08-04 13:23:30 +02:00
Marco Neumann 39f30fd0b6 test: make test queries easier to understand 2021-08-04 13:23:30 +02:00
Marco Neumann 567ef7e991 test: expland replay tests a bit 2021-08-04 13:23:30 +02:00
Marco Neumann b868cd160e docs: fix code comment about sequence ranges 2021-08-04 13:23:30 +02:00
Marco Neumann ed70b73fd8 test: determistic concurreny for `TestDb` 2021-08-04 13:23:30 +02:00
Marco Neumann a2bc97b923 feat: prune sequence numbers during replay
This only prunes entire sequence numbers, it does not (yet!) prune
individual rows for sequence numbers that are partially persisted.
2021-08-04 13:23:30 +02:00
Andrew Lamb 7a18087044
feat: Log messages during database initialization (#2180)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-04 11:04:41 +00:00
Marco Neumann 65991270e4 refactor: rename handle and shutdown to link them to background worker 2021-08-04 12:04:47 +02:00
Marco Neumann c2faf0876b fix: fix typo and explain policy storage 2021-08-04 11:55:31 +02:00
Marco Neumann 42953b0561 fix: increase max wait time for compaction to 60s 2021-08-04 11:51:07 +02:00
Marco Neumann 164c6e3743 feat: improve hard buffer logging and use that as test assertions 2021-08-04 11:49:05 +02:00
Marco Neumann 657f469317 test: fix `seek_to_end_works` 2021-08-04 11:33:47 +02:00
Marco Neumann 6ce1984d75 test: improve hard buffer limit tests 2021-08-04 11:33:47 +02:00
Marco Neumann 3ac88ffc49 fix: hard buffer limits around write buffer consumption
- when reading entries from write buffer during normal playback, do not
  throw away entries when hitting the hard buffer limit. instead wait
  for compaction to sort it out
- during playback, wait for compaction
2021-08-04 11:33:47 +02:00
Marco Neumann 9ea04a42ff refactor: start background worker before performing replay
This enables compaction during replay.
2021-08-04 11:33:47 +02:00
Marco Neumann 0fe8eda89e refactor: move lifecycle policy into Db struct 2021-08-04 11:33:47 +02:00
Jacob Marble 98d4c9fca1
feat: switch protobuf write service to canonical definition (#2182)
* feat: switch protobuf write service to canonical definition

The protobuf definition used for the proto write endpoint was a WIP. Now
that a canonical definition exists at
https://github.com/influxdata/influxdb-pb-data-protocol/ we can switch
to that.

* chore: lint etc

* chore: fix rustdoc nit in proto definition comment
2021-08-04 00:16:49 +00:00
Raphael Taylor-Davies ffb36cd50c
refactor: extract ApplicationState from Server (#2167)
* refactor: extract Application from Server

* chore: review feedback
2021-08-03 09:36:55 +00:00
Marco Neumann f504d6002a docs: error handling for `seek_to_end` 2021-08-03 09:40:40 +02:00
Marco Neumann c912e91c95 feat: add flag to skip replay
Closes #2169.
2021-08-02 18:14:19 +02:00
Carol (Nichols || Goulding) 9d15798288 fix: Address or allow Clippy warnings new with Rust 1.54 2021-07-30 09:59:59 -04:00
kodiakhq[bot] 545222303f
Merge branch 'main' into cn/cc-only 2021-07-29 17:18:16 +00:00
Carol (Nichols || Goulding) 79a04f861f refactor: Take chunk and write time when creating a new MUB chunk
This makes it more consistent with the API of creating a new read buffer
chunk and a new object store chunk.
2021-07-29 10:11:50 -04:00
Raphael Taylor-Davies 431774c8b7
refactor: extract resolver from server::Config (#2143)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-29 13:14:58 +00:00
Raphael Taylor-Davies 336ff30484
refactor: make server fields private (#2144)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-29 13:06:05 +00:00
Raphael Taylor-Davies df3b162475
refactor: move connection manager to separate module (#2142)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-29 12:58:15 +00:00
Carol (Nichols || Goulding) ad0a9549de fix: Avoid an unnecessary parsing of iox metadata
In one case where ParquetChunk::new was being called, the calling code
had just parsed the IoxMetadata too. In the other case, the calling code
had just *created* the IoxMetadata being parsed. In both cases, this
re-parsing wasn't actually needed; the two bits of info
ParquetChunk::new can be easily passed in.
2021-07-28 14:25:56 -04:00
Carol (Nichols || Goulding) af7866a638 refactor: Remove first/last write times from ParquetFile chunks 2021-07-28 14:12:36 -04:00
Marco Neumann 9371f781fe test: add "missing entry" replay test 2021-07-28 17:34:02 +02:00
Marco Neumann 04e797c706 refactor: pass sequencer numbers directly to DB checkpoint
First of all using a partition checkpoint as some kind of intermediate
representation was kinda a hack because partition checkpoints should
only created for to-be-persisted partitions, not for the others.
API-wise it should only be possible to construct a partition checkpoint
from a flush handle.

Also we were only able to construct partition checkpoints for partitions
that had unpersisted data, otherwise there was no sane way to fill the
`min_unpersisted_timestamp`. We must however scan all partitions no
matter if there is unpersisted data so that we can determine the maximum
seen sequence numbers. This was caught by a replay test resulting in a
catalog state where the last database checkpoint had lower maximum seen
sequence numbers than some partition checkpoint, bailing out with an
error.

So overall it turns out that passing the sequencer numbers directly
instead of wrapping them into a partition checkpoint is the better
implementation.
2021-07-28 17:28:34 +02:00
Marco Neumann a0764cbafd test: add failing replay test 2021-07-28 17:28:34 +02:00
Marco Neumann 29ddc36154 docs: state the reason for some replay tests 2021-07-28 17:28:34 +02:00
Marco Neumann ca90e92ecc fix: replay tests should not fail when awaiting on query results 2021-07-28 17:28:34 +02:00
Carol (Nichols || Goulding) 11b7755325 refactor: Remove first/last write times from RUB chunks 2021-07-28 11:22:22 -04:00
Carol (Nichols || Goulding) 4689b5e4e5 refactor: Remove first/last write times from MUB chunks 2021-07-28 11:02:57 -04:00
Carol (Nichols || Goulding) 0f5398c4b9 refactor: Store first/last write on DbChunk snapshots 2021-07-28 11:02:56 -04:00
Carol (Nichols || Goulding) bc2ec3338f refactor: Move MBChunk creation inside CatalogChunk new_open 2021-07-28 11:02:56 -04:00
Carol (Nichols || Goulding) b5195571fa refactor: Move MBChunk creation inside partition create_open_chunk 2021-07-28 11:02:56 -04:00
kodiakhq[bot] 7b73190d79
Merge branch 'main' into crepererum/ingest_wallclock 2021-07-28 13:49:08 +00:00
Marco Neumann 0fcec6b742 refactor: move ingest timestamp from sequence to sequended entry 2021-07-28 15:40:35 +02:00
Raphael Taylor-Davies 754d647c06
feat: enable row timestamp metrics with environment variable (#2135)
* feat: enable row timestamp metrics with environment variable

* chore: fix test

* chore: fix typo

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-28 13:39:56 +00:00
Carol (Nichols || Goulding) 8add00e761 feat: Make CatalogChunk first/last write times required
Connects to #1927.
2021-07-28 09:22:06 -04:00
Carol (Nichols || Goulding) 09e48018a0 refactor: Move ts_to_timestamp fn into the only file it's used in 2021-07-28 09:22:06 -04:00
Carol (Nichols || Goulding) 7c9a21632b refactor: Organize uses 2021-07-28 09:22:04 -04:00
Marco Neumann 7b1301851a feat: metric for ingest wall-clock time 2021-07-28 14:41:18 +02:00
Marco Neumann e736bc6953 feat: add ingest timestamp to `Sequence`
This allows us to track wall-clock ingest time for entries that we
receive via write buffer (e.g. Kafka).
2021-07-28 14:41:18 +02:00
Marko Mikulicic ec0804900a
feat(iox): Quick&Dirty KafkaProducer sink implementation
RoutingRules such as RoutingConfig and ShardConfig use a sink to decide where to write
the entries.

The write buffer is currently implemented in the `db` and is accessed by using the `write_local_entry`
code path. This PR simply invokes that legacy code path whenever a "kafka" sink is selected.

This allows us immediately to benefit from the ability of the ShardingConfig to select or reject
tables and send some to kafka, some to devnull.

This PR does not allow us yet to split an input batch into mulitiple shards and send each
to a different kafka topic. For that, we'll need to pull out the write buffer code path out of
the `db` and do something similar to a ConnectionManager but for write buffers. TODO
2021-07-28 10:13:22 +02:00
Andrew Lamb 3ea84c6be4
feat: expose null_counts in system.chunk_columns (#2105)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-27 11:05:23 +00:00
kodiakhq[bot] 5551dd3a87
Merge branch 'main' into devnull 2021-07-27 09:57:16 +00:00
kodiakhq[bot] 119b913fa3
Merge branch 'main' into crepererum/improve_replay_tests 2021-07-27 07:27:58 +00:00
Andrew Lamb 5fb3e00f2a
fix: Properly record total_count and null_count in statistics (#2103)
* fix: Properly record total_count and null_count in statistics

* fix: fix statistics calculation in mutable_buffer

* refactor: expose null counts in read_buffer

* refactor: expose null_count in parquet_file

* fix: update server crate tests

* fix: update query_tests tests

* docs: tweak comments

* refactor: Use storage_stats rather than adding `null_count`

* refactor: rename test data field for clarity

* fix: fixup merge conflicts

* refactor: rename initial_non_null_count to initial_total_count

* refactor: caculate null_count as row_count - to_add
2021-07-26 18:13:36 +00:00
Marko Mikulicic 094945a72d
feat: Add '/dev/null' sink 2021-07-26 19:19:11 +02:00
Marco Neumann d7e0b03064 refactor: use `drop` instead of `Option` 2021-07-26 17:43:03 +02:00
Marco Neumann 2d5a095d2d refactor: rename `ActionOrTest` to `Step` 2021-07-26 17:34:13 +02:00
Marco Neumann 5787fbdb21 refactor: rename framework tests 2021-07-26 17:32:46 +02:00
Marco Neumann aa61eb2732 refactor: improve replay test naming and add more docs 2021-07-26 17:31:13 +02:00
Marco Neumann 43cb148566 fix: docstring
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-07-26 17:14:01 +02:00
Marco Neumann 43f29422f8 refactor: isolate replay code and improve tests
This puts all the replay logic under `server::db::replay` as well as its
error variants and tests.

The tests are reworked using a more generic
test framework which allows us to specify a number of steps instead of
filling pre-defined ones with variables. Each step is either an action
(e.g. restart DB, perform replay, ingest data into the write buffer
state) or a check (e.g. assert that these partitions exists, await until
the background workers has ingested these partitions). The entire
framework is kept generic so it should be easy to create more checks and
actions in the future. The resulting tests are more verbose, but (at
least in my opinion) easier to follow along since the reader can see
what's happening at which step and does not jump back and forth between
the test config and the "driver" that uses the config.
2021-07-26 17:14:01 +02:00
kodiakhq[bot] 009c77d864
Merge branch 'main' into cn/parquet-first-last 2021-07-26 14:59:54 +00:00
Raphael Taylor-Davies 0b88deea43
refactor: don't pass sequence to MUB (#2107)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-26 14:40:39 +00:00
Marko Mikulicic e5ee252876
feat: Add kafka sink variant 2021-07-26 11:08:02 +02:00
Marko Mikulicic d58a3ccbc7
refactor: Add sink to routing config
This deprecates the "target" field in the RoutingConfig and replaces it with the "sink"
field, which has a variant that accepts a node group.

This commit is backward compatible in that it will accept existing configs.
The configs will roundtrip to the new format though (i.e. `database get` will render
the sink field).
2021-07-26 11:08:01 +02:00
Marko Mikulicic 16a82ba350
refactor: Generailize sinks: Rename Shard to Sink
The ShardConfig applies matchers that resolve to a shard number.
The config then applies a mapping between shard numbers to targets.
The type that encapsulated the target that a shard points to was also called
a "Shard". This is confusing. This commit changes it to "Sink", i.e. a destination
for traffic to go to. Subsequent commits will expand the definition of a Sink to
encompass different kinds of sinks (like kafka write buffer, "devnull", ...)

This changes only the name of the protobuf message and the related rust types,
it doesn't change any name of the json-rendered protobuf configs.
2021-07-26 11:08:00 +02:00
Raphael Taylor-Davies c595039c81
feat: add row timestamp metrics (#2101)
* feat: add row timestamp metrics

* chore: review feedback
2021-07-23 19:17:11 +00:00
Jake Goulding d928bc84e6 feat: Thread time_of_{first,last}_write through Parquet metadata 2021-07-23 14:07:35 -04:00
Raphael Taylor-Davies 446af5eb15
fix: consistent write timestamps (#2104)
* fix: consistent write timestamps

* chore: fix benchmarks
2021-07-23 18:04:15 +00:00
Carol (Nichols || Goulding) 3c794153dd refactor: Organize uses 2021-07-23 13:48:15 -04:00
Carol (Nichols || Goulding) 7de946c534 fix: ChunkStage::WrittenToObjectStore is now called ChunkStage::Persisted 2021-07-23 13:11:42 -04:00
Raphael Taylor-Davies 844a025c7c
feat: drop based on LRU (#2075) (#2092)
* feat: drop based on LRU (#2075)

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-23 08:31:28 +00:00
Marco Neumann 53b00ec4e0 test: split replay tests 2021-07-23 10:17:02 +02:00
Marco Neumann be1bc7025c refactor: use a single seek loop during replay 2021-07-23 10:05:11 +02:00
Marco Neumann ace247d5c2 feat: add replay logging 2021-07-23 10:03:02 +02:00
Marco Neumann 0c89930b7c feat: check that replay plan and write buffer are in-sync 2021-07-23 09:39:46 +02:00
Marco Neumann db0f501b02 feat: implement naive replay 2021-07-23 09:24:04 +02:00
Marco Neumann 6ef3680554 feat: collect replay plan during catalog loading 2021-07-23 09:23:06 +02:00
kodiakhq[bot] 71f3f1aba2
Merge branch 'main' into cn/refactorings 2021-07-22 19:44:18 +00:00
Andrew Lamb 01c79f1a1a
fix: Print all timestamps using RFC3339 format (#2098)
* fix: Use IOx pretty printer rather than arrow pretty printer

* chore: update tests in the query crate

* chore: update influxdb_iox tests

* chore: Update end to end tests

* chore: update query_tests

* chore: update mutable_buffer tests

* refactor: update parquet_file tests

* refactor: update db tests

* chore: update kafka integration test output

* fix: merge conflict
2021-07-22 19:04:52 +00:00
Raphael Taylor-Davies 20d06e3225
feat: include more information in system.operations table (#2097)
* feat: include more information in system.operations table

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 17:16:09 +00:00
Carol (Nichols || Goulding) 14cb2a6bef test: Add assertions for first/last write times as chunks move 2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding) 37f24ebfc7 feat: Record first/last write times for creation of read_buffer::Chunk 2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding) 0c44179aa9 feat: Add first/last write time on DbChunk
To eventually be used in collect_rub
2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding) 8d1d877196 feat: Record first/last write times for RUB chunks 2021-07-22 11:35:22 -04:00
Carol (Nichols || Goulding) 28fc01ecee test: Make test failure messages easier to read 2021-07-22 11:15:19 -04:00
Carol (Nichols || Goulding) 6feea3b2d5 feat: Require at least one RecordBatch to create a read_buffer::Chunk::new
In the signature only for the moment.
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding) d347750366 refactor: Make collect_rub create the RBChunk
Which gets rid of the need for new_rub_chunk.

This will enable creating RBChunks that are guaranteed to have data.
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding) 0a724878e6 refactor: Organize uses 2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding) 7371b0aabf refactor: Use existing new_rub_chunk function that has the same code 2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding) eadcb3265a refactor: Use some TryStreamExt adapters in collect_rub 2021-07-22 11:15:18 -04:00
Raphael Taylor-Davies 38e375d11a
feat: add chunk storage metrics (#2069)
* feat: add chunk storage metrics

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 15:13:09 +00:00
Raphael Taylor-Davies 8c974beba0
feat: add access timestamps to CatalogChunk (#2075) (#2081)
* feat: add access timestamps to CatalogChunk (#2075)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-22 12:19:30 +00:00
kodiakhq[bot] 8c4f5cb237
Merge branch 'main' into crepererum/fix_db_checkpoints 2021-07-21 16:46:13 +00:00
Marco Neumann cddf94653c refactor: use `write_buffer` subsystem for ingest metrics 2021-07-21 15:07:59 +02:00
Marco Neumann fd00206fbb refactor: increase watermark update frequence to once per 10s 2021-07-21 15:02:48 +02:00
Marco Neumann 2f1efcf517 docs: clarify difference 2021-07-21 15:00:53 +02:00
Marco Neumann 4d5f209030 docs: do not repeat unix that often 2021-07-21 14:59:07 +02:00
Marco Neumann ec866de193 fix: collect checkpoint data from all tables 2021-07-21 14:48:29 +02:00
Marco Neumann 7d597d1d5c refactor: make ingest metrics easier to understand 2021-07-21 13:57:53 +02:00
Marco Neumann fb931bb1ca feat: write buffer ingestion metrics 2021-07-21 11:59:52 +02:00
Raphael Taylor-Davies 091837420f
feat: add PersistenceWindows sytem table (#2030) (#2062)
* feat: add PersistenceWindows sytem table (#2030)

* chore: update log

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 13:10:57 +00:00
Raphael Taylor-Davies e4d2c51e8b
fix: update PersistenceWindows on rules update (#2018) (#2060)
* fix: update PersistenceWindows on rules update (#2018)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 12:44:47 +00:00
kodiakhq[bot] 58dd7e9532
Merge branch 'main' into crepererum/writer_buffer_seek 2021-07-20 12:29:18 +00:00
Raphael Taylor-Davies cf8a60252d
refactor: split system_tables module into smaller modules (#2061)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-20 12:19:20 +00:00
Marco Neumann ec7ebdff29 refactor: use lifetimes to ensure single stream / no seek while streaming 2021-07-20 13:52:33 +02:00
Marco Neumann b0663a0337 feat: disallow multiple write buffer streams and seeking while streams
Multiple streams will mess up ordering. Seeking while streaming is
likely a bug and should not work.
2021-07-20 12:35:20 +02:00
Raphael Taylor-Davies 767c2a6fe1
refactor: explicit server startup state machine (#2040)
* refactor: explicit server startup state machine

* chore: update `ServerStage` docs

* chore: further docs

* chore: more logging

* chore: format
2021-07-20 10:11:18 +00:00
kodiakhq[bot] 5bf68c4a57
Merge branch 'main' into jg/snafu-driveby 2021-07-19 20:20:30 +00:00
Raphael Taylor-Davies 1c8c227668
refactor: push database rules update into Db (#2052)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-19 16:05:21 +00:00
kodiakhq[bot] 1d1ac12522
Merge branch 'main' into crepererum/write_buffer_multiple_streams 2021-07-19 15:50:42 +00:00
Andrew Lamb 4da8a16c18
chore: update to arrow 5.0 and master datafusion (#2049)
* chore: update to arrow 5.0 and master datafusion

* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Raphael Taylor-Davies e2a23c7ac3
fix: persist deadlock (#2045) (#2046) 2021-07-19 11:52:48 +00:00
Marco Neumann 592424c896 refactor: use one stream per sequencer/partition
Advantages are:

- for large DBs w/ many partitions we can ingest data in-parallel
- on top of this change we can implement per-sequencer seeking, which is
  required for replay
2021-07-19 12:26:58 +02:00
kodiakhq[bot] a1d47a8a7a
Merge branch 'main' into crepererum/simplify_testdb_lifecycle_rules 2021-07-19 09:53:35 +00:00
Raphael Taylor-Davies 5fc98c7c56
feat: add failure reporting to TaskTracker (#2031)
* feat: add failure reporting to TaskTracker

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-19 09:17:20 +00:00
Marco Neumann 2263189e09 test: make TestDb lifecycle better for testing
This is a leftover from #1972.
2021-07-19 09:50:44 +02:00
Jake Goulding 449ba46b22 refactor: Make more use of SNAFU's context methods and ensure! macro 2021-07-16 16:31:50 -04:00
Edd Robinson 54ad69ed86 fix: ensure correct table meta size used 2021-07-16 10:48:45 -04:00
Marco Neumann f57ba6afdb
fix: use fixed-size timestamps for parquet metadata (#2032)
This fixes flaky tests that rely on predictable files sizes.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-16 13:14:02 +00:00
Marco Neumann 2498642c00 fix: `persist_partition` docstring 2021-07-16 12:46:07 +02:00
Marco Neumann 1ef2bc1887 refactor: `Db::{write_chunk_to_object_store => Db::persist_partition}`
The previous method allowed to persist any chunk -- even ones that
should not be persisted yet and w/o any order of peristence. That will
break our persistence windows. So instead offer a sane higher-level
interface that can trigger persistence of a partition within the
boundaries of the lifecycle rules. This needs some adjustments for our
test suite.
2021-07-16 12:07:58 +02:00
Marco Neumann 9683d91f32 refactor: adjust to upstream changes 2021-07-16 11:45:34 +02:00
Marco Neumann 2b0a4bbe0a feat: persist real (non-fake) part.+DB checkpoints 2021-07-16 11:45:34 +02:00
Marco Neumann 8276511bd3 feat: allow to construct partition checkpoint from partition 2021-07-16 11:45:34 +02:00
Marco Neumann a9ea8e9ced docs: add docstring to some `Partition` methods 2021-07-16 11:45:34 +02:00
Marco Neumann 71b5030fc0 refactor: remove unused `LockableChunk::write_to_object_store` 2021-07-16 11:45:34 +02:00
Raphael Taylor-Davies 00b89cd751
fix: freeze chunks in write path (#2021) (#2022)
* fix: freeze chunks in write path (#2021)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-16 08:51:37 +00:00
kodiakhq[bot] 50aa1f857d
Merge branch 'main' into ntran/refactor_use_sort_key 2021-07-15 21:17:26 +00:00
kodiakhq[bot] 76d9b8f7cc
Merge branch 'main' into debugkafka 2021-07-15 21:07:35 +00:00
Edd Robinson d5dcb40438
refactor: track future execution (#2014)
* refactor: track future execution

* refactor: update server/src/db/lifecycle/compact.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 20:58:52 +00:00
Marko Mikulicic 06399e88e0
chore: Add some debug logs to write buffer 2021-07-15 22:18:03 +02:00
Nga Tran cfe0bfa88b refactor: address review comments and add useful log info to catch resort 2021-07-15 15:39:12 -04:00
Andrew Lamb 3fd6430fb6
fix: rename `estimated_bytes` to `memory_bytes` and expose `object_store_bytes` in ChunkSummary and system.chunks (#2017)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 16:00:24 +00:00
Marco Neumann b5428e53a5 refactor: write buffer testing + better mocking
This refactors the write buffer a bit for:

- **Testing:** Add generic tests for the Kafka and the mocking
  implementation. The same interface can be used easily add new
  implementations (e.g. via Redis, filesystem, ...).
- **Partition on Write:** The caller of the writer operation must now
  specify the partition/sequencer ID. The implicit partitioning of the
  Kafka writer would have lead to broken data since we must never spill
  entries w/ the same primary key over multiple partitions. At the
  moment we will only use partition 0 but we can easily implement
  better logic in the future.
- **Improved Mocking:** The mocked implementation now simulates a system
  that feels more real. Especially the handling around multiple streams
  and "write while read" has been improved. This will be helpful for
  testing and for new features like seeking (during replay). A solid
  realistic mock also helps us to ensure that the tests using the mock
  do not rely on unrealistic behavior too much.
2021-07-15 17:20:45 +02:00
Raphael Taylor-Davies d71f38f27c
feat: compute PartitionCheckpoint from PersistenceWindows (#2011)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 12:17:23 +00:00
Andrew Lamb 0c86d1dccf
feat: Record parquet bytes size in catalog / parquet_file (#2006)
* feat: Store object store size in parquet_file

* fix: update TRANSACTION_VERSION to 8

* refactor: rename os_bytes --> file_size_bytes
2021-07-15 12:07:11 +00:00
Marco Neumann 4741483f72 docs: explain why we update memory metrics when lifecycle action is cleared 2021-07-15 12:07:56 +02:00
Marco Neumann 924b0db542 fix: account for memory size in drop lifecycle action 2021-07-15 12:07:56 +02:00
Marco Neumann cccdd8a43f fix: correct code comment 2021-07-15 12:07:56 +02:00
Marco Neumann 77a9191a11 fix: chunk dropping over lifecycle policy should also respect the preserved catalog 2021-07-15 12:07:56 +02:00
Marco Neumann 71cb15f017 refactor: use lifecycle action to drop chunks
This avoids holding partition locks while the preserved catalog IO is
done.
2021-07-15 12:07:56 +02:00
Marco Neumann e570c66697 feat: add "dropping" chunk lifecycle action 2021-07-15 12:07:56 +02:00
Marco Neumann 68e20779a2 test: add test for clearing lifecycle actions from chunks 2021-07-15 12:07:56 +02:00
Marco Neumann d89fca00be feat: persist "drop chunk" 2021-07-15 12:07:56 +02:00
Raphael Taylor-Davies 3e0d1eb560
refactor: introduce PartitionAddr (#2010) 2021-07-15 10:01:33 +00:00
Nga Tran 0b1f2b1fd0 chore: merge main to branch 2021-07-14 16:17:14 -04:00
Nga Tran 552e3fb691 fix: Padd stats compute deterministic order of sort key and update tests that got changed by the use of sort key 2021-07-14 14:06:41 -04:00
kodiakhq[bot] 833debd5b5
Merge branch 'main' into cn/exploration 2021-07-14 17:30:55 +00:00
Raphael Taylor-Davies cbeeb97cff
feat: flush open window on persist (#1985)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-14 16:58:20 +00:00
Raphael Taylor-Davies 1d00fa2fd8
refactor: track memory metrics in catalog (#1995)
* refactor: track memory metrics in catalog

* chore: update comment
2021-07-14 16:23:00 +00:00
Carol (Nichols || Goulding) 8070065e2f fix: Change RUB chunk table_summaries to table_summary
Because chunks now have only one table.

Connects to #1718, #1613, #1295
2021-07-14 11:18:02 -04:00
Carol (Nichols || Goulding) 649b467adb fix: CatalogChunk no longer needs to record a write when created from a MUB chunk 2021-07-14 10:28:12 -04:00
Carol (Nichols || Goulding) 7ccbab8c90 feat: Make a TableSummaryAndTimes to use to slowly replace TableSummary
And use TableSummaryAndTimes with the mutable buffer chunks when turning
them into catalog chunks.

It's proving too big to switch over everything using TableSummary at
once, so this will let us switch over more incrementally.
2021-07-14 10:28:12 -04:00
Edd Robinson 4dedb657f2
Merge branch 'main' into alamb/go_go_go_go 2021-07-14 14:04:13 +01:00
Raphael Taylor-Davies f1c1620c84
feat: make persistence windows interface harder to use incorrectly (#1977)
* feat: make persistence windows interface harder to use incorrectly

* chore: review feedback

* chore: update comment

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-14 13:03:18 +00:00
Edd Robinson 0e5276ed20
Merge branch 'main' into alamb/go_go_go_go 2021-07-14 13:56:35 +01:00
Marco Neumann 9cb9ae0874 chore: move write buffer into its own crate 2021-07-14 14:09:18 +02:00
Marko Mikulicic d427fed9dc
fix: Remove bad max.request.size config param 2021-07-14 13:54:18 +02:00
Nga Tran 8fd0df04f2 feat: continue buidling and using sort_key if available 2021-07-13 16:25:58 -04:00
Andrew Lamb 781c4fa666 fix: update server tests 2021-07-13 15:44:57 -04:00
Marko Mikulicic 239c931f26
fix: Raise max message to 10M
And log message size on kafka write error.

Turns out the kafka partition message size limit default is 1MB, but also the
client side "max request size" default is also 1MB.
The error message we get from our kafka client is misleading: it says

```
KafkaError (Message production error: MessageSizeTooLarge (Broker: Message size too large)) }
```

which to my mind it seemed like if ("Broker:") the broker said "Message size too large".
That was a lie; I killed the broker and the client kept saying the same error message which means
it didn't even try to send the message out.

TODO: make this a proper parameter. (but let's unblock)
2021-07-13 17:47:36 +02:00
kodiakhq[bot] 6a09678f34
Merge branch 'main' into crepererum/update_deps 2021-07-13 14:18:57 +00:00
Raphael Taylor-Davies 6c8b2b4fa7
feat: add integration test of compaction freezing (#1938) (#1975)
* feat: add integration test of compaction freezing (#1938)

* chore: update server/src/db/lifecycle/compact.rs

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-07-13 14:11:10 +00:00
Marco Neumann 157a0cc98c chore: update flatbuffers to 2.0 2021-07-13 15:44:45 +02:00
Marko Mikulicic bf20641d78
chore: Log whether the write buffer is enabled 2021-07-13 14:15:52 +02:00
Raphael Taylor-Davies 5a0caeab44
feat: skip over fully persisted partitions (#1962) (#1973)
* feat: skip over fully persisted partitions (#1962)

* chore: review feedback
2021-07-13 10:40:45 +00:00
Andrew Lamb d35b74c226
fix: Fix doc build warnings (#1945)
* fix: Fix doc build warnings

* refactor: add deny bare_urls to crates

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 08:03:42 +00:00
Paul Dix 708aebaeb3
fix: freeze open chunk when compacting (#1971)
Closes #1938. Unfortunately, this contains only a unit test to ensure that open chunks are frozen when set_compacting is called. It would be better to have a more end-to-end integration test that ensurest his behavior, but I've confirmed by hand (with some sleeps and a hacked up end-to-end test) that this fixes it.
2021-07-13 07:44:02 +00:00
Nga Tran 5418a1fe6b refactor: remove unused comments 2021-07-12 18:14:38 -04:00
Nga Tran 23895e6673 feat: Using sort_key to avoid resorts 2021-07-12 18:08:45 -04:00
Carol (Nichols || Goulding) 6764a2d68e fix: Write Buffer errors are known, not UnknownDatabaseErrors
Fixes #1956.
2021-07-12 11:21:31 -04:00
Carol (Nichols || Goulding) 3bd7486016 test: Rename a test type alias to not shadow super::Error 2021-07-12 10:46:29 -04:00
Andrew Lamb 670826daf9
refactor: make object_store construction interface consistent (#1944)
* refactor: make object_store construction interface consistent

* fix: benchmarks

* fix: doc build

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-12 12:56:36 +00:00
Andrew Lamb 9534220035
feat: Add any lifecycle_action to system.chunks and API (#1947) 2021-07-09 17:38:29 +00:00
Raphael Taylor-Davies 7af560aa99
feat: Persist lifecycle action (#1888)
* feat: add split and persist operation

* docs: Improve doc strings

* refactor: use for loop rather than map

* refactor: Make it clear that the lifecycle policy picks the split timestamp

* fix: race condition

* docs: improve comments

* fix: logical merge conflict

* fix: clippy

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2021-07-09 13:21:46 +00:00
Andrew Lamb 1a79bf7e99
refactor: Make aws/azure/gcs optional features and stop compiling 100 dependencies during dev (#1933)
* feat: make aws, gcp, azure dependencies optional

* fix: only run object store tests if the features are enabled

* fix: clean up testing

* fix: rename step

* fix: add to list of jobs

* fix: remove test with object store

* fix: review comments
2021-07-09 11:38:30 +00:00
Andrew Lamb 3cb8f297b1
refactor: encapsulate the ObjectStore implementations in the object store crate (#1932)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-09 10:38:32 +00:00
Marco Neumann bc958e2ff0 refactor: use Arcs to pass schemas around 2021-07-09 09:45:12 +02:00
Marco Neumann 09e611deb7 refactor: lift query schema generation up to caller
Do no longer scan chunks during query planning to determine the schema
(except for the lifetime jobs where we have a good reason to do so).
Instead pass the schema down to from whoever is triggering the query.
For real SQL queries, we then just use the the table-wide schemas
introduced in #1913.

Apart from avoiding schema merges we now also don't crash any longer
when no chunks are left in the table (aka columns are present but all
rows are gone).

Fixes #1768.
Fixes #1884.
2021-07-09 09:24:21 +02:00
kodiakhq[bot] c37053ad46
Merge branch 'main' into cn/chunk-times 2021-07-08 20:58:54 +00:00
kodiakhq[bot] a2726c7e92
Merge branch 'main' into cn/kafka-read-metrics-and-e2e-tests 2021-07-08 20:40:19 +00:00
Carol (Nichols || Goulding) 22495dd355 fix: Take a TableBatch in the MBChunk constructor
Thus ensuring all MBChunks will have data in them.
2021-07-08 16:39:35 -04:00
Carol (Nichols || Goulding) 548c64539e fix: Wrap lines at 100 chars 2021-07-08 16:39:33 -04:00
Carol (Nichols || Goulding) 74c0a6cb00 fix: Arrange use statements so rustfmt can manage their order 2021-07-08 16:39:02 -04:00
kodiakhq[bot] c8126784a8
Merge branch 'main' into ntran/avoid_sort_in_scan 2021-07-08 20:22:18 +00:00
Andrew Lamb 72928aab3d
refactor: Move ChunkLifecycleAction to the data_types crate (#1939) 2021-07-08 20:18:33 +00:00
Andrew Lamb dd3eff7748
refactor: Always use `row_count` for count of rows in system.* tables (#1937) 2021-07-08 19:28:11 +00:00
Carol (Nichols || Goulding) c6bf0a26f4 feat: Add metrics for when ingesting from the write buffer fails
So that we have some way of figuring out what might be going on.
2021-07-08 09:57:51 -04:00
Carol (Nichols || Goulding) 80e1dcafe0 feat: Support reading from all Kafka partitions
When reading from the Kafka write buffer, subscribe to all partitions in
a topic and start from the smallest offset available, instead of
assuming there will only be 1 partition per topic.
2021-07-08 09:30:59 -04:00
Carol (Nichols || Goulding) c90ef7b14b fix: Create one consumer group per server+database
This hasn't caused any problems for me yet, but seemed like a good idea
because we want to be sure we don't get any of Kafka's consumer
rebalancing if we have multiple partitions.
2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) e5168936f5 feat: Better error messages through to gRPC API + e2e Kafka Read tests 2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) c53ae41d57 fix: Remove unneeded Option from the reading mock 2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) 854c28c41a feat: Stream messages from Kafka into the database 2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) ee500f5bda feat: Support configuring a write buffer for writing OR reading 2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) 63d26f6f3f refactor: Rename KafkaBuffer to KafkaBufferProducer 2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) e5de73133c feat: Change write buffer connection rule to take either Writing or Reading connection info
A database on one IOx server can, exclusively:

- Not interact with Kafka at all
- Send writes to Kafka
- Read writes from Kafka

Notably, a database on a particular server will never write *and* read from Kafka at the same time.
2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) fd4bcc2fa5 refactor: Rename the WriteBuffer trait to be WriteBufferWriting 2021-07-08 09:28:34 -04:00
Carol (Nichols || Goulding) 83e50cfba4 refactor: Rename field to not contain the type 2021-07-08 09:28:34 -04:00
kodiakhq[bot] 69e4786fc7
Merge branch 'main' into crepererum/str_arcs 2021-07-08 13:20:49 +00:00
Marco Neumann 18893e76e0 refactor: convert some table name and part. key String to Arcs
This has the (somewhat nice) side effect that it shrinks the in-mem
catalog a bit as well because nw `ParquetChunk` is a bit smaller making
the chunk stage enum smaller as well.
2021-07-08 14:34:28 +02:00
Edd Robinson 7ff8ae4ce5 refactor: tidy up sort key rep 2021-07-08 12:48:41 +01:00
Edd Robinson f811bf1e5e refactor: log compaction activity 2021-07-08 12:48:41 +01:00
Andrew Lamb 33bc85ad18
feat: Infrastructure for persistence (#1925)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 11:14:38 +00:00
Andrew Lamb 7602bde850
chore: Update datafusion deps (#1799)
* chore: Update datafusion deps + rework code

* refactor: remove workaround as it has been contributed upstream

* fix: Update query/src/exec/split.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 10:58:32 +00:00
Marco Neumann 24056d7bfc test: ensure that table schemas are recovered from pres. catalog 2021-07-08 10:01:42 +02:00
Marco Neumann a746cd45c5 test: check for schema change errors 2021-07-08 09:51:49 +02:00
Marco Neumann bd22dd38ea docs: fix typos
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-07-08 09:18:09 +02:00
Marco Neumann b528ac2b55 feat: store schemas per table
This way we can:

- check for schema matches even for writes going into different
  partitions
- solve #1768 and #1884 in some future PR

Closes #1897.
2021-07-08 09:18:09 +02:00
Marco Neumann 5ca9760c94 test: make partioning in DB tests consistent w/ DB rules 2021-07-08 09:18:09 +02:00
Marco Neumann ed3ebdcbd2 refactor: use sync locks w/ better metrics 2021-07-08 09:18:09 +02:00
Marco Neumann 5936452895 feat: add infra to check table-wide schemas 2021-07-08 09:18:09 +02:00
Nga Tran 5c722af0fa fix: remove comments 2021-07-07 16:50:53 -04:00
Nga Tran d3c4f8c249 fix: store sort key correctly inthe schema. Update tests to reflect it 2021-07-07 15:55:23 -04:00
Paul Dix cc350bb1ea fix: don't update last write time on failed writes
Fixes #1905
2021-07-07 14:50:03 -04:00
Andrew Lamb e6d995cbd8
chore: Update to Rust 1.53.0 (#1922)
* chore: Update to Rust 1.53.0

* fix: Update to latest clippy standards

* fix: bad refactor

* fix: Update escaping

* test: update test output

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 18:02:03 +00:00
Andrew Lamb 957c6245e3
docs: Note that rollover_partition is not automatically called (#1910)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 12:14:31 +00:00
Marko Mikulicic 25e3a304ed
chore: Log partition rollover (#1907)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-07 11:48:16 +00:00
Nga Tran 8dfc3bb6bc fix: Thanks Andrew for helping fic the compile problem and avoid using Arc<Mutex> 2021-07-06 18:05:59 -04:00
Nga Tran 76789e5902 feat: store sotkey into the chunk schema of RUB 2021-07-06 17:00:35 -04:00
Marco Neumann b6185982f7 refactor: make `ProviderBuilder` a build-time-checked builder
It's safer and also avoids cloning / copying state around.
2021-07-06 18:20:05 +02:00
Marco Neumann 4f5fe62428
feat: add DB name to lifecycle logs (#1900)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-06 16:14:28 +00:00
Marco Neumann 09b7405b20
docs: spelling fixes
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-07-06 17:46:36 +02:00
Marco Neumann 3d644b63a1 feat: add `Replay` state to DB init 2021-07-06 14:24:39 +02:00
Marco Neumann 4ca2d3e148 chore: move persistence windows related code into own crate
The entire persistence windows data structures (including the
checkpoints) have nothing to do with the mutable buffer per se. So lets
move them into their own crate. This also makes `parquet_file` not
longer depend on `mutable_buffer`.
2021-07-05 10:23:58 +02:00
Marco Neumann cdab1bed05 feat: persist part+db checkpoint in parquets and catalog
This will be required for replay on server startup.
2021-07-05 09:42:46 +02:00
kodiakhq[bot] bcf43a3de5
Merge branch 'main' into crepererum/db_state_in_grpc 2021-07-05 07:21:48 +00:00
Nga Tran 405a6a691b feat: intial implementation of #1886: avoid resort if appropriate 2021-07-02 17:57:48 -04:00
Raphael Taylor-Davies b4534883fe
refactor: remove table name from upsert_table (#1882)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-02 15:22:41 +00:00