Commit Graph

1362 Commits (36e87d7b2e15a42907c4487a765035d5085c06c4)

Author SHA1 Message Date
Carol (Nichols || Goulding) d5ab29711e fix: Serialize relative db object store paths to the server config
So that they can be deserialized, without parsing, to create a new
iox object store from the location listed in the server config.

Notably, the locations serialized don't start with the object storage's
prefix like "s3:" or "file:". The location is the same object storage as
the server configuration that was just read from object storage. Having
the server config on one type of object storage and the database files
on another type is not supported.
2021-10-18 08:37:36 -04:00
Carol (Nichols || Goulding) 26484309e0 fix: Re-export prost errors instead of wrapping them 2021-10-15 13:44:53 -04:00
Carol (Nichols || Goulding) 2253a7ba62 fix: Use a map in the server config protobuf 2021-10-15 10:52:59 -04:00
Carol (Nichols || Goulding) afd6e826e5 feat: Write out server config files listing database name and locations 2021-10-15 09:46:20 -04:00
Carol (Nichols || Goulding) 5348c9e503 refactor: Move ProstError to root of generated_types to be useful elsewhere 2021-10-15 09:46:20 -04:00
Carol (Nichols || Goulding) 4365dda6cc test: Remove server init that isn't needed; it gets restarted before anything's checked 2021-10-15 09:46:20 -04:00
Carol (Nichols || Goulding) 42824c30ec fix: This error is listing databases, not rules 2021-10-15 09:46:20 -04:00
Raphael Taylor-Davies d6b7b56f16
refactor: pull lifecycle out of Db (#2242) (#2831)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-15 13:08:00 +00:00
Marco Neumann 2850487877 feat: make trace collector in Kafka consumer optional
The whole application might not have a trace collector configured in
which case we don't wanna produce any spans.
2021-10-15 09:20:40 +02:00
Raphael Taylor-Davies bdd6d67e7a
refactor: split out mutable_batch crate (#2841)
* refactor: split out mutable_batch crate

* refactor: restore chunk module for better diffs

* chore: fmt

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-14 16:46:54 +00:00
kodiakhq[bot] 993c6173d1
Merge branch 'main' into ntran/grpc_storage 2021-10-14 15:28:05 +00:00
Nga Tran faf65f38cc refactor: address review comments 2021-10-14 11:23:20 -04:00
Marco Neumann 28195b9c0c chore: new `parquet_catalog` crate 2021-10-14 14:34:59 +02:00
Edd Robinson 96e05726ee refactor: expose negated_predicate API for columns_names 2021-10-14 13:08:56 +01:00
kodiakhq[bot] 61ec559eee
Merge branch 'main' into crepererum/write_buffer_span_ctx 2021-10-14 11:50:07 +00:00
Raphael Taylor-Davies 4087d094b1
refactor: rework write buffer compaction as integration test (#2830)
* refactor: rework write buffer compaction as integration test

* chore: fix lint

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-14 11:04:44 +00:00
Raphael Taylor-Davies e911cf9ac1
refactor: make WriteBufferConfigFactory interior mutable (#2829)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-14 10:30:59 +00:00
Marco Neumann 5e06519afb feat: propagate trace information through write buffer 2021-10-14 11:07:41 +02:00
Raphael Taylor-Davies d752b79cbe
fix: disable persistence during replay (#2812)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-14 08:41:10 +00:00
Nga Tran 8dd9dcce01 test: verify if all scenarios are created correctly and add a few delete tests for read_filter 2021-10-13 17:21:03 -04:00
kodiakhq[bot] a6ca469876
Merge branch 'main' into crepererum/cleanup_pres_catalog_interace 2021-10-13 14:49:02 +00:00
Raphael Taylor-Davies ba829436d7
feat: restart Database (#2822) (#2825)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-13 14:23:20 +00:00
Marco Neumann 1523e0edcd refactor: clean up preserved catalog interface
1. Remove `new_empty` logic. It's a leftover from the time when the
   `PreservedCatalog` owned the in-memory catalog.
2. Make `db_name` a part of the `PreservedCatalogConfig`.
2021-10-13 13:58:11 +02:00
Raphael Taylor-Davies d390dfa280
feat: rework delete predicate preservation as integration test (#2820)
* feat: rework delete predicate preservation as integration test

* chore: review feedback

* chore: fix lint
2021-10-13 10:40:17 +00:00
Raphael Taylor-Davies f7f6965b65
feat: don't panic if `Db::compact_chunks` with no matching chunks (#2818) 2021-10-12 21:54:43 +00:00
Raphael Taylor-Davies 8a82f92c5d
refactor: add TimeProvider abstraction (#2722) (#2815)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-12 21:19:03 +00:00
Raphael Taylor-Davies 5b69bb0d72
feat: reduce lifecycle lock scope (#2242) (#2810)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-12 17:34:16 +00:00
Raphael Taylor-Davies 8414e6edbb
feat: migrate preserved catalog to TimeProvider (#2722) (#2808)
* feat: migrate preserved catalog to TimeProvider (#2722)

* fix: deterministic catalog prune tests

* fix: failing test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-12 14:43:05 +00:00
Raphael Taylor-Davies 3dfe400e6b
feat: migrate write path to TimeProvider (#2722) (#2807) 2021-10-12 12:09:08 +00:00
Raphael Taylor-Davies 0554173684
feat: migrate write buffer to TimeProvider (#2722) (#2804)
* feat: migrate write buffer to TimeProvider (#2722)

* chore: review feedback

Co-authored-by: Marco Neumann <marco@crepererum.net>

Co-authored-by: Marco Neumann <marco@crepererum.net>
2021-10-12 10:32:34 +00:00
Raphael Taylor-Davies b39e01f7ba
feat: migrate PersistenceWindows to TimeProvider (#2722) (#2798) 2021-10-11 20:40:00 +00:00
Raphael Taylor-Davies 06c2c23322
refactor: create PreservedCatalogConfig struct (#2793)
* refactor: create PreservedCatalogConfig struct

* chore: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-11 15:43:05 +00:00
kodiakhq[bot] 05fe4701c1
Merge branch 'main' into dependabot/cargo/cache_loader_async-0.1.2 2021-10-11 15:20:28 +00:00
Marco Neumann ad41b74a03 fix: adjust code to `cache_loader_async` 0.1.2 2021-10-11 17:12:08 +02:00
dependabot[bot] 49c63d35b1
chore(deps): bump cache_loader_async from 0.1.1 to 0.1.2
Bumps [cache_loader_async](https://github.com/ZeroTwo-Bot/cache-loader-async-rs) from 0.1.1 to 0.1.2.
- [Release notes](https://github.com/ZeroTwo-Bot/cache-loader-async-rs/releases)
- [Changelog](https://github.com/ZeroTwo-Bot/cache-loader-async-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/ZeroTwo-Bot/cache-loader-async-rs/commits)

---
updated-dependencies:
- dependency-name: cache_loader_async
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-11 15:02:12 +00:00
Marco Neumann ae0acf0024 refactor: remove `db_name` param from `select_persistable_chunks`
This was only used for logging but is already part of `ChunkAddr`.
2021-10-11 17:01:28 +02:00
Marco Neumann 8185feddb9 fix: do not break chunk ordering during persistence
Fixes #2729.
2021-10-11 17:01:28 +02:00
Carol (Nichols || Goulding) 5da2f7b1b0
Merge branch 'main' into cn/less-database-name 2021-10-11 10:35:42 -04:00
Marco Neumann c4a2641764 refactor: remove `time_closed`
The "time closed" is a leftover from an old lifecycle system, where
chunks moved through the system (open=>closed=>persisted) without being
merged. Now we have the compaction as well as the split query for
persistence that can merge chunks, so a single "time closed" doesn't
make sense any longer. So in fact it is `None` for many chunks and is
also not persisted. Also the current lifecycle policy doesn't use this
value. So let's just remove it.

Closes #1846.
2021-10-11 15:49:34 +02:00
Raphael Taylor-Davies afe34751e7
refactor: split out schema crate (#2781)
* refactor: split out schema crate

* chore: fix doc
2021-10-11 09:45:08 +00:00
Raphael Taylor-Davies f35a49edd0
refactor: move Sequence to data_types (#2780)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-11 09:23:00 +00:00
Carol (Nichols || Goulding) 8407735e00 fix: Pass the database name into PreservedCatalog 2021-10-08 15:25:10 -04:00
Carol (Nichols || Goulding) 276aef69c9 refactor: Move PreservedCatalog test helper functions to test helpers and use them more 2021-10-08 15:25:10 -04:00
Marco Neumann f8b5c0ee50 fix: remove obsolete TODO 2021-10-08 12:36:23 +02:00
Marco Neumann d3de6bb6e4 refactor: `max_persisted_timestamp` => `flush_timestamp`
There might be data left before this timestamp that wasn't persisted
(e.g. incoming data while the persistence was running).
2021-10-08 12:36:23 +02:00
Marco Neumann 63a932fa37 refactor: "min unpersisted ts" => "max persisted ts"
Store the "maximum persisted timestamp" instead of the "minimum
unpersisted timestamp". This avoids the need to calculate the next
timestamp from the current one (which was done via "max TS + 1ns").

The old calculation was prone to overflow panics. Since the
timestamps in this calculation originate from user-provided data (and
not the wall clock), this was an easy DoS vector that could be triggered
via the following line protocol:

```text
table_1 foo=1 <i64::MAX>
```

which is

```text
table_1 foo=1 9223372036854775807
```

Bonus points: the timestamp persisted in the partition
checkpoints is now the very same that was used by the split query during
persistence. Consistence FTW!

Fixes #2225.
2021-10-08 11:52:49 +02:00
kodiakhq[bot] 001ed36da4
Merge branch 'main' into crepererum/issue2627 2021-10-08 07:35:08 +00:00
Carol (Nichols || Goulding) 15b396c720 fix: Set a UUID if none was read from object storage on server startup
This enables a smoother transition to this state when this gets
deployed.
2021-10-07 10:19:17 -04:00
Carol (Nichols || Goulding) 169f2499d3 fix: Use bytes for UUID in the protobuf instead of string 2021-10-07 10:19:17 -04:00
Carol (Nichols || Goulding) ae7d893199 feat: Add UUID to databases
Connects to #2675.

When a database is created, assign it a UUID and serialize the UUID to
object storage by wrapping the database rules in a new
`PersistedDatabaseRules` type that also contains the UUID.

All APIs to the end user involving rules should continue using only
`DatabaseRules` so the UUID is an internal implementation detail.
2021-10-07 10:19:14 -04:00
Carol (Nichols || Goulding) 27e7a1f925 refactor: Organize use statements 2021-10-07 10:17:19 -04:00
Marco Neumann 81c75eec7e fix: interaction of preservation and delete predicates
This is the second part of #2627.
2021-10-07 11:38:09 +02:00
Marco Neumann 57b3be3b2d fix: interaction of compaction and delete predicates
- predicates that existed before the compaction can be forgotten since
  they are materialized during compaction
- predicates that are added while the compaction is running must be
  included into the new chunk

This is the first half of #2627.
2021-10-07 11:24:58 +02:00
kodiakhq[bot] 7d6be3f500
Merge branch 'main' into crepererum/issue2748 2021-10-07 09:04:18 +00:00
Marco Neumann 8b06d72a58 fix: address review comments 2021-10-07 10:24:19 +02:00
Marco Neumann 63d74be490 refactor: make `ChunkId` a UUID 2021-10-07 10:23:27 +02:00
Marco Neumann 0374ba2284 fix: re-enable no longer flaky part of `delete_predicate_preservation`
Fix #2748.
2021-10-07 10:15:49 +02:00
Raphael Taylor-Davies 39157828b1
feat: remove remaining usages of Instant (#2722) (#2749)
* feat: remove remaining usages of Instant (#2722)

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-06 16:44:02 +00:00
Nga Tran 05387f5a70 test: disable running query after re-create db in delete_predicate_preservation to avoid flaky test 2021-10-06 11:17:02 -04:00
Raphael Taylor-Davies ce5b24e65d
refactor: use DateTime<Utc> in PersistenceWindows (#2722) (#2743)
* refactor: use DateTime<Utc> in PersistenceWindows (#2722)

* chore: fix benchmark

* chore: fmt

* chore: review feedback
2021-10-06 09:39:32 +00:00
Marco Neumann d322069dd4 refactor: move delete predicate persistence into background job 2021-10-06 08:05:38 +02:00
kodiakhq[bot] d72a494198
Merge branch 'main' into crepererum/in_mem_expr_part5 2021-10-05 16:20:24 +00:00
Raphael Taylor-Davies d0929e3a34
feat: persist no chunks (#2712) (#2718)
* feat: persist no chunks (#2712)

* fix: persist partition

* fix: chunk ordering test

* chore: fix logical conflict

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-05 15:18:35 +00:00
Raphael Taylor-Davies 2a584420b3
refactor: make data_types optional dependency (#2739)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-05 15:07:45 +00:00
Marco Neumann bb7a27e5ed refactor: use proper sets during delete predicate collection
We no longer need hacky pointer tricks to de-duplicate delete predicates
when collecting them for catalog checkpoints. This was once required
when the delete predicates didn't implement `Eq` and `Hash` but now it's
all way easier.
2021-10-05 10:37:34 +02:00
Marco Neumann 28ccf2a8c3 refactor: `TransactionHandle::delete_predicate` cannot fail 2021-10-05 09:41:46 +02:00
Marco Neumann 10c1a72402 refactor: remove unused fields from `DeletePredicate` 2021-10-05 09:29:24 +02:00
kodiakhq[bot] f6fc148fe5
Merge branch 'main' into crepererum/issue2633a 2021-10-04 15:50:48 +00:00
Raphael Taylor-Davies 742a1065a1
feat: don't auto-increment background worker now (#2719)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-04 15:50:09 +00:00
Marco Neumann 97881079e8 refactor: make `ChunkOrder` non-zero
This will make it easier to handle missing values.

Helps with #2633.
2021-10-04 17:49:12 +02:00
Marco Neumann 75ac6e8646 refactor: make `DeletePredicate::range` non-optional 2021-10-04 16:36:20 +02:00
Marco Neumann 5a5a929b9e refactor: introduce `DeletePredicate`
`DeletePredicate` is a simpler version of `Predicate` that is based on
IOx `DeleteExpr` instead of the full-blown DataFusion `Expr`. This will
allow us to do a couple of things (in follow-up changes):

- Order and de-duplicate delete predicates
- Normalize predicates
- Infallible serialization
- Smaller memory footprint

Note that this change only affects delete expressions. Query expressions
that are supported via the API are not changed. The query subsystem also
still uses the full-featured expressions/predicates (delete
expressions/predicates are converted to the more powerful DataFusion
version on-the-fly).
2021-10-04 16:36:20 +02:00
kodiakhq[bot] 181145eca1
Merge branch 'main' into dependabot/cargo/arrow-5.5.0 2021-10-04 13:10:42 +00:00
Edd Robinson 7ab10daa19
Merge branch 'main' into dependabot/cargo/arrow-5.5.0 2021-10-04 12:58:29 +01:00
Edd Robinson f8c72d611c
Merge branch 'main' into dependabot/cargo/parquet-5.5.0 2021-10-04 12:56:54 +01:00
Edd Robinson d0384f60d0 test: update server tests 2021-10-04 12:39:35 +01:00
Raphael Taylor-Davies 9cc6b18205
refactor: simplify delete_predicate_preservation test (#2714) 2021-10-04 11:38:03 +00:00
Raphael Taylor-Davies e8eab2cc97
feat: allow compaction and persistence to retun no chunk (#2664) (#2700)
* feat: allow compaction and persistence to retun no chunk (#2664)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-04 10:54:47 +00:00
dependabot[bot] d1f5209869
chore(deps): bump arrow from 5.4.0 to 5.5.0
Bumps [arrow](https://github.com/apache/arrow-rs) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/5.5.0/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/5.4.0...5.5.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-04 08:55:38 +00:00
Nga Tran bd22c73b8a chore: Merge branch 'main' into ntran/delete_endpoint 2021-10-01 13:33:39 -04:00
Nga Tran ee94e9038a test: finalize codin up delete http endpoints and end-to-end tests 2021-10-01 12:15:00 -04:00
Raphael Taylor-Davies b402423e9e
feat: remove move lifecycle action (#2674)
* feat: remove move_chunk lifecycle action

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-30 16:58:05 +00:00
Edd Robinson 003f72ba00
Merge branch 'main' into er/fix/read_buffer/pred_validate 2021-09-29 14:50:12 +01:00
Edd Robinson a52b86e070 fix: fallback to no predicate if it can't be validated
Closes: #1603

If a predicate cannot be executed against a read buffer chunk because of schema conflicts then fall back to applying no predicate and let the query engine apply predicates in the Filter step of the plan.
2021-09-29 14:42:56 +01:00
Carol (Nichols || Goulding) 92583aee82 fix: Remove streaming API since we're not streaming anyway 2021-09-29 08:19:32 -04:00
Carol (Nichols || Goulding) d05528bcfd refactor: Use s3_request for put requests
Which meant we also needed to change the byte stream to be a closure
that can generate a byte stream
2021-09-29 08:19:32 -04:00
Raphael Taylor-Davies 1534ae9edf
refactor: store chunks in iteration order (#2634)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-27 11:59:27 +00:00
Carol (Nichols || Goulding) cf83a325f2
fix: Await on freeze handles instead of try_freeze/returning Transition errors (#2570)
* fix: Await on a freeze handle instead of returning TransitionInProgress

* fix: Await on freeze handle in skip_replay

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-26 10:44:59 +00:00
Edd Robinson 621b26166c feat: validate predicates on read_filter 2021-09-24 14:52:16 +01:00
Marco Neumann 6d85700e3e
docs: mention return type.
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-09-24 07:23:09 +00:00
Marco Neumann 4a0cda188e refactor: make `Partition::force_drop_chunk` similar to `Partition::drop_chunk`
- Bubbles up "not found" error, the caller should reason about it
- Returns deleted chunk
2021-09-23 18:37:54 +02:00
Marco Neumann e842733c5b refactor: `CatalogChunk::add_delete_predicate` cannot fail 2021-09-23 09:55:31 +02:00
kodiakhq[bot] b16e7ea91a
Merge branch 'main' into crepererum/issue2518c 2021-09-22 16:09:04 +00:00
Andrew Lamb d38648952c
chore: Update datafusion (#2602)
* chore: Update datafusion + other deps

* refactor: update query crate for new async interfaces

* refactor: update server crate for new async interface

* refactor: update query_tests crate for new async interfaces

* refactor: update ioxd and server to use new async interface

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-22 10:33:25 +00:00
Marco Neumann 0d7bb69dd3 feat: teach `Db` to preserve delete predicates 2021-09-22 09:43:37 +02:00
Marco Neumann 981ee0c6df refactor: accept unknown chunks in persisted delete predicates
Due to the timing of the "persist" lifecycle action and that delete
predicates might arrive at any time + the fact that we don't wanna hold
transaction locks for too long, we should accept delete predicates for
chunks that are currently "persisting" even though that lifecycle action
might fail.
2021-09-22 09:29:50 +02:00
Nga Tran b4b33c378e test: turn all delete tests on 2021-09-21 15:23:41 -04:00
Marco Neumann fb7299a169
fix: bubble up write errors (#2598)
Fixes #2538.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-21 11:08:37 -04:00
Marco Neumann 015dfb3b16 test: do not (ab)use the panic hook for replay tests
The old construct uses a single assert-statement for both:

- "bubble-up" scenario, were a panic should fire
- a check, were a panic should not fire

That makes it easy to add new tests. However we need two rather
questionable things to make that work:

- catch panic: to convert an assertion to a check
- a custom panic hook: to make tests not overly verbose (aka caught
  panics should not show up on stdout)

Esp. the custom panic hook doesn't work too well w/ multi-threaded tests
since it might swallow error messages from unrelated tests and makes
debugging of CI failures hard.

So instead of using assertions for checks, we now implement a proper
assertion and a check for each test. That's a bit more code per check
but easier probably more stable.
2021-09-21 12:00:37 +02:00
kodiakhq[bot] 77d84ca5ab
Merge branch 'main' into crepererum/chunk_id 2021-09-20 13:39:05 +00:00
kodiakhq[bot] c7e6fffaaa
Merge branch 'main' into ntran/delete_scan 2021-09-20 13:29:47 +00:00
Marco Neumann cef5aeee52 refactor: introduce `ChunkId` type 2021-09-20 13:10:41 +02:00
kodiakhq[bot] 140c71eaf0
Merge branch 'main' into crepererum/issue2518a 2021-09-20 09:16:39 +00:00
Raphael Taylor-Davies f62d0eab3c
feat: disable bytes serde (#2580)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-20 09:07:12 +00:00
Marco Neumann acf698c366 fix: delete predicate sorting 2021-09-20 10:48:32 +02:00
Marco Neumann 492d991f49 feat: delete catalog pres. catalog <=> in-mem catalog API
First step towards #2518. Creates the Rust API to communicate delete
predicates between the preserved catalog and the in-memory catalog and
adds tests ensuring that the in-mem catalog produces the wanted errors
as well as correct checkpoints (similar to how this is done for the
parquet file tracking already).

**This does NOT contain the actual preservation!**
2021-09-20 10:48:32 +02:00
dependabot[bot] 876bb10cf8
chore(deps): bump rand_distr from 0.4.1 to 0.4.2
Bumps [rand_distr](https://github.com/rust-random/rand) from 0.4.1 to 0.4.2.
- [Release notes](https://github.com/rust-random/rand/releases)
- [Changelog](https://github.com/rust-random/rand/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-random/rand/compare/rand_distr-0.4.1...rand_distr-0.4.2)

---
updated-dependencies:
- dependency-name: rand_distr
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-20 08:39:39 +00:00
Marco Neumann 0f5198c88d test: fix tests dealing w/ parquet metadata sizes
Sizes now depend on the actual content and therefore we need
deterministic timestamps.
2021-09-20 09:42:53 +02:00
Marco Neumann e15631002e test: allow test code to specify exact parquet creation timestamp
This is required for deterministic sizes since different timestamp lead
to different compression ratios.
2021-09-20 09:42:52 +02:00
Nga Tran 364d245eae feat: apply negated delete predicates during scan 2021-09-17 16:20:42 -04:00
Carol (Nichols || Goulding) 51a40b31bf feat: Add a --detailed option to the database list CLI
That will list both active and deleted databases with their generations.

Closes #2462.
2021-09-17 15:27:23 -04:00
Carol (Nichols || Goulding) 44a89cdf75 refactor: Change DeletedDatabase to DetailedDatabase
So this info can be reused for active databases in detailed database
lists.
2021-09-17 15:27:22 -04:00
kodiakhq[bot] 23cc980d9e
Merge branch 'main' into cn/restore 2021-09-17 17:52:56 +00:00
Nga Tran 60a866ddcb refactor: merge delete predicates into select predicate 2021-09-17 07:52:33 -04:00
Nga Tran 0444d1b4fd chore: merge main to branch 2021-09-16 17:28:37 -04:00
Nga Tran 6cfeeb352b refactor: address review comments 2021-09-16 17:21:06 -04:00
Nga Tran cf4fd500b9 refactor: remove tests moved to query_tests 2021-09-16 15:05:48 -04:00
Marco Neumann ec943081c7 refactor: `Arc<Vec<...>>` => `Vec<Arc<...>>` for del predicates
The motivations are:

1. The API uses a SINGLE predicate and adds that to many chunks. With
   `Arc<Vec<...>>` you gain nothing, with `Vec<Arc<...>>` the predicate
   is only stored once (in many vectors)
2. While we currently add predicates blindly to all chunks, we can be way
   smarter in the future and prune out tables, partitions or even single
   chunks (based on statistics). With that, it will be rare that many
   chunks share the exact same set of predicates.
3. It would be nice if we could de-duplicate predicates when writing them
   to the preserved catalog without needing to repeat the pruning
   discussed in point 2. This is way easier to implement whan chunks
   exists in `Arc`s.
4. As a side-note: the `Arc<Vec<...>>` wasn't really cloned around but
   instead was created many time. So the new version should be more
   memory efficient out of the box.
2021-09-16 17:16:09 +02:00
Andrew Lamb ce224bd37f
fix: Capture query execution traces for storage gRPC queries as well (#2553)
* fix: Capture query execution traces for storage gRPC queries as well

* refactor: remove debugging droppings

* refactor: do not Box::pin within TracedStream

* refactor: Use Futures::TryStreamExt rather than custom collect function

* fix: remove wild println

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-16 14:45:20 +00:00
kodiakhq[bot] 33cd1cffad
Merge branch 'main' into ntran/delete_read 2021-09-16 13:22:50 +00:00
Marco Neumann bb17b4e2c2 test: even more time-related lifting
Lift a few `Utc::now()` calls further and narrow down checks in tests.
Also avoid a few `<` comparisons which might not always hold.
2021-09-16 11:05:10 +02:00
Marco Neumann 2820db5583 refactor: split preserved catalog `api` into `core` and `interface`
This makes it clearer which traits and functions users of the preserved
catalog must implement. This also splits the error types into smaller
enums that are easier to understand.

This change should make it easier to implement new functionality (like
capturing delete predicates).
2021-09-16 10:30:11 +02:00
Carol (Nichols || Goulding) 91fd32d506 fix: Reset restored db's state instead of restarting background worker 2021-09-15 19:04:05 -04:00
Nga Tran 61e1eac135 fix: fix the cases of multi[le expressions in delete predicate 2021-09-15 17:00:21 -04:00
Carol (Nichols || Goulding) d70d94100e refactor: Extract a type alias 2021-09-15 17:00:09 -04:00
Carol (Nichols || Goulding) 81feced9d6 fix: Restart database background worker when it's restored 2021-09-15 16:59:49 -04:00
Carol (Nichols || Goulding) 7c81c280cf fix: Shut down a database when it's deleted 2021-09-15 16:59:49 -04:00
Carol (Nichols || Goulding) 7b6d8f9327 feat: Add an API for restoring a database that was marked deleted 2021-09-15 16:59:37 -04:00
Raphael Taylor-Davies c66095cad1
feat: remove metrics crate (#2552) 2021-09-15 19:43:33 +00:00
Raphael Taylor-Davies 6e7fa3e574
feat: migrate http ingest metrics (#2542)
* feat: migrate http ingest metrics

* chore: review feedback

* refactor: RAII entry ingest recorder
2021-09-15 19:01:10 +00:00
Nga Tran 7175488133 chore: add some comments 2021-09-15 14:45:04 -04:00
Nga Tran 3486cc8b38 fix: should not send an empty delete predicate predicate which means delete everything (no time range) 2021-09-15 14:14:26 -04:00
Raphael Taylor-Davies 1ea4335ff3
fix: report correct DatabaseStateCode (#2543)
* fix: report correct DatabaseStateCode

* chore: fix lint

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-15 18:02:54 +00:00
Raphael Taylor-Davies 6f2301e16c
feat: migrate write buffer metrics (#2536)
* feat: migrate write buffer metrics

* feat: update server/src/write_buffer.rs

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-15 17:41:55 +00:00
Raphael Taylor-Davies 27b0c46f79
feat: shutdown WriteBufferConsumer on drop (#2533)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-15 11:48:56 +00:00
Marco Neumann 86031b03dc refactor: make time_of_write a parameter
Instead of depending on the `chrono` clock implicitly via `Utc::now()`
we should make it an explicit parameter. This is essential for testing:

1. We have many tests guessing around this value by taking `Utc::now()`
   directly before or after the write (this commit doesn't fix that, but
   allows us to fix it in a follow-up).
2. We have some tests that ignore the `time_of_write` values in some
   comparisons because they cannot control that value (fix not included
   here but left as a follow-up).
3. The upcoming compression (#2528) needs to control timestamps within
   the compressed payload (and `time_of_write` is embedded in the
   parquet metadata) because the compressed size depends on it (even if
   the uncompressed size is stable).

In general I argue that a "clock" is always data and should be passed
(either as a value or as a "now"-function) from the API layer. Hidden
clock checks just make mocking and tests a nightmare (we've seen this w/
replay tests as well).
2021-09-15 11:39:15 +02:00
Nga Tran 63cc7b3fb0 test: more tests to discover what still need to be done 2021-09-14 17:57:30 -04:00
Nga Tran f4f140d3b7 chore: merge main to branch 2021-09-14 13:25:32 -04:00
Raphael Taylor-Davies 939c9ca038
fix: reset Database background worker on deletion (#2530)
* fix: reset Database background worker on deletion

* chore: update server/src/database.rs

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: add database delete restore test

* chore: fix logical conflicts

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-14 16:57:33 +00:00
kodiakhq[bot] d60aa5940b
Merge branch 'main' into crepererum/chunk_order_type 2021-09-14 16:25:17 +00:00
Raphael Taylor-Davies c33e5c22e6
feat: pull WriteBuffer consumer out of Db and onto Database (#2243) (#2525)
* feat: pull WriteBuffer consumer out of Db and onto Database (#2243)

* chore: restore WritingOnlyAllowedThroughWriteBuffer error

* refactor: remove WriteBufferConfig

* chore: fix docs

* chore: move WriteBufferConsumer tests out of db.rs

* chore: document WriteBufferFactory member functions

* chore: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-14 16:04:58 +00:00
Marco Neumann bfaba78dc3 refactor: move `predicate` into its own crate
Two reasons:

1. I wanna decouple `parquet_file` from `query` (nearly done, needs a
   small follow-up PR).
2. `predicate` will have more and more features (like serialization)
   which justifies a new home
2021-09-14 17:13:02 +02:00
Marco Neumann becef1c75f refactor: introduce `ChunkOrder` type 2021-09-14 17:10:23 +02:00
Marco Neumann aaeb67ae5d refactor: make chunk iterations sorted by `order, ID` 2021-09-14 13:00:55 +02:00
Marco Neumann 804790711b refactor: isolate `sort_chunks` 2021-09-14 13:00:55 +02:00
Marco Neumann c28f38309a docs: improve chunk ordering docs 2021-09-14 13:00:55 +02:00
Marco Neumann 96618af6a2 fix: respect chunk order when invoking lifecycle actions 2021-09-14 13:00:55 +02:00
Marco Neumann 1b788732da fix: order chunks correctly during query processing
The query processing was implicitly relying on the order provided by the
catalog. This had two issues:

- this ordering was not defined in the API contract (neither via docs
  nor via typing)
- the order was based on chunk IDs which is not adequate in some cases
  (e.g. when chunks are created while a persistence operations is in
  progress)

Now we explicitly sort chunks by `(order, ID)`.

Fixes #1963.
2021-09-14 13:00:55 +02:00
Marco Neumann 8a531be05b feat: expose chunk order via API and in system table 2021-09-14 13:00:55 +02:00
Marco Neumann 45cb00d8c0 refactor: track chunk order in chunks 2021-09-14 13:00:55 +02:00
Marco Neumann 3f2e46c397 feat: prune old transactions from preserved catalog 2021-09-14 12:08:17 +02:00
Nga Tran 042a78e5a7 feat: apply delete predicate during query to emilimate deleted data 2021-09-13 18:02:55 -04:00
Andrew Lamb 5eef76c868
chore: Update dependencies (including datafusion) (#2521)
* chore: Update datafusion deps to pre-release

* refactor: Update IOx to use new datafusion Statistics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 21:30:44 +00:00
Raphael Taylor-Davies f3bcafcfea
feat: migrate http metrics to metric crate (#2508)
* feat: migrate http metrics

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 18:56:20 +00:00
kodiakhq[bot] e76d70ea36
Merge branch 'main' into ntran/delete_pred_chunks 2021-09-13 16:15:57 +00:00
Nga Tran 40499b222e chore: merge main to branch 2021-09-13 12:15:16 -04:00
Nga Tran 8292c4d2e4 refactor: address review comments 2021-09-13 11:44:18 -04:00
Jake Goulding 0b6e577da5 fix: Return same error when querying deleted vs uncreated database
Closes #2446
2021-09-13 11:43:07 -04:00
Raphael Taylor-Davies 574149d644
feat: migrate remaining catalog metrics to new crate (#2490)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-13 14:42:14 +00:00
Raphael Taylor-Davies 20143e4f4e
feat: migrate chunk pruning metrics (#2516) 2021-09-13 13:13:47 +00:00
Nga Tran 3798ca09bb feat: save delete predicates in chunks 2021-09-10 17:16:18 -04:00
Raphael Taylor-Davies b8f7319704
feat: migrate read buffer metrics to metric crate (#2510)
* feat: migrate read buffer metrics to metric crate

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-10 19:51:43 +00:00
kodiakhq[bot] f6c0d94991
Merge branch 'main' into crepererum/rust_155 2021-09-10 10:59:59 +00:00
Andrew Lamb ec63321bb0
feat: Less errors on update_database_rules (#2433)
* fix: serialize concurrent database rules updates

* fix: second attempt

* docs: Apply suggestions from code review

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-09-10 10:46:26 +00:00
Marco Neumann 368f0369ee chore: Rust 1.55 2021-09-10 12:36:49 +02:00
Raphael Taylor-Davies eed81e752d
feat: remove deprecated catalog metrics (#2489)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-10 10:12:04 +00:00
kodiakhq[bot] faa05f394b
Merge branch 'main' into ntran/parse_delete_2 2021-09-09 18:28:39 +00:00
Raphael Taylor-Davies 44918e4afc
feat: migrate chunk metrics (#2491)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-09 16:02:16 +00:00
kodiakhq[bot] 76271a141a
Merge branch 'main' into crepererum/remove_process_clock 2021-09-09 15:08:40 +00:00
Marco Neumann 4d6ec4bfe6 refactor: remove process clock
The process clock is a leftover from the pre-Kafka writer buffer design
and is no longer required.
2021-09-09 16:55:48 +02:00
kodiakhq[bot] a9e2ed4c14
Merge branch 'main' into crepererum/fix_job_metrics 2021-09-09 14:53:00 +00:00
Raphael Taylor-Davies 3cee899f77
feat: migrate catalog timestamp summary to `metric` crate (#2486)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-09 14:52:25 +00:00
Marco Neumann a5d4d954fb fix: increase job duration histogram range
The default upper limit of 10s is too tight for many jobs. This now
increases the histogram range to 5000s (no joke, we've seen jobs w/ over
40min run time, even though that shouldn't happen).
2021-09-09 16:48:21 +02:00
Marco Neumann 40d3f53aee feat: add DB and table name to job metrics 2021-09-09 16:37:44 +02:00
Marco Neumann 0a31f5f2e5 fix: fix job metrics naming
For duration historgrams, the exporter takes care of the correct suffix
depending on the resolution used by it. For example the prometheus
exporter will use a `..._seconds` metric to encode the historgram. The
IOx internal metric should therefore NOT append any resolution. This
then removes the `_nanoseconds` suffix, renaming the externally visible
metric from

```text
influxdb_iox_job_completed_{cpu,wall}_nanoseconds_seconds
```

to

```text
influxdb_iox_job_completed_{cpu,wall}_seconds
```
2021-09-09 16:37:44 +02:00
Raphael Taylor-Davies 9de12745e7
feat: migrate lock metrics to metric crate (#2481)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-09 08:56:19 +00:00
Nga Tran 00df7b064c feat: finally have the delete predicate parsed 2021-09-08 17:30:10 -04:00
Marco Neumann 801cf08be7 feat: auto-creation of sequencers by write buffer
For Kafka, that basically means that we create a topic if it doesn't
exist yet.

Closed #2455.
Fixes #2189.
2021-09-07 18:24:57 +02:00
Marco Neumann d5662328b0 refactor: `n_sequencers` should be non-zero 2021-09-07 18:18:20 +02:00
Nga Tran dbe4bcff22 chore: merge main to branch 2021-09-07 10:54:59 -04:00
Marco Neumann 31cbb646b9 feat: skip individual rows during replay based on timestamp 2021-09-07 11:44:52 +02:00
Marco Neumann fe0df2ab0c fix: job metric race condition 2021-09-06 14:33:59 +02:00
Marco Neumann 998bafcd85 fix: typo
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2021-09-06 13:39:22 +02:00
Marco Neumann 77287ad228 feat: rework job metrics to be push-based, add wall/cpu time histograms 2021-09-06 13:39:22 +02:00
Marco Neumann e6f12f965c feat: expose job metrics
Closes #2416.
2021-09-06 13:39:22 +02:00
kodiakhq[bot] f6e040df3d
Merge branch 'main' into dependabot/cargo/tokio-1.11.0 2021-09-06 09:18:55 +00:00
Raphael Taylor-Davies a4b0cbc0e7
feat: migrate jemalloc metrics to `metric` crate (#2435)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-06 09:18:27 +00:00
dependabot[bot] b67610d9b9
chore(deps): bump tokio from 1.10.1 to 1.11.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.10.1 to 1.11.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.10.1...tokio-1.11.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 09:11:38 +00:00
Nga Tran 9de3b79a90 refactor: more cleanup 2021-09-06 01:45:47 -04:00
Nga Tran de0bd80c3d refactor: cleanup 2021-09-06 01:07:07 -04:00
Nga Tran 4801b2c238 feat: Have the ParseDelete message and its corresponding ProvidedParseDelete struct ready for building delete parser 2021-09-06 00:13:59 -04:00
dependabot[bot] b1bb390893
chore(deps): bump parking_lot from 0.11.1 to 0.11.2
Bumps [parking_lot](https://github.com/Amanieu/parking_lot) from 0.11.1 to 0.11.2.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Amanieu/parking_lot/compare/0.11.1...0.11.2)

---
updated-dependencies:
- dependency-name: parking_lot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-06 01:18:24 +00:00
kodiakhq[bot] 2d41fd519f
Merge branch 'main' into cn/list-soft-deleted 2021-09-03 15:16:32 +00:00
Marco Neumann 3c968ac092 feat: correctly account MUB sizes
Fixes #1565.
2021-09-03 09:15:49 +02:00
Nga Tran a85d95d2e9 refactor: cleanup 2021-09-02 17:25:41 -04:00
Nga Tran e2274a9f41 feat: parser for delete predicate 2021-09-02 17:02:05 -04:00
Carol (Nichols || Goulding) ce6030a3cb feat: Wire list deleted databases through gRPC and CLI APIs 2021-09-02 15:48:07 -04:00
Marco Neumann ecf1f99ddb refactor: more flexible writer buffer config
This allows:

- different types (instead of guessing through the connection URL)
- sequencer counts (not used yet but will be by #2455)
- extensible configs (e.g. to configure Kafka in a more granular way,
  not wired up yet)
- future extensions (since we use a message now instead of a single
  string)

**BREAKING: This requires changes for deployed systems / existing DBs!**
2021-09-02 16:41:35 +02:00
kodiakhq[bot] b3d04b3e26
Merge branch 'main' into cn/server-startup-delete 2021-09-01 13:32:34 +00:00
Carol (Nichols || Goulding) c89ad70d07
test: Ensure we deleted some tombstone file
To guard against forgetting to change this test if we change the tombstone file.

Co-authored-by: Marco Neumann <marco@crepererum.net>
2021-09-01 09:04:25 -04:00
kodiakhq[bot] e183ecb3e7
Merge branch 'main' into cn/list-but-not-deleted-databases 2021-09-01 13:03:09 +00:00
Marco Neumann 06833110ab test: allow creation of less complex parquet chunks 2021-09-01 11:26:05 +02:00
Nga Tran a4183de411 feat: more progress on the delete flow from grpc API to catalog chunks 2021-08-31 17:42:07 -04:00
Marco Neumann 79ad48ac3a chore: rename "labels" to "attributes" 2021-08-31 11:31:15 +02:00
Nga Tran f962d0ef2e feat: first step to add delete_predicate into chunk catalog 2021-08-30 17:16:08 -04:00
Carol (Nichols || Goulding) 396bc6a3ad test: Database startup error when there are multiple active generations
Fixes #2196.
2021-08-30 15:49:12 -04:00
Carol (Nichols || Goulding) c4693a08a5 fix: Remove an unnecessary clone 2021-08-30 14:14:23 -04:00
Carol (Nichols || Goulding) e67624dd37 fix: Assert on which error getting a deleted database returns 2021-08-30 11:29:25 -04:00
Carol (Nichols || Goulding) 442a26bb99 fix: Remove some unneded snafu-related allocations 2021-08-30 10:49:20 -04:00
Carol (Nichols || Goulding) 01103002f4 fix: Return an error if we can't find an iox object store to write a tombstone file in 2021-08-30 10:42:46 -04:00
Carol (Nichols || Goulding) d688678464 feat: Add an iox_object_store API for writing the tombstone file
Connects to #1871.
2021-08-30 10:42:45 -04:00
Marco Neumann 96b0026203 fix: make "persist partition" a bit more stable
- add longer wait times to tests
- exclude chunks that have active lifecycle actions early (instead of
  failing the whole set)
- properly catch the "no chunks" case

Fixes #2434.
2021-08-30 13:11:12 +02:00
Andrew Lamb 2f49e47a23
feat: return DatabaseRules for ListDatabases request (#2431) 2021-08-28 10:53:24 +00:00
Andrew Lamb 779b027271
feat: Store only the database rules sent by the client (do not store default values) (#2430)
* feat: add omit_default to protobuf definition

* feat: Persist only the client provided rules

* fix: Remove race conditions

* fix: merge confit

* refactor: do not use macro

* refactor: restore use of glob import

* fix: review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-28 10:26:32 +00:00
Nga Tran 499af57299 chore: merge mian to branch and resolve conflicts 2021-08-27 17:51:07 -04:00
Nga Tran b79eaa34d1 refactor: address review comments 2021-08-27 15:53:27 -04:00
kodiakhq[bot] 400ee89e70
Merge branch 'main' into crepererum/refactor_catalog_crate 2021-08-27 14:16:14 +00:00
Marco Neumann a2efe3299d refactor: restructure catalog code in `parquet_file`
No functional change (except for slightly changing error messages). This
will make it easier to add more functionality.
2021-08-27 15:06:31 +02:00
Raphael Taylor-Davies fcec394a28
feat: connect up new metrics (#2428) 2021-08-27 12:55:35 +00:00
Edd Robinson 6c7f8d6630 feat: add delete to crate Read Buffer API 2021-08-27 12:30:20 +01:00
Nga Tran bcd39e225c feat: Management API for delete 2021-08-26 17:31:21 -04:00
Raphael Taylor-Davies e3e801d29a
feat: propagate span context into storage RPC queries (#2407)
* feat: propagate span context into storage RPC queries

* refactor: create ExecutionContextProvider trait

* chore: cleanup imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
Carol (Nichols || Goulding) 7cf7fb02ed refactor: Rename database ObjectStore state types to DatabaseObjectStore 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 6d0959fbc3 fix: Move IOx object store creation logic into Database state machine 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 199d212b18 refactor: Move find-or-create IoxObjectStore logic into tests
This is the only place this logic is used; it's not appropriate for
production usage as we only ever want to either find and error or create
and error in real life.
2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) c7eceac8a3 refactor: Have server determine database generation from object store 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 5e1b57de9a refactor: Borrow arcs instead of as_ref 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) cee2f21d47 feat: Add a find_or_create object store function for tests 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) 18ba3b5c59 feat: Create database directories with a generation ID 2021-08-26 09:14:22 -04:00
Marco Neumann 026202a05c fix: correctly account for parquet metadata size
We need to hold the parquet metadata in memory so that we're able to
create catalog checkpoints. We used to do that by holding the decoded
structure (provided by the upstream `parquet` crate) in memory and
serializing that data on demand to Apache Thrift.

There are two drawbacks:

1. We did not account for the memory usage of the decoded structures (or
   at least not fully).
2. We actually don't need the decoded data in-memory, since for the
   checkpoint creation we only need to write the serialized data.

So this PR changes our wrapper so it holds the serialized data which is
then only decoded when it's really necessary. Since the serialized data
is a simple byte vector, we can also easily account for the size.

Note that this makes the accounted size of parquet chunks larger.
However this data was always there, we just ignored it up until now. If
the size of the parquet metadata really becomes an issue, we could trait
some CPU time for memory by compressing it.
2021-08-26 13:24:32 +02:00
kodiakhq[bot] b1ecf1bfed
Merge branch 'main' into crepererum/job_start_time_in_system_table 2021-08-26 08:04:10 +00:00
Andrew Lamb ddf6c6362e
chore: update DataFusion again (#2411)
* chore: update datafusion ref

* chore: run cargo update

* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
Marco Neumann 558aa54aa3 feat: add start time to `operations` system table 2021-08-26 10:00:29 +02:00
Edd Robinson 69329b0b38
Merge branch 'main' into er/refactor/read_buffer/rle_entries 2021-08-25 12:08:44 +01:00
Edd Robinson 11e88877f4 fix: correct size estimation of RLE encoding 2021-08-25 12:03:04 +01:00
Edd Robinson f3c57c47fa
Merge branch 'main' into er/refactor/read_buffer/table_arg 2021-08-25 10:30:12 +01:00
kodiakhq[bot] c98723e3b3
Merge branch 'main' into crepererum/rub_shrink_rle 2021-08-25 08:58:22 +00:00
Marco Neumann 2ad9843e5f feat: make `RLE` a bit smaller by capacity-based allocation
For some demo data this reduced the overall chunk size from

195049367 bytes
to
191088095 bytes
2021-08-25 10:22:43 +02:00
kodiakhq[bot] 5d97acb2f3
Merge branch 'main' into crepererum/issue2372 2021-08-25 07:08:15 +00:00
Edd Robinson 5648817285 refactor: remove redunant argument 2021-08-24 22:26:17 +01:00
Raphael Taylor-Davies f7792aafe6
feat: query tracing (#2273) (#2391)
* feat: query tracing (#2273)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 17:35:59 +00:00
Marco Neumann 363d202202 feat: stop application executor in one dedicated place 2021-08-24 14:46:36 +02:00
Raphael Taylor-Davies a6c9cc2bf2
refactor: rework exec module (#2384)
* refactor: rework exec module

* chore: update docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-24 08:39:54 +00:00
Andrew Lamb 35cf560c9f
fix: do not error if partition has no chunks (#2383)
* fix: do not error if partition has no chunks

* fix: do not panic

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 17:33:54 +00:00
Raphael Taylor-Davies 0946ffe916
refactor: reuse IOxExecutionContext (#2373)
* refactor: reuse IOxExecutionContext

* fix: orphaned comment

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-23 15:47:15 +00:00
kodiakhq[bot] ec0152714e
Merge branch 'main' into catalog-test-determinism 2021-08-19 17:53:04 +00:00
Raphael Taylor-Davies b0e8b75a8a fix: TestCatalogState unique chunk ID 2021-08-19 17:19:12 +01:00
kodiakhq[bot] 47431148d5
Merge branch 'main' into er/refactor/read_buffer/bitmap_size 2021-08-18 21:20:13 +00:00
Raphael Taylor-Davies e81b82c0a4
feat: split db worker loop (#2337)
* feat: split db worker loop

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-18 17:33:13 +00:00
Carol (Nichols || Goulding) 61263c8774 feat: Add a debugging-suitable way to get the object storage path of a database 2021-08-18 11:32:39 -04:00