Commit Graph

277 Commits (f75bd0771ad0a9859fab5f513e5a06090a95632b)

Author SHA1 Message Date
Marco Neumann f75bd0771a refactor: sort file list in build script 2021-10-21 10:09:29 +02:00
kodiakhq[bot] f4eb3884f3
Merge branch 'main' into cn/write-db-owner-file 2021-10-20 13:51:45 +00:00
Marco Neumann 2079fb689e chore: remove unused `generated_types` => `thiserror` dep 2021-10-19 14:45:56 +02:00
Carol (Nichols || Goulding) 10912c2b97 feat: Write owner info in the database's object store directory
Use this as an extra check that the database thinks the current server
is its owner when a server tries to initialize and own a database.

If the owner info doesn't match the current server, which could happen
if a server's config was updated but the database's owner info wasn't,
put the database in an error state that requires operator intervention.
2021-10-18 21:13:20 -04:00
Carol (Nichols || Goulding) d5ab29711e fix: Serialize relative db object store paths to the server config
So that they can be deserialized, without parsing, to create a new
iox object store from the location listed in the server config.

Notably, the locations serialized don't start with the object storage's
prefix like "s3:" or "file:". The location is the same object storage as
the server configuration that was just read from object storage. Having
the server config on one type of object storage and the database files
on another type is not supported.
2021-10-18 08:37:36 -04:00
Carol (Nichols || Goulding) 26484309e0 fix: Re-export prost errors instead of wrapping them 2021-10-15 13:44:53 -04:00
Carol (Nichols || Goulding) def65cfea0 docs: Add examples of the expected database locations 2021-10-15 11:10:10 -04:00
Carol (Nichols || Goulding) 2253a7ba62 fix: Use a map in the server config protobuf 2021-10-15 10:52:59 -04:00
Carol (Nichols || Goulding) afd6e826e5 feat: Write out server config files listing database name and locations 2021-10-15 09:46:20 -04:00
Carol (Nichols || Goulding) 5348c9e503 refactor: Move ProstError to root of generated_types to be useful elsewhere 2021-10-15 09:46:20 -04:00
Raphael Taylor-Davies 074ae40382
feat: migrate entry to use bytes::Bytes (#2842)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-14 10:55:51 +00:00
Raphael Taylor-Davies 8a82f92c5d
refactor: add TimeProvider abstraction (#2722) (#2815)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-12 21:19:03 +00:00
Raphael Taylor-Davies 3dfe400e6b
feat: migrate write path to TimeProvider (#2722) (#2807) 2021-10-12 12:09:08 +00:00
Marco Neumann c4a2641764 refactor: remove `time_closed`
The "time closed" is a leftover from an old lifecycle system, where
chunks moved through the system (open=>closed=>persisted) without being
merged. Now we have the compaction as well as the split query for
persistence that can merge chunks, so a single "time closed" doesn't
make sense any longer. So in fact it is `None` for many chunks and is
also not persisted. Also the current lifecycle policy doesn't use this
value. So let's just remove it.

Closes #1846.
2021-10-11 15:49:34 +02:00
dependabot[bot] 1327735537
chore(deps): bump thiserror from 1.0.29 to 1.0.30
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.29 to 1.0.30.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.29...1.0.30)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-11 08:01:21 +00:00
Marco Neumann d3de6bb6e4 refactor: `max_persisted_timestamp` => `flush_timestamp`
There might be data left before this timestamp that wasn't persisted
(e.g. incoming data while the persistence was running).
2021-10-08 12:36:23 +02:00
Marco Neumann 63a932fa37 refactor: "min unpersisted ts" => "max persisted ts"
Store the "maximum persisted timestamp" instead of the "minimum
unpersisted timestamp". This avoids the need to calculate the next
timestamp from the current one (which was done via "max TS + 1ns").

The old calculation was prone to overflow panics. Since the
timestamps in this calculation originate from user-provided data (and
not the wall clock), this was an easy DoS vector that could be triggered
via the following line protocol:

```text
table_1 foo=1 <i64::MAX>
```

which is

```text
table_1 foo=1 9223372036854775807
```

Bonus points: the timestamp persisted in the partition
checkpoints is now the very same that was used by the split query during
persistence. Consistence FTW!

Fixes #2225.
2021-10-08 11:52:49 +02:00
Carol (Nichols || Goulding) 15b396c720 fix: Set a UUID if none was read from object storage on server startup
This enables a smoother transition to this state when this gets
deployed.
2021-10-07 10:19:17 -04:00
Carol (Nichols || Goulding) 169f2499d3 fix: Use bytes for UUID in the protobuf instead of string 2021-10-07 10:19:17 -04:00
Carol (Nichols || Goulding) ae7d893199 feat: Add UUID to databases
Connects to #2675.

When a database is created, assign it a UUID and serialize the UUID to
object storage by wrapping the database rules in a new
`PersistedDatabaseRules` type that also contains the UUID.

All APIs to the end user involving rules should continue using only
`DatabaseRules` so the UUID is an internal implementation detail.
2021-10-07 10:19:14 -04:00
Carol (Nichols || Goulding) 27e7a1f925 refactor: Organize use statements 2021-10-07 10:17:19 -04:00
Marco Neumann 63d74be490 refactor: make `ChunkId` a UUID 2021-10-07 10:23:27 +02:00
Nga Tran eebbb9a165 chore: clea up dead code 2021-10-05 13:08:40 -04:00
Raphael Taylor-Davies 2a584420b3
refactor: make data_types optional dependency (#2739)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-10-05 15:07:45 +00:00
Nga Tran 825ddaa34c chore: merge main to branch 2021-10-05 10:12:17 -04:00
Marco Neumann b8aa4c33ce refactor: use protobuf bytes for transaction UUIDs 2021-10-05 12:27:48 +02:00
Marco Neumann 10c1a72402 refactor: remove unused fields from `DeletePredicate` 2021-10-05 09:29:24 +02:00
Nga Tran ff8e037971 feat: handle delete response 2021-10-04 20:37:53 -04:00
Marco Neumann 97881079e8 refactor: make `ChunkOrder` non-zero
This will make it easier to handle missing values.

Helps with #2633.
2021-10-04 17:49:12 +02:00
Marco Neumann a0bbbdb197 refactor: use dedicated in-memory `Expr` type during IO
Place a dedicated `Expr` type between datafusion and protobuf. While
this type is currently only used during serialization, it will be used
within `Predicate` in a follow-up change to enable de-duplication of
predicates and avoid the double parsing of datafusion expressions (we
are already somewhat parsing them when we check for valid delete
predicates).

This change looks larger than it is. In practice it just separates the
conversion "datafusion => protobuf" into "datafusion => IOx => protobuf"
and "protobuf => datafusion" into "protobuf => IOx => datafusion". So
(apart from the error types) this is functionally the same.
2021-10-04 09:59:14 +02:00
Raphael Taylor-Davies b402423e9e
feat: remove move lifecycle action (#2674)
* feat: remove move_chunk lifecycle action

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-30 16:58:05 +00:00
Raphael Taylor-Davies db22b4bf60
chore: expand protobuf documentation (#2659)
* chore: expand protobuf documentation

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-29 21:22:03 +00:00
Andrew Lamb 451135ac31
fix: remove unused imports in proto (#2654)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-29 08:42:32 +00:00
Raphael Taylor-Davies 86cee568d5
feat: use upstream pbjson (#2650)
* feat: use upstream pbjson

* chore: fmt
2021-09-28 16:29:26 +00:00
Marco Neumann 24c1563535 chore: remove unused `generated_types` => `futures` dep 2021-09-22 11:34:36 +02:00
Marco Neumann 26edaa4d42 chore: `generated_types` uses `num_cpus` only for tests 2021-09-22 11:31:22 +02:00
Marco Neumann 16a257e7b9 chore: remove unused `generated_types` => `serde_json` dep 2021-09-22 11:19:19 +02:00
Marco Neumann 6682178d6f feat: teach preserved catalog to handle delete predicates 2021-09-20 15:51:14 +02:00
Marco Neumann cef5aeee52 refactor: introduce `ChunkId` type 2021-09-20 13:10:41 +02:00
Raphael Taylor-Davies f62d0eab3c
feat: disable bytes serde (#2580)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-20 09:07:12 +00:00
Marco Neumann 9c80d32af5 refactor: use normal google timestamps in parquet metadata again
We changed from Google timestamp (which use variable-sized integers) to
our own fixed-sized integer timestamps so that the size of the parquet
metadata does not depend on the timestamp. However with the introduction
of compression this is the case anyways (since slightly different
timestamps lead to different compression results) and we need now
derministic timestamps for tests. So there is now point in using our own
timestamp type. Switching back to the variable-sized type also shrinks
the post-compression results a bit.
2021-09-20 09:34:03 +02:00
Marco Neumann afc507ae14 feat: compress encoded parquet metadata
Depending on the number of columns, this should safe between 60% and
75%.
2021-09-20 09:33:18 +02:00
Marco Neumann 7e804db0a3 fix: use btree map for some protobuf messages for deterministic outputs 2021-09-20 09:33:18 +02:00
Carol (Nichols || Goulding) 51a40b31bf feat: Add a --detailed option to the database list CLI
That will list both active and deleted databases with their generations.

Closes #2462.
2021-09-17 15:27:23 -04:00
Carol (Nichols || Goulding) 44a89cdf75 refactor: Change DeletedDatabase to DetailedDatabase
So this info can be reused for active databases in detailed database
lists.
2021-09-17 15:27:22 -04:00
kodiakhq[bot] 23cc980d9e
Merge branch 'main' into cn/restore 2021-09-17 17:52:56 +00:00
Raphael Taylor-Davies 37b615f301
feat: migrate operations CLI to use pbjson (#2562)
* feat: migrate operations CLI to use pbjson

* fix: reserve removed field

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-16 19:15:24 +00:00
Raphael Taylor-Davies 1d55d9a1b5
feat: add pbjson support (#2468)
* feat: add pbjson support

* chore: fix test
2021-09-16 07:33:27 +00:00
Carol (Nichols || Goulding) 91fd32d506 fix: Reset restored db's state instead of restarting background worker 2021-09-15 19:04:05 -04:00
Carol (Nichols || Goulding) 7b6d8f9327 feat: Add an API for restoring a database that was marked deleted 2021-09-15 16:59:37 -04:00