Commit Graph

216 Commits (c745a0a94cfd6669cef89477e7441d8b8c8bc619)

Author SHA1 Message Date
Raphael Taylor-Davies 7d070f8582
feat: remove influxdata.iox.write.v1.WriteService (#3044)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-05 17:51:01 +00:00
Raphael Taylor-Davies 06dfdc4af8
feat: simplify sharding (#3031)
* feat: simplify sharding

* chore: fix lint
2021-11-04 16:57:19 +00:00
Marco Neumann 4840dc11cb feat: add basic CRUD operations for router configs 2021-11-04 12:03:14 +01:00
Carol (Nichols || Goulding) 62e7376394
fix: Restore database should only need UUID, not database name too 2021-11-03 15:01:26 -04:00
Raphael Taylor-Davies 08fcd87337
feat: use protobuf encoding in write buffer (#2724) (#3018) 2021-11-03 15:19:05 +00:00
Raphael Taylor-Davies 39a7f3069f
feat: add alternative string encodings to pbdata (#2724) (#3006)
* feat: add alternative string encodings to pbdata (#2724)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-02 18:14:34 +00:00
Marco Neumann 511621d3ca refactor: use stripped-down shard config for router
This only implements table-based sharding. We can add value-based
sharding later.
2021-11-02 11:12:20 +01:00
Marco Neumann 0d0c0cb42b refactor: move write buffer configs to new home
Write buffer configs will partially be shared by database and router
nodes, so lets move them into a shared home.
2021-11-02 10:17:01 +01:00
Marco Neumann 6644048eec refactor: remove partitioning from router config
Routers are not live yet so this ain't a breaking change. Also w/ write
buffer format rework that is going to be finished soon we don't need
this.
2021-11-02 10:14:00 +01:00
Marco Neumann 4c9570b519 refactor: move `catalog` protobuf to `preserved_catalog`
This makes it clearer what's going since the contained messages are
only for the preserved part, not the in-mem catalog and its management.
2021-11-01 18:07:25 +01:00
Marco Neumann 011af2d6ba feat: add `DeploymentService`
Ref #2980.
2021-11-01 17:55:40 +01:00
kodiakhq[bot] 8c3446ac87
Merge branch 'main' into crepererum/issue2980b 2021-11-01 16:39:32 +00:00
Marco Neumann c6c07879ee docs: explain deprecation 2021-11-01 11:38:46 +01:00
Marco Neumann bcd66c555a feat: add `RemoteService`
Ref #2980.
2021-11-01 11:38:46 +01:00
Marco Neumann 83e4514a43 feat: add `DeleteService`
Ref #2980.
2021-10-29 11:57:31 +02:00
Carol (Nichols || Goulding) 0bccd35f8c
fix: List details about databases only prints active dbs and their UUIDs
We can no longer list deleted databases because the server no longer
knows about them, and we now have UUIDs that are useful to know about,
so change the detailed listing of databases to return UUID instead of
possible deleted at time.
2021-10-28 13:20:29 -04:00
Carol (Nichols || Goulding) ce55cf401c
feat: Return UUID when creating a database
Seems polite and is useful for some test setup
2021-10-28 13:20:29 -04:00
Carol (Nichols || Goulding) 4077d3a20a
fix: Take a UUID when restoring a database 2021-10-28 13:20:29 -04:00
Carol (Nichols || Goulding) 95ba84edca
fix: Return UUID when a database is deleted
So that it can be immediately used to restore elsewhere.
2021-10-28 13:20:29 -04:00
Carol (Nichols || Goulding) f160712f5e
fix: Remove the concept of generations
From iox_object_store and everywhere that fails to compile as a result.

Also make path handling in iox_object_store a bit more consistent.
2021-10-28 13:20:28 -04:00
kodiakhq[bot] c23037fb33
Merge branch 'main' into crepererum/router_draft 2021-10-25 15:23:00 +00:00
Carol (Nichols || Goulding) 948cf92aaa fix: Remove the ability to list deleted databases
Once we're relying on the server config file to know about databases,
IOx shouldn't be listing files in object storage to try and find deleted
databases.

When the disown API is implemented for the floating database design, it
will return the UUID of the database just disowned, so that it can
immediately be used in an adopt API call.
2021-10-22 14:50:06 -04:00
Marco Neumann 7db09316f1 refactor: move router into its own service 2021-10-21 10:35:09 +02:00
Marco Neumann c27f096377 feat: rework router config 2021-10-21 10:09:29 +02:00
Marco Neumann 4f78981958 refactor: factor out partition cfg into its own proto file 2021-10-21 10:09:29 +02:00
Marco Neumann 1fc00095f5 refactor: factor out write buffer cfg into its own proto file 2021-10-21 10:09:29 +02:00
Carol (Nichols || Goulding) 10912c2b97 feat: Write owner info in the database's object store directory
Use this as an extra check that the database thinks the current server
is its owner when a server tries to initialize and own a database.

If the owner info doesn't match the current server, which could happen
if a server's config was updated but the database's owner info wasn't,
put the database in an error state that requires operator intervention.
2021-10-18 21:13:20 -04:00
Carol (Nichols || Goulding) d5ab29711e fix: Serialize relative db object store paths to the server config
So that they can be deserialized, without parsing, to create a new
iox object store from the location listed in the server config.

Notably, the locations serialized don't start with the object storage's
prefix like "s3:" or "file:". The location is the same object storage as
the server configuration that was just read from object storage. Having
the server config on one type of object storage and the database files
on another type is not supported.
2021-10-18 08:37:36 -04:00
Carol (Nichols || Goulding) def65cfea0 docs: Add examples of the expected database locations 2021-10-15 11:10:10 -04:00
Carol (Nichols || Goulding) 2253a7ba62 fix: Use a map in the server config protobuf 2021-10-15 10:52:59 -04:00
Carol (Nichols || Goulding) afd6e826e5 feat: Write out server config files listing database name and locations 2021-10-15 09:46:20 -04:00
Marco Neumann c4a2641764 refactor: remove `time_closed`
The "time closed" is a leftover from an old lifecycle system, where
chunks moved through the system (open=>closed=>persisted) without being
merged. Now we have the compaction as well as the split query for
persistence that can merge chunks, so a single "time closed" doesn't
make sense any longer. So in fact it is `None` for many chunks and is
also not persisted. Also the current lifecycle policy doesn't use this
value. So let's just remove it.

Closes #1846.
2021-10-11 15:49:34 +02:00
Marco Neumann d3de6bb6e4 refactor: `max_persisted_timestamp` => `flush_timestamp`
There might be data left before this timestamp that wasn't persisted
(e.g. incoming data while the persistence was running).
2021-10-08 12:36:23 +02:00
Marco Neumann 63a932fa37 refactor: "min unpersisted ts" => "max persisted ts"
Store the "maximum persisted timestamp" instead of the "minimum
unpersisted timestamp". This avoids the need to calculate the next
timestamp from the current one (which was done via "max TS + 1ns").

The old calculation was prone to overflow panics. Since the
timestamps in this calculation originate from user-provided data (and
not the wall clock), this was an easy DoS vector that could be triggered
via the following line protocol:

```text
table_1 foo=1 <i64::MAX>
```

which is

```text
table_1 foo=1 9223372036854775807
```

Bonus points: the timestamp persisted in the partition
checkpoints is now the very same that was used by the split query during
persistence. Consistence FTW!

Fixes #2225.
2021-10-08 11:52:49 +02:00
Carol (Nichols || Goulding) 169f2499d3 fix: Use bytes for UUID in the protobuf instead of string 2021-10-07 10:19:17 -04:00
Carol (Nichols || Goulding) ae7d893199 feat: Add UUID to databases
Connects to #2675.

When a database is created, assign it a UUID and serialize the UUID to
object storage by wrapping the database rules in a new
`PersistedDatabaseRules` type that also contains the UUID.

All APIs to the end user involving rules should continue using only
`DatabaseRules` so the UUID is an internal implementation detail.
2021-10-07 10:19:14 -04:00
Marco Neumann 63d74be490 refactor: make `ChunkId` a UUID 2021-10-07 10:23:27 +02:00
Nga Tran eebbb9a165 chore: clea up dead code 2021-10-05 13:08:40 -04:00
Nga Tran 825ddaa34c chore: merge main to branch 2021-10-05 10:12:17 -04:00
Marco Neumann b8aa4c33ce refactor: use protobuf bytes for transaction UUIDs 2021-10-05 12:27:48 +02:00
Marco Neumann 10c1a72402 refactor: remove unused fields from `DeletePredicate` 2021-10-05 09:29:24 +02:00
Nga Tran ff8e037971 feat: handle delete response 2021-10-04 20:37:53 -04:00
Marco Neumann a0bbbdb197 refactor: use dedicated in-memory `Expr` type during IO
Place a dedicated `Expr` type between datafusion and protobuf. While
this type is currently only used during serialization, it will be used
within `Predicate` in a follow-up change to enable de-duplication of
predicates and avoid the double parsing of datafusion expressions (we
are already somewhat parsing them when we check for valid delete
predicates).

This change looks larger than it is. In practice it just separates the
conversion "datafusion => protobuf" into "datafusion => IOx => protobuf"
and "protobuf => datafusion" into "protobuf => IOx => datafusion". So
(apart from the error types) this is functionally the same.
2021-10-04 09:59:14 +02:00
Raphael Taylor-Davies b402423e9e
feat: remove move lifecycle action (#2674)
* feat: remove move_chunk lifecycle action

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-30 16:58:05 +00:00
Raphael Taylor-Davies db22b4bf60
chore: expand protobuf documentation (#2659)
* chore: expand protobuf documentation

* chore: review feedback

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-29 21:22:03 +00:00
Andrew Lamb 451135ac31
fix: remove unused imports in proto (#2654)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-29 08:42:32 +00:00
Marco Neumann 6682178d6f feat: teach preserved catalog to handle delete predicates 2021-09-20 15:51:14 +02:00
Marco Neumann 9c80d32af5 refactor: use normal google timestamps in parquet metadata again
We changed from Google timestamp (which use variable-sized integers) to
our own fixed-sized integer timestamps so that the size of the parquet
metadata does not depend on the timestamp. However with the introduction
of compression this is the case anyways (since slightly different
timestamps lead to different compression results) and we need now
derministic timestamps for tests. So there is now point in using our own
timestamp type. Switching back to the variable-sized type also shrinks
the post-compression results a bit.
2021-09-20 09:34:03 +02:00
Marco Neumann afc507ae14 feat: compress encoded parquet metadata
Depending on the number of columns, this should safe between 60% and
75%.
2021-09-20 09:33:18 +02:00
Carol (Nichols || Goulding) 51a40b31bf feat: Add a --detailed option to the database list CLI
That will list both active and deleted databases with their generations.

Closes #2462.
2021-09-17 15:27:23 -04:00