Don't use string prefixes, e.g. `foo/bar/` is a prefix of `foo/bar/x`
but NOT of `foo/bar_baz/y`.
This also removes some heuristics during the cloud storage parsing that
assumed that file names always contain a dot but directories don't.
Technically we should now always be able to know whether a path points
to a file or a directory:
- Rust (manually constructed): we use `DirsAndFileName` which knows the
difference (i.e. if `file_name` is set)
- in-mem store: we also use `DirsAndFileName`
- file system: this was fixed by #1523
(ccd094dfcf and 464667d8b8)
- cloud: cloud doesn't know about directories. So all paths that these
APIs return and that end with a `/` are directories (can only occur in
`list_with_delimiter`); everyting else is a file
Path string representations are now acting occurdingly (i.e. always end
with an `/` if they point to a directory).
Fixes#3226.
This removes 3 "nonexisting region" tests that where testing very
specific error behavior that no local emulator (minio and localstack)
replicate and that don't add much value. It's better to test our AWS
code at all than being to picky.
Otherwise the whole thing blows up when starting a server that has many
DBs registerd, because we potentially create 1 connection per DB (e.g.
to read out the preserved catalog).
Fixes#3336.
* feat: enable reconfiguration of in-use throttled store
This is handy for tests for which a part should run "normal" and another
one should be throttled/blocked.
* feat: keep track of the number of tasks within a `DedicatedExecutor`
* test: ensure query cancellation (somewhat) works
We cannot really test that query cancellation finishes all subtasks
because _tokio_ doesn't provide sufficient stats / inspection, at least
as long we don't want to rely heavily on _tokio_ tracing. So let's at
least check that tasks from the dedicated executors are pruned properly.
For all other regressions we need to add unit tests to the affected
components. See for example:
- https://github.com/apache/arrow-datafusion/issues/1103
- https://github.com/apache/arrow-datafusion/pull/1105
- https://github.com/apache/arrow-datafusion/pull/1112
- https://github.com/apache/arrow-datafusion/pull/1121Closes#2027.
So that they can be deserialized, without parsing, to create a new
iox object store from the location listed in the server config.
Notably, the locations serialized don't start with the object storage's
prefix like "s3:" or "file:". The location is the same object storage as
the server configuration that was just read from object storage. Having
the server config on one type of object storage and the database files
on another type is not supported.
The implementation of list_with_delimiter for the in-memory object
storage assumed that paths returned from the BTreeMap keys that sorted
greater than the prefix given to list_with_delimiter and for whom
prefix_matches returned true would also have parts after the prefix.
This didn't account for paths that started with the prefix but didn't
immediately have the delimiter after the prefix: that is,
prefix = 1/database_name
would match the in-memory paths:
1/database_name/0/rules.pb
1/database_name_and_another_thing/0/rules.pb
The first path here *would* return some parts_after_prefix, but the
second path would not and the previously existing code would panic for
the added path in the list_with_delimiter test case.