Commit Graph

209 Commits (67209fce7c7fc7da3d2a3aa526d59492c27e503d)

Author SHA1 Message Date
Carol (Nichols || Goulding) 82588d5c72 fix: Don't return Result from test functions 2021-04-07 12:40:00 -04:00
Marko Mikulicic 8bd7a39607 fix: Use InstanceMetadataProvider directly
Rusoto's ChainProvider swallows the error message produced by the underlying InstanceMetadataProvider
(see d59d716f09/rusoto/credential/src/lib.rs (L397)) making it hard for us to know why it's not working in our staging cluster.
2021-03-15 13:55:43 +00:00
Marko Mikulicic 3c8a58266a
feat: Make S3 auth parameters optional
so that if not present, the aws client library can use builtin auth providers,
such as the InstanceMetadataProvider, which is commonly used to get the credentials
granted to the AWS VM via cloud native mechanism.
2021-03-15 10:11:01 +01:00
Carol (Nichols || Goulding) d67a03f616 fix: Improved tests found another unimplemented hiding 2021-03-04 16:18:31 -05:00
Carol (Nichols || Goulding) ca2f74063e fix: Test using the object store wrapper interface 2021-03-04 16:18:11 -05:00
Carol (Nichols || Goulding) 16df4c542b fix: Rename AZURE_STORAGE_MASTER_KEY to AZURE_STORAGE_ACCESS_KEY 2021-03-04 10:15:46 -05:00
Carol (Nichols || Goulding) e9fedfae17 fix: Use AWS_DEFAULT_REGION instead of AWS_REGION 2021-03-04 10:15:30 -05:00
Carol (Nichols || Goulding) 37746173d9 feat: Change azure object store to only get config from args, not env 2021-03-04 10:14:42 -05:00
Carol (Nichols || Goulding) 02d981451d feat: Implement Google Cloud Storage-related CLI arguments 2021-03-04 10:14:42 -05:00
Carol (Nichols || Goulding) ef13c1023e feat: Change google cloud object store to get config from args, not env 2021-03-04 10:14:42 -05:00
Carol (Nichols || Goulding) 06236e796b feat: Change aws object store to get config from args, not env 2021-03-04 10:14:42 -05:00
Carol (Nichols || Goulding) c7ef18337c feat: Consolidate all bucket config into one option/env var
Fixes #869.
2021-02-25 15:53:20 -05:00
Marko Mikulicic 12b768b8f1 fix: Escape empty string PathPart
Empty directory names are silently ignored and can lead to very surprising effects
such as directory layouts missing a level. This makes it hard to reason about directory structures.

A sane object store path API should either disallow empty names or deal with them gracefully.

Since we already have to escape file/directory names using the minimum common denominator valid character
set for known cloud providers, it feels quite natural to treat this empty dir/file name problem as encoding problem.
2021-02-24 21:13:56 +00:00
Marko Mikulicic 9e521a2ea1 feat: Plug GCS list_with_delimiter impl
And fix env.example. Now GCS can be used to persist snapshots.
2021-02-23 21:06:53 +00:00
Marko Mikulicic 9860def4b2 feat: Add S3 plumbing to iodx config 2021-02-23 14:55:55 +00:00
Carol (Nichols || Goulding) cff12da3a1 fix: Upgrade to released version of cloud_storage
Fixes #801.
2021-02-22 13:01:06 -05:00
Carol (Nichols || Goulding) a42103f436 Merge remote-tracking branch 'origin/main' into cn/google-list-with-delimiter 2021-02-22 12:53:46 -05:00
Marko Mikulicic 81739bf486 docs: Fix typo 2021-02-22 10:35:23 +00:00
Carol (Nichols || Goulding) cc6738c6f3 fix: Check for AZURE_STORAGE_MASTER_KEY in the test macro too 2021-02-18 16:53:06 -05:00
Carol (Nichols || Goulding) fcd4f91909 feat: Implement list_with_delimiter for Azure storage 2021-02-18 16:37:23 -05:00
Carol (Nichols || Goulding) 57942b51b7 feat: Update to latest Azure sdk to get delimiter support
Needed these PRs:
  - https://github.com/Azure/azure-sdk-for-rust/pull/176
  - https://github.com/Azure/azure-sdk-for-rust/pull/179

Also needed to enable the queue feature to get the azure_storage crate
compiling; at the moment, the code is still being reorganized and the
features aren't independent yet:
https://github.com/Azure/azure-sdk-for-rust/issues/177
2021-02-18 14:59:06 -05:00
Jake Goulding 484adcc257 chore: fix typo in an error message 2021-02-18 14:57:38 -05:00
Marko Mikulicic 536c1724bd feat: Allow to put streams of unknown length to objectstore
Addresses the API aspect of #818

Adds a utility module that helps computing the length of a stream while buffering it
for later replay (in-memory or spilling it in a temporary file).
2021-02-18 16:49:18 +00:00
Carol (Nichols || Goulding) f934a21efe test: Update tests to match new cloud storage error behavior 2021-02-17 14:23:34 -05:00
Carol (Nichols || Goulding) ef54131afb feat: Gets google cloud list_with_delimiter tests passing 2021-02-17 14:23:33 -05:00
Edd Robinson 2b642a8da6 refactor: add arc clone lint 2021-02-15 12:38:19 +00:00
Jake Goulding dad426d02e
fix: Report a failure to parse an AWS datetime (#794)
* fix: Report a failure to parse an AWS datetime

* refactor: use SNAFU context selectors instead of enum variants
2021-02-12 15:10:49 +00:00
Raphael Taylor-Davies c7e8a68fbe
fix: enable tokio::fs for object_store crate (#788)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-11 12:12:57 +00:00
Carol (Nichols || Goulding) c17feb998a feat: Implement display on FilePath 2021-02-08 15:13:25 -05:00
Carol (Nichols || Goulding) 076f67285d
Merge branch 'main' into cn+jg/file-delimiter 2021-02-05 09:45:34 -05:00
Carol (Nichols || Goulding) fbf776c6b3
chore: Clean up Cargo.tomls (#754)
* fix: test_helpers crate should only be a dev-dep

* fix: object_store no longer has a build script, so no longer needs a build dep

* chore: Alphabetize all Cargo.tomls
2021-02-04 18:56:02 -05:00
Carol (Nichols || Goulding) 8b18003e19 test: Don't check file metadata because SystemTime is not monotonic
See https://doc.rust-lang.org/std/time/struct.SystemTime.html
2021-02-04 15:46:11 -05:00
Carol (Nichols || Goulding) fa8594327d test: Add a better failure message to aid debugging 2021-02-04 15:12:33 -05:00
Carol (Nichols || Goulding) abbd29aeeb fix: Use Self in From impls 2021-02-04 13:40:33 -05:00
Carol (Nichols || Goulding) 3f1434e0e4 refactor: Remove redundant parts of error variant names
This info is now conveyed by the module each error comes from.
2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) af7a5fa952 fix: Use walkdir::Errors within disk::Errors
I didn't want the object store lib Error to have to know about walkdir,
but I feel better about it now that this error type is scoped to the
disk module. The walkdir errors might have a bit more information.
2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) 80581c9084 fix: Remove vestigial error types 2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) f8fb24b88c refactor: Extract In-memory memory::Error 2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) 8e6a06ebb2 refactor: Extract Azure azure::Error 2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) 841f4ee314 refactor: Extract AWS S3 aws::Error 2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) 1492e52e57 refactor: Extract Google Cloud Storage gcp::Error 2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) f795c56c8d refactor: Start splitting up the object store error type; extract disk::Error
It's starting to get out of control. Time to fix that.
2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) f9454fb57f feat: Implement list_with_delimiter for File object store
Fixes #688.
2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) 5b18f7dbea feat: Hook DirsAndFileName push_part_as_dir to FilePath 2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) a7cd8a2796 feat: Add a way to unset a file name in an object store path 2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) c722188c5a feat: Connect parts_after_prefix from DirsAndFileName to FilePath
This will be useful in the File object store's list_with_delimiter.
2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) c66efa80d1 feat: Implement PartialOrd and Ord for FilePath
This allows storing FilePaths in a BTreeSet and ordering FilePaths.
2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) fd39315388 refactor: Improve error name
I thought this only had to do with the InMemory Put request, but it's a
bit more general than that. Hopefully clarifying the purpose of this
error.
2021-02-04 13:04:52 -05:00
Jake Goulding 678044e08a fix: test isn't a special extension recognized by object storage
This test was invalid because there are cases in which we use the
assumption that all file names in object storage should end with
`.json`, `.parquet`, or `.segment`.
2021-02-04 13:04:52 -05:00
Carol (Nichols || Goulding) 5d1c7dfe82 docs: Improve descriptive code comments as suggested in review 2021-02-01 14:56:49 -05:00
Carol (Nichols || Goulding) 5c8b351f57 fix: Address clippy suggestions 2021-02-01 14:56:49 -05:00
Carol (Nichols || Goulding) f9539f2b74 fix: Remove blanket trait impl now causing a stack overflow 2021-02-01 14:56:49 -05:00
Carol (Nichols || Goulding) ff6955a433 refactor: Extract a trait for ObjectStoreApi with associated path
This is the promised cleanup. This structure gets rid of a lot of
intermediate structures and encodes through associated types how the
object stores and path types are related.

The enums are still necessary to avoid having generics leak all over
the place, but the object store variants and path variants should always
match because they'll always come from the object store trait
implementations that use the associated types.
2021-02-01 14:56:47 -05:00
Carol (Nichols || Goulding) c40205b37e test: Move DirsAndFileName functionality tests with the definition 2021-02-01 14:39:18 -05:00
Carol (Nichols || Goulding) 596a73f56a refactor: Extract a FilePath type for use in file storage
Enforces that on-disk storage will only ever use file paths.

More cleanup coming!
2021-02-01 14:39:18 -05:00
Carol (Nichols || Goulding) d39131ab49 refactor: Extract a CloudPath type for use in cloud storage
This is the start of using the type system to enforce that only
CloudPaths will be used with S3, GCS, and Azure.

Still some mess in here, cleanup coming.
2021-02-01 14:39:16 -05:00
Carol (Nichols || Goulding) 7d3b4db234 fix: InMemory doesn't need pagination 2021-02-01 14:35:47 -05:00
Carol (Nichols || Goulding) fdbe602e57 refactor: Always get a path to build from the object store 2021-02-01 14:30:21 -05:00
Andrew Lamb f3bd8bd0e3
chore: update deps (tokio 1.0 and ecosystem) (#707)
* chore: Update arrow + tokio deps

* chore: Use bleeding edge azure

* chore: Update aws + other deps

* fix: fmt

* fix: Switch to in-house version of routerify

* fix: Upgrade to hyper 0.14

The hyper::error module is now private; hyper::Error is the public
re-export

* fix: Upgrade cloud storage to get tokio upgrade

* fix: Upgrade open_telemetry

* fix: Do not call `panic::set_hook` during another panic

Doing so leads to a double panic which aborts the process.

* fix: new h2 error who dis

Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2021-01-29 16:11:55 -05:00
Carol (Nichols || Goulding) 6bb91653c1
refactor: Some tiny cleanups (#680)
* refactor: Remove import of unimplemented macro that's in the prelude

* refactor: Remove allowing of dead code that isn't dead anymore

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-01-21 07:27:17 -05:00
Carol (Nichols || Goulding) 979458034a refactor: Extract PathPart to its own module
And be more explicit about the privacy of the inner String
2021-01-15 15:29:01 -05:00
Carol (Nichols || Goulding) 004fb8f803 refactor: Extract a file module for file-based paths 2021-01-15 15:26:35 -05:00
Carol (Nichols || Goulding) ac22d2bb44 refactor: Extract a cloud module to deal with cloud paths 2021-01-15 15:24:49 -05:00
Carol (Nichols || Goulding) f8956dfbe8 refactor: Extract a module for the parsed DirsAndFileName structure 2021-01-15 15:24:48 -05:00
Carol (Nichols || Goulding) 813092649d fix: Make file behave the same as other object stores with paths 2021-01-15 10:25:05 -05:00
Carol (Nichols || Goulding) 0415d4a186 fix: Only treat PathBuf ending parts that match one of our extensions as file names 2021-01-15 10:15:00 -05:00
Carol (Nichols || Goulding) 25cc396f5e test: Handle macOS temp directories that start with . better 2021-01-15 10:15:00 -05:00
Carol (Nichols || Goulding) 4acf0f6ea9 fix: Take prefix's file name into account in
This makes the in memory object store behave consistently with the cloud
object stores.
2021-01-13 09:57:13 -05:00
Carol (Nichols || Goulding) ea719724e9 test: Add another filename to clarify behavior in this test 2021-01-13 09:56:26 -05:00
Carol (Nichols || Goulding) 383e3abfce fix: Remove dbgs 2021-01-11 16:57:38 -05:00
Carol (Nichols || Goulding) 8570c88689 test: Add a partial file name scenario to list_with_delimiter tests 2021-01-11 16:57:38 -05:00
Carol (Nichols || Goulding) f0ab0e25a0 fix: Match partial directory names with prefix 2021-01-11 16:57:38 -05:00
Carol (Nichols || Goulding) 7c457710ee fix: Implement parts_after_prefix; InMemory now passes 2021-01-11 16:57:38 -05:00
Carol (Nichols || Goulding) 06f1358e2d feat: Change ObjectStorePath API to be more explicit
Now you have to designate whether you're adding a directory or a file
name, with some assumptions based on paths coming from a cloud object
storage or the file system.

A notable difference: checking to see if "apple/b" is a prefix of
"apple/bear/cow.json" will now say no; only whole directories are
matched.
2021-01-11 16:57:37 -05:00
Karsten Jeschkies 2cd383af6f feat: Azure support for object store
Closes #528

This patch adds support for Microsfot Azure Blob storage. The
implementations requires an account, a key and container name. They can
be configured via the environment variables `AZURE_STORAGE_ACCOUNT`,
`AZURE_STORAGE_MASTER_KEY` and `AZURE_STORAGE_CONTAINER`.
2021-01-08 16:27:17 +01:00
Carol (Nichols || Goulding) cef0bb7c98 feat: Implement starts_with on ObjectStorePath 2021-01-07 16:51:32 -05:00
Carol (Nichols || Goulding) 535e65c02a refactor: Use itertools' extend with the iter instead of collecting 2021-01-07 16:02:23 -05:00
Carol (Nichols || Goulding) 91c4e26628 feat: Disallow parts of paths to be only one or two dots 2021-01-07 16:02:20 -05:00
Carol (Nichols || Goulding) 23782dc9b7 test: Add some tests around path building and encoding 2021-01-07 16:02:10 -05:00
Carol (Nichols || Goulding) 164c0e7357 fix: Use DELIMITER to create DELIMITER_BYTE 2021-01-07 15:23:51 -05:00
Carol (Nichols || Goulding) 37056a1753 feat: Decode PathPart's values when Displaying 2021-01-07 15:23:32 -05:00
Carol (Nichols || Goulding) b421de77c4 feat: Encode characters GCS recommends avoiding 2021-01-07 14:58:58 -05:00
Carol (Nichols || Goulding) 18ee1b561b feat: Use ObjectStorePath everywhere to feel out the API needed 2021-01-07 10:48:22 -05:00
Carol (Nichols || Goulding) e58607f015 fix: Add more info to a test failure message
I had stuff in my Google storage from when I was manually testing out
how paths are handled; it was hard to see that was the problem without
this extra failure text.
2021-01-07 09:21:00 -05:00
Carol (Nichols || Goulding) 44fb5b2b72 fix: Un-nest test modules
Now that the code is separated into modules, we don't need the modules
inside the test modules. So before this commit, the test names looked
like this:

```
test aws::tests::amazon_s3::s3_test_put_nonexistent_bucket ... ok
test gcp::test::google_cloud_storage::gcs_test ... ok
test disk::tests::file::length_mismatch_is_an_error ... ok
test memory::tests::in_memory::length_mismatch_is_an_error ... ok
```

and after this commit, the test names look like this:

```
test aws::tests::s3_test_put_nonexistent_bucket ... ok
test gcp::test::gcs_test ... ok
test disk::tests::length_mismatch_is_an_error ... ok
test memory::tests::length_mismatch_is_an_error ... ok
```
2021-01-07 09:20:57 -05:00
Carol (Nichols || Goulding) 55d64182f6 fix: Use @domodwyer's macro trick instead of conditional compilation 2021-01-07 09:20:09 -05:00
Carol (Nichols || Goulding) 5387499888 refactor: Reorganize imports 2021-01-07 09:20:04 -05:00
Carol (Nichols || Goulding) e36a7d3595 fix: Get Google Cloud tests compiling again 2021-01-07 09:19:58 -05:00
Paul Dix cf56c1ba9e feat: Add object store path abstraction 2021-01-07 09:19:50 -05:00
Paul Dix c1f8e89bf0 feat: Add list_with_delimiter to in memory object store
This adds the list_with_delimiter function to the in-memory object store. It also updates the function signature to require a prefix since it will always only want to list either the objects in the dir or the common prefixes.
2021-01-07 09:19:22 -05:00
Paul Dix 4b40d11e60 feat: Add list_with_delimiter to object store
This adds a new function list_with_delimiter to the object store. This commit contains just the implementation for S3, leaving the others to be completed in follow on commits.

This has a fixed delimiter to ensure a directory structure is created. This delimiter should be dependent on platform and which object store is used. For any of the cloud object stores or in memory, the delimiter should be /. For the future disk based implementation it should be dependendent on if you're running on Windows or Linux.

I didn't use Stream for the return type because I found it difficult to work with and I don't think it actually added anything useful. The return ListResult struct has the next token and I prefer that the caller explicitly makes calls that go over the network so they're more aware of what's going on, where a Stream abstracts that away so it's hidden behind the scenes. We can easilsy add a Stream based version on top of this existing API if we want.
2021-01-07 09:19:15 -05:00
Paul Dix d1ab5c0ee9 chore: refactor object_store crate
This pulls the different backing implmenetations into their own modules. They're about to get more complex so it felt like it was time to separate them out rather than building towards a single multi-thousand line lib.rs. The error type is only defined in lib and imported by the individual modules, which I think makes it easier to work with.
2021-01-07 09:19:07 -05:00
Carol (Nichols || Goulding) b11896b7e9
fix: Compiler errors missed in aws object store tests because CI wasn't checking them (#564) 2020-12-15 12:28:42 -05:00
Dom 4c35253fd5 style: unmangle wrapped diagrams
Adds #[rustfmt::skip] to comment blocks containing diagrams to skip wrapping.
2020-12-14 13:14:36 +00:00
Dom 6f473984d0 style: wrap comments
Runs rustfmt with the new config.
2020-12-11 18:22:26 +00:00
Dom c9a101ecae
Merge branch 'main' into brandonsov/add-bucket-location-to-object-store-errors 2020-12-10 18:14:27 +00:00
Brandon Sov 568065d63f style: rename location_string to location_copy 2020-12-10 09:41:24 -08:00
Brandon Sov 6247a01144 test: update typos 2020-12-10 09:24:42 -08:00
huming a5a3cd149d chore: some minor comments and rename 2020-12-10 10:48:57 +08:00
Brandon Sov 146bf59d8d test: simplify test error matching 2020-12-09 11:36:49 -08:00
Brandon Sov d179fe68d3 refactor: replace bucket_name clones with references 2020-12-09 11:03:19 -08:00
Brandon Sov af8569378f test: move common variable and function to general test usage 2020-12-09 11:01:51 -08:00
Brandon Sov 625542c310 fix: Update s3 error function to correct pattern 2020-12-09 10:14:50 -08:00
Brandon Sov 4be47b1ccc fix: Move functions to the conditional compilation flag to pass linter 2020-12-08 23:42:41 -08:00
Brandon Sov 62c14de2bc fix: Update pattern match to detect String 2020-12-08 23:42:33 -08:00
Brandon Sov 1a4b2eac26 fix: Report bucket/location when relevant with object store errors 2020-12-08 22:29:28 -08:00
Paul Dix fa3ecbd4ed
feat: Implement write buffer to Parquet snapshotting (#526)
* feat: Implement write buffer to Parquet snapshotting

This introduces snapshot to the server packages to manage snapshotting. It also introduces a new trait for representing a Partition. There is a very crude API wired up in http_routes for testing purposes. Follow on work will bring the server package into http_routes and rework the snapshot API.
2020-12-08 14:20:43 -05:00
Carol (Nichols || Goulding) 085f91000d fix: dotenv is both a build and a dev dependency 2020-11-20 13:18:28 -05:00
Andrew Lamb a3b88d5506
refactor: rename delorean_object_store --> object_store (#413) 2020-11-05 08:56:30 -05:00