Commit Graph

49167 Commits (b555ddf18b19cb57ddf2f71596cd3409354caf2e)

Author SHA1 Message Date
Michael Gattozzi b555ddf18b
feat: Add different output support to queries (#24616)
This commit adds the ability to choose the output format of a query via
the v3 api so that a user can choose, whether by Accept headers or the
format url param, how the data will be returned to them.

Prior to this commit the default was a pretty printed text format, but
that instead has been changed to json as the default.

There are multiple formats one can choose:

1. json
2. csv
3. pretty printed text
4. parquet

I've tested each of these out and it works well. In particular the
parquet output is exciting as users will be able to perform a query and
receive back parquet data that they can then load into say a Python
script or something else to work on and operate it. As we extend what
data can be queried, as well as persisting it, what people will be able
to do with Edge will be really cool and I'm interested to see how users
will end up using this functionality in the future.
2024-02-12 12:04:05 -05:00
Michael Gattozzi 8a68ae3f11
fix: Remove nightly CI build from Circle CI runs (#24637)
Prior to this change we've had CI fail nightly because we can't push the
image to CI due to permissions issues. The problem is that
influxdata/influxdb_iox is the one that actually has access to push that
data to quay.

This commit removes the nightly build and references to it as this image
is built nightly by the IOx team. If things break we have access to fix
it, but I don't think it'll be an issue.
2024-02-12 10:21:15 -05:00
Trevor Hilton 397ee6e73b
fix: add rust-analyzer to toolchain file (#24636)
* fix: add rust-analyzer to toolchain file

Added the rust-analyzer component to the rust-toolchain.toml
file so that the correct version of rust-analyzer is installed
on Apple Silicone.

This will allow the LSP to work on Apple Silicone machines.

* chore: update deps for cargo deny
2024-02-06 16:04:03 -05:00
Michael Gattozzi ff567cd33f
chore(deps): Update arrow and datafusion to 49.0.0 (#24605)
* chore(deps): Update arrow and datafusion to 49.0.0

This commit copies in our dependency code from influxdb_iox in order for
us to be able to upgrade from a forked version of 46.0.0 to 49.0.0 of
both arrow and datafusion. Most of the important changes were around how
we consumed the crates in influxdb3(_server/_write). Those diffs are
particularly worth looking at as the rest was a straight copy and we
don't touch those crates in our development currently for influxdb3
edge.

* fix: regenerate workspace hack crate

* fix: Protobuf issues with incompatibility labels

* fix: Broken CI yaml

* fix: buf version

* fix: Only check IOx repo

* fix: Remove protobuf lint

* fix: Comment out call to protobuf-lint
2024-01-31 19:18:51 -05:00
Michael Gattozzi 001a2a6653
feat: Implement Persister for PersisterImpl (#24588)
* feat: Implement Catalog r/w for persister

This commit implements reading and writing the Catalog to the object
store. This was already stubbed out functionality, but it just needed an
implementation. Saving it to the object store is pretty straight forward
as it just serializes it to JSON and writes it to the object store. For
loading, it finds the most recently added Catalog based on the file name
and returns that from the object store in it's deserialized form and
returned to the caller.

This commit also adds some tests to make sure that the above
functionality works as intended.

* feat: Implement Segment r/w for persister

This commit continues the work on the persister by implementing the
persist_segment and load_segment functions for the persister. Much like
the Catalog implementation, it's serialized to JSON before being
persisted to the object store in persist_segment. This is pretty
straightforward. For the loading though we need to find the most recent
n segment files and so we need to list them and then return the most
recent n. This is a little more complicated to do, but there are
comments in the code to make it easier to grok.

We also implement more tests to make sure that this part of the
persister works as expected.

* feat: Implement Parquet r/w to persister

This commit does a few things:

- First we add methods to the persister trait for reading and writing
  parquet files as these were not stubbed out in prior commits
- Secondly we add a method to serialize a SendableRecordBatchStream into
  Parquet bytes
- With these in place implementing the trait methods is pretty
  straightforward: hand a path in and a stream and get back some
  metadata about the file persisted and also get the bytes back if
  loading from the store

Of course we also add more tests to make sure this all works as
expected. Do note that this does nothing to make sure that we bound how
much memory is used or if this is the most efficient way to write
parquet files. This is mostly to get things working with the
understanding that future refinement on the approach might be needed.

* fix: Update smallvec for crate advisory

* fix: Implement better filename handling

* feat: Handle loading > 1000 Segment Info files
2024-01-25 14:31:57 -05:00
Michael Gattozzi e13cc476bb
feat: Add paths module to influxdb3_write (#24579)
This commit introduces 4 new types in the paths module for the
influxdb3_write crate. They are:

- ParquetFilePath
- CatalogFilePath
- SegmentInfoFilePath
- SegmentWalFilePath

Each of these corresponds to an object store path and for the WAL file an
on disk path that we can use to address the needed files in a consistent way
and not need to have path construction be duplicated to address these files.

These types also Deref/AsRef to the object_store::path::Path type (or the
std::path::Path type for the Wal) so that they can be used in places that
expect the type such as various object_store/std::fs and so that we can use
the underlying type's methods without needing to implement them for each
type as they are just a thin wrapper around those types.

This commit adds some tests to make sure that the path construction
works as intended and also updates the `wal.rs` file to use the new
`SegmentWalFilePath` instead of just a `PathBuf`.

Closes: #24578
2024-01-19 10:57:54 -05:00
François Martin 58bec1d819
docs: rename influxdb_iox to influxdata (#24577) 2024-01-16 13:34:23 -05:00
Paul Dix 02b4d28637
feat: add basic wal implementation for Edge (#24570)
* feat: add basic wal implementation for Edge

This WAL implementation uses some of the code from the wal crate, but departs pretty significantly from it in many ways. For now it uses simple JSON encoding for the serialized ops, but we may want to switch that to Protobuf at some point in the future. This version of the wal doesn't have its own buffering. That will be implemented higher up in the BufferImpl, which will use the wal and SegmentWriter to make data in the buffer durable.

The write flow will be that writes will come into the buffer and validate/update against an in memory Catalog. Once validated, writes will get buffered up in memory and then flushed into the WAL periodically (likely every 10-20ms). After being flushed to the wal, the entire batch of writes will be put into the in memory queryable buffer. After that responses will be sent back to the clients. This should reduce the write lock pressure on the in-memory buffer considerably.

In this PR:
- Update the Wal, WalSegmentWriter, and WalSegmentReader traits to line up with new design/understanding
- Implement wal (mainly just a way to identify segment files in a directory)
- Implement WalSegmentWriter (write header, op batch with crc, and track sequence number in segment, re-open existing file)
- Implement WalSegmentReader

* refactor: make Wal return impl reader/writer

* refactor: clean up wal segment open

* fix: WriteBuffer and Wal usage

Turn wal and write buffer references into a concrete type, rather than dyn.

* fix: have wal loading ignore invalid files
2024-01-12 11:52:28 -05:00
Michael Gattozzi 028a05fbde
fix: remove deploy step for images (#24566)
We currently don't need or want to deploy influxdb as we're still
building out the Edge product. Maybe later for a demo, but for now it
just breaks CI and so this removes it.
2024-01-10 13:51:19 -05:00
Michael Gattozzi 89d28ade42
fix: change circle-ci config from iox to influxdb3 (#24564)
This commit changes the circle-ci config to use influxdb3 rather than
iox in our ci config script as the repo is influxdb not influxdb_iox.
While we could probably strip out a lot more here as a first attempt to
get this to build release images and push them on main this will do just
fine.
2024-01-10 12:36:38 -05:00
Michael Gattozzi 9d81c73785
fix: set Dockerfile to build influxdb3 not IOx (#24563)
Now that we're transitioning the repo code to have influxdb3 Edge not
IOx be what's here, we can update the Dockerfile to build influxdb3.
This is mostly just updating which version of Rust to use, changing the
command that's run when docker runs the container to serve, and changing
influxdb_iox to influxdb3 everywhere in the file.
2024-01-09 15:19:21 -05:00
Michael Gattozzi 8ee13bca48
fix: Failing CI on main (#24562)
* fix: build, upgrade rustc, and deps

This commit upgrades Rust to 1.75.0, the latest release. We also
upgraded our dependencies to stay up to date and to clear out any
uneeded deps from the lockfile. In order to make sure everything works
this also fixes the build by upgrading the workspace-hack crate using
cargo hikari and removing the `workspace.lint` that was in
influxdb3_write that didn't need to be there, probably from a merge
issue.

With this we can build influxdb3 as our default on main, but this alone
is not enough to fix CI and will be addressed in future commits.

* fix: warnings for influxdb3 build

This commit fixes the warnings emitted by `cargo build` when compiling
influxdb3. Mainly it adds needed lifetimes and removes uneccesary
imports and functions calls.

* fix: all of the clippy lints

This for the most part just applies suggested fixes by clippy with a few
exceptions:

- Generated type crates had additional allows added since we can't
  control what code gets made
- Things that couldn't be automatically fixed were done so manually in
  particular adding a Send bound for traits that created a Future that
  should be Send

We also had to fix a build issue by adding a feature for tokio-compat
due to the upgrade of deps. The workspace crate was updated accordingly.

* fix: failing test due to rust panic message change

Inbetween rustc 1.72 and rustc 1.75 the way that error messages were
displayed when panicing changed. One of our tests depended on the output
of that behavior and this commit updates the error message to the new
form so that tests will pass.

* fix: broken cargo doc link

* fix: cargo formatting run

* fix: add workspace-hack to influxdb3 crates

This was the last change needed to make sure that the workspace-hack
crate CI lint would pass.

* fix: remove tests that can not run anymore

We removed iox code from this code base and as a result some tests
cannot be run anymore and so this commit removes them from the code base
so that we can get a green build.
2024-01-09 15:11:35 -05:00
Paul Dix 5831cf8cee
feat: Add basic Edge server structure (#24552)
* WIP: basic influxdb3 command and http server

* WIP: write lp, buffer, query out

* WIP: test write & query on influxdb3_server, fix warnings

* WIP: pull write buffer and catalog into separate crate

* WIP: sketch out types used for write: buffer, wal, persister

* WIP: remove a bunch of old IOx stuff and fmt
2024-01-08 11:50:59 -05:00
Joshua Powers acfef87659
chore: Sync and release v1.0.1 of influxdb-line-protocol (#24527)
* chore: Backport influxdb line protocol changes, release v1.0.1

* chore: Update influxdb_line_protocol to 2.0

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2023-12-22 15:12:41 -05:00
Joshua Powers 2fabaf98a4
chore: Run cargo fmt --all (#24528) 2023-12-20 14:11:04 -07:00
Jamie Strandboge bb6a5c0bf6
chore: ignore Go in .github/dependabot.yml, take 3 (#24439)
Update to use the documented dependency-name: "*" methodology rather
than an undocumented example.

References:
- https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
2023-11-02 08:16:13 -04:00
Jeffrey Smith II 560c2ef846
chore: Update dependabot.yml (#24391)
Add the v3 label to dependabot Rust PRs
2023-11-02 08:10:04 -04:00
Jamie Strandboge 4baa25e56f
chore: ignore Go in .github/dependabot.yml, take 2 (#24438) 2023-11-02 07:29:35 -04:00
Jamie Strandboge 361a82a84a
chore: ignore Go in .github/dependabot.yml (#24430)
Before switching to rust-based IOx, influxdb was a Go project which
dependabot tracked. After the switch, dependabot would issue alerts for
go files that no longer exist. Tell dependabot to ignore "gomod"
2023-10-26 10:53:42 -05:00
dependabot[bot] d34fc59217
chore(deps): Bump rustix from 0.38.8 to 0.38.19 (#24421)
Bumps [rustix](https://github.com/bytecodealliance/rustix) from 0.38.8 to 0.38.19.
- [Release notes](https://github.com/bytecodealliance/rustix/releases)
- [Commits](https://github.com/bytecodealliance/rustix/compare/v0.38.8...v0.38.19)

---
updated-dependencies:
- dependency-name: rustix
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-19 16:24:15 -05:00
Rick Spencer 2f0e2dcb9d
chore: Update README.md (#24379)
Update README to include link to Flux repo
2023-09-26 09:59:23 -04:00
Paul Dix 7b7475983b chore: update README with new v3 details 2023-09-21 09:31:53 -04:00
Paul Dix aa458ed166 Merge branch 'iox-repo' 2023-09-21 09:22:15 -04:00
Paul Dix cafe37bd1f Merge branch 'pd/influxdb3-oss' 2023-09-21 09:15:41 -04:00
Dom 427daa82b0
Merge pull request #8788 from influxdata/dependabot/cargo/tokio-util-0.7.9
chore(deps): Bump tokio-util from 0.7.8 to 0.7.9
2023-09-21 14:04:16 +01:00
Paul Dix de835d8c33 feat: remove everything to make way for 3.0, the last database rewrite you'll ever need. 2023-09-21 09:03:38 -04:00
Dom 25f3147dc7
Merge branch 'main' into dependabot/cargo/tokio-util-0.7.9 2023-09-21 13:37:57 +01:00
Dom 008b60cffb
Merge pull request #8790 from influxdata/dependabot/cargo/insta-1.32.0
chore(deps): Bump insta from 1.31.0 to 1.32.0
2023-09-21 13:36:11 +01:00
Dom 75d4d8c55b
Merge branch 'main' into dependabot/cargo/insta-1.32.0 2023-09-21 12:52:56 +01:00
Dom e61a35e396
Merge branch 'main' into dependabot/cargo/tokio-util-0.7.9 2023-09-21 12:52:23 +01:00
kodiakhq[bot] b5c0ecd140
Merge pull request #8767 from influxdata/savage/respect-ingest-system-state-during-wal-replay
feat(ingester): Respect `IngestState` during WAL replay
2023-09-21 10:19:37 +00:00
kodiakhq[bot] 12b02359aa
Merge branch 'main' into savage/respect-ingest-system-state-during-wal-replay 2023-09-21 10:13:27 +00:00
dependabot[bot] 82382b9b3a
chore(deps): Bump insta from 1.31.0 to 1.32.0
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.31.0 to 1.32.0.
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.31.0...1.32.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-21 10:10:26 +00:00
Dom bd1b668dbb
Merge branch 'main' into dependabot/cargo/tokio-util-0.7.9 2023-09-21 11:04:42 +01:00
Dom 29462d0fe5
Merge pull request #8789 from influxdata/dependabot/cargo/smallvec-1.11.1
chore(deps): Bump smallvec from 1.11.0 to 1.11.1
2023-09-21 11:04:28 +01:00
dependabot[bot] 37d37f3626
chore(deps): Bump smallvec from 1.11.0 to 1.11.1
Bumps [smallvec](https://github.com/servo/rust-smallvec) from 1.11.0 to 1.11.1.
- [Release notes](https://github.com/servo/rust-smallvec/releases)
- [Commits](https://github.com/servo/rust-smallvec/compare/v1.11.0...v1.11.1)

---
updated-dependencies:
- dependency-name: smallvec
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-21 02:02:29 +00:00
dependabot[bot] 661acc77f0
chore(deps): Bump tokio-util from 0.7.8 to 0.7.9
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.8 to 0.7.9.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.8...tokio-util-0.7.9)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-21 02:01:19 +00:00
Brandon Pfeifer b3b982d746
chore: update MacOS executor to M1 (#24372) 2023-09-20 14:30:21 -04:00
kodiakhq[bot] 3e5196bbda
Merge pull request #8782 from influxdata/cn/independent-refactorings
refactor: Improvements made during UpsertSchema work that are actually independent
2023-09-20 16:56:41 +00:00
Carol (Nichols || Goulding) d1f355bb58
fix: Remove an Arc wrapping that's no longer needed 2023-09-20 11:14:20 -04:00
Carol (Nichols || Goulding) 7c31771c64
refactor: Implement NamespaceSchema proto conversion as From 2023-09-20 11:13:53 -04:00
Carol (Nichols || Goulding) 11f916eee1
refactor: Extract test helper functions to improve readability 2023-09-20 10:42:26 -04:00
Fraser Savage cb7a26cb65
refactor(ingester): Revert `IngestState` methods to crate public
These methods should not be used at all outside the ingester, and only
the type itself needs to be accessible in the WAL replay benchmark.
2023-09-20 15:03:39 +01:00
Carol (Nichols || Goulding) eda4ccdf1a
test: Clean up existing test for consistency with current test style
- Extract some shared values
- Remove an unneeded Arc::clone
- Change expects that don't provide much clarity to unwraps
- Give the test a more distinctive and less redundant name
2023-09-20 10:01:06 -04:00
Carol (Nichols || Goulding) 257a6d2552
fix: Generating proto doesn't need ownership of an Arc 2023-09-20 10:01:06 -04:00
Carol (Nichols || Goulding) 265941f1a8
refactor: Only return NamespaceSchema proto so it can be reused in different responses 2023-09-20 09:57:06 -04:00
Dom 28c3637c01
Merge pull request #8780 from influxdata/dom/enable-merkle-tracking
feat(router): init anti-entropy merkle search tree
2023-09-20 14:41:30 +01:00
Dom d6f87cc569
Merge branch 'main' into dom/enable-merkle-tracking 2023-09-20 14:25:45 +01:00
Marco Neumann 5269285250
refactor: isolate V1 ingester->querier client (#8778)
Isolate the actual client from the query planning parts
(`Ingester{Chunk,Partition}`) so we can hook up the V2 client in #8350.

The PR looks large, but it just moves code around and decouples the
error handling.
2023-09-20 12:55:30 +00:00
Dom Dwyer 39768fa989
feat(router): init anti-entropy merkle search tree
Adds initialisation code to the routers to instantiate an
AntiEntropyActor, pre-populate the Merkle Search Tree during schema
warmup, and maintain it at runtime.
2023-09-20 13:47:16 +02:00