This PR introduces a new type `SerdeVecHashMap` that can be used in places where we need a HashMap with the following properties:
1. When serialized, it is serialized as a list of key-value pairs, instead of a map
2. When deserialized, it assumes the serialization format from (1.) and deserializes from a list of key-value pairs to a map
3. Does not allow for duplicate keys on deserialization
This is useful in places where we need to create map types that map from an identifier (integer) to some value, and need to serialize that data. For example: in the WAL when serializing write batches, and in the catalog when serializing the database/table schema.
This PR refactors the code in `influxdb3_wal` and `influxdb3_catalog` to use the new type for maps that use `DbId` and `TableId` as the key. Follow on work can give the same treatment to `ColumnId` based maps once that is fully worked out.
## Explanation
If we have a `HashMap<u32, String>`, `serde_json` will serialize it in the following way:
```json
{"0": "foo", "1": "bar"}
```
i.e., the integer keys are serialized as strings, since JSON doesn't support any other type of key in maps.
`SerdeVecHashMap<u32, String>` will be serialized by `serde_json` in the following way:
```json,
[[0, "foo"], [1, "bar"]]
```
and will deserialize from that vector-based structure back to the map. This allows serialization/deserialization to run directly off of the `HashMap`'s `Iterator`/`FromIterator` implementations.
## The Controversial Part
One thing I also did in this PR was switch the catalog from using a `BTreeMap` for tables to using the new `HashMap` type. This breaks the deterministic ordering of the database schema's `tables` map and therefore wrecks the snapshot tests we were using. I had to comment those parts of their respective tests out, because there isn't an easy way to make the underlying hashmap have a deterministic ordering just in tests that I am aware of.
If we think that using `BTreeMap` in the catalog is okay over a `HashMap`, then I think it would be okay to roll a similar `SerdeVecBTreeMap` type specifically for the catalog. Coincidentally, this may actually be a good use case for [`indexmap`](https://docs.rs/indexmap/latest/indexmap/), since it holds supposedly similar lookup performance characteristics to hashmap, while preserving order and _having faster iteration_ which could be a win for WAL serialization speed. It also accepts different hashing algorithms so could be swapped in with FNV like `HashMap` can.
## Follow-up work
Use the `SerdeVecHashMap` for column data in the WAL following https://github.com/influxdata/influxdb/issues/25461
This commit updates us to rustc 1.80. There are three significant changes
here:
1. LazyLock and LazyCell have been stabilized meaning we can replace our
usage of Lazy from the once_cell crate with the std lib versions
2. Lints were added to handle unknown cfg directives. `tokio_unstable`
is affected by this and while we do have the flags in our
.cargo/config.toml Cargo still output a lint for it so we supress
that warning now in our Cargo.toml for the workspace
3. clippy now throws a new warning about priority levels for lints. It's
quite frankly a thing that doesn't make sense to me and should be
something cargo fixes, but here we are.
Besides that it was a painless upgrade and now we're on the latest and
greatest.
This fixes new lints that have come up in the latest edition of clippy and moves
.cargo/config to .cargo/config.toml as the previous filename is now deprecated.
* chore: Update to Rust 1.77.0
This is a fairly quiet upgrade. The only changes are some lints around
`OpenOptions` that were added to clippy between 1.75 and this version
and they're small changes that either remove unecessary function calls
or add a needed function call.
* fix: cargo-deny by using the --locked flag
* fix: add rust-analyzer to toolchain file
Added the rust-analyzer component to the rust-toolchain.toml
file so that the correct version of rust-analyzer is installed
on Apple Silicone.
This will allow the LSP to work on Apple Silicone machines.
* chore: update deps for cargo deny
* fix: build, upgrade rustc, and deps
This commit upgrades Rust to 1.75.0, the latest release. We also
upgraded our dependencies to stay up to date and to clear out any
uneeded deps from the lockfile. In order to make sure everything works
this also fixes the build by upgrading the workspace-hack crate using
cargo hikari and removing the `workspace.lint` that was in
influxdb3_write that didn't need to be there, probably from a merge
issue.
With this we can build influxdb3 as our default on main, but this alone
is not enough to fix CI and will be addressed in future commits.
* fix: warnings for influxdb3 build
This commit fixes the warnings emitted by `cargo build` when compiling
influxdb3. Mainly it adds needed lifetimes and removes uneccesary
imports and functions calls.
* fix: all of the clippy lints
This for the most part just applies suggested fixes by clippy with a few
exceptions:
- Generated type crates had additional allows added since we can't
control what code gets made
- Things that couldn't be automatically fixed were done so manually in
particular adding a Send bound for traits that created a Future that
should be Send
We also had to fix a build issue by adding a feature for tokio-compat
due to the upgrade of deps. The workspace crate was updated accordingly.
* fix: failing test due to rust panic message change
Inbetween rustc 1.72 and rustc 1.75 the way that error messages were
displayed when panicing changed. One of our tests depended on the output
of that behavior and this commit updates the error message to the new
form so that tests will pass.
* fix: broken cargo doc link
* fix: cargo formatting run
* fix: add workspace-hack to influxdb3 crates
This was the last change needed to make sure that the workspace-hack
crate CI lint would pass.
* fix: remove tests that can not run anymore
We removed iox code from this code base and as a result some tests
cannot be run anymore and so this commit removes them from the code base
so that we can get a green build.
* chore: Upgrade to Rust 1.68
* fix: Remove unnecessary into_iter, thanks Clippy!
* fix: Use the size of the type, not a reference to the type... oops.
Thanks clippy!
* fix: Return block directly instead of creating a variable
Thanks clippy!
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: Upgrade to Rust 1.64
* fix: Use iter find instead of a for loop, thanks clippy
* fix: Remove some needless borrows, thanks clippy
* fix: Use then_some rather than then with a closure, thanks clippy
* fix: Use iter retain rather than filter collect, thanks clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>