This PR introduces a new type `SerdeVecHashMap` that can be used in places where we need a HashMap with the following properties:
1. When serialized, it is serialized as a list of key-value pairs, instead of a map
2. When deserialized, it assumes the serialization format from (1.) and deserializes from a list of key-value pairs to a map
3. Does not allow for duplicate keys on deserialization
This is useful in places where we need to create map types that map from an identifier (integer) to some value, and need to serialize that data. For example: in the WAL when serializing write batches, and in the catalog when serializing the database/table schema.
This PR refactors the code in `influxdb3_wal` and `influxdb3_catalog` to use the new type for maps that use `DbId` and `TableId` as the key. Follow on work can give the same treatment to `ColumnId` based maps once that is fully worked out.
## Explanation
If we have a `HashMap<u32, String>`, `serde_json` will serialize it in the following way:
```json
{"0": "foo", "1": "bar"}
```
i.e., the integer keys are serialized as strings, since JSON doesn't support any other type of key in maps.
`SerdeVecHashMap<u32, String>` will be serialized by `serde_json` in the following way:
```json,
[[0, "foo"], [1, "bar"]]
```
and will deserialize from that vector-based structure back to the map. This allows serialization/deserialization to run directly off of the `HashMap`'s `Iterator`/`FromIterator` implementations.
## The Controversial Part
One thing I also did in this PR was switch the catalog from using a `BTreeMap` for tables to using the new `HashMap` type. This breaks the deterministic ordering of the database schema's `tables` map and therefore wrecks the snapshot tests we were using. I had to comment those parts of their respective tests out, because there isn't an easy way to make the underlying hashmap have a deterministic ordering just in tests that I am aware of.
If we think that using `BTreeMap` in the catalog is okay over a `HashMap`, then I think it would be okay to roll a similar `SerdeVecBTreeMap` type specifically for the catalog. Coincidentally, this may actually be a good use case for [`indexmap`](https://docs.rs/indexmap/latest/indexmap/), since it holds supposedly similar lookup performance characteristics to hashmap, while preserving order and _having faster iteration_ which could be a win for WAL serialization speed. It also accepts different hashing algorithms so could be swapped in with FNV like `HashMap` can.
## Follow-up work
Use the `SerdeVecHashMap` for column data in the WAL following https://github.com/influxdata/influxdb/issues/25461
* refactor: roll back addition of DatabaseSchemaProvider trait
* refactor: make parquet metrics optional in telemetry for pro
* refactor: make ParquetFileId Hash
* refactor: test harness logging
Allow the endpoint for telemetry to be passed in via the cli args, e.g
```
--telemetry-endpoint "https://somehost/test/"
```
and the actual endpoint always appends `v3` to it. So, above URL becomes
"https://somehost/test/v3"
* feat(circleci): add inclusivity checks
* chore(circleci): adjust package-validation for inclusive language
* chore: update tests for inclusive language
- Introduced traits, `ParquetMetrics` and `SystemInfoProvider` to enable
writing easier tests
- Uses mockito for code that depends on reqwest::Client and also uses
mockall to generally mock any traits like `SystemInfoProvider`
- Minor updates to docs
- added mechanism within PersistedFile to expose parquet file related
metrics. The details are updated when new snapshot is generated and
also when all snapshots are loaded when the process starts up
- at the point of creating the telemetry payload these parquet metrics
are looked up before sending it to the server.
Closes: https://github.com/influxdata/influxdb/issues/25418
- instrumented code to get read and write measurement
- introduced EventsBucket for collection of reads/writes
- sampler now samples every minute for all metrics (including
reads/writes)
- other tidy ups
closes: https://github.com/influxdata/influxdb/issues/25372
- moved num samples for cpu/mem to be usize from u64
- generic float function to round to 2 decimal places added
- removed unnecessary cast of f32 to f64