Commit Graph

49264 Commits (8f01e9c62a93f5c27d4d40d74f4690f98dc2b525)

Author SHA1 Message Date
peterbarnett03 8f01e9c62a
chore: Update README for InfluxDB main repo ()
* chore: update README content

* chore: update README content

* fix: updating Cargo.lock and semantics

* fix: adding dark mode logo with dynamic picture instead of img tag

* fix: adding dynamic picture instead of img tag

* fix: adding updated dark mode logo

* fix: limiting logo size to 600px

* fix: limiting logo size to 600px via width tag

---------

Co-authored-by: Peter Barnett <peterbarnett@Peters-MacBook-Pro.local>
2024-06-27 12:50:05 -04:00
Paul Dix b30729e36a
refactor: move persisted files out of segment state ()
* refactor: move persisted files out of segment state

This refactors persisted parquet files out of the SegmentState into a new struct PersistedParquet files. The intention is to have SegmentState be only for the active write buffer that has yet to be persisted to Parquet files in object storage.

Persisted files will then be accessible throughout the system without having to touch the active in-flight write buffer.

* refactor: pr feedback cleanup
2024-06-27 11:46:03 -04:00
Trevor Hilton aa28302cdd
feat: store last cache config in the catalog ()
* feat: store last cache info in the catalog

* test: test series key in catalog serialization

* test: add test for last cache catalog serialization

* chore: cargo update

* chore: remove outdated snapshot
2024-06-26 14:19:48 -04:00
Lorrens Pantelis 8b6c2a3b3d
refactor: Replace use of `std::HashMap` with `hashbrown::HashMap` ()
* refactor: use hashbrown with entry_ref api

* refactor: use hashbrown hashmap instead of std hashmap in places that would from the `entry_ref` API

* chore: Cargo update to pass CI
2024-06-26 12:43:35 -04:00
Trevor Hilton 7cfaa6aeaf
chore: clean up log statements in query_executor ()
* chore: clean up log statements in query_executor

There were several tracing statements that were making the log output
for each query rather verbose. This reduces the amount of info!
statements by converting them to debug!, and clarifies some of the logged
messages.

The type of query is also logged, i.e, "sql" vs. "influxql", which was
not being done before.

* refactor: switch back important log to info
2024-06-26 11:51:12 -04:00
Paul Dix 2ddef9f8da
feat: track buffer memory usage and persist ()
* feat: track buffer memory usage and persist

This is a bit light on the test coverage, but I expect there is going to be some big refactoring coming to segment state and some of these other pieces that track parquet files in the system. However, I wanted to get this in so that we can keep things moving along. Big changes here:

* Create a persister module in the write_buffer
* Check the size of the buffer (all open segments) every 10s and predict its size in 5 minutes based on growth rate
* If the projected growth rate is over the configured limit, either close segments that haven't received writes in a minute, or persist the largest tables (oldest 90% of their data)
* Added functions to table buffer to split a table based on 90% older timestamp data and 10% newer timestamp data, to persist the old and keep the new in memory
* When persisting, write the information in the WAL
* When replaying from the WAL, clear out the buffer of the persisted data
* Updated the object store path for persisted parquet files in a segment to have a file number since we can now have multiple parquet files per segment

* refactor: PR feedback
2024-06-25 10:10:37 -04:00
Jean Arhancet b6718e59e3
feat: add csv influx v1 ()
* feat: add csv influx v1

* fix: clippy error

* fix: cargo.lock

* fix: apply feedbacks

* test: add csv integration test

* fix: cargo audit
2024-06-25 08:45:55 -04:00
Trevor Hilton 5cb7874b2c
feat: v3 write API with series key ()
Introduce the experimental series key feature to monolith, along with the new `/api/v3/write` API which accepts the new line protocol to write to tables containing a series key.

Series key
* The series key is supported in the `schema::Schema` type by the addition of a metadata entry that stores the series key members in their correct order. Writes that are received to `v3` tables must have the same series key for every single write.

Series key columns are `NOT NULL`
* Nullability of columns is enforced in the core `schema` crate based on a column's membership in the series key. So, when building a `schema::Schema` using `schema::SchemaBuilder`, the arrow `Field`s that are injected into the schema will have `nullable` set to false for columns that are part of the series key, as well as the `time` column.
* The `NOT NULL` _constraint_, if you can call it that, is enforced in the buffer (see [here](https://github.com/influxdata/influxdb/pull/25066/files#diff-d70ef3dece149f3742ff6e164af17f6601c5a7818e31b0e3b27c3f83dcd7f199R102-R119)) by ensuring there are no gaps in data buffered for series key columns.

Series key columns are still tags
* Columns in the series key are annotated as tags in the arrow schema, which for now means that they are stored as Dictionaries. This was done to avoid having to support a new column type for series key columns.

New write API
* This PR introduces the new write API, `/api/v3/write`, which accepts the new `v3` line protocol. Currently, the only part of the new line protocol proposed in https://github.com/influxdata/influxdb/issues/24979 that is supported is the series key. New data types are not yet supported for fields.

Split write paths
* To support the existing write path alongside the new write path, a new module was set up to perform validation in the `influxdb3_write` crate (`write_buffer/validator.rs`). This re-uses the existing write validation logic, and replicates it with needed changes for the new API. I refactored the validation code to use a state machine over a series of nested function calls to help distinguish the fallible validation/update steps from the infallible conversion steps.
* The code in that module could potentially be refactored to reduce code duplication.
2024-06-17 14:52:06 -04:00
Michael Gattozzi 5c146317aa
chore: Update Rust to 1.79.0 ()
Fairly quiet update for us. The only change was around using the numeric
constants now inbuilt into the primitives not the ones from `std`

https://rust-lang.github.io/rust-clippy/master/index.html#/legacy_numeric_constants

Release post: https://blog.rust-lang.org/2024/06/13/Rust-1.79.0.html
2024-06-13 13:56:39 -04:00
Jean Arhancet 62d1c67b14
refactor: remove arrow_batchtes_to_json ()
* refactor: remove arrow_batchtes_to_json

* test: query v3 json format
2024-06-10 15:25:12 -04:00
Draco 84b38f5a06
fix: buffer size typo () 2024-06-07 12:48:04 -04:00
Brandon Pfeifer a5eba2f8f2
fix: only execute "build_dev" on non-fork branches () 2024-06-05 15:06:52 -04:00
Trevor Hilton 039dea2264
refactor: add dedicated type for serializaing catalog tables ()
Remove reliance on data_types::ColumnType

Introduce TableSnapshot for serializing table information in the catalog.

Remove the columns BTree from the TableDefinition an use the schema
directly. BTrees are still used to ensure column ordering when tables are
created, or columns added to existing tables.

The custom Deserialize impl on TableDefinition used to block duplicate
column definitions in the serialized data. This preserves that bevaviour
using serde_with and extends it to the other types in the catalog, namely
InnerCatalog and DatabaseSchema.

The serialization test for the catalog was extended to include multiple
tables in a database and multiple columns spanning the range of available
types in each table.

Snapshot testing was introduced using the insta crate to check the
serialized JSON form of the catalog, and help catch breaking changes
when introducing features to the catalog.

Added a test that verifies the no-duplicate key rules when deserializing
the map components in the Catalog
2024-06-04 11:38:43 -04:00
Trevor Hilton faab7a0abc
fix: writes with incorrect schema should fail ()
* test: add reproducer for 
* fix: validate schema of lines in lp and return error for invalid fields
2024-05-29 09:48:50 -04:00
Paul Dix 2ac986ae8a
feat: Add last_write_time and table buffer size ()
This adds tracking of the instant of the last write to open buffer segment and methods to the table buffer to compute the estimated memory size of it.

These will be used by a background task that will continuously check to see if tables should be persisted ahead of time to free up buffer memory space.

Originally, I had hoped to have the size tracking happen as the buffer was built so that returning the size would be zero cost (i.e. just returning a value), but I found in different kinds of testing that I wasn't able to get something that was even close to accurate. So for now it will use this more expensive computed method and we'll check on this periodically (every couple of seconds) to see when to persist.
2024-05-21 10:45:35 -04:00
Trevor Hilton 220e1f4ec6
refactor: expose system tables by default in edge/pro () 2024-05-17 12:39:08 -04:00
Trevor Hilton 0201febd52
feat: add the `system.queries` table ()
The system.queries table is now accessible, when queries are initiated
in debug mode, which is not currently enabled via the HTTP API, therefore
this is not yet accessible unless via the gRPC interface.

The system.queries table lists all queries in the QueryLog on the
QueryExecutorImpl.
2024-05-17 12:04:25 -04:00
Trevor Hilton 1cb3652692
feat: add SystemSchemaProvider to QueryExecutor ()
A shell for the `system` table provider was added to the QueryExecutorImpl
which currently does not do anything, but will enable us to tie the
different system table providers into it.

The QueryLog was elevated from the `Database`, i.e., namespace provider,
to the QueryExecutorImpl, so that it lives accross queries.
2024-05-17 11:21:01 -04:00
Michael Gattozzi 2381cc6f1d
fix: make DB Buffer use the up to date schema ()
Alternate Title: The DB Schema only ever has one table

This is a story of subtle bugs, gnashing of teeth, and hair pulling.
Gather round as I tell you the tale of of an Arc that pointed to an
outdated schema.

In  we introduced an Index for the database as this will allow us
to perform faster queries. When we added that code this check was added:

```rust
if !self.table_buffers.contains_key(&table_name) {
    // TODO: this check shouldn't be necessary. If the table doesn't exist in the catalog
    // and we've gotten here, it means we're dropping a write.
    if let Some(table) = self.db_schema.get_table(&table_name) {
        self.table_buffers.insert(
            table_name.clone(),
            TableBuffer::new(segment_key.clone(), &table.index_columns()),
        );
    } else {
        return;
    }
}
```

Adding the return there let us continue on with our day and make the
tests pass. However, just because these tests passed didn't mean the
code was correct as I would soon find out. With a follow up ticket of
 created we merged the changes and I began to debug the issue.

Note we had the assumption of dropping a single write due to limits
because the limits test is what failed. What began was a chase of a few
days to prove that the limits weren't what was failing. This was a bit
long but the conclusion was that the limits weren't causing it, but it
did expose the fact that a Database only ever had one table which was
weird.

I then began to dig into this a bit more. Why would there only be one
table? We weren't just dropping one write, we were dropping all but
*one* write or so it seemed. Many printlns/hours later it became clear
that we were actually updating the schema! It existed in the Catalog,
but not in the pointer to the schema in the DatabaseBuffer struct so
what gives?

Well we need to look at [another piece of code](8f72bf06e1/influxdb3_write/src/write_buffer/mod.rs (L540-L541)).

In the `validate_or_insert_schema_and_partitions` function for the
WriteBuffer we have this bit of code:

```rust
// The (potentially updated) DatabaseSchema to return to the caller.
let mut schema = Cow::Borrowed(schema);
```

As we pass in a reference to the schema in the catalog. However, when we
[go a bit further down](8f72bf06e1/influxdb3_write/src/write_buffer/mod.rs (L565-L568))
we see this code:

```rust
    let schema = match schema {
        Cow::Owned(s) => Some(s),
        Cow::Borrowed(_) => None,
    };
```

What this means is that if we make a change we clone the original and
update it. We *aren't* making a change to the original schema. When we
go back up the call stack we get to [this bit of code](8f72bf06e1/influxdb3_write/src/write_buffer/mod.rs (L456-L460)):

```rust
    if let Some(schema) = result.schema.take() {
        debug!("replacing schema for {:?}", schema);


        catalog.replace_database(sequence, Arc::new(schema))?;
    }
```

We are updating the catalog with the new schema, but how does that work?

```rust
        inner.databases.insert(db.name.clone(), db);
```

Oh. Oh no. We're just overwriting it. Which means that the
DatabaseBuffer has an Arc to the *old* schema, not the *new* one. Which
means that the buffer will get the first copy of the schema with the
first new table, but *none* of the other ones. The solution is to make
sure that the buffer is passed the current schema so that it can use the most
up to date version from the catalog. This commit makes those changes
to make sure it works.

This was a very very subtle mutability/pointer bug given the
intersection of valid borrow checking and some writes making it in, but
luckily we caught it. It does mean though that until this fix is in, we
can consider changes between the Index PR and now are subtly broken and
shouldn't be used for anything beyond writing to a signle table per DB.

TL;DR We should ask the Catalog what the schema is as it contains the up
to date version of it.

Closes 
2024-05-16 11:08:43 -04:00
Trevor Hilton 4901982c45
refactor: cleanup unused methods in Bufferer trait () 2024-05-16 09:34:08 -04:00
Trevor Hilton adeb1a16e3
chore: sync latest core () 2024-05-16 09:09:47 -04:00
Trevor Hilton 8f72bf06e1
chore: use latest `influxdb3_core` changes ()
Introduction of the `TokioDatafusionConfig` clap block for configuring the DataFusion runtime - this exposes many new `--datafusion-*` options on start, including `--datafusion-num-threads`

To accommodate renaming of `QueryNamespaceProvider` to `QueryDatabase` in `influxdb3_core`, I renamed the `QueryDatabase` type to `Database`.

Fixed tests that broke as a result of sync.
2024-05-13 12:33:50 -04:00
Jamie Strandboge 6f3d6b1b7e
chore(README): add preliminary and basic usage instructions ()
* chore(README): add preliminary and basic usage instructions

* chore(README): remove references to _series_id. Thanks hiltontj
2024-05-10 14:41:57 -05:00
Michael Gattozzi 7a2867b98b
feat: Store precision in WAL for replayability ()
Up to this point we assumed that a precision for everything was in nanoseconds.
While we do write and persist data as nanoseconds we made this assumption for
the WAL. However, we store the original line protocol data. If we want it to be
replayable we would need to include the precision and use that when loading the
WAL from disk. This commit changes the code to do that and we can see that that
data is definitely peristed as the WAL is now bigger in the tests.
2024-05-08 13:05:24 -04:00
Trevor Hilton 9354c22f2c
chore: remove _series_id ()
Removed the _series_id column that stored a SHA256 hash of the tag set
for each write.

Updated all test assertions that made reference to it.

Corrected the limits on columns to un-account for the additional _series_id
column.
2024-05-08 12:28:49 -04:00
Trevor Hilton 09fe268419
chore: clean up heappy, pprof, and jemalloc ()
* chore: clean up heappy, pprof, and jemalloc

Setup the use of jemalloc as default allocator using tikv-jemallocator
crate instead of tikv-jemalloc-sys.

Removed heappy and pprof, and also cleaned up all the mutually exclusive
compiler flags for using heappy as the allocator.

* chore: remove heappy from ci
2024-05-06 15:21:18 -04:00
Paul Dix 8e79667776
feat: Implement index for buffer ()
* feat: Implement index for buffer

This implements an index for the data in the table buffers. For now, by default, it indexes all tags, keeping a mapping of tag key/value pair to the row ids that it has in the buffer. When queries ask for record batches from the table buffer, the filter expression is evaluated to determine if a record batch can be built on the fly using only the row ids that match the index. If we don't have it in the index, the entire record batch from the buffer will be returned.

This also updates the logic in segment state to only request a record batch with the projection. The query executor was updated so that it pushes the filter and projection down to the request to get table chunks.

While implementing this, I believe I uncovered a bug where when limits are hit, a write still attempts to get buffered. I'll log a follow up to look at that.

* refactor: Update for PR feedback

* chore: cargo update to address deny failure
2024-05-06 12:59:50 -04:00
Michael Gattozzi c88cb5f093
feat: build binaries and Docker images in CI ()
For releases we need to have Docker images and binary images available for the
user to actually run influxdb3. These CI changes will build the binaries on a
release tag and the Docker image as well, test, sign, and publish them and make
them available for download.

Co-Authored-By: Brandon Pfeifer <bpfeifer@influxdata.com>
2024-05-03 16:39:42 -04:00
Michael Gattozzi 7138019636
chore: Upgrade to Rust 1.78.0 ()
This fixes new lints that have come up in the latest edition of clippy and moves
.cargo/config to .cargo/config.toml as the previous filename is now deprecated.
2024-05-02 13:39:20 -04:00
Michael Gattozzi 43368981c7
feat: implement parquet cache persistance ()
* feat: use concrete type for Persister

Up to this point we'd been using a generic `Persister` trait, however,
in practice even for tests we only use one singular type, the
`PersisterImpl`. In order to share the `MemoryPool` between it and the
upcoming `ParquetCache` we need it to be the concrete type. This
simplifies the code to grok as well by removing uneeded generic bounds.

* fix: new_with_partition_key fn name typo

* feat: implement parquet cache persistance

* fix: incorporate feedback and don't hold across await
2024-04-29 14:34:32 -04:00
Jure Bajic db8c8d5cc4
feat: Add `with_params_from` method to clients query request builder ()
Closes 
2024-04-29 13:08:51 -04:00
Trevor Hilton 0d5b591ec9
chore: point at latest core ()
Minor core update to bring in security updates and cargo optimizations from core.
2024-04-23 12:55:30 -04:00
Trevor Hilton eb80b96a2c
feat: QoL improvements to the load generator and analysis tools ()
* feat: add seconds to generated load files

This adds seconds to the time string portion of the generated files from
load generation runs. Previously, if the generator was run more than once
in the same minute, latter runs would fail because the results files
already exist.

* refactor: make query/write/system graphs optional based on run

Made the analysis tool have optional graphs based on what was actually
generated.

* refactor: change the time string format in generated load files
2024-04-15 10:58:36 -04:00
Michael Gattozzi 4afbebc73e
feat: Add and hook in an in memory Parquet cache ()
This adds an in memory Parquet cache to the WriteBuffer. With this we
now have a cache that Parquet files will be queried from when a query
does come in. Note this change *does not* actually let us persist any
data. This merely adds the cache. Future changes will add the ability
to cache the data as well as the logic around what should be cached.

As this doesn't allow any data to be cached or queried a test has not
been added at this time, but will in future PRs.
2024-04-10 15:02:03 -04:00
Paul Dix 8f59f935c5
feat: add basic load generator comparison app ()
* feat: add basic load generator comparison app

* refactor: PR feedback
2024-04-05 11:44:08 -04:00
Trevor Hilton 557b939b15
refactor: make end argument common to query and write load generation ()
* refactor: make end common to load generatino tool

Made the --end argument common to both the query and write load generation
runners.

A panic message was also added in the table buffer where unwraps were
causing panics

* refactor: load gen print statements for consistency
2024-04-04 10:13:08 -04:00
Trevor Hilton 51ff5ebbaf
feat: add the `full` sub-command to load generator ()
* refactor: query/write load gen arg interface

Refactored the argument interface for the query and write load gen
commands to make them easier to unify in a new `full` command.

In summary:
- remove the query sampling interval
- make short-form of --querier-count 'q' instead of 'Q'
- remove the short-form for --query-format
- remove --spec-path in favour of --querier-spec and --writer-spec
  for specifying spec path of the `query` and `write` loads, resp.

* feat: produce error on 0s sampling interval

* refactor: split out query/write command configs

Refactored the query and write command clap configurations to make
them composable for the full command

* refactor: expose query and write runner for composability

Refactored the query and write runners so that they can be
composed into the full runner.

* feat: add the full load generator sub-command

Implement a new sub-command for the load generator: full

This runs both the query and write loads simultaneously, and exposes
the unified CLI of the two commands, respectively.

* chore: cargo update to fix audit
2024-04-03 20:09:12 -04:00
Michael Gattozzi 2291ebeae7
feat: sort and dedupe on persist ()
When persisting parquet files we now will sort and dedupe on persist using the
COMPACT operation implemented in IOx Query. Note that right now we don't choose
any column to sort on and default to no column. This means that we dedupe and
sort on whatever the default behavior is for the COMPACT operation. Future
changes can figure out what columns to sort by when compacting the data.
2024-04-03 15:13:36 -04:00
Trevor Hilton 1982244e65
chore: update to latest core ()
* chore: update to latest core
2024-04-03 09:36:28 -04:00
Trevor Hilton 2dde602995
feat: report system stats in load generator ()
* feat: report system stats in load generator

Added the mechanism to report system stats during load generation. The
following stats are saved in a CSV file:

- cpu_usage
- disk_written_bytes
- disk_read_bytes
- memory
- virtual_memory

This only works when running the load generator against a local instance
of influxdb3, i.e., one that is running on your machine.

Generating system stats is done by passing the --system-stats flag to the
load generator.
2024-04-02 17:16:17 -04:00
Paul Dix 1b3d279d70
fix: make write load_generators wait () 2024-04-02 17:08:44 -04:00
Trevor Hilton cc55685886
feat: improved results directory structure for load generation ()
* feat: add new clap args for results gen

Added the results_dir and configuration_name args
to the common load generator config which will be
used in generating the results directory structure.

* feat: load gen results directory structure

Write and query load generation runners will now setup files in a
results directory, using a specific structure. Users of the load tool
can specify a `results_dir` to save these results, or the tool will
pick a `results` folder in the current directory, by default.

Results will be saved in files using the following path convention:

results/<s>/<c>/<write|query|system>_<time>.csv

- <s>: spec name
- <c>: configuration name, specified by user with the `config-name`
  arg, or by default, will use the revision SHA of the running server
- <write|query|system>: which kind of results file
- <time>: a timestamp in the form 'YYYY-MM-DD-HH-MM'

The setup code was unified for both write and query commands, in
preparation for the creation of a system stats file, as well as for
the capability to run both query and write at the same time, however,
those remain unimplemented as of this commit.

* feat: /ping API support on influxdb3_client::Client
2024-04-02 14:06:51 -04:00
Trevor Hilton e0465843be
feat: `/ping` API to serve version and revision ()
* feat: /ping API to serve version

The /ping API was added, which is served at GET and
POST methods. The API responds with a JSON body
containing the version and revision of the build.

A new crate was added, influxdb3_process, which
takes the process_info.rs module from the influxdb3
crate, and puts it in a separate crate so that other
crates (influxdb3_server) can depend on it. This was
needed in order to have access to the version and
revision values, which are generated at build time,
in the HTTP API code of influxdb3_server.

A E2E test was added to check that /ping works.

E2E TestServer can now have logs emitted using the
TEST_LOG environment variable.
2024-04-01 16:57:10 -04:00
Paul Dix 696456b280
refactor: buffer using Arrow builders ()
* refactor: Buffer to use Arrow builders

This refactors the TableBuffer to use the Arrow builders for the data. This also removes cloning from the table buffer in favor of yielding record batches. This is part of a test to see if querying the buffer will be faster with this method avoiding a bunch of data copies.

* fix: adding columns when data is in buffer

This fixes a bug where the Arrow schema in the Catalog wouldn't get updated when columns are added to a table. Also fixes bug in the buffer where a new column wouldn't have the correct number of rows in it (now fixed by adding in nulls for previous rows).

* refactor: PR feedback in buffer_segment
2024-03-29 15:29:00 -04:00
Trevor Hilton b55bfba475
feat: initial query load generator ()
Implement the query load generator. The design follows that of the existing write load generator.

A QuerySpec is defined that will be used by the query command to generate a set of queriers to perform queries against a running server in parallel.
2024-03-29 14:58:03 -04:00
Trevor Hilton 8d49b5e776
fix: tests broken by recent merge () 2024-03-28 15:41:49 -04:00
Michael Gattozzi 9f7940d56f
fix: set default logging level to info ()
When running influxdb3 we did not have a default log level. As a result we
couldn't even see if the program was even running. This change provides a
default level unless a user supplied one is given.
2024-03-28 14:15:57 -04:00
Trevor Hilton 7784749bca
feat: support v1 and v2 write APIs ()
feat: support v1 and v2 write APIs

This adds support for two APIs: /write and /api/v2/write. These implement the v1 and v2 write APIs, respectively. In general, the difference between these and the new /api/v3/write_lp API is in the request parsing. We leverage the WriteRequestUnifier trait from influxdb3_core to handle parsing of v1 and v2 HTTP requests, to keep the error handling at that level consistent with distributed versions of InfluxDB 3.0. Specifically, we use the SingleTenantRequestUnifier implementation of the trait.

Changes:
- Addition of two new routes to the route_request method in influxdb3_server::http to serve /write and /api/v2/write requests.
- Database name validation was updated to handle cases where retention policies may be passed in /write requests, and to also reject empty names. A unit test was added to verify the validate_db_name function.
- HTTP request authorization in the router will extract the full Authorization header value, and store it in the request extensions; this is used in the write request parsing from the core iox_http crate to authorize write requests.
- E2E tests to verify correct HTTP request parsing / response behaviour for both /write and /api/v2/write APIs
- E2E tests to check that data sent in through /write and /api/v2/write can be queried back
2024-03-28 13:33:17 -04:00
Trevor Hilton c79821b246
feat: add `_series_id` to tables on write ()
feat: add _series_id to tables on write

New _series_id column is added to tables; this stores a 32 byte SHA256 hash of the tag set of a line of Line Protocol. The tag set is checked for sort order, then sorted if not already, before producing the hash.

Unit tests were added to check hashing and sorting functions work.

Tests that performed queries needed to be modified to account for the new _series_id column; in general, SELECT * queries were altered to use a select clause with specific column names.

The Column limit was increased to 501 internally, to account for the new _series_id column, but the user-facing limit is still 500
2024-03-26 15:22:19 -04:00
Paul Dix 12636ca759
fix: loader error with single wal file ()
Fixes a bug where the loader would error out if there was a wal segment file for a previous segment that hand't been persisted, and a new wal file had to be created for the new open segment. This would show up as an error if you started the server and then stopped and restarted it without writing any data.
2024-03-25 15:40:21 -04:00