Go to file
Paul Dix 3265960010
refactor: implement new wal and refactor write buffer (#25196)
* feat: refactor WAL and WriteBuffer

There is a ton going on here, but here are the high level things. This implements a new WAL, which is backed entirely by object store. It then updates the WriteBuffer to be able to work with how the new WAL works, which also required an update to how the Catalog is modified and persisted.

The concept of Segments has been removed. Previously there was a separate WAL per segment of time. Instead, there is now a single WAL that all writes and updates flow into. Data within the write buffer is organized by Chunk(s) within tables, which is based on the timestamp of the row data. These are known as the Level0 files, which will be persisted as Parquet into object store. The default chunk duration for level 0 files is 10 minutes.

The WAL is written as single files that get created at the configured WAL flush interval (1s by default). After a certain number of files have been created, the server will attempt to snapshot the WAL (default is to snapshot the first 600 files of the WAL after we have 900 total, i.e. snapshot 10 minutes of WAL data).

The design goal with this is to persist 10 minute chunks of data that are no longer receiving writes, while clearing out old WAL files. This works if data getting written in around "now" with no more than 5 minutes of delay. If we continue to have delayed writes, a snapshot of all data will be forced in order to clear out the WAL and free up memory in the buffer.

Overall, this structure of a single wal, with flushes and snapshots and chunks in the queryable buffer led to a simpler setup for the write buffer overall. I was able to clear out quite a bit of code related to the old segment organization.

Fixes #25142 and fixes #25173

* refactor: address PR feedback

* refactor: wal to replay and background flush on new

* chore: remove stray println
2024-08-01 15:04:15 -04:00
.cargo chore: Upgrade to Rust 1.78.0 (#24953) 2024-05-02 13:39:20 -04:00
.circleci fix: only execute "build_dev" on non-fork branches (#25044) 2024-06-05 15:06:52 -04:00
.github chore: Remove dependabot for our repo (#24693) 2024-02-26 13:38:20 -05:00
assets chore: Update README for InfluxDB main repo (#25101) 2024-06-27 12:50:05 -04:00
docker fix: Add docker folder back for CI (#24720) 2024-02-29 16:47:41 -05:00
influxdb3 refactor: implement new wal and refactor write buffer (#25196) 2024-08-01 15:04:15 -04:00
influxdb3_client fix: last cache catalog configuration tracks explicit vs. non-explicit value columns (#25185) 2024-07-24 11:00:40 -04:00
influxdb3_load_generator feat: QoL improvements to the load generator and analysis tools (#24914) 2024-04-15 10:58:36 -04:00
influxdb3_process chore: Upgrade to rustc 1.80 (#25193) 2024-07-25 11:38:18 -04:00
influxdb3_server refactor: implement new wal and refactor write buffer (#25196) 2024-08-01 15:04:15 -04:00
influxdb3_wal refactor: implement new wal and refactor write buffer (#25196) 2024-08-01 15:04:15 -04:00
influxdb3_write refactor: implement new wal and refactor write buffer (#25196) 2024-08-01 15:04:15 -04:00
iox_query_influxql_rewrite feat: extend InfluxQL rewriter for SELECT and EXPLAIN (#24726) 2024-03-05 15:40:16 -05:00
.editorconfig chore: editor config spacing for shell scripts 2022-12-13 11:12:11 +01:00
.gitattributes feat: implement jaeger-agent protocol directly (#2607) 2021-09-22 17:30:37 +00:00
.gitignore chore: clean up heappy, pprof, and jemalloc (#24967) 2024-05-06 15:21:18 -04:00
.kodiak.toml chore: Set default to squash 2022-01-25 15:57:10 +01:00
CONTRIBUTING.md docs: rename influxdb_iox to influxdata (#24577) 2024-01-16 13:34:23 -05:00
Cargo.lock refactor: implement new wal and refactor write buffer (#25196) 2024-08-01 15:04:15 -04:00
Cargo.toml refactor: implement new wal and refactor write buffer (#25196) 2024-08-01 15:04:15 -04:00
Dockerfile feat: build binaries and Docker images in CI (#24751) 2024-05-03 16:39:42 -04:00
Dockerfile.dockerignore fix: Readd the Dockerfile for the main branch (#24719) 2024-02-29 16:33:36 -05:00
LICENSE-APACHE fix: Add LICENSE (#430) 2020-11-10 12:10:07 -05:00
LICENSE-MIT fix: Add LICENSE (#430) 2020-11-10 12:10:07 -05:00
PROFILING.md docs: `PROFILING.md` (#25075) 2024-07-24 11:01:36 -04:00
README.md chore: Update README for InfluxDB main repo (#25101) 2024-06-27 12:50:05 -04:00
SECURITY.md chore: tweak wording and don't reference gpg key in SECURITY.md (#24838) 2024-03-25 14:34:36 -05:00
deny.toml chore: upgrade to sqlx 0.7.1 (#8266) 2023-07-19 12:18:57 +00:00
rust-toolchain.toml chore: Upgrade to rustc 1.80 (#25193) 2024-07-25 11:38:18 -04:00
rustfmt.toml chore: use Rust edition 2021 2021-10-25 10:58:20 +02:00

README.md

InfluxDB Logo

InfluxDB is the leading open source time series database for metrics, events, and real-time analytics.

Project Status

This main branch contains InfluxDB v3 in pre-release and under active development. Build artifacts are not yet generally available and official installation instructions will be coming later this year. For now, a Dockerfile is provided and can be adapted or used for inspiration by intrepid users.

Learn InfluxDB

Documentation | Community Forum | Community Slack | Blog | InfluxDB University | YouTube

Try InfluxDB Cloud for free and get started fast with no local setup required. Click here to start building your application on InfluxDB Cloud.

Installation

We have nightly and versioned Docker images, Debian packages, RPM packages, and tarballs of InfluxDB available on the InfluxData downloads page. We also provide the InfluxDB command line interface (CLI) client as a separate binary available at the same location.

If you are interested in building from source, see the building from source guide for contributors.

To begin using InfluxDB, visit our Getting Started with InfluxDB documentation.

License

The open source software we build is licensed under the permissive MIT and Apache 2 licenses. Weve long held the view that our open source code should be truly open and our commercial code should be separate and closed.

Interested in joining the team building InfluxDB?

Check out current job openings at www.influxdata.com/careers today!