influxdb

History

Michael Gattozzi 4e2cb630b3 fix: Prevent Catalog UUID races for new nodes (#26160 ) When starting up a new cluster in Enterprise we might have multiple nodes starting at the same time. We might have an issue wherby we have multiple catalogs with different UUIDs in their in memory representation. For example: - Let's say we have node0 and node1 - node0 and node1 start at the same time and both check object storage to see if there is a catalog to load - They both see there is no catalog - They both create a new one by generating a UUID and persisting it to object storage - Whichever is written second is now the one with the correct UUID in their in memory representation while the other will not have the correct one until restarted likely This in practice isn't an issue today as Trevor notes in https://github.com/influxdata/influxdb_pro/issues/600, but it could be once we start using `--cluster-id` for licensing purposes. In order to prevent this we instead make the write to object storage use the Put mode. If it exists then the write will fail and the node that lost the race will instead just load the other's catalog. For example if node1 wins the race then node0 will load the catalog created by node1 and use that UUID instead. As this is hard to create a test for as it involves a race condition to happen I have not included one as we could never really be sure it was taken care of and we rely on the underlying object store we are writing to to handle this for us. It's also not likely to happen given this is only on a new cluster being initiated for the first time decreasing the chances of it occurring in the first place.	2025-03-18 11:25:08 -04:00
..
src	fix: Prevent Catalog UUID races for new nodes (#26160 )	2025-03-18 11:25:08 -04:00
Cargo.toml	feat: catalog checkpoints (#26126 )	2025-03-11 18:20:36 -04:00

fix: Prevent Catalog UUID races for new nodes (#26160 )

When starting up a new cluster in Enterprise we might have multiple
nodes starting at the same time. We might have an issue wherby we have
multiple catalogs with different UUIDs in their in memory
representation.

For example:
- Let's say we have node0 and node1
- node0 and node1 start at the same time and both check object storage
  to see if there is a catalog to load
- They both see there is no catalog
- They both create a new one by generating a UUID and persisting it to
  object storage
- Whichever is written second is now the one with the correct UUID in
  their in memory representation while the other will not have the
  correct one until restarted likely

This in practice isn't an issue today as Trevor notes in
https://github.com/influxdata/influxdb_pro/issues/600, but it could be
once we start using `--cluster-id` for licensing purposes. In order to
prevent this we instead make the write to object storage use the Put
mode. If it exists then the write will fail and the node that lost the
race will instead just load the other's catalog.

For example if node1 wins the race then node0 will load the catalog
created by node1 and use that UUID instead.

As this is hard to create a test for as it involves a race condition to
happen I have not included one as we could never really be sure it was
taken care of and we rely on the underlying object store we are writing
to to handle this for us. It's also not likely to happen given this is
only on a new cluster being initiated for the first time decreasing the
chances of it occurring in the first place.

2025-03-18 11:25:08 -04:00

src

fix: Prevent Catalog UUID races for new nodes (#26160 )

2025-03-18 11:25:08 -04:00

Cargo.toml

feat: catalog checkpoints (#26126 )

2025-03-11 18:20:36 -04:00