* feat: This PR adds -tsm file flag to export
Adds the ability to use influx_inspect export to export data from a single tsm file, for example influx_inspect export -out - -tsmfile 000000006-000000002.tsm.bad -database thermo -retention autogen.
* chore: add scaffolding for naive solution
* feat: test case scaffolding
* fix: implement check for series key before proceeding
* fix: add validation for ReadSeriesKeyMeasurement usage
* refactor: explicit use of series key len
* feat: add remaining check to index
* feat: add check to remaining files
As the Len function is used as part of the parseSeriesKey, this also needs to be accounted for on the nil return from this function as it is used in different contexts
* feat: expand test cases
* chore: go fmt
* chore: update test failure message
* chore: impl feedback on unnecessary sz checks
* feat: expand test cases
* fix: nil series key check
In both sections for index.go there is a pre-existing length check against the series key which should catch invalid values, perhaps this explains why it hasn't cropped up in the reported panics. For even more safety, we can also skip a nil key because we know that subsequent calls will cause a panic where this key is attempted to be used
* fix: remove nil tags check
A key with no tags is valid, so we should not check for BOTH nil key and tags as a key could be nil, which is invalid, yet still have tags and therefore cause the check to pass which we do not want
* feat: extend test cases from feedback
* fix: extend checks for CompareSeriesKeys
* feat: add nilKeyHandler for shared key checking logic
* fix: logical error in nilKeyHandler
Prior to this, the else was always defaulted to at the end of the conditional branch, which causes unexpected behaviour and a failure of a bunch of tests.
* fix: return tags keep nil data
In a recent change to this, we agreed on a simple name == nil check for the actual data. As a follow on to this, I just realised that we don't actually want to nil back the tags, even if they're not checked, because having no tags is a valid input so we can simply return whatever we were passed unchanged.
* fix: use len == 0 for extra safety
* feat: extra test for blank series key
Some series files which are smaller than the standard
sizes cause SIGBUS in influx_inspect and influxd, because
entry iteration walks onto mapped memory not backed by the
the file. Avoid walking off the end of the file while
iterating series entries in oddly sized files.
closes https://github.com/influxdata/influxdb/issues/24508
Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
Absolute file paths in influx_inspect check-schema
cause an 'Invalid Argument' error. This was caused
fs.WalkDir using fs.ValidPath. Replacing with
filepath.WalkDir permits absolute paths.
closes https://github.com/influxdata/influxdb/issues/23987
This switches so that the message
skipped missing file: /path/to/tsm.tsm
is written to stdErr instead of stdout (or the output file if `-out` has been provided)
(cherry picked from commit a9bf1d54c1)
closes https://github.com/influxdata/influxdb/issues/23866
Co-authored-by: Ben Tasker <88340935+btasker@users.noreply.github.com>
Instead of writing out the complete fields.idx
file when it changes, write out incremental
changes that will be applied to the file on
close and startup.
closes https://github.com/influxdata/influxdb/issues/23653
adds two commands "check-schema" and
"merge-schema" to influx_inspect.
These test for field type conflicts
in all fields.idx beneath a directory
and merges the derived schemas if
"check-schema" has been run multiple
times on different directories
feat: estimate Cloud2 cardinality on 1.X databases
To ease migrations to Cloud 2 installations from
1.X databases, estimate Cloud 2 cardinality for
a data node (or OSS system).
closes https://github.com/influxdata/influxdb/issues/23356
influx_inspect verify -dir will no longer append the "/data" path to the dir. Files are checked recursively, so this will still include files in the "/data" path as well as other subdirectories.
closes https://github.com/influxdata/influxdb/issues/22572
Add a special value to the -out flag, a hyphen, to write to stdout.
While writing to stdout, send status messages to stderr instead of
stdout (the current behavior).
Closes https://github.com/influxdata/influxdb/issues/20974
When applied, this patch will add the -lponly flag to the export command
which instructs influx_inspect to only output line protocol without
comments and other out-of-band data.
The series index looks at a set of tombstones when querying the id for
a given key, but it does not look when asking for the offset for some
id, even if that id is deleted.
Update the verify tooling to check that the index agrees with the
deleted status of the id, but skip doing the extra checks if the
id is deleted.
If there is a significant amount of data in the WAL, then building the
TSI index can be problematic without being able to set the max cache
size to something larger.
This commit adds an option to se the maximum cache size.
This commit fixes an issue with the series file compaction process
where tombstones are lost after compaction and series existence
checks are not correct. This commit also fixes some smaller flushing
issues within the series file that mainly related to testing.