Commit Graph

34 Commits (main-2.x)

Author SHA1 Message Date
WeblWabl 06ab224516
fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578) (#25622)
* fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578)

* fix(influxd): update xxhash, avoid stringtoslicebyte in cache

This commit does 3 things:

* it updates xxhash from v1 to v2; v2 includes a assembly arm version of
  Sum64
* it changes the cache storer to write with a string key instead of a
  byte slice. The cache only reads the key which WriteMulti already has
as a string so we can avoid a host of allocations when converting back
and forth from immutable strings to mutable byte slices. This includes
updating the cache ring and ring partition to write with a string key
* it updates the xxhash for finding the cache ring partition to use
Sum64String which uses unsafe pointers to directly use a string as a
byte slice since it only reads the string. Note: this now uses an
assembly version because of the v2 xxhash update. Go 1.22 included new
compiler ability to recognize calls of Method([]byte(myString)) and not
make a copy but from looking at the call sites, I'm not sure the
compiler would recognize it as the conversion to a byte slice was
happening several calls earlier.

That's what this change set does. If we are uncomfortable with any of
these, we can do fewer of them (for example, not upgrade xxhash; and/or
not use the specialized Sum64String, etc).

For the performance issue in maz-rr, I see converting string keys to
byte slices taking between 3-5% of cpu usage on both the primary and
secondary. So while this pr doesn't address directly the increased cpu
usage on the secondary, it makes cpu usage less on both which still
feels like a win. I believe these changes are easier to review that
switching to a byte slice pool that is likely needed in other places as
the compiler provides nearly all of the correctness checks we need (we
are relying also on xxhash v2 being correct).

* helps #550

* chore: fix tests/lint

* chore: don't use assembly version; should inline

This 2 line change causes xxhash to use a purego Sum64 implementation
which allows the compiler to see that Sum64 only read the byte slice
input which them means is can skip the string to byte slice allocation
and since it can skip that, it should inline all the calls to
getPartitionStringKey and Sum64 avoiding 1 call to Sum64String which
isn't inlined.

* chore: update ci build file

the ci build doesn't use the make file!!!

* chore: revert "chore: update ci build file"

This reverts commit 94be66fde03e0bbe18004aab25c0e19051406de2.

* chore: revert "chore: don't use assembly version; should inline"

This reverts commit 67d8d06c02e17e91ba643a2991e30a49308a5283.

(cherry picked from commit 1d334c679ca025645ed93518b7832ae676499cd2)

* feat: need to update go sum

---------

Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
2024-12-05 16:57:26 -06:00
Jack 976ef20a32
fix: panic index out of range for invalid series keys [Port to main-2.x] (#24597)
* fix: cherry-pick to main-2.x
2024-01-23 17:09:10 +00:00
davidby-influx 081f95147e
fix: avoid SIGBUS when reading non-std series segment files (#24509) (#24520)
Some series files which are smaller than the standard
sizes cause SIGBUS in influx_inspect and influxd, because
entry iteration walks onto mapped memory not backed by the
the file.  Avoid walking off the end of the file while
iterating series entries in oddly sized files.

closes https://github.com/influxdata/influxdb/issues/24508

Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
(cherry picked from commit 969abf3da2)

closes https://github.com/influxdata/influxdb/issues/24511
2023-12-19 15:02:34 -08:00
Daniel Moran df448c654b
feat(tsi): optimize series iteration (#22316)
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.


Co-authored-by: Sam Arnold <sarnold@influxdata.com>
2021-08-27 09:59:23 -04:00
sans 7dcaf5c639
fix: typos (#19734) 2020-10-13 09:50:32 -07:00
Stuart Carnie dee8977d2c
chore: move v2/v1/tsdb → v2/tsdb 2020-08-26 10:46:47 -07:00
Jacob Marble 26ca766459
refactor(tsdb): move series file to its own package (#17224)
* refactor(storage): move type ByTagKey to the only package that uses it

* refactor(tsdb): use types in tsdb/cursors

* refactor(tsdb): remove unused type SeriesIDElems

* refactor(tsdb): inline only use of tsdb.ReadAllSeriesIDIterator

* refactor(tsdb): move series file to its own package

* refactor(storage): remove platform->influxdb aliases
2020-03-12 11:32:52 -07:00
Ben Johnson 627b6f86bb feat(storage): Series file compaction 2020-03-11 19:31:58 -06:00
Jacob Marble aa5c77409d
backport: Fix open/close race in SeriesFile (#13837) 2019-05-08 11:39:24 -07:00
Jacob Marble 8c269e0153
chore(log): Put trace_id back in logs (#13712)
* chore(log): Put trace_id back in logs

* fix tests
2019-04-30 18:51:22 -07:00
Todd Persen 138c17f22c Fix typos in tsdb package 2019-04-17 12:55:38 -07:00
Ben Johnson 307bb6af9c
Improve bulk series file writes. 2019-04-05 14:38:58 -06:00
zhulongcheng 2554f1c5dd storage: add SeriesOffsetSize constant 2019-03-12 10:51:22 +08:00
Jacob Marble 603a1f26e0 use tracing.StartSpanFromContext 2019-03-07 12:12:31 -07:00
Jacob Marble b9c7ec439e
feat(influxd): Tracing refactor (#12318)
* feat(launcher): Tracing to log disabled by default

* remove traceLogger and use opentracing directly

* add Jaeger tracing

* go vet && go fmt
2019-03-04 11:48:11 -08:00
Jeff Wendling 0fae44e219 storage: fix problems with keeping resources alive
This commit adds the pkg/lifecycle.Resource to help manage opening,
closing, and leasing out references to some resource. A resource
cannot be closed until all acquired references have been released.
If the debug_ref tag is enabled, all resource acquisitions keep
track of the stack trace that created them and have a finalizer
associated with them to print on stderr if they are leaked. It also
registers a handler on SIGUSR2 to dump all of the currently live
resources.

Having resources tracked in a uniform way with a data type allows us
to do more sophisticated tracking with the debug_ref tag, as well.
For example, we could panic the process if a resource cannot be
closed within a certain time frame, or attempt to figure out the
DAG of resource ownership dynamically.

This commit also fixes many issues around resources, correctness
during error scenarios, reporting of errors, idempotency of
close, tracking of memory for some data structures, resource leaks
in tests, and out of order dependency closes in tests.
2019-02-28 10:22:01 -07:00
Mark Rushakoff d73d73c0d4 chore: rename imports from platform to influxdb
I did this with a dumb editor macro, so some comments changed too.

Also rename root package from platform to influxdb.

In interest of minimizing risk, anyone importing the root package has
now aliased it to "platform" so that no changes beyond imports were
necessary in those files.

Lastly, replace the old platform module to local path /dev/null so that
nobody can accidentally reintroduce a platform dependency while
migrating platform code to influxdb.
2019-01-09 20:51:47 -08:00
Edd Robinson 6b63a3def7 Add option to disable sfile metrics 2018-12-10 14:36:28 +00:00
Edd Robinson bff655786f Ensure tsdb metrics properly registered 2018-12-07 14:32:34 +00:00
Edd Robinson e0c10227d0 Fix metric issue in series file 2018-12-07 14:32:34 +00:00
Edd Robinson 7960ccc320 Add TSI index metrics 2018-12-07 14:32:34 +00:00
Edd Robinson 55caa0fe54 Add RHH metrics 2018-12-07 14:32:34 +00:00
Edd Robinson d1fe2bc188 Add series file metrics 2018-12-07 14:32:34 +00:00
Mark Rushakoff 985c260af7 chore(storage,tsdb): fix megacheck errors 2018-11-01 12:54:46 -07:00
Ben Johnson d856116b00
Add tsi1 measurement cardinality stats. 2018-10-17 08:38:41 -06:00
Jeff Wendling 810833f33f chore: refactor reads service and make it consumable externally
This pulls in the code that allows doing reads with flux into the
platform repo, and removes extra.go.

The reusable portion is under storage/reads, where the concrete
implementation for one of the platform's engines is in
storage/readservice.

In order to make this more reusable, the cursors had to move into
their own package, decoupling it from all of the other code in the
tsdb package. tsdb/cursors is this new package, and type/function
aliases have been added to the tsdb package to point at it.

The models package already is very light on transitive dependencies
and so it was allowed to be depended on in a concrete way in the
cursors package.

Finally, the protobuf definitions for issuing GRPC reads has been
moved into its own package for two reasons:
    1. It's a clean separation, and helps keep it that way.
    2. Many/most consumers will not be using GRPC. We just
       use the datatypes to express the API which helps making
       a GRPC server easier.
It is left up to future refactorings (specifically ones that involve
GPRC) to determine if these types should remain, or if there is a
cleaner way.

There's still some dependencies on both github.com/influxdata/influxql
and github.com/influxdata/influxdb/logger that we can hopefully remove
in future refactorings.
2018-10-09 09:51:13 -06:00
Edd Robinson 981b2cdbea Skeleton storage engine 2018-10-04 10:24:43 +01:00
Edd Robinson 3385f389f7 Update tsdb package from OSS 2018-10-01 12:08:37 +01:00
Edd Robinson e4cca868f4 Get TSI tests passing 2018-10-01 12:08:37 +01:00
Edd Robinson fb0db04bc1 Initial import pkg package 2018-10-01 12:03:20 +01:00
Edd Robinson 04818c7859 Initial import of models package 2018-10-01 12:03:19 +01:00
Jeff Wendling d44b583c4d remove code as reported by the unused tool 2018-10-01 12:03:19 +01:00
Jeff Wendling b0a317a34c remove and document some things 2018-10-01 12:03:19 +01:00
Jeff Wendling 992884ab6c initial import of tsdb package 2018-10-01 12:03:19 +01:00