Commit Graph

14918 Commits (db/6263/compaction-debug-logging)

Author SHA1 Message Date
davidby-influx af00cb7bbd
Merge pull request #20063 from influxdata/DSB_SnapshotInProgress_master-1.x
fix(tsm1): "snapshot in progress" error during backup: restore loop with backoff
2020-11-17 10:06:46 -08:00
davidby-influx 0faac1a478 chore(tsm1): fix formatting
Failed to format code before commit.
2020-11-16 21:25:26 -08:00
davidby-influx b3724581bc fix(tsm1): "snapshot in progress" error during backup
Loop with backoff in (*Engine).CreateSnapshot() to retry
(*Engine).WriteSnapshot() up to 3 times if
ErrSnapshotInPrgress is returned.  Then continue
on no error or on SnapshotInProgress if skipCacheOk is
true.

https://github.com/influxdata/plutonium/issues/3227
(cherry picked from commit dfa6aa8cea)
2020-11-16 21:23:00 -08:00
davidby-influx cc1e70baf4
Merge pull request #19869 from influxdata/DSB_SnapshotInProgress_3227
fix(tsm1): "snapshot in progress" error during backup
2020-11-12 15:42:22 -08:00
davidby-influx 0dcff81f56 fix(tsm1): "snapshot in progress" error during backup
Test the skipCacheOk flag to tsdb.Shard.CreateSnapshot() and
tsdb.Engine.CreateSnapshot()
A value of true allows the backup to proceed even if a cache
snapshot cannot be taken.

https://github.com/influxdata/plutonium/issues/3227
2020-11-05 16:50:51 -08:00
davidby-influx 6ec446f422 fix(tsm1): "snapshot in progress" error during backup
This fix adds a skipCacheOk flag to
tsdb.Store.CreateShardSnapshot() and tsdb.Shard.CreateSnapshot()
to pass to tsdb.Engine.CreateSnapshot()
A value of true allows the backup to proceed even if a cache snapshot
cannot be taken.
This flag is set to true in tsm1.Engine.Backup(), the OSS backup code path
This flag is set to false in tsm1.Engine.Export()

https://github.com/influxdata/plutonium/issues/3227
2020-11-05 11:08:08 -08:00
Ayan George 225bcecd73
fix: Upgrade version of jwt-go package to v4.0.0 (#19893)
* fix: Upgrade version of jwt-go package to v4.0.0

This commit updates the dependencies for influxdb to require v4.0.0-preview1 of
the jwt-go package.  This required updating the go.mod and go.sum files as well
as any source file that directly imported that package.

Prior to this commit, the TestHandler_Query_Auth() tests would fail as it
checked for specific error strigns returned by the jwt-go package.

Version 4.0.0-preview1 of the package changed the verbiage of those errors a
bit.  This patch updates the test to detect the new error string.
2020-11-05 10:55:24 -05:00
davidby-influx 23be20bf1b fix(tsm1): "snapshot in progress" error during backup
When an InfluxDB database is very busy writing new points the backup
the process can fail because it can not write a new snapshot.
The error is: operation timed out with error: create snapshot: snapshot in progress.
This happens because InfluxDB takes almost "continuously" a snapshot
from the cache caused by the high number of points ingested.
The fix for this was https://github.com/influxdata/influxdb/pull/16627
but it was for OSS only, and was not in the code path for backups
in clusters.
This fix adds a skipCacheOk flag to tsdb.Engine.CreateSnapshot().
A value of true allows the backup to proceed even if a cache snapshot
cannot be taken.
This flag is set to true in tsm1.Engine.Backup(), the OSS backup code path
and in tsdb.Shard.CreateSnapshot(), the cluster backup code path.
This flag is set to false in tsm1.Engine.Export()

https://github.com/influxdata/plutonium/issues/3227
2020-10-30 10:37:36 -07:00
Ayan George f7eb697dd3
refactor: Use filepath.Walk (#19514)
Prior to this commit, we had our own recursive file walker which
required a condition based on if s.Config.TypesDB pointed to a directory
or a regular file.

This commit replaces our own readdir() with filepath.Walk() and treats
recursing directories and loading one file as a single case.  This
simplifies the code quite a bit.
2020-10-21 10:29:48 -04:00
Ayan George b1def70670
feat: generate modern profiles (#19655)
* feat: generate modern profiles

Prior to this commit, influxd was writing legacy profiling data which
often (always?) required an accompanying executable to use.

This commit instructs influxd to write profiles in the new format which
can be examined without a binary.

While we're at it, this commit also adds the allocs and threadcreate
profiles.

Finally, this patch also changes the format of the downloaded tar in the
following ways:

* The profiles are added to the profile/ directory -- so instead of
  extracting the profiles into your current directory, they're placed in
  a "profiles" directory.

* This commit adds the .pb.gz extension to each of the files since
  they're gzipped protobuf files and not .txt.
2020-10-21 09:26:15 -04:00
David Norton 8e57d701bd
Merge pull request #19691 from influxdata/dn-disable-compaction-per-shard
feat: allow disable compaction per shard
2020-10-15 09:49:13 -04:00
David Norton 3d92eef720 feat: allow disable compaction per shard
This feature allows compaction to be disabled on a per-shard basis by
creating a file named do_not_compact in a shard's directory. When
disabled, a message is logged every 15 minutes with the reason for
compaction being disabled (existance of the file). This makes it easy to
know if compaction has been disabled for any shards by searching the log
for "compaction disabled" or running "find path/to/data -type f -name
do_not_compact".
2020-10-06 10:58:07 -04:00
Pavel Závora b8ca6f9298
Merge pull request #19631 from influxdata/fix/CORS_allows_patch_v1
fix(CORS): allow PATCH
2020-09-24 13:36:40 +02:00
Pavel Zavora e8f7b78d68 fix(CORS): allow PATCH 2020-09-24 11:55:22 +02:00
David Norton fb98ce63ec
Merge pull request #19420 from influxdata/fix-unlocked-map-access
fix: lock map before writes
2020-09-22 11:28:46 -04:00
Ayan George 431f073b9e
feat: Add -lponly flag to export sub-command (#19609)
When applied, this patch will add the -lponly flag to the export command
which instructs influx_inspect to only output line protocol without
comments and other out-of-band data.
2020-09-22 10:09:09 -04:00
Ayan George 42873d4424
chore: Quiet static analysis tools (#19509)
* Remove redundant type in slice/array declarations.
* Call t.Fatal() from test-functions, not non-test go-routines.
* Remove unnecessary empty value operator from ranges.
* Call defer .Close() methods only after checking for error on Open().
2020-09-05 12:43:29 -04:00
Ayan George 4ef4fe9aef fix(tsi1): Acquire a lock when modifying measurement map
This patch protects an internal map for concurrent use.

(*LogFile).Writes() method calls
(*LogFile).createMeasurementIfNotExists() which writes to a shared map.

(*LogFile).Writes() acquires a read-lock which leaves
createMeasurementIfNotExists() open to concurrent writes to its shared
map.

This commit adds the ExecEntries method to the *LogFile type so that we
can properly lock calls to (*LogFile).appendEntry() using defer.

(*LogFile).ExecEntries() is used to mostly replace the body of
(*LogFile).Writes() and incurs another function call since ExecEntries()
can't be inlined.  Below is the output of build with "-m -m -m" gcflags:

  ./log_file.go:1076:6: cannot inline (*LogFile).ExecEntries: unhandled op DEFER

The performace impact of the additional function call should be
negligable and is outwieghed by the safety and simplicity of using
defer.
2020-08-31 12:52:54 -04:00
Ayan George 1ffe13894d
chore: Use latest version of influxql package (#19460)
This commit updates our influxql dependency to hash 65d3ef77.
2020-08-28 11:31:50 -04:00
Ayan George 6297ede3d9
fix(tsdb): return error on nonexistent shard id (#17060)
Have Store.DeleteShard() return a useful error if it cannot find the
requested shard.

Fixes #17059
2020-08-24 14:34:44 +00:00
Ayan George 3436db4ebb
refactor: Use binary.Read() instead of io.ReadFull() (#19323)
The original version of verifyVersion() reads into a byte slice,
manually ensures its byte order, then converts it to a type comparable
with Version and MagicNumber.

This patch hides those details by calling binary.Read() and reading
values into properly typed variables.

This adds a bit of overhead but this code isn't in the hot-path and this
patch greatly simplifies the code.

verifyVersion() originally accepted an io.ReadSeeker.  It is only called
in once place and that function immediately calls seek after
verifyVersion(), therefore it is probably safe to call Seek() BEFORE
verifyVersion().

The benefit is that verifyVersion() is easier to test since we can pass
it a bytes.Buffer.

This patch adds a test for verifyVersion() as well as a benchmark.

benchmark                    old ns/op     new ns/op     delta
BenchmarkVerifyVersion-8     73.5          123           +67.35%

Finally, this commit moves verifyVersion() from writer.go to reader.go
which is where it is actually used.
2020-08-13 14:54:18 -04:00
Ayan George 6ce0e11738
feat: Collect values written stats (#19187)
* feat(engine/tsm1): Add WritePointsWithContext()

Add WritePontsWithContext() and make WritePoints() a thin wrapper for
it.

The purpose is to add statistics context values that we'll use to
propagate the number of fields and points written to calls up the call
chain.

* feat(tsdb): Add WriteToShardWithContext()

When applied, this patch adds WriteToShardWithContext() and wraps it
with WriteToShard() to preserve the API.

The the purpose of this addition is to propagate a context.Context value
to Shard.WritePointsWithContext().

* feat(tsdb/shard): Add WritePointsWithContext()

The purpose of adding WritePointsWithContext() is to propage context
values down to engine code and propage statistics via the context.Value
up to callers.

This patch also adds values written statistics to the shard.

* feat(http): Gather values written stats

WritePointsWithContext() was added to propagate context values down to
the engine and communicate stats to the caller.

* feat(http): Gather values written stats

WritePointsWithContext() was added to propagate context values down to
the engine and communicate stats to the caller.

* refactor: Change MetricKey to ContextKey

This patch gives the type we're useing for context keys a better name.
2020-08-12 11:26:12 -04:00
David Norton 8eade84355
Merge pull request #19252 from influxdata/dn-revert-disable-series-id-set-cache-size
fix(tsdb): revert disable series id set cache size by default
2020-08-07 14:44:17 -04:00
David Norton 94a4a3474d fix(tsdb): revert disable series id set cache size by default
This reverts commit 9c41e12ee4.
2020-08-07 14:06:03 -04:00
David Norton 619f0ab78e
Merge pull request #18667 from influxdata/new-http-headers
feat(service/httpd): Add user configurable HTTP headers
2020-07-08 13:59:59 -04:00
Tristan Su 6910c53440
feat(prometheus): update prometheus remote protocol (#17814)
Fetched up-to-date protocol from prometheus project
2020-07-08 07:12:52 -07:00
Jacob Marble 3f3b7b5160
chore: update some dependencies (#18786)
Helps #18528

This change bumps a couple of dependencies to prepare for something like #17814 which
updates many dependencies at once. Turns out that change is based on an
old commit, so several things have already been updated.

After this, we should do a separate commit to update prometheus per #18528
2020-07-06 14:34:55 -07:00
Ayan George 04536858d7
Merge branch 'master-1.x' into new-http-headers 2020-06-30 11:03:04 -04:00
Ayan George dde8231d5c feat(http): Allow user supplied HTTP headers
This patch adds the [http.headers] subsection to the configuration file
that allows users to supply headers that will be returned in all HTTP
responses.

Applying this patch will:

* Add code to implement new configuration items.
* Add test to ensure configuration is properly parsed.
* Add test to ensure http response headers are set
* Update sample configuration file
2020-06-30 10:59:25 -04:00
Pavel Závora d46c6a89e3
Merge pull request #18410 from influxdata/18391/cors_v1
fix(handler): allow CORS in v2 compatibility endpoints
2020-06-26 11:05:40 +02:00
Pavel Závora 2ff73114a8
Merge branch 'master-1.x' into 18391/cors_v1 2020-06-26 06:29:30 +02:00
Ben Johnson 4a1a8c0041
Merge pull request #18689 from influxdata/batch-write-tombstones-when-deleting
perf(tsi1): batch write tombstone entries when dropping/deleting
2020-06-25 08:15:12 -06:00
Ben Johnson be0edf5d75
Merge pull request #18695 from influxdata/epoch-wait-when-dropping-shard
fix(tsi1): wait deleting epoch before dropping shard
2020-06-25 08:14:53 -06:00
Ayan George a9d02e7ab7
fix: Handle snapshot related errors (#18710)
When applied this patch will:

* log snapshot directory removal errors

  Prior to this patch, errors when removing temporary snapshot
  directories happens silently.

  This patch ensures that errors are logged when os.RemoveAll() fails.

* refactor tsm1: Declare error value in condition

  Save a line of code and limits the scope of an error value.

* refactor tsm1: Add MakeSnapshotLinks()

  This commit adds (*FileStore).MakeSnapshotLinks().  The code in this
  function was originally part of CreateSnapshot().

  That code was hoisted out and into MakeSnapshotLinks() becuase there
  are two points of failure that require cleanup -- we have to delete a
  temporary directory on failure.

  Placing the code in one function allows us to check its returned error
  value and perform cleanup in only once place.

  In short, we hoisted code out of CreateSnapshot() to simplify error
  handling.

  On error, we remove any directories we created.
2020-06-25 10:05:04 -04:00
dengzhi.ldz 42dba6487a fix(tsi1): wait deleting epoch before dropping shard 2020-06-24 09:37:13 -06:00
dengzhi.ldz 331569bc11 perf(tsi1): batch write tombstone entries when dropping/deleting 2020-06-24 09:26:09 -06:00
Tristan Su 57ea78e984
fix(httpd): add option to authenticate prometheus remote read (#18429) 2020-06-23 15:03:19 -07:00
David Norton 78a05d1119
Merge pull request #17596 from foobar/optimize-sorted-merge-iterator
improvement(query): performance improvement for sorted merge iterator
2020-06-23 12:50:10 -04:00
Tristan Su 1e7a2e234a
fix(test): use go 1.13 in test scripts (#18529)
With https://github.com/influxdata/influxdb/pull/17530, go 1.13 is
required.
2020-06-22 15:29:17 -07:00
Tristan Su b31bf6a861
chore: fix missing eol (#18379)
Seems these files were created on non-unix platform so EOL is missing.
This is not an issue but for consistence with other files, it's better
to add eol.
2020-06-22 15:25:12 -07:00
Tristan Su f24f644510
chore: fix code format (#18013) 2020-06-22 15:23:36 -07:00
Tristan Su 17d192e062
chore(dumptsm): clean up dead code (#17381) 2020-06-22 15:02:53 -07:00
Tristan Su 7be913de6e
chore(tsdb): clean up unused ShardID in EngineOptions (#17243) 2020-06-22 15:01:32 -07:00
Ben Johnson 7f08d1f99f
Merge pull request #18456 from influxdata/bj-parallelize-field-iterator-planning
feat(query): Parallelize field iterator planning
2020-06-11 12:54:31 -06:00
Ben Johnson 5263070632 feat(query): Parallelize field iterator planning 2020-06-11 08:01:14 -06:00
Pavel Zavora fe150dc768 chore: update changelog 2020-06-10 19:50:30 +02:00
Pavel Zavora 0c6940d0a3 fix(handler): allow CORS in v2 compatibility endpoints 2020-06-09 09:09:30 +02:00
Pavel Zavora 0e458e7175 fix(handler): add User-Agent to allowed CORS headers 2020-06-09 07:09:27 +02:00
Jakub Bednář 61606289db chore(handler): add 2.0 compatible health endpoint from v1.8 (#17252) 2020-06-09 06:53:32 +02:00
Ben Johnson ab00f36a32
Merge pull request #18203 from influxdata/fix-series-id-set-iterator-merge-retention
fix(tsdb): Defer closing of underlying SeriesIDSetIterators
2020-05-26 08:37:01 -06:00