Commit Graph

110 Commits (7e0ed1f5e51f92edc97983895898b3a05bb85c1e)

Author SHA1 Message Date
Jason Wilder 675d7c9d65 Merge branch '1.2' into jw-merge12 2017-03-06 11:09:05 -07:00
Ben Johnson dffd12319c
Add point.UnmarshalBinary() bounds checking. 2017-03-01 12:01:25 -07:00
Jason Wilder a024003f2c Merge branch '1.2' into jw-merge-12 2017-02-22 12:13:29 -07:00
Ben Johnson 78a9bb2527 Remove Tags.shouldCopy, replace with forceCopy on series creation.
Previously, tags had a `shouldCopy` flag to indicate if those tags
referenced an underlying buffer and should be copied to allow GC.
Unfortunately, this prevented tags from being copied that were
created and referenced the mmap which caused segfaults.

This change removes the `shouldCopy` flag and replaces it with a
`forceCopy` argument in `CreateSeriesIfNotExists()`. This allows
the write path to indicate that tags must be cloned on insert.
2017-02-21 11:13:35 -07:00
Edd Robinson fb7388cdfc Remove dead code from various pkgs 2017-01-17 09:47:34 -08:00
Joe LeGasse cd00085e9e Adjust Tags cloning
This change delays Tag cloning until a new series is found, and will
only clone Tags acquired from `ParsePoints...` and not those referencing
the mmap-ed files (TSM) that are created on startup.
2017-01-13 13:15:36 -05:00
Mark Rushakoff cdbdd156f3 Fix memory leak of retained HTTP write payloads
This leak seems to have been introduced in 8aa224b22d,
present in 1.1.0 and 1.1.1.

When points were parsed from HTTP payloads, their tags and fields
referred to subslices of the request body; if any tag set introduced a
new series, then those tags then were stored in the in-memory series
index objects, preventing the HTTP body from being garbage collected. If
there were no new series in the payload, then the request body would be
garbage collected as usual.

Now, we clone the tags before we store them in the index. This is an
imperfect fix because the Point still holds references to the original
tags, and the Point's field iterator also refers to the payload buffer.
However, the current write code path does not retain references to the
Point or its fields; and this change will likely be obsoleted when TSI
is introduced.

This change likely fixes #7827, #7810, #7778, and perhaps others.
2017-01-12 16:16:54 -08:00
Mark Rushakoff a135906b43 Merge pull request #7747 from influxdata/mr-lint-cleanup
Miscellaneous lint cleanup
2017-01-10 08:22:00 -08:00
Mark Rushakoff 6a94d200c8 Merge remote-tracking branch 'influx/master' into mr-godoc 2017-01-04 13:27:36 -08:00
Mark Rushakoff 4aedd29b02 Merge pull request #7788 from influxdata/mr-point-data-methods
Remove unused methods from Point: Data, SetData
2017-01-04 13:15:38 -08:00
Mark Rushakoff 3e473dd262 Remove unused methods from Point: Data, SetData
These are not called anywhere in the TICK stack that I can see.
2017-01-03 16:11:40 -08:00
Cory LaNou 3c518f8927
panicing is bad -> error returns are good 2017-01-03 14:28:29 -06:00
Mark Rushakoff 07b87f2630 Miscellaneous lint cleanup 2017-01-03 09:47:32 -08:00
Mark Rushakoff 7b5b3189dd Update godoc for package models 2016-12-30 18:02:52 -08:00
Mark Rushakoff b896c56b5a Add test for marshalling a point without fields 2016-12-28 12:09:30 -08:00
Mark Rushakoff f6a3ffecaf Require fields when marshalling Point
I haven't been able to reproduce creating a point without any fields,
but we've seen points in the wild that have been marshalled with no
fields - that is, the length header for fields is uint32(0) and a
well-formed encoded time follows.

Attempting to unmarshal points via NewPointFromBytes returns
ErrPointMustHaveAField, so it seems better to fail earlier with the same
error, rather than allowing those points to be serialized in the first
place.
2016-12-23 19:43:15 -08:00
kun de4436e9d9 fix scan tag value panic 2016-12-20 08:39:18 +08:00
Mark Rushakoff 15ba7958f8 Use strings.Replacer to escape string field
benchmark                                        old ns/op     new ns/op     delta
BenchmarkEscapeStringField_Plain-4               167           65.3          -60.90%
BenchmarkEscapeString_Quotes-4                   167           165           -1.20%
BenchmarkEscapeString_Backslashes-4              211           184           -12.80%
BenchmarkEscapeString_QuotesAndBackslashes-4     413           397           -3.87%

BenchmarkExportTSMStrings_100s_250vps-4     33833611      27381442           -19.07%
BenchmarkExportWALStrings_100s_250vps-4     34977761      29222717           -16.45%

benchmark                                        old allocs     new allocs     delta
BenchmarkEscapeStringField_Plain-4               4              1              -75.00%
BenchmarkEscapeString_Quotes-4                   4              3              -25.00%
BenchmarkEscapeString_Backslashes-4              5              3              -40.00%
BenchmarkEscapeString_QuotesAndBackslashes-4     9              5              -44.44%

BenchmarkExportTSMStrings_100s_250vps-4     201605          76938              -61.84%
BenchmarkExportWALStrings_100s_250vps-4     225371         100728              -55.31%

benchmark                                        old bytes     new bytes     delta
BenchmarkEscapeStringField_Plain-4               56            16            -71.43%
BenchmarkEscapeString_Quotes-4                   56            48            -14.29%
BenchmarkEscapeString_Backslashes-4              104           80            -23.08%
BenchmarkEscapeString_QuotesAndBackslashes-4     208           160           -23.08%

BenchmarkExportTSMStrings_100s_250vps-4     10872629       6062048           -44.24%
BenchmarkExportWALStrings_100s_250vps-4     10094933       5269980           -47.80%
2016-12-17 23:46:58 -08:00
Jason Wilder 2f776ea9e1 Fix string fields w/ trailing slashes
A string field w/ a trailing slash before the quote would parse incorrectly
because the quote would be seen as escaped.  We have to treat \\ as an
escape sequence within strings in order to handle this.
2016-12-01 15:24:11 -07:00
Jonathan A. Sternberg b4db76cee2 Introduce syntax for marking a partial response with chunking
The `partial` tag has been added to the JSON response of a series and
the result so that a client knows when more of the series or result will
be sent in a future JSON chunk.

This helps interactive clients who don't want to wait for all of the
data to know if it is done processing the current series or the current
result. Previously, the client had to guess if the next chunk would
refer to the same result or a new result and it had to match the name
and tags of the two series to know if they were the same series. Now,
the client just needs to check the `partial` field included with the
response to know if it should expect more.

Fixed `max-row-limit` so it counts rows instead of results and it
truncates the response when the `max-row-limit` is reached.
2016-11-22 11:16:22 -06:00
Jason Wilder 8fce6bba48 Add tag value cardinality limit 2016-10-10 11:42:15 -06:00
Jason Wilder 2ae6b5e1ed Replace uses of newFieldsFromBinary with FieldIterator 2016-10-03 16:30:21 -06:00
Joe LeGasse 743946fafb models: Add FieldIterator type
The FieldIterator is used to scan over the fields of a point, providing
information, and delaying parsing/decoding the value until it is needed.
This change uses this new type to avoid the allocation of a map for the
fields which is then thrown away as soon as the points get converted
into columns within the datastore.
2016-10-03 16:30:21 -06:00
joelegasse bc4282ad99 Merge pull request #7347 from influxdata/2016-09-22-zero-alloc-strconv-parse-numbers
Zero-alloc wrappers for strconv.Parse{Int,Float}. Thanks @rw
2016-09-26 15:33:30 -04:00
rw 3155ff2a27 Implement and use zero-alloc FNV64a.
+ Remove a heap alloc in (Point).HashID() and (Row).tagsHash()
  (According to `-gcflags -m`).
+ Direct port from the stdlib.
+ Fuzz test for equivalence to stdlib version.
+ Save one alloc per line when writing with the bulk protocol.
2016-09-26 11:43:27 -07:00
rw 16c3bd2093 Happy-path zero-alloc num parsing fuzz tests. 2016-09-26 11:41:31 -07:00
rw 6906fe7240 Zero-alloc wrappers for strconv.Parse{Int,Float}
+ Reduces short-lived heap allocs during value parsing.
+ Fuzz tests to verify equivalence to stdlib functions.
2016-09-26 11:41:31 -07:00
Jason Wilder ac9a7d520b Pre-allocated Points slice when parsing points
Over a longer period of writes, this allocation shows up quite
a bit in profiles since the slice needs to be resized frequently.

This scans the slice to count how many lines are going to be parsed
in order to pre-allocate the slice capacity.  It's slightly slower,
but creates less garbage in the long run.
2016-09-26 12:19:15 -06:00
Cory LaNou 4a2d23da4a
fix failing vet issue in test 2016-09-26 11:29:02 -05:00
Joe LeGasse 0d2b339d7c models: Added AppendString, PointSize, and Round to Point
This change also updates the UDP client to take advantage of these
improvements, as well as some code review changes.
2016-09-23 13:22:30 -04:00
Joe LeGasse ee6816756a udp client: large points will now be split, if possible
The v2 UDP client will attempt to split points that exceed the
configured payload size. It will only do this for points that have a
timestamp specified.
2016-09-23 13:22:30 -04:00
Vladimir Sagan 0e33af50a9 UDP client: write metrics splitting metrics into chunks 2016-09-23 13:22:30 -04:00
Jonathan A. Sternberg dc2527ce86 Merge branch '1.0' 2016-08-31 14:45:57 -05:00
Jonathan A. Sternberg 0d63889847 Allow blank lines in the line protocol input 2016-08-30 09:25:55 -05:00
Jonathan A. Sternberg a2b65f2b86 Remove unused Err field on models.Row 2016-08-29 12:38:29 -05:00
Jonathan A. Sternberg 8b234546a8 Merge pull request #7204 from influxdata/1.0
Merge 1.0 branch to master
2016-08-25 15:20:30 -05:00
Jonathan A. Sternberg 10029caf2f Support negative timestamps in the query engine
Negative timestamps are now supported. We also now refuse two
nanoseconds that are at the edge of the minimum time window. One of the
nanoseconds we do not accept is because we need MinInt64 to be used for
some internal comparisons in the TSM engine and it was causing an
underflow when we subtracted one from the minimum time. The second is so
we can have one minimum time that signifies the default minimum that
nobody can write to (so we can implicitly rewrite the timestamp on
aggregate queries) but still use the explicit timestamp if it is given
to us by the user. We aren't able to tell the difference between if the
user provided it or if it was implicit without those values being
different.

If the default minimum time is used with an aggregate query, we rewrite
the time to be the epoch for backwards compatibility since we believe
that's more important than supporting that extra nanosecond.
2016-08-25 12:52:41 -05:00
Edd Robinson 6cafdbc604 Ensure we don't mutate provided statistics tags 2016-08-24 11:40:13 +01:00
rw fd53330824 gofmt 2016-08-23 11:53:17 -07:00
rw 8d7d89d8ed Fix "nil map" panic in statistics collection. 2016-08-23 11:45:12 -07:00
Edd Robinson 90ff713f21 Fix base64 encoding issue in stats
Fixes #7177.
2016-08-22 15:21:31 +01:00
Ben Johnson 8aa224b22d
reduce memory allocations in index
This commit changes the index to point to index data in the shards
instead of keeping it in-memory on the heap.
2016-08-16 14:09:00 -06:00
Jason Wilder d432aaa84d Fix panic with parsing empty key
Fixes #6990
2016-07-28 18:38:17 -06:00
Cory LaNou 968d322d6d finish tsm file exporter 2016-07-21 17:20:51 -05:00
Jonathan A. Sternberg 837a9804cf Refactoring the monitor service to avoid expvar
Truncate the time interval output of the monitor service to be on even
time intervals rather than on every minute based on the start time. This
normalizes the output from the monitor service.
2016-07-07 11:13:58 -05:00
Jonathan A. Sternberg 497db2a6d3 Removing dead code from every package except influxql
The tsdb package had a substantial amount of dead code related to the
old query engine still in there. It is no longer used, so it was removed
since it was left unmaintained. There is likely still more code that is
the same, but wasn't found as part of this code cleanup.

influxql has dead code show up because of the code generation so it is
not included in this pruning.
2016-06-20 22:41:07 -05:00
Jonathan A. Sternberg 8e1b036b0a Modify the max nanosecond time to be one nanosecond less
The highest time represented by a nanosecond needs to be used for an
exclusive range, so the maximum time needs to be one less than the
possible maximum number of nanoseconds representable by an int64 so that
we don't lose a point at that one time.

Previously worked in the open source version because the timestamp used
for finding a shard would be truncated by the retention policy so the
lookup time didn't run into this edge case because it didn't rest on the
truncation boundary. Since that point didn't really belong in that shard
group and was placed there by mistake, it's best to fix this bug since
the timestamp used to create the shard group should be capable of
retrieving it.
2016-06-16 12:15:41 -05:00
Jonathan A. Sternberg 3bd9425edb Fix the point validation parser to identify and sort tags correctly
Fixes #6771.
2016-06-13 09:45:10 -05:00
Jason Wilder ff2475bf7c Prevent allocation in unesecapeTag 2016-05-27 10:30:08 -06:00
Jason Wilder 97ad5fd2e6 Add ParseKey benchmark 2016-05-27 10:30:08 -06:00