With this change, the query engine code gathers information about
shards and tagsets by working with individual shards, collating the
information, and returning that to the client. It does not assume that any
particular shard is local, and accesses all shards through abstracted
Mappers, of which there are two types -- a Mapper type for Raw queries
and a second type for Aggregate queries. There are corresponding
Executors for each type of Mapper, but both types of Executors share the
same interface.
Writing points that were not sorted by time could cause very high
CPU usages and increased latencies because each point inserted would
cause the in-memory cache to be resorted. The worst case would be
writing a large batch of N points in reverse time order which would
invoke N sorts of the slice.
This patch keeps track of which slices need to be sorted and sorts
them once at the end. In the previous example, the N sorts becomes
one. There is still a pathalogical case that would require N/2 sorts.
For example, 10000 points split across 5000 series. Each series has two
points that are in reverse time order. This would incur 5000 sorts still.
Fixes#3159
These are not supported types but previously it would cause the
point.Fields() func to panic. This prevents it from panicing
so the values can be ignored if needed.
When creating a point manually, the field values are interface{}
which allows unsupported types to be passed in. Previously, the
code would panic. It will now default to string representation of
the value if it's not a known type.
This commit adds a write ahead log to the shard. Entries are cached
in memory and periodically flushed back into the index. The WAL and
the cache are both partitioned into buckets so that flushing doesn't
stop the world as long.
Field values that were out of range for the type would panic the database
when being inserted because the parser would allow them as valid points.
This change prevents those invalid values from being parsed and instead
returns an error.
An alternative fix considered was to handle the error and clamp the value
to the min/max value for the type. This would treat numeric range errors
slightly differently than other type erros which might lead to confusion.
The simplest fix with the current parser would be to just convert each field
to the type at parse time. Unfortunately, this adds extra memory allocations
and lowers throughput significantly. Since out of range values are less common
than in-range values, some heuristics are used to determine when the more
expensive type parsing and range checking is performed. Essentially, we only
do the slow path when we cannot determine that the value is in an acceptable
type range.
Fixes#3127
A field value of just a numeric value would be accepted by the line
protocol parser but the value would be set as the field name and
the value would be nil. Instead, return an error because all field
values need a field name.
Statements were only being normalized if a default database was included
in the query (usually via the query param 'db'). However if no default
database was included, and none was an explicit part of the measurement
name, no database-existence check was run. This result in a later panic
with wildcard expansion.
Fixes#2960
Integers were were written back to line protocol using strconv.FormatFloat
incorrectly. Large integers are written in scientific notation which
causes their type to change to a float when parsed back.
Supported boolean values are now t, T, true, TRUE, f, F, false, and
FALSE. This is what the strconv.ParseBool function supports with
the exception of 1 and 0. 1 and 0 would be parsed as ints in the
line protocol.
Previously, any non-true value would be parsed as false. e.g.
value=blah would parse to false. This will now return an error as
parsing time.
Adds more tests for invalid numbers such as 0.1a, -2.-4, as well
test for supported formats for negative and positive integers/floats
as well as scientific notation.
Fixes#2869
When adding a new field to an existing measurment, Shard.validateSeriesAndFields
would also encode the fields as a side-effect. In the case of a new field
that needed to be created, the encoding would fail because the field type
had not been created for the measurement yet. The fields are re-encoded
after validateSeriesAndFields returns and after the field encoding have been
setup properly so this additional encoding during
validation isn't necessary.
Fixes a panic on writes because the field value was not parse correctly.
panic: unsupported value type during encode fields: <nil>
goroutine 117 [running]:
github.com/influxdb/influxdb/tsdb.(*FieldCodec).EncodeFields(0xc2081c4020, 0xc2081dc180, 0x0, 0x0, 0x0, 0x0, 0x0)
/Users/jason/go/src/github.com/influxdb/influxdb/tsdb/shard.go:573 +0x8e3
* Add deleteMeasurement to store and shard
* Add DropMeasurement to DatabaseIndex
* Update ErrMeasurementNotFound and ErrDatabaseNotFound to not include the first line of the stack trace.
* Pulled over updates to ast and parser from master
* Updated store and shard to be able to drop series
* Pulled updates to database.go from master into tsdb/meta.go
Fixes issue where queries wouldn't be able to hit anything because the index does't load until the shard is open.
Fix an issue where field codecs weren't populated in the shard when loading.
Uses a structure like:
/root/
/db1/rp1/1
/2
/db2/rp2/3
If a write is assigned to a shard on the local node but the shard
has not been created, create it when the write returns an error
and retry the write.
This allows the new write path to be hooked up if you start the
server with `INFLUXDB_ALPHA1=1`. When set, writes will go though
the coordinator and be stubbed out to write to a single local data
node with one shards. The write will be logged and written to
disk .
The env var is used so that the current write path is not completely
broken which would break many of the tests that depend on writes.
Note that queries are not currently working w/ the this change.