Parsing the line protocol again on the receiving side of the remote
write consumes a lot cpu. This uses a different marshaling format
that is much faster to parse after we already parsed the point on
the write side.
Under highly conncurrent write load, the coordinating node would
create a connection to any other node that is part of the replica
group. Since each connection can be expensive, OOM sitations could
occur because there was no bounds on the number of new connections
that would be created. If writes on a remote node were slow, connections
could pile up an exacerbate the problem.
This switches the pool to be bounded and has a checkout that is blocking
with a timeout. If a connection is available, it's returned immediately.
If the pool still has room for more connections, it will create one if needed.
Otherwise, the call will block until a connection becomes available or
the timeout expires. In the case of a timeout, it is propogated back up
to the PointsWriter that determine what do return to the client.
Go style -- and existing runtime stats -- do not use underscores, but
instead use camel case. This change makes the internal stats adhere to
that convention.
Float values are not supported in the existing engine and the tsm1
engines. This changes NewPoint to return an error if a field value
contains a NaN field. It also allows us to validate fields to prevent
other unsupported types from sneaking in through other input plugins.
Since INTO queries need to have absolute information about the database
to work, we need to create a loopback interface back to the cluster
in order to perform them.
Issue: Enable golint on the code base #4098 (changes only for the cluster subpackage)
- [ ] CHANGELOG.md updated
- [X] Rebased/mergable
- [X] Tests pass
- [X] Sign [CLA](http://influxdb.com/community/cla.html) (if not already signed)
Point was accessed from multiple goroutines and there was a race on the the internal
cachedFields and cachedName fields. Accessing these fields is unnecessary work as it
requires the point to be unmarshal into Go types and then remarshaled back into protbuf
types. Instead, just send the line protocol version already available on the point via
the protobuf. This avoid accesssing these cached fields and eliminates some extra work.
Possible fix for #4069
Immediately return once the required number of writes are completed,
otherwise requests running with relaxed consistency levels (e.g. any
or one) would be blocked unexpectedly, for instance, waiting for dead
nodes to respond.
This commit converts meta.ShardInfo.OwnerIDs from a slice of ids
to a slice of objects. This is to support adding statuses for a
shard for a given node. For example, a node may have a shard
assigned to it but it is currently copying the shard and is not
ready to serve data for it.
The old `OwnerIDs` is marked as deprecated, however, the code
still supports loading from older protobuf-encoded data.
Running show measurements in a partially replicated cluster produces inconsistent
results due to the connection pooling. When running remote meta-data queries,
the cluster service ends ups keeping map shard request open but still checks the connection
back into the pool. This causes inconsistent results because data from the last request
interferes with the new request.
This removes the connection pool which fixes the issue. It also has the side effect of fixing
a nodes pool connections that have gone bad when a node restarts. For example, in a 3 node cluster
that has been responding to queries correctly, restarting 1 node will cause all the other to fail
to query that node indefinitely. This is now fixed as well.
* Capitalize first letter of message
* Log all services staring consistently
* Remove some extraneous log statements in meta.Store
* Log data dirs for meta, data and hinted handoff
With this change remote mapping no longer uses HTTP, as the HTTP ports
exposed by nodes on the cluster are not known cluster wide. The TCP
ports exposed by the cluster service are, so this change uses that
functionality. Each RemoteMapper has its own dedicated connection pool
for each node, and remote mapping TCP connections are in no way coupled
with query TCP connections.