1.9 KiB
The following benchmarks measure reading 10,000,000 points via TCP, yamux and gRPC.
These were performed using a VM and host machine on the same physical hardware.
iperf
reported about 390 MB/s as the maximum throughput between the machines.
influxd
running on VM, client on host.
All protocols used the same protobuf message (to provide framing) with an embedded []byte
array to serialize batches of points – large arrays of structures for points are simply too slow.
The underlying storage engine cursor can read 10,000,000 points in about 230ms, therefore the overhead for each protocol is as follows
TCP → 470ms
yamux → 620ms
gRPC → 970ms
Maximum transfer rates are therefore:
TCP → 340 MB/s or ~21e6 points / sec
yamux → 258 MB/s or ~16e6 points / sec
gRPC → 164 MB/s or ~10e6 points / sec
It is worth noting that I have not tested Go's network libraries to determine maximum throughput, however I suspect it may be close to the TCP maximum. Whilst we will benchmark using independent machines in AWS, these tests helped me understand relative performance of the various transports and the impact different serialization mechanisms have on our throughput. Protobuf is ok as long as we keep the graph small, meaning we customize the serialization of the points.
As a comparison, I also tested client and server on localhost, to compare the protocols without the network stack overhead. gRPC was very inconsistent, varying anywhere from 463ms to 793ms so the result represents the average of a number of runs.
Overhead
TCP → 95ms
yamux → 108ms
gRPC → 441ms
These numbers bring TCP and yamux within about 10% of each other. The majority of the difference between TCP and yamux is due to the additional frames sent by yamux to manage flow control, which add latency. If that overhead is a concern, we may need to tune the flow control algorithm.