The following benchmarks measure reading 10,000,000 points via TCP, yamux and gRPC. 
These were performed using a VM and host machine on the same physical hardware. 
`iperf` reported about 390 MB/s as the maximum throughput between the machines. 
`influxd` running on VM, client on host.

All protocols used the same protobuf message (to provide framing) with an embedded `[]byte` 
array to serialize batches of points – large arrays of structures for points are simply too slow.

The underlying storage engine cursor can read 10,000,000 points in about 230ms, therefore 
the overhead for each protocol is as follows

```
TCP   → 470ms
yamux → 620ms
gRPC  → 970ms
```

Maximum transfer rates are therefore:

```
TCP   → 340 MB/s or ~21e6 points / sec
yamux → 258 MB/s or ~16e6 points / sec
gRPC  → 164 MB/s or ~10e6 points / sec
```

It is worth noting that I have not tested Go's network libraries to determine maximum throughput, 
however I suspect it may be close to the TCP maximum. Whilst we will benchmark using independent
machines in AWS, these tests helped me understand relative performance of the various transports 
and the impact different serialization mechanisms have on our throughput. Protobuf is ok as long 
as we keep the graph small, meaning we customize the serialization of the points.

---

As a comparison, I also tested client and server on localhost, to compare the protocols without the 
network stack overhead. gRPC was very inconsistent, varying anywhere from 463ms to 793ms so the result 
represents the average of a number of runs.

Overhead

```
TCP   →  95ms
yamux → 108ms
gRPC  → 441ms
```

These numbers bring TCP and yamux within about 10% of each other. The majority of the difference 
between TCP and yamux is due to the additional frames sent by yamux to manage flow control, 
which add latency. If that overhead is a concern, we may need to tune the flow control algorithm.