This can happen, though is very unlikely. If this node receives encoded
data, to be written to disk, and is queried for that data before its
metastore is updated, there will be no field mapping for the data during
decode. All this can happen because data is encoded by the node that first
received the write request, not the node that actually writes the data to
disk. So if this happens, skip the data.
This pull request adds recovery to the messaging.Topic when opening. If
any partial messages are found then the file is truncated at that point
and started from there. This can occur when ungracefully shutting down
a server. It can leave half written messages at the end of segments.
When a data node starts up, the broker URLs were not set before
they were actually being used. The call to client.Open() in
turn triggers the raft streamer and heartbeat which try to connect
to the broker. If those started before the subsequent client.SetURLs()
call, you would see the following error in the logs at startup:
[messaging] 2015/04/01 11:59:22 reconnecting to broker: url={ <nil> /messaging/messages index=2&streaming=true&topicID=0 }, err=Get /messaging/messages?index=2&streaming=true&topicID=0: unsupported protocol scheme ""
Fixing this race uncovered another bug where the join urls would be
cleared the first time the broker was started. In this case, the
join urls should be left alone since they were set properly w/ SetURLs.
Fixes#2152
This is a pre-requisite for #1934. When running separate
broker and data nodes, you currently need to know what role
a host is performing. This complicates cluster setup in
that you must configure separate broker URLs and data node
URLs.
This change allows a broker only node to redirect data nodes endpoints
to a valid data node and a data only node to redirect broker
endpoints to a valid broker.
Refactored query engine to have different processing pipeline for raw queries. This enables queries that have a large offset to not keep everything in memory. It also makes it so that queries against raw data that have a limit will only p
rocess up to that limit and then bail out.
Raw data queries will only read up to a certain point in the map phase before yielding to the engine for further processing.
Fixes#2029 and fixes#2030