Update readme with pointers to influxdb.org site. Move design stuff over to a notes page
parent
5eae652bf5
commit
dfcc083efa
83
README.md
83
README.md
|
@ -1,83 +1,10 @@
|
|||
chronosdb [![Build Status](https://travis-ci.org/influxdb/influxdb.png?branch=master)](https://travis-ci.org/influxdb/influxdb)
|
||||
InfluxDB [![Build Status](https://travis-ci.org/influxdb/influxdb.png?branch=master)](https://travis-ci.org/influxdb/influxdb)
|
||||
=========
|
||||
|
||||
Scalable datastore for metrics, events, and real-time analytics
|
||||
InfluxDB is an open source distributed time series database tha has no external dependencies. It's useful for metrics, events, and analytics with a built in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to answer queries in real-time. That means every data point is indexed as it comes in and is immediately available in queries that should return in < 100ms. It's designed to be scalabe, simple to install and manage, and fast to get data in and out.
|
||||
|
||||
Requirements
|
||||
------------
|
||||
Read an [overview of the design goals and reasons for the project](http://influxdb.org/overview/).
|
||||
|
||||
* horizontal scalable
|
||||
* http interface
|
||||
* udp interface (low priority)
|
||||
* persistent
|
||||
* metadata for time series
|
||||
* perform functions quickly (count, unique, sum, etc.)
|
||||
* group by time intervals (e.g. count ticks every 5 minutes)
|
||||
* joining multiple time series to generate new timeseries
|
||||
* schema-less
|
||||
* sql-like query language
|
||||
* support multiple databases with read/write api key
|
||||
* single time series should scale horizontally (no hot spots)
|
||||
* dynamic cluster changes and data balancing
|
||||
* pubsub layer
|
||||
* continuous queries (keep connection open and return new points as they arrive)
|
||||
* Delete ranges of points from any number of timeseries (that should reflect in disk space usage)
|
||||
* querying should support one or more timeseries (possibly with regex to match on)
|
||||
Check out the [getting started guide](http://influxdb.org/docs/) to read about how to install InfluxDB, start writing data, and issuing queries in just a few minutes.
|
||||
|
||||
New Requirements
|
||||
----------------
|
||||
* Easy to backup and restore
|
||||
* Large time range queries with one column ?
|
||||
* Optimize for HDD access ?
|
||||
* What are the common use cases that we should optimize for ?
|
||||
|
||||
Modules
|
||||
-------
|
||||
|
||||
|
||||
+--------------------+ +--------------------+
|
||||
| | | |
|
||||
| WebConsole/docs | | Http API |
|
||||
| | | |
|
||||
+------------------+-+ +-+------------------+
|
||||
| |
|
||||
| |
|
||||
+-----+-------+-----------+
|
||||
| |
|
||||
| Lang. Bindings |
|
||||
| |
|
||||
+-----------------+ |
|
||||
| | |
|
||||
| Query Engine | |
|
||||
| | |
|
||||
+-----------------+-------+
|
||||
| |
|
||||
+----+ Coordinator (consensus) +-----+
|
||||
| | | |
|
||||
| +-------------------------+ |
|
||||
| |
|
||||
| |
|
||||
+--------+-----------+ +-------+------------+
|
||||
| | | |
|
||||
| Storage Engine | | Storage Engine |
|
||||
| | | |
|
||||
+--------+-----------+ +-------+------------+
|
||||
|
||||
Replication & Concensus Notes
|
||||
-----------------------------
|
||||
|
||||
Single raft cluster for which machines are in cluster and who owns which locations.
|
||||
1. When a write comes into a server, figure out which machine owns the data, proxy out to that.
|
||||
2. The machine proxies to the server, which assigns a sequence number
|
||||
3. Each machine in the cluster asks the other machines that own hash ring locations what their latest sequence number is every 10 seconds (this is read repair)
|
||||
|
||||
For example, take machines A, B, and C. Say B and C own ring location #2. If a write comes into A it will look up the configuration and pick B or C at random to proxy the write to. Say it goes to B. B assigns a sequence number of 1. It keeps a log for B2 of the writes. It will also keep a log for C2's writes. It then tries to write #1 to C.
|
||||
|
||||
If the write is marked as a quorum write, then B won't return a success to A until the data has been written to both B and C. Every so often both B and C will ask each other what their latest writes are.
|
||||
|
||||
Taking the example further, if we had server D that also owned ring location 2. B would ask C for writes to C2. If C is down it will ask D for writes to C2. This will ensure that if C fails no data will be lost.
|
||||
|
||||
Coding Style
|
||||
------------
|
||||
|
||||
1. Public functions should be at the top of the file, followed by a comment `// private functions` and all private functions.
|
||||
See the [list of libraries for different langauges](http://influxdb.org/docs/libraries/javascript.html). Or see the [HTTP API documentation to start writing a library for your favorite language](http://influxdb.org/docs/api/http.html).
|
||||
|
|
|
@ -0,0 +1,82 @@
|
|||
Just some notes about requirements, design, and clustering.
|
||||
|
||||
Scalable datastore for metrics, events, and real-time analytics
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
* horizontally scalable
|
||||
* http interface
|
||||
* udp interface (low priority)
|
||||
* persistent
|
||||
* metadata for time series (low priority)
|
||||
* perform functions quickly (count, unique, sum, etc.)
|
||||
* group by time intervals (e.g. count ticks every 5 minutes)
|
||||
* joining multiple time series to generate new timeseries
|
||||
* schema-less
|
||||
* sql-like query language
|
||||
* support multiple databases with authentication
|
||||
* single time series should scale horizontally (no hot spots)
|
||||
* dynamic cluster changes and data balancing
|
||||
* pubsub layer
|
||||
* continuous queries (keep connection open and return new points as they arrive)
|
||||
* Delete ranges of points from any number of timeseries (that should reflect in disk space usage)
|
||||
* querying should support one or more timeseries (possibly with regex to match on)
|
||||
|
||||
New Requirements
|
||||
----------------
|
||||
* Easy to backup and restore
|
||||
* Large time range queries with one column ?
|
||||
* Optimize for HDD access ?
|
||||
* What are the common use cases that we should optimize for ?
|
||||
|
||||
Modules
|
||||
-------
|
||||
|
||||
|
||||
+--------------------+ +--------------------+
|
||||
| | | |
|
||||
| WebConsole/docs | | Http API |
|
||||
| | | |
|
||||
+------------------+-+ +-+------------------+
|
||||
| |
|
||||
| |
|
||||
+-----+-------+-----------+
|
||||
| |
|
||||
| Lang. Bindings |
|
||||
| |
|
||||
+-----------------+ |
|
||||
| | |
|
||||
| Query Engine | |
|
||||
| | |
|
||||
+-----------------+-------+
|
||||
| |
|
||||
+----+ Coordinator (consensus) +-----+
|
||||
| | | |
|
||||
| +-------------------------+ |
|
||||
| |
|
||||
| |
|
||||
+--------+-----------+ +-------+------------+
|
||||
| | | |
|
||||
| Storage Engine | | Storage Engine |
|
||||
| | | |
|
||||
+--------+-----------+ +-------+------------+
|
||||
|
||||
Replication & Concensus Notes
|
||||
-----------------------------
|
||||
|
||||
Single raft cluster for which machines are in cluster and who owns which locations.
|
||||
1. When a write comes into a server, figure out which machine owns the data, proxy out to that.
|
||||
2. The machine proxies to the server, which assigns a sequence number
|
||||
3. Each machine in the cluster asks the other machines that own hash ring locations what their latest sequence number is every 10 seconds (this is read repair)
|
||||
|
||||
For example, take machines A, B, and C. Say B and C own ring location #2. If a write comes into A it will look up the configuration and pick B or C at random to proxy the write to. Say it goes to B. B assigns a sequence number of 1. It keeps a log for B2 of the writes. It will also keep a log for C2's writes. It then tries to write #1 to C.
|
||||
|
||||
If the write is marked as a quorum write, then B won't return a success to A until the data has been written to both B and C. Every so often both B and C will ask each other what their latest writes are.
|
||||
|
||||
Taking the example further, if we had server D that also owned ring location 2. B would ask C for writes to C2. If C is down it will ask D for writes to C2. This will ensure that if C fails no data will be lost.
|
||||
|
||||
Coding Style
|
||||
------------
|
||||
|
||||
1. Public functions should be at the top of the file, followed by a comment `// private functions` and all private functions.
|
Loading…
Reference in New Issue