Update readme with pointers to influxdb.org site. Move design stuff over to a notes page

2013-10-31 13:36:56 -04:00 · 2013-10-31 13:36:56 -04:00 · dfcc083efa
parent 5eae652bf5
commit dfcc083efa
2 changed files with 87 additions and 78 deletions
--- a/README.md
+++ b/README.md
@ -1,83 +1,10 @@
-chronosdb  [![Build Status](https://travis-ci.org/influxdb/influxdb.png?branch=master)](https://travis-ci.org/influxdb/influxdb)
+InfluxDB [![Build Status](https://travis-ci.org/influxdb/influxdb.png?branch=master)](https://travis-ci.org/influxdb/influxdb)
 =========

-Scalable datastore for metrics, events, and real-time analytics
+InfluxDB is an open source distributed time series database tha has no external dependencies. It's useful for metrics, events, and analytics with a built in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to answer queries in real-time. That means every data point is indexed as it comes in and is immediately available in queries that should return in < 100ms. It's designed to be scalabe, simple to install and manage, and fast to get data in and out.

-Requirements
------------
+Read an [overview of the design goals and reasons for the project](http://influxdb.org/overview/).

-* horizontal scalable
-* http interface
-* udp interface (low priority)
-* persistent
-* metadata for time series
-* perform functions quickly (count, unique, sum, etc.)
-* group by time intervals (e.g. count ticks every 5 minutes)
-* joining multiple time series to generate new timeseries
-* schema-less
-* sql-like query language
-* support multiple databases with read/write api key
-* single time series should scale horizontally (no hot spots)
-* dynamic cluster changes and data balancing
-* pubsub layer
-* continuous queries (keep connection open and return new points as they arrive)
-* Delete ranges of points from any number of timeseries (that should reflect in disk space usage)
-* querying should support one or more timeseries (possibly with regex to match on)
+Check out the [getting started guide](http://influxdb.org/docs/) to read about how to install InfluxDB, start writing data, and issuing queries in just a few minutes.

-New Requirements
----------------
-* Easy to backup and restore
-* Large time range queries with one column ?
-* Optimize for HDD access ?
-* What are the common use cases that we should optimize for ?
-
-Modules
-------
-
-
-           +--------------------+   +--------------------+
-           |                    |   |                    |
-           |  WebConsole/docs   |   |      Http API      |
-           |                    |   |                    |
-           +------------------+-+   +-+------------------+
-                              |       |
-                              |       |
-                        +-----+-------+-----------+
-                        |                         |
-                        |  Lang. Bindings         |
-                        |                         |
-                        +-----------------+       |
-                        |                 |       |
-                        |   Query Engine  |       |
-                        |                 |       |
-                        +-----------------+-------+
-                        |                         |
-                   +----+ Coordinator (consensus) +-----+
-                   |    |                         |     |
-                   |    +-------------------------+     |
-                   |                                    |
-                   |                                    |
-          +--------+-----------+                +-------+------------+
-          |                    |                |                    |
-          |   Storage Engine   |                |   Storage Engine   |
-          |                    |                |                    |
-          +--------+-----------+                +-------+------------+
-
-Replication & Concensus Notes
-----------------------------
-
-Single raft cluster for which machines are in cluster and who owns which locations.
-1. When a write comes into a server, figure out which machine owns the data, proxy out to that.
-2. The machine proxies to the server, which assigns a sequence number
-3. Each machine in the cluster asks the other machines that own hash ring locations what their latest sequence number is every 10 seconds (this is read repair)
-
-For example, take machines A, B, and C. Say B and C own ring location #2. If a write comes into A it will look up the configuration and pick B or C at random to proxy the write to. Say it goes to B. B assigns a sequence number of 1. It keeps a log for B2 of the writes. It will also keep a log for C2's writes. It then tries to write #1 to C.
-
-If the write is marked as a quorum write, then B won't return a success to A until the data has been written to both B and C. Every so often both B and C will ask each other what their latest writes are.
-
-Taking the example further, if we had server D that also owned ring location 2. B would ask C for writes to C2. If C is down it will ask D for writes to C2. This will ensure that if C fails no data will be lost.
-
-Coding Style
------------
-
-1. Public functions should be at the top of the file, followed by a comment `// private functions` and all private functions.
+See the [list of libraries for different langauges](http://influxdb.org/docs/libraries/javascript.html). Or see the [HTTP API documentation to start writing a library for your favorite language](http://influxdb.org/docs/api/http.html).
--- a/design_notes.md
+++ b/design_notes.md
@ -0,0 +1,82 @@
+Just some notes about requirements, design, and clustering.
+
+Scalable datastore for metrics, events, and real-time analytics
+
+Requirements
+------------
+
+* horizontally scalable
+* http interface
+* udp interface (low priority)
+* persistent
+* metadata for time series (low priority)
+* perform functions quickly (count, unique, sum, etc.)
+* group by time intervals (e.g. count ticks every 5 minutes)
+* joining multiple time series to generate new timeseries
+* schema-less
+* sql-like query language
+* support multiple databases with authentication
+* single time series should scale horizontally (no hot spots)
+* dynamic cluster changes and data balancing
+* pubsub layer
+* continuous queries (keep connection open and return new points as they arrive)
+* Delete ranges of points from any number of timeseries (that should reflect in disk space usage)
+* querying should support one or more timeseries (possibly with regex to match on)
+
+New Requirements
+----------------
+* Easy to backup and restore
+* Large time range queries with one column ?
+* Optimize for HDD access ?
+* What are the common use cases that we should optimize for ?
+
+Modules
+-------
+
+
+           +--------------------+   +--------------------+
+           |                    |   |                    |
+           |  WebConsole/docs   |   |      Http API      |
+           |                    |   |                    |
+           +------------------+-+   +-+------------------+
+                              |       |
+                              |       |
+                        +-----+-------+-----------+
+                        |                         |
+                        |  Lang. Bindings         |
+                        |                         |
+                        +-----------------+       |
+                        |                 |       |
+                        |   Query Engine  |       |
+                        |                 |       |
+                        +-----------------+-------+
+                        |                         |
+                   +----+ Coordinator (consensus) +-----+
+                   |    |                         |     |
+                   |    +-------------------------+     |
+                   |                                    |
+                   |                                    |
+          +--------+-----------+                +-------+------------+
+          |                    |                |                    |
+          |   Storage Engine   |                |   Storage Engine   |
+          |                    |                |                    |
+          +--------+-----------+                +-------+------------+
+
+Replication & Concensus Notes
+-----------------------------
+
+Single raft cluster for which machines are in cluster and who owns which locations.
+1. When a write comes into a server, figure out which machine owns the data, proxy out to that.
+2. The machine proxies to the server, which assigns a sequence number
+3. Each machine in the cluster asks the other machines that own hash ring locations what their latest sequence number is every 10 seconds (this is read repair)
+
+For example, take machines A, B, and C. Say B and C own ring location #2. If a write comes into A it will look up the configuration and pick B or C at random to proxy the write to. Say it goes to B. B assigns a sequence number of 1. It keeps a log for B2 of the writes. It will also keep a log for C2's writes. It then tries to write #1 to C.
+
+If the write is marked as a quorum write, then B won't return a success to A until the data has been written to both B and C. Every so often both B and C will ask each other what their latest writes are.
+
+Taking the example further, if we had server D that also owned ring location 2. B would ask C for writes to C2. If C is down it will ask D for writes to C2. This will ensure that if C fails no data will be lost.
+
+Coding Style
+------------
+
+1. Public functions should be at the top of the file, followed by a comment `// private functions` and all private functions.