docs: make figure 1 smaller

pull/24376/head
Nga Tran 2021-12-13 14:22:02 -05:00
parent 5b920fb877
commit 6e44daabdc
1 changed files with 39 additions and 41 deletions

View File

@ -5,53 +5,51 @@
## Data Organization
`IOx Server` is an database management system (DBMS) that, as an example, can be briefly illustrated in Figure 1. An IOx server can include different isolated datasets from one or many organizations/users, each can be represented by a `database`. For example, the IOX Server in Figure 1 consists of `p` databases. Each database can have as many `tables` as needed. Data of each table can be split into many `partitions` based on a specified partition key which is an expression of the table column(s). In the example of Figure 1, `Table 1` is partitioned by date which is an expression on a time column of `Table 1`. Partition data can be further split into many chunks depending on the table's flow of ingested data which will be described in next section, Data Life Cycle. Each chunk contains a subset of rows of a table partition on a subset of columns of the table. For example, `Chunk 1` has 2 rows of data on columns `col1`, `col2`, and `col3` while `Chunk 2` includes 3 rows on `col1` and `col4`. Since every chunk can consist of data of the same or different columns, a chunk has it own `schema` defined with it. `Chunk 1`'s schema is {`col1`, `col2`, `col3`} (and their corresponding data types) and `Chunk 2`'s schema is {`col1`, `col4`}. Same name column, `col1`, represents the same column of the table and must have the same data type.
```text
┌───────────────┐ IOx Server
│ IOx Server │
└───────────────┘
┌────────────────────┼────────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐
│ Database 1 │ ... │ Database p │ Databases
└───────────────┘ └───────────────┘
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ Tables
│ Table 1 │ ... │ Table n │
└───────────────┘ └───────────────┘
│ │
┌─────────────────────┼────────────────────┐ │
│ │ │ │
▼ ▼ ▼ ▼
┌────────────┐ ┌────────────┐ Partitions
│Partition 1 │ ... │Partition m │ ...
│(2021-12-10)│ │(2021-12-20)│
└────────────┘ └────────────┘
│ │
│ ┌────────────────┼────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────┐ ┌───────────┐ Chunks
... │ Chunk 1 │ ... │ Chunk 2 │
│ │ │ │
│col1 col2 col3│ │ col1 col4 │
│---- ---- ----│ │ ---- ---- │
│---- ---- ----│ │ ---- ---- │
└──────────────┘ │ ---- ---- │
└───────────┘
```text
┌───────────┐
│IOx Server │ IOx Server
└───────────┘
┌──────────────┼────────────────┐
▼ ▼ ▼
┌───────────┐ ┌────────────┐
│Database 1 │ ... │ Database p │ Databases
└───────────┘ └────────────┘
┌──────────────┼─────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐
│ Table 1 │ ... │ Table n │ Tables
└──────────┘ └──────────┘
│ │
┌──────────────┼──────────────┐ │
▼ ▼ ▼ ▼
┌────────────┐ ┌────────────┐
│Partition 1 │ ... │Partition m │ ... Partitions
│(2021-12-10)│ │(2021-12-20)│
└────────────┘ └──────┬─────┘
│ │
┌─────────────┼─────────────┐ │
▼ ▼ ▼ ▼
┌──────────────┐ ┌───────────┐
│ Chunk 1 │ ... │ Chunk 2 │ ... Chunks
│ │ │ │
│col1 col2 col3│ │ col1 col4 │
│---- ---- ----│ │ ---- ---- │
│---- ---- ----│ │ ---- ---- │
└──────────────┘ │ ---- ---- │
└───────────┘
Figure 1: Data organization in an IOx Server
```
Chunk is considered the smallest unit of blocks of data in IOx and the central discussion of the rest of this document. Chunks can include duplicate rows identified by its table primary key. Duplicated rows will be deduplicated at Query and Compaction time[^1].
[^1]: Duplication and deduplication are parts of a large topic that are out of this document scope.
Chunk is considered the smallest unit of block of data in IOx and the central discussion of the rest of this document. Chunks can include duplicated rows of table primary key that will be deduplicated at Query and Compaction time[^dup]. IOx does not (yet) support direct data modification but does allow deletion[^del] which means a modification can be done through a deletion and an ingestion.
[^dup]: `Duplication` and `deduplication` are parts of a large topic that are out of this document scope.
[^del]: `Deletion` is another large topic that deserves its own document.
### Chunk Types
### Chunk Stages
## Data Life Cycle