[skip ci]Format markdown for milvus_create_collection_en.md (#10332)

Signed-off-by: ruiyi.jiang <ruiyi.jiang@zilliz.com>
pull/10344/head
ryjiang 2021-10-21 10:49:08 +08:00 committed by GitHub
parent b099179ac0
commit c34391f378
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 21 additions and 14 deletions

View File

@ -2,12 +2,12 @@
`Milvus 2.0` uses `Collection` to represent a set of data, like `Table` in a traditional database. Users can create or drop `Collection`. Altering the `Schema` of `Collection` is not supported yet. This article introduces the execution path of `CreateCollection`, at the end of this article, you should know which components are involved in `CreateCollection`.
The execution flow of `CreateCollection` is shown in the following figure:
![create_collection](./graphs/dml_create_collection.png)
1. Firstly, `SDK` starts a `CreateCollection` request to `Proxy` via `Grpc`, the `proto` is defined as follows:
```proto
service MilvusService {
...
@ -23,9 +23,9 @@ message CreateCollectionRequest {
// Not useful for now
string db_name = 2;
// The unique collection name in milvus.(Required)
string collection_name = 3;
string collection_name = 3;
// The serialized `schema.CollectionSchema`(Required)
bytes schema = 4;
bytes schema = 4;
// Once set, no modification is allowed (Optional)
// https://github.com/milvus-io/milvus/issues/6690
int32 shards_num = 5;
@ -41,6 +41,7 @@ message CollectionSchema {
```
1. When received the `CreateCollection` request, the `Proxy` would wrap this request into `CreateCollectionTask`, and pushes this task into `DdTaskQueue` queue. After that, `Proxy` would call `WaitToFinish` method to wait until the task is finished.
```go
type task interface {
TraceCtx() context.Context
@ -70,20 +71,24 @@ type createCollectionTask struct {
```
3. There is a background service in `Proxy`, this service would get the `CreateCollectionTask` from `DdTaskQueue`, and execute it in three phases.
- `PreExecute`, do some static checking at this phase, such as check if `Collection Name` and `Field Name` are legal, if there are duplicate columns, etc.
- `Execute`, at this phase, `Proxy` would send `CreateCollection` request to `RootCoord` via `Grpc`, and wait for response, the `proto` is defined as follows:
```proto
service RootCoord {
...
rpc CreateCollection(milvus.CreateCollectionRequest) returns (common.Status){}
- `PreExecute`, do some static checking at this phase, such as check if `Collection Name` and `Field Name` are legal, if there are duplicate columns, etc.
- `Execute`, at this phase, `Proxy` would send `CreateCollection` request to `RootCoord` via `Grpc`, and wait for response, the `proto` is defined as follows:
...
}
```
- `PostExecute`, `CreateCollectonTask` does nothing at this phase, and return directly.
```proto
service RootCoord {
...
rpc CreateCollection(milvus.CreateCollectionRequest) returns (common.Status){}
...
}
```
- `PostExecute`, `CreateCollectonTask` does nothing at this phase, and return directly.
4. `RootCoord` would wrap the `CreateCollection` request into `CreateCollectionReqTask`, and then call function `executeTask`. `executeTask` would return until the `context` is done or `CreateCollectionReqTask.Execute` is returned.
```go
type reqTask interface {
Ctx() context.Context
@ -105,6 +110,7 @@ type CreateCollectionReqTask struct {
7. `RootCoord` would alloc a timestamp from `TSO` before writing `Collection`'s meta into `metaTable`, and this timestamp is considered as the point when the collection was created
8. At last `RootCoord` will send a message of `CreateCollectionRequest` into `MsgStream`, and other components, who have subscribed to the `MsgStream`, would be notified. The `Proto` of `CreateCollectionRequest` is defined as follow:
```proto
message CreateCollectionRequest {
common.MsgBase base = 1;
@ -124,7 +130,8 @@ message CreateCollectionRequest {
9. After all these operations, `RootCoord` would update the internal timestamp and return, so the `Proxy` would get the response.
*Notes:*
_Notes:_
1. In the `Proxy`, all `DDL` requests will be wrapped into `task`, and push the `task` into `DdTaskQueue`, the background service will read a new `task` from `DdTaskQueue` only when the previous one is finished. So all the `DDL` requests are executed serially on the `Proxy`
2. In the `RootCoord`, all `DDL` requests will be wrapped into `reqTask`, but there is no task queue, so the `DDL` requests will be executed in parallel on `RootCoord`.