Commit Graph

1560 Commits (c4b37fb2853cdfc0e69c1f1e711bbafb50183881)

Author SHA1 Message Date
yihao.dai 0fe5e90e8b
enhance: Remove import v1 (#31403)
Remove all code and logic related to import v1.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-22 15:29:09 +08:00
Chun Han c3264ca3e3
feat: support segment pruner (#31003)
related: #30376
2024-03-22 13:57:06 +08:00
groot c81909bfab
enhance: Support MinIO TLS connection (#31311)
issue: https://github.com/milvus-io/milvus/issues/30709
pr: #31292

Signed-off-by: yhmo <yihua.mo@zilliz.com>
Co-authored-by: Chen Rao <chenrao317328@163.com>
2024-03-21 11:15:20 +08:00
jaime db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
wei liu 147a3b8bdc
fix: Grpcclient return unrecoverable error (#31256)
issue: #31222

grpcclient's `call` func return a unrecoverable error, then the caller's
retry policy also breaks due to this unrecoverable error.

This PR introduce `retry.Handle`, the new func use `func() (bool,
error)` as input parameters, which return `shouldRetry` directly, to
avoid grpcclient return a unrecoverable error

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-15 10:03:05 +08:00
Buqian Zheng 3c80083f51
feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630)
add sparse float vector support to different milvus components,
including proxy, data node to receive and write sparse float vectors to
binlog, query node to handle search requests, index node to build index
for sparse float column, etc.

https://github.com/milvus-io/milvus/issues/29419

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-13 14:32:54 -07:00
yihao.dai 87b3c25b15
fix: Fix binlog import (#31205)
1. File type validation is omitted during binlog import.
2. System fields are appended to the schema during binlog import.

issue: https://github.com/milvus-io/milvus/issues/28521

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-13 10:35:04 +08:00
cai.zhang de2c95d00c
enhance: Constraint dynamic field as key-value format (#31183)
issue: #31051

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-12 12:45:03 +08:00
aoiasd d19a82e3f6
enhance: support stream call for grpc client (#30013)
relate: https://github.com/milvus-io/milvus/issues/29719

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-03-07 17:45:01 +08:00
wei liu fa73520b57
fix: Make datacoord client retry on index api (#30654)
issue: #20553

This PR add retry on all interface which belong to indexcoord in milvus
2.2 and. move to data coord in milvus 2.3, to prevent meet
`unimplemented` error during rolling upgrade from milvus 2.2 to 2.3.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-04 15:32:59 +08:00
yihao.dai 3775464b7c
enhance: Support varchar autoid for bulkinsertV1 (#30896)
This PR is a supplement to PR
https://github.com/milvus-io/milvus/pull/30377.

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-28 21:20:59 +08:00
chyezh 77477d6340
fix: wrong context passing into NewClient, error handling lost in session_util (#30817)
issue: #30799

Signed-off-by: chyezh <chyezh@outlook.com>
2024-02-28 10:40:09 +08:00
yiwangdr 32cff25f97
enhance: decrease coordinator init time (#29822)
This PR mainly improve two items:
1. Target observer should refresh loading status during init time. An
uninitialized loading status blocks search/query. Currently, the target
observer refreshes every 10 seconds, i.e. we'd need to wait for 10s for
no reason. That's also the reason why we constantly see false log
"collection unloaded" upon mixcoord restarts.
2. Delete session when service is stopped. So that the new service
doesn't need to wait for the previous session to expire (~10s).

Item 1 is the major improvement of this PR, which should speed up init
time by 10s.
Item 2 is not a big concern in most cases as coordinators usually shut
down after stop(). In those cases, coordinator restart triggers serverID
change which further triggers an existing logic that deletes expired
session. This PR only fixes rare cases where serverID doesn't change.

integration test:
`go test -tags dynamic -v -coverprofile=profile.out -covermode=atomic
tests/integration/coordrecovery/coord_recovery_test.go -timeout=20m`
Performance after the change:
Average init time of coordinators: 10s
Hardware: M2 Pro
Test setup: 1000 collections with 1000 rows (dim=128) per collection.


issue: #29409

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-02-05 14:00:12 +08:00
smellthemoon 6bc10f9fdd
enhance: support varchar autoid when bulkinsert (#30377)
support varchar autoid when bulkinsert

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-02-01 19:45:09 +08:00
xige-16 060c8603a3
fix: Support mvcc with hybrid serach (#30114)
issue: https://github.com/milvus-io/milvus/issues/29656
/kind bug

Signed-off-by: xige-16 <xi.ge@zilliz.com>

---------

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-02-01 16:03:03 +08:00
yihao.dai c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00
cai.zhang f619d792c0
enhance: Break down the granularity of collection info cache expired (#29977)
issue: #29772 

1. `DropPartition` only invalidates the cache related to the partition.
2. `CreateAlias` does not invalidate the cache.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-30 16:45:02 +08:00
chyezh 211143c5e6
enhance: add basic information of milvus into metrics (#29665)
add basic build information and runtime component dependency into
metrics.

issue: #29664

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-29 15:47:02 +08:00
smellthemoon 9512af357b
enhance: reduce memory when read data (#30284)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-26 20:49:00 +08:00
SimFG aa7014a360
enhance: move the cgo code in the pkg dir to interal dir (#30261)
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-01-25 15:15:01 +08:00
PowderLi 08ca0a2ca5
feat: support etcd authentication (#30226)
issue: #28895
add 3 configuration for ETCD config

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-01-24 11:35:00 +08:00
Patrick Weizhi Xu 0907d76253
enhance: pass partition key scalar info if enabled when build vector index (#29931)
issue: #29892 

Pass optional scalar IVF offsets to Cardinal

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-01-24 00:04:55 +08:00
chyezh 5ee9f734c1
fix: Use determined order to lock in BlockAll to avoid deadlock (#29246)
issue: #29104

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-22 14:50:56 +08:00
yihao.dai ddd741a5d4
fix: Fix closing closed chan in proxy watcher (#30143)
issue: https://github.com/milvus-io/milvus/issues/30142

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-19 23:02:54 +08:00
congqixia 10acdbbe8e
enhance: free CString in InitTraceConfig (#30055)
`C.CString` result needs to be freed after usage

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-17 15:15:03 +08:00
congqixia c0f0548702
fix: use SafeChan preventing close channel multiple times (#30022)
See also #29935

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-16 17:34:54 +08:00
Bingyi Sun e1258b8cad
feat: integrate storagev2 into loading segment (#29336)
issue: #29335

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-12 18:10:51 +08:00
wayblink 1df3f90696
feat: Implement DescribeAlias and ListAliases interfaces (#29641)
#22882
/kind feature

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-01-11 19:12:51 +08:00
Xu Tong e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
Cai Yudong cb9d9ec0f0
enhance: Correct sampleFraction's type to float (#29810)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-01-10 13:18:50 +08:00
yihao.dai 3d07b6682c
feat: Add import reader for numpy (#29253)
This PR implements a new numpy reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-08 19:42:49 +08:00
yihao.dai 156a0dd450
feat: Add import reader for Parquet (#29618)
This PR implements a Parquet reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-07 19:38:49 +08:00
yihao.dai 23183ffb0f
feat: Add import reader for json (#29252)
This PR implements a new json reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-05 18:12:48 +08:00
smellthemoon 1c1f2a1371
enhance:change some logs (#29579)
related #29588

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-05 16:12:48 +08:00
yihao.dai 3561586edf
feat: Add import reader for binlog (#28910)
This PR defines the new import reader interfaces and implement a binlog
reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-05 11:48:47 +08:00
Jiquan Long 3f46c6d459
feat: support inverted index (#28783)
issue: https://github.com/milvus-io/milvus/issues/27704

Add inverted index for some data types in Milvus. This index type can
save a lot of memory compared to loading all data into RAM and speed up
the term query and range query.

Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL`
and `VARCHAR`.

Not supported: `ARRAY` and `JSON`.

Note:
- The inverted index for `VARCHAR` is not designed to serve full-text
search now. We will treat every row as a whole keyword instead of
tokenizing it into multiple terms.
- The inverted index don't support retrieval well, so if you create
inverted index for field, those operations which depend on the raw data
will fallback to use chunk storage, which will bring some performance
loss. For example, comparisons between two columns and retrieval of
output fields.

The inverted index is very easy to be used.

Taking below collection as an example:

```python
fields = [
		FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100),
		FieldSchema(name="int8", dtype=DataType.INT8),
		FieldSchema(name="int16", dtype=DataType.INT16),
		FieldSchema(name="int32", dtype=DataType.INT32),
		FieldSchema(name="int64", dtype=DataType.INT64),
		FieldSchema(name="float", dtype=DataType.FLOAT),
		FieldSchema(name="double", dtype=DataType.DOUBLE),
		FieldSchema(name="bool", dtype=DataType.BOOL),
		FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000),
		FieldSchema(name="random", dtype=DataType.DOUBLE),
		FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim),
]
schema = CollectionSchema(fields)
collection = Collection("demo", schema)
```

Then we can simply create inverted index for field via:

```python
index_type = "INVERTED"
collection.create_index("int8", {"index_type": index_type})
collection.create_index("int16", {"index_type": index_type})
collection.create_index("int32", {"index_type": index_type})
collection.create_index("int64", {"index_type": index_type})
collection.create_index("float", {"index_type": index_type})
collection.create_index("double", {"index_type": index_type})
collection.create_index("bool", {"index_type": index_type})
collection.create_index("varchar", {"index_type": index_type})
```

Then, term query and range query on the field can be speed up
automatically by the inverted index:

```python
result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"])
result = collection.query(expr='int64 < 5', output_fields=["pk"])
result = collection.query(expr='int64 > 2997', output_fields=["pk"])
result = collection.query(expr='1 < int64 < 5', output_fields=["pk"])
```

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-12-31 19:50:47 +08:00
cai.zhang c45f8a2946
fix: Import data from parquet file in streaming way (#29514)
issue: #29292

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-27 15:30:46 +08:00
XuanYang-cn 7a6aa8552a
fix: add back existing datanode metrics (#29360)
See also: #29204

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-12-22 14:20:43 +08:00
congqixia f699be79f7
fix: grpc client check session skipped due to role not match (#29356)
Related to #28815

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-21 10:12:51 +08:00
wei liu e41fd6fbde
enhance: Move proxy client manager to util package (#28955)
issue:  #28898

This PR move the `ProxyClientManager` to util package, in case of
reusing it's implementation in querycoord

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-20 19:22:42 +08:00
wayblink 2274aa3b50
fix: bulkinsert binlog didn't consider ts order when processing delta data (#29163)
#29162

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-12-14 14:36:40 +08:00
Bingyi Sun ad866d2889
feat: integrate storagev2 into index build process (#28995)
issue: https://github.com/milvus-io/milvus/issues/28994

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-13 17:24:38 +08:00
wei liu fe1eeae2aa
enhance: Use mockery to replace manual mock code (#29074)
issue: #29043
This PR remove mannul mock code for proxy and data coord

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-13 10:46:44 +08:00
cai.zhang 49b8657f95
enhance: Support implicit type conversion for parquet (#29046)
issue: #29019

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-12 16:14:44 +08:00
congqixia 1fe5f12bd5
enhance: Add client connect wrapper to keep connection alive (#29058)
See also #29057
Add wrapper to maintain client&connection
When reset operation is needed, `Close` method shall wait until all
on-going request return

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-11 17:20:38 +08:00
cai.zhang 2b05460ef9
enhance: Make import-related error message clearer (#28978)
issue: #28976

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-08 10:12:38 +08:00
wayblink 6736f65345
feat: skip some empty ttMsg in Datanode flowgraph (#28756)
/kind feature

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-12-07 01:00:37 +08:00
yihao.dai d26b563a8b
feat: Define import API and metadata (#28731)
Define the new rpc and metadata for ImportV2.

see also: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-12-04 19:56:35 +08:00
Bingyi Sun 45e6801ce4
feat: Add checker activation service interfaces (#28850)
issue: #28610

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-04 17:38:37 +08:00
cai.zhang f5f4f0872e
enhance: Support importing data with parquet file (#28608)
issue: #28272

Numpy does not support array type import. 
Array type data is imported through parquet.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-11-29 20:52:27 +08:00