Commit Graph

1534 Commits (c1b0562d210d34e9cc6f48ee34dc18b489e960c0)

Author SHA1 Message Date
Bingyi Sun e1258b8cad
feat: integrate storagev2 into loading segment (#29336)
issue: #29335

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-12 18:10:51 +08:00
wayblink 1df3f90696
feat: Implement DescribeAlias and ListAliases interfaces (#29641)
#22882
/kind feature

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-01-11 19:12:51 +08:00
Xu Tong e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
Cai Yudong cb9d9ec0f0
enhance: Correct sampleFraction's type to float (#29810)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-01-10 13:18:50 +08:00
yihao.dai 3d07b6682c
feat: Add import reader for numpy (#29253)
This PR implements a new numpy reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-08 19:42:49 +08:00
yihao.dai 156a0dd450
feat: Add import reader for Parquet (#29618)
This PR implements a Parquet reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-07 19:38:49 +08:00
yihao.dai 23183ffb0f
feat: Add import reader for json (#29252)
This PR implements a new json reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-05 18:12:48 +08:00
smellthemoon 1c1f2a1371
enhance:change some logs (#29579)
related #29588

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-05 16:12:48 +08:00
yihao.dai 3561586edf
feat: Add import reader for binlog (#28910)
This PR defines the new import reader interfaces and implement a binlog
reader for import.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-05 11:48:47 +08:00
Jiquan Long 3f46c6d459
feat: support inverted index (#28783)
issue: https://github.com/milvus-io/milvus/issues/27704

Add inverted index for some data types in Milvus. This index type can
save a lot of memory compared to loading all data into RAM and speed up
the term query and range query.

Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL`
and `VARCHAR`.

Not supported: `ARRAY` and `JSON`.

Note:
- The inverted index for `VARCHAR` is not designed to serve full-text
search now. We will treat every row as a whole keyword instead of
tokenizing it into multiple terms.
- The inverted index don't support retrieval well, so if you create
inverted index for field, those operations which depend on the raw data
will fallback to use chunk storage, which will bring some performance
loss. For example, comparisons between two columns and retrieval of
output fields.

The inverted index is very easy to be used.

Taking below collection as an example:

```python
fields = [
		FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100),
		FieldSchema(name="int8", dtype=DataType.INT8),
		FieldSchema(name="int16", dtype=DataType.INT16),
		FieldSchema(name="int32", dtype=DataType.INT32),
		FieldSchema(name="int64", dtype=DataType.INT64),
		FieldSchema(name="float", dtype=DataType.FLOAT),
		FieldSchema(name="double", dtype=DataType.DOUBLE),
		FieldSchema(name="bool", dtype=DataType.BOOL),
		FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000),
		FieldSchema(name="random", dtype=DataType.DOUBLE),
		FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim),
]
schema = CollectionSchema(fields)
collection = Collection("demo", schema)
```

Then we can simply create inverted index for field via:

```python
index_type = "INVERTED"
collection.create_index("int8", {"index_type": index_type})
collection.create_index("int16", {"index_type": index_type})
collection.create_index("int32", {"index_type": index_type})
collection.create_index("int64", {"index_type": index_type})
collection.create_index("float", {"index_type": index_type})
collection.create_index("double", {"index_type": index_type})
collection.create_index("bool", {"index_type": index_type})
collection.create_index("varchar", {"index_type": index_type})
```

Then, term query and range query on the field can be speed up
automatically by the inverted index:

```python
result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"])
result = collection.query(expr='int64 < 5', output_fields=["pk"])
result = collection.query(expr='int64 > 2997', output_fields=["pk"])
result = collection.query(expr='1 < int64 < 5', output_fields=["pk"])
```

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-12-31 19:50:47 +08:00
cai.zhang c45f8a2946
fix: Import data from parquet file in streaming way (#29514)
issue: #29292

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-27 15:30:46 +08:00
XuanYang-cn 7a6aa8552a
fix: add back existing datanode metrics (#29360)
See also: #29204

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-12-22 14:20:43 +08:00
congqixia f699be79f7
fix: grpc client check session skipped due to role not match (#29356)
Related to #28815

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-21 10:12:51 +08:00
wei liu e41fd6fbde
enhance: Move proxy client manager to util package (#28955)
issue:  #28898

This PR move the `ProxyClientManager` to util package, in case of
reusing it's implementation in querycoord

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-20 19:22:42 +08:00
wayblink 2274aa3b50
fix: bulkinsert binlog didn't consider ts order when processing delta data (#29163)
#29162

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-12-14 14:36:40 +08:00
Bingyi Sun ad866d2889
feat: integrate storagev2 into index build process (#28995)
issue: https://github.com/milvus-io/milvus/issues/28994

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-13 17:24:38 +08:00
wei liu fe1eeae2aa
enhance: Use mockery to replace manual mock code (#29074)
issue: #29043
This PR remove mannul mock code for proxy and data coord

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-13 10:46:44 +08:00
cai.zhang 49b8657f95
enhance: Support implicit type conversion for parquet (#29046)
issue: #29019

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-12 16:14:44 +08:00
congqixia 1fe5f12bd5
enhance: Add client connect wrapper to keep connection alive (#29058)
See also #29057
Add wrapper to maintain client&connection
When reset operation is needed, `Close` method shall wait until all
on-going request return

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-11 17:20:38 +08:00
cai.zhang 2b05460ef9
enhance: Make import-related error message clearer (#28978)
issue: #28976

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-08 10:12:38 +08:00
wayblink 6736f65345
feat: skip some empty ttMsg in Datanode flowgraph (#28756)
/kind feature

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-12-07 01:00:37 +08:00
yihao.dai d26b563a8b
feat: Define import API and metadata (#28731)
Define the new rpc and metadata for ImportV2.

see also: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-12-04 19:56:35 +08:00
Bingyi Sun 45e6801ce4
feat: Add checker activation service interfaces (#28850)
issue: #28610

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-04 17:38:37 +08:00
cai.zhang f5f4f0872e
enhance: Support importing data with parquet file (#28608)
issue: #28272

Numpy does not support array type import. 
Array type data is imported through parquet.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-11-29 20:52:27 +08:00
cai.zhang 1b7a503f89
enhance: Revert import support csv format (#28760)
Revert import support csv format.
issue: #28778

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-11-28 14:32:27 +08:00
cai.zhang c29b60e18e
enhance: Support Array DataType for bulk_insert (#28341)
issue: #28272 
Support array DataType for bulk_insert with json, binlog files.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-11-27 13:50:27 +08:00
MrPresent-Han fc30d291be
fix createCollection failed occasionally (#28592) (#28712)
fix: create collection seldom failure #28592

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-11-27 11:10:25 +08:00
wayblink da339535d5
enhance: Merge flowgraph goroutines into 1 (#28654)
/kind enhancement
#24826

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-11-23 19:52:25 +08:00
smellthemoon 73f2bab454
enhance:add some log when create client and get component states (#28160)
/kind improvement

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-11-22 09:12:22 +08:00
PowderLi c238bff9fb
fix: symbol 'GetStorageMetrics' and 'enableDynamicField' (#28580)
/kind bug
to #28579 #28504

1. replace enableDynamic with enableDynamicField
2. cgo directly link to milvus_storage

Signed-off-by: PowderLi <min.li@zilliz.com>
2023-11-21 10:20:22 +08:00
Bingyi Sun d7145e2c06
enhance: Update golangci_lint version (#28535)
Update golangci lint and fix some warnings

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-11-21 10:04:21 +08:00
PowderLi a1c505dbd5
add internal storage metrics (#28278)
/kind improvement
issue: #28277

Signed-off-by: PowderLi <min.li@zilliz.com>
2023-11-19 17:22:25 +08:00
XuanYang-cn 40d5c902b6
Enable getting multiple segments in plan result (#28350)
Compaction plan result contained one segment for one plan. For l0
compaction would write to multiple segments, this PR expand the segments
number in plan results and refactor some names for readibility.

- Name refactory: - CompactionStateResult -> CompactionPlanResult -
CompactionResult -> CompactionSegment

See also: #27606

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-14 15:56:19 +08:00
smellthemoon 0aa90de141
Reduce the goroutine in flowgraph to 2 (#28233)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-11-13 10:50:17 +08:00
wei liu bce1054f92
Fix retry when proxy stopped (#28264)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-09 18:58:21 +08:00
groot 3f6b203018
Fix bulkinsert bug that segments are compacted after import (#28192)
Signed-off-by: yhmo <yihua.mo@zilliz.com>
2023-11-07 15:14:26 +08:00
wei liu 5b45a138b1
disable auto balance when old node exists (#28191)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-07 14:02:20 +08:00
wei liu da41a5b51e
fix check grpc error logic (#28182)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-07 11:54:18 +08:00
Xiaofan da19e49daf
Support purge old session for standalone (#28184)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-11-06 21:21:42 +08:00
wei liu 68a86471ba
fix grpc client retry on node server not match error (#28169)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-03 23:42:16 +08:00
wayblink 00ae019ff0
Use go 1.20 csv_reader to keep milvus go=1.18 limitation (#28080)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-11-03 10:40:16 +08:00
wei liu ecec5dfcfd
fix retry on offline node (#28079)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-03 10:14:16 +08:00
groot abd5b199cc
Bulkinsert support pure list json (#27990)
Signed-off-by: yhmo <yihua.mo@zilliz.com>
2023-11-01 19:02:13 +08:00
Enwei Jiao 8ae9c947ae
Use OpenDAL to access object store (#25642)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-11-01 09:00:14 +08:00
KumaJie e88212ba4b
Add CSV file import function (#27149)
Signed-off-by: kuma <675613722@qq.com>
Co-authored-by: kuma <675613722@qq.com>
2023-10-31 22:47:23 +08:00
yah01 9658367a3c
Refine chunk manager errors (#27590)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-31 12:18:15 +08:00
Enwei Jiao 4a33391b8f
rename createindex (#27903)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-10-27 10:12:14 +08:00
Filip Haltmayer 6b1a106a31
Moving etcd client into session (#27069)
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
2023-10-27 07:36:12 +08:00
zhagnlu 6060dd7ea8
Add chunk manager request timeout (#27692)
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2023-10-23 20:08:08 +08:00
SimFG 9b0ecbdca7
Support to replicate the mq message (#27240)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-10-20 14:26:09 +08:00