milvus

Commit Graph

Author	SHA1	Message	Date
SimFG	5016038781	enhance: release the record in delete codec and add some log for compaction (#34454 ) /kind improvement Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-07-09 15:40:17 +08:00
congqixia	2f691f1e67	enhance: Unify DeleteLog parsing code (#34009 ) See also #33787 The parsing delete log is distributed in lots of places, which is not recommended and hard to maintain. This PR abstract common parsing logic into `DeleteLog.Parse` method to unify implementation and make it easier to replace json parsing lib. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-21 16:54:01 +08:00
shaoting-huang	5f02e52561	enhance: Refactor data codec deserialize (#33923 ) #33922 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-06-20 11:17:59 +08:00
smellthemoon	2a1356985d	enhance: support null in go payload (#32296 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-06-19 17:08:00 +08:00
shaoting-huang	8cdc0e6233	fix: fix data codec writer close (#33818 ) issue:#33813 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-06-18 13:59:57 +08:00
congqixia	f993b2913b	enhance: Reserve space of payload writer when serialize data (#33817 ) See also #33561 #33562 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-17 12:06:04 +08:00
XuanYang-cn	f67b6dc2b0	fix: DeleteData merge wrong data casuing data loss (#33820 ) See also: #33819 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-06-14 17:57:56 +08:00
shaoting-huang	0ecd694305	enhance: legacy code clean up (#33838 ) issue: #33839 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-06-14 14:25:56 +08:00
congqixia	512ea6be5f	enhance: Avoid merging insert data when buffering insert msgs (#33562 ) See also #33561 This PR: - Use zero copy when buffering insert messages - Make `storage.InsertCodec` support serialize multiple insert data chunk into same batch binlog files Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-13 11:15:56 +08:00
congqixia	b39dfc25dc	enhance: Use fastjson lib for unmarshal delete log (#33787 ) ``` goos: linux goarch: amd64 GOMAXPROC=1 cpu: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz BenchmarkJsonSerdeStd 343872 3568 ns/op 1335 B/op 25 allocs/op BenchmarkJsonSerdeFastjson 5124177 234.9 ns/op 16 B/op 1 allocs/op ``` --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-12 20:41:57 +08:00
cai.zhang	6ea7633bd5	enhance: Add memory size for binlog (#33025 ) issue: #33005 1. add `MemorySize` field for insert binlog. 2. `LogSize` means the file size in the storage object. 3. `MemorySize` means the size of the data in the memory. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2024-05-15 12:59:34 +08:00
Buqian Zheng	3c80083f51	feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630 ) add sparse float vector support to different milvus components, including proxy, data node to receive and write sparse float vectors to binlog, query node to handle search requests, index node to build index for sparse float column, etc. https://github.com/milvus-io/milvus/issues/29419 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-13 14:32:54 -07:00
Ted Xu	71adafa933	enhance: adding a streaming deserialize reader for binlogs (#30860 ) See #30863 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-03-04 19:31:09 +08:00
Ted Xu	12acaf3e4f	enhance: Adding a generic stream payload reader (#30682 ) See: #30404 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-02-21 17:10:52 +08:00
wayblink	f976385421	enhance: replace binlogIO with io.BinlogIO in datanode (#29725 ) #30633 Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-02-20 14:38:51 +08:00
aoiasd	a0537156c0	enhance: delete codc deserialize data by stream batch (#30407 ) relate: https://github.com/milvus-io/milvus/issues/30404 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-02-06 17:04:25 +08:00
XuanYang-cn	d744962aa1	fix: Correct Size calculation of DeleteData (#30397 ) This PR would correct the actual deltalog size See also: #30191 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-02-02 10:47:04 +08:00
Xu Tong	e429965f32	Add float16 approve for multi-type part (#28427 ) issue：https://github.com/milvus-io/milvus/issues/22837 Add bfloat16 vector, add the index part of float16 vector. Signed-off-by: Writer-X <1256866856@qq.com>	2024-01-11 15:48:51 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
XuanYang-cn	aae7e62729	feat: Add levelzero compaction in DN (#28470 ) See also: #27606 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-11-30 14:30:28 +08:00
congqixia	8a9ab69369	fix: Skip statslog generation flushing empty L0 segment (#28733 ) See also #27675 When L0 segment contains only delta data, merged statslog shall be skiped when performing sync task --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-25 15:10:25 +08:00
yah01	cc952e0486	enhance: optimize forwarding level0 deletions by respecting partition (#28456 ) - Cache the level 0 deletions after loading level0 segments - Divide the level 0 deletions by partition related: #27349 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-21 18:24:22 +08:00
yah01	ece592a42f	Deliver L0 segments delete records (#27722 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-07 01:44:18 +08:00
XuanYang-cn	2f16339aac	Enhance InsertData and FieldData (#27436 ) 1. Add NewInsertData 2. Add GetRowNum(), GetMemorySize(), and, Append() for InsertData 3. Add AppendRow() for FieldData for compaction Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-10-17 17:36:11 +08:00
SimFG	26f06dd732	Format the code (#27275 ) Signed-off-by: SimFG <bang.fu@zilliz.com>	2023-09-21 09:45:27 +08:00
Xu Tong	9166011c4a	Add float16 vector (#25852 ) Signed-off-by: Writer-X <1256866856@qq.com>	2023-09-08 10:03:16 +08:00
congqixia	2770ac4df5	Fix nilness linter errors (#26218 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-08-09 11:31:15 +08:00
xige-16	94d6cbb238	Fix querynode panic when binlog ts wrong (#25635 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2023-07-18 10:41:20 +08:00
PowderLi	3f4356df10	fix the spelling of `field` (#25008 ) Signed-off-by: PowderLi <min.li@zilliz.com>	2023-06-21 14:00:42 +08:00
congqixia	41af0a98fa	Use go-api/v2 for milvus-proto (#24770 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-06-09 01:28:37 +08:00
Enwei Jiao	d3af451d92	Upgrade golangci-lint (#24707 ) Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>	2023-06-07 19:34:36 +08:00
aoiasd	c84bdcea49	merge stats log when segment flushing or compacting (#23570 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2023-05-29 10:21:28 +08:00
Enwei Jiao	967a97b9bd	Support json & array types (#23408 ) Signed-off-by: yah01 <yang.cen@zilliz.com> Co-authored-by: yah01 <yang.cen@zilliz.com>	2023-04-20 11:32:31 +08:00
jaime	c9d0c157ec	Move some modules from internal to public package (#22572 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2023-04-06 19:14:32 +08:00
yah01	081572d31c	Refactor QueryNode (#21625 ) Signed-off-by: yah01 <yang.cen@zilliz.com> Co-authored-by: Congqi Xia <congqi.xia@zilliz.com> Co-authored-by: aoiasd <zhicheng.yue@zilliz.com>	2023-03-27 00:42:00 +08:00
Xiaofan	949d5d078f	Fix memory calculation in dataCodec (#21800 ) Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>	2023-01-28 11:09:52 +08:00
congqixia	f745d7f489	Fix compaction target segment rowNum is always 0 (#20937 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2022-12-01 20:33:17 +08:00
Xiaofan	633a749880	Recude IndexCodec Load Memory (#20621 ) Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com> Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>	2022-11-18 10:47:08 +08:00
Xiaofan	2bfecf5b4e	Refine bloomfilter and memory usage (#20168 ) Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com> Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>	2022-10-31 17:41:34 +08:00
SimFG	a55f739608	Separate public proto files (#19782 ) Signed-off-by: SimFG <bang.fu@zilliz.com> Signed-off-by: SimFG <bang.fu@zilliz.com>	2022-10-16 20:49:27 +08:00
SimFG	d7f38a803d	Separate some proto files (#19218 ) Signed-off-by: SimFG <bang.fu@zilliz.com> Signed-off-by: SimFG <bang.fu@zilliz.com>	2022-09-16 16:56:49 +08:00
xige-16	4de1bfe5bc	Add cpp data codec (#18538 ) Signed-off-by: xige-16 <xi.ge@zilliz.com> Co-authored-by: zhagnlu lu.zhang@zilliz.com Signed-off-by: xige-16 <xi.ge@zilliz.com>	2022-09-09 22:12:34 +08:00
xige-16	e40061b864	Update binlog event format (#18347 ) Signed-off-by: xige-16 <xi.ge@zilliz.com> Signed-off-by: xige-16 <xi.ge@zilliz.com>	2022-08-11 14:06:38 +08:00
yah01	70f8bea4b4	Avoid growing slice as deserializing binlogs (#17421 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2022-06-08 11:46:06 +08:00
yah01	7af02fa531	Improve load performance, load binlogs concurrently per file, deserialize binlogs concurrently per field/segment (#16514 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2022-04-25 15:57:47 +08:00
godchen	bb7a0766fe	Add dependency factory (#16204 ) Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>	2022-04-07 22:05:32 +08:00
xige-16	99984b88e1	Support delete varChar value (#16229 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2022-04-02 17:43:29 +08:00
Jiquan Long	ba37531456	Add support for loading multiple indexes (#16138 ) Signed-off-by: dragondriver <jiquan.long@zilliz.com>	2022-03-30 21:11:28 +08:00
Xiaofan	801eeffbcc	Replace cgo parquet reader to go parquet reader (#16199 ) Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>	2022-03-30 15:21:28 +08:00
xige-16	205c92e54b	Support insert string data (#15993 ) Signed-off-by: xige-16 <xi.ge@zilliz.com>	2022-03-25 14:27:25 +08:00

1 2 3

132 Commits (cc8f7aa11013041f85cd04a6a4c52657ed07443c)