milvus

Commit Graph

Author	SHA1	Message	Date
congqixia	de8a266d8a	enhance: Enable linux code checker (#35084 ) See also #34483 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-30 15:53:51 +08:00
wei liu	c45f38aa61	enhance: Update protobuf-go to protobuf-go v2 (#34394 ) issue: #34252 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-29 11:31:51 +08:00
congqixia	531092c031	enhance: Add lint rule to forbid gogo protobuf (#34594 ) github.com/gogo/protobuf is deprecated and could be error prune after upgrade protobuf message to v2. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-12 10:19:35 +08:00
smellthemoon	2a1356985d	enhance: support null in go payload (#32296 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-06-19 17:08:00 +08:00
congqixia	512ea6be5f	enhance: Avoid merging insert data when buffering insert msgs (#33562 ) See also #33561 This PR: - Use zero copy when buffering insert messages - Make `storage.InsertCodec` support serialize multiple insert data chunk into same batch binlog files Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-13 11:15:56 +08:00
Cai Yudong	4004e4c545	enhance: Optimize bulk insert unittest (#33224 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-24 10:23:41 +08:00
Cai Yudong	bcdbd1966e	feat: Support sparse float vector bulk insert for binlog/json/parquet (#32649 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-05-07 18:43:30 +08:00
Cai Yudong	5fc439c600	feat: Bulk insert support fp16/bf16 (#32157 ) Issue: #22837 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2024-04-22 10:05:22 +08:00
Buqian Zheng	3c80083f51	feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630 ) add sparse float vector support to different milvus components, including proxy, data node to receive and write sparse float vectors to binlog, query node to handle search requests, index node to build index for sparse float column, etc. https://github.com/milvus-io/milvus/issues/29419 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-03-13 14:32:54 -07:00
yihao.dai	a434d33e75	feat: Add import scheduler and manager (#29367 ) This PR introduces novel managerial roles for importv2: 1. ImportMeta: To manage all the import tasks; 2. ImportScheduler: To process tasks and modify their states; 3. ImportChecker: To ascertain the completion of all tasks and instigate relevant operations. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-01 18:31:02 +08:00
yihao.dai	c5918290e6	feat: Add import executor and manager for datanode (#29438 ) This PR introduces novel importv2 roles for datanode: 1. Executor: To execute tasks, a import task will be divided into the following steps: read data -> hash data -> sync data; 2. Manager: To manage all the tasks; issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-31 20:45:04 +08:00
Xu Tong	e429965f32	Add float16 approve for multi-type part (#28427 ) issue：https://github.com/milvus-io/milvus/issues/22837 Add bfloat16 vector, add the index part of float16 vector. Signed-off-by: Writer-X <1256866856@qq.com>	2024-01-11 15:48:51 +08:00
yihao.dai	3d07b6682c	feat: Add import reader for numpy (#29253 ) This PR implements a new numpy reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-08 19:42:49 +08:00
yihao.dai	23183ffb0f	feat: Add import reader for json (#29252 ) This PR implements a new json reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-05 18:12:48 +08:00
yihao.dai	3561586edf	feat: Add import reader for binlog (#28910 ) This PR defines the new import reader interfaces and implement a binlog reader for import. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-05 11:48:47 +08:00
Jiquan Long	3f46c6d459	feat: support inverted index (#28783 ) issue: https://github.com/milvus-io/milvus/issues/27704 Add inverted index for some data types in Milvus. This index type can save a lot of memory compared to loading all data into RAM and speed up the term query and range query. Supported: `INT8`, `INT16`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, `BOOL` and `VARCHAR`. Not supported: `ARRAY` and `JSON`. Note: - The inverted index for `VARCHAR` is not designed to serve full-text search now. We will treat every row as a whole keyword instead of tokenizing it into multiple terms. - The inverted index don't support retrieval well, so if you create inverted index for field, those operations which depend on the raw data will fallback to use chunk storage, which will bring some performance loss. For example, comparisons between two columns and retrieval of output fields. The inverted index is very easy to be used. Taking below collection as an example: ```python fields = [ FieldSchema(name="pk", dtype=DataType.VARCHAR, is_primary=True, auto_id=False, max_length=100), FieldSchema(name="int8", dtype=DataType.INT8), FieldSchema(name="int16", dtype=DataType.INT16), FieldSchema(name="int32", dtype=DataType.INT32), FieldSchema(name="int64", dtype=DataType.INT64), FieldSchema(name="float", dtype=DataType.FLOAT), FieldSchema(name="double", dtype=DataType.DOUBLE), FieldSchema(name="bool", dtype=DataType.BOOL), FieldSchema(name="varchar", dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name="random", dtype=DataType.DOUBLE), FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim), ] schema = CollectionSchema(fields) collection = Collection("demo", schema) ``` Then we can simply create inverted index for field via: ```python index_type = "INVERTED" collection.create_index("int8", {"index_type": index_type}) collection.create_index("int16", {"index_type": index_type}) collection.create_index("int32", {"index_type": index_type}) collection.create_index("int64", {"index_type": index_type}) collection.create_index("float", {"index_type": index_type}) collection.create_index("double", {"index_type": index_type}) collection.create_index("bool", {"index_type": index_type}) collection.create_index("varchar", {"index_type": index_type}) ``` Then, term query and range query on the field can be speed up automatically by the inverted index: ```python result = collection.query(expr='int64 in [1, 2, 3]', output_fields=["pk"]) result = collection.query(expr='int64 < 5', output_fields=["pk"]) result = collection.query(expr='int64 > 2997', output_fields=["pk"]) result = collection.query(expr='1 < int64 < 5', output_fields=["pk"]) ``` --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2023-12-31 19:50:47 +08:00
XuanYang-cn	2f16339aac	Enhance InsertData and FieldData (#27436 ) 1. Add NewInsertData 2. Add GetRowNum(), GetMemorySize(), and, Append() for InsertData 3. Add AppendRow() for FieldData for compaction Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-10-17 17:36:11 +08:00

17 Commits (a2b517523dbecde8d96514fb3b7ef3b217747ecf)