milvus

Commit Graph

Author	SHA1	Message	Date
Spade A	8e1ce15146	fix: ngram index is mistakenly used for unsopported operations (#43955 ) issue: https://github.com/milvus-io/milvus/issues/43917 1. fix ngrma index to be mistakenly used for unsopported operation 2. fix potential uaf problem --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-21 14:41:46 +08:00
zhagnlu	d904c4e677	enhance: optimize compare expr performance for pk field (#43154 ) #43153 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-21 10:59:46 +08:00
congqixia	7963b17ac1	fix: Revert "fix: Use `folly::SharedMutex` preventing starvation (#43937 )" (#43959 ) Related to #43958 This reverts commit `580350495a`. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-21 10:09:47 +08:00
Spade A	d6a428e880	feat: impl StructArray -- support create index for vector array (embedding list) and search on it (#43726 ) Ref https://github.com/milvus-io/milvus/issues/42148 This PR supports create index for vector array (now, only for `DataType.FLOAT_VECTOR`) and search on it. The index type supported in this PR is `EMB_LIST_HNSW` and the metric type is `MAX_SIM` only. The way to use it: ```python milvus_client = MilvusClient("xxx:19530") schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True) ... struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field") ... struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000) ... schema.add_struct_array_field(struct_schema) index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128}) ... milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params) ``` Note: This PR uses `Lims` to convey offsets of the vector array to knowhere where vectors of multiple vector arrays are concatenated and we need offsets to specify which vectors belong to which vector array. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-08-20 10:27:46 +08:00
Alexander Guzhva	cfdb17a088	enhance: Fix ArithHelperI64 for SVE in bitset (#43952 ) missing ArithHelperI64<ArithOpType::Div, CmpOp> Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>	2025-08-19 22:48:58 +08:00
Alexander Guzhva	e179a5635f	enhance: remove duplicate code in ArithHelperF32 in SVE for bitset (#43950 ) fixes a problem of https://github.com/milvus-io/milvus/pull/43949 Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>	2025-08-19 22:35:47 +08:00
liliu-z	7dd2b103b0	enhance: Fix template declaration order for ArithHelperF32 in SVE implementation (#43949 ) Signed-off-by: Li Liu <li.liu@zilliz.com>	2025-08-19 21:58:22 +08:00
congqixia	580350495a	fix: Use `folly::SharedMutex` preventing starvation (#43937 ) Related to #43936 This PR: - Use `folly::SharedMutex` instead of `std::shared_mutex` preventing starvation - Use `folly::SharedMutex::WriteHolder/ReadHolder` instead of std::shared_lock and std::unique_lock to get better performance Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-19 20:05:46 +08:00
aoiasd	dcf04a58b8	feat: support use score function on segment search and use filter (#43868 ) relate: https://github.com/milvus-io/milvus/issues/43867 Support boost function score, multiply by the weight if match filter. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-19 16:15:45 +08:00
Gao	b602b4187d	enhance: upgrade aws-sdk from 1.9.234 to 1.11.352 (#43916 ) issue: #43908 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-08-19 11:11:45 +08:00
yihao.dai	64ab3d2681	enhance: Improve error message when query vector and dim mismatch (#43835 ) /kind improvement Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-08-18 01:07:44 +08:00
foxspy	647c2bca2d	enhance: Support streaming read and write of vector index files (#43824 ) issue: #42032 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-08-15 23:41:43 +08:00
Alexander Guzhva	ebb10dfae0	enhance: better auto-detect of SVE options for the bitset library (#43833 ) Enables the compilation of SVE code for the bitset library if a C++ compiler supports it. There are two conditions for enabling the SVE code * a C++ compiler needs to have a `-march=armv8-a+sve` * `arm_sve.h` header must be available AFAIK, `gcc 7 does not support SVE`, `gcc 8` and `gcc 9` support SVE, but have no `arm_sve.h` file, and only `gcc 10` has both. Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>	2025-08-15 22:37:44 +08:00
sthuang	5e4eb4a6e0	enhance: [StorageV2] bump storage version (#43871 ) related: https://github.com/milvus-io/milvus/issues/43869 bump storage version. include the following feature: * https://github.com/milvus-io/milvus-storage/pull/231 * https://github.com/milvus-io/milvus-storage/pull/232 * https://github.com/milvus-io/milvus-storage/pull/233 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-08-15 17:59:43 +08:00
sthuang	c102fa8b0b	enhance: [StorageV2] zero copy for packed writer record batch (#43779 ) The Out of Memory (OOM) error occurs because a handler retains the entire ImportRecordBatch in memory. Consequently, even when child arrays within the batch are flushed, the memory for the complete batch is not released. We temporarily fixed by deep copying record batch in #43724. The proposed fix is to split the RecordBatch into smaller sub-batches by column group. These sub-batches will be transferred via CGO, then reassembled before being written to storage using the Storage V2 API. Thus we can achieve zero-copy and only transferring references in CGO. related: #43310 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-08-15 10:11:44 +08:00
congqixia	f032044125	enhance: Refine segcore param change callback (#43838 ) Related to #43230 This PR - Move segcore setup function to `initcore` package to remove cgo dependency from pkg - Register core callback only for components depends on segcore - Rectify `UpdateLogLevel` implementation Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-13 19:31:44 +08:00
zhagnlu	b7c7df9440	fix: fix delete consumer bug for cocurrency R-W (#43831 ) #41570 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-12 11:37:42 +08:00
Gao	81a0915c29	enhance: add milvus-common module to decouple knwhere & segcore (#43624 ) issue: https://github.com/milvus-io/milvus/issues/42032 https://github.com/milvus-io/milvus/issues/41435 based on pr: https://github.com/milvus-io/milvus/pull/42124 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Co-authored-by: xianliang.li <xianliang.li@zilliz.com>	2025-08-11 14:09:42 +08:00
zhagnlu	5b83975d39	enhance:convert multi not equal to not in (#43690 ) #43689 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-08 10:37:40 +08:00
sparknack	169be30a76	enhance: cachinglayer: reserve resource for inevictable cachecell (#43602 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-08 10:35:49 +08:00
zhagnlu	c04d678ad4	enhance: make segcore params effective without restarting milvus (#43231 ) #43230 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-08-08 10:33:48 +08:00
congqixia	1561a4ae8c	enhance: [StorageV2] Avoid create local parent dir if fs remote (#43790 ) Related to #43752 milvus-storage pr: milvus-io/milvus-storage#230 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-08 10:19:40 +08:00
congqixia	b6199acb05	enhance: Utilize `search_batch_pks` for `search_ids` of PkTerm (#43751 ) Related to #43660 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-07 14:19:40 +08:00
congqixia	d414f6bd4d	enhance: Add assertion preventing reload same field (#43736 ) Related to #43725 This patch add assertion preventing segment reloading same field column. Also improve the message info when pk already exists. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-05 19:35:39 +08:00
yihao.dai	cb7be8885d	enhance: Deep copy arraw array (#43724 ) Deep copy arrow array and make a new RecordBatch with the copied array. issue: https://github.com/milvus-io/milvus/issues/43310 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-08-05 00:31:38 +08:00
Chun Han	d826d6ac91	fix: try to get span raw data for variable length data type(#43544 ) (#43705 ) related: #43544 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-08-04 11:15:38 +08:00
aoiasd	4f02b06abc	enhance: support set lindera dict build dir and download url in yaml (#43541 ) relate: https://github.com/milvus-io/milvus/issues/43120 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-04 09:47:38 +08:00
congqixia	4aff581007	enhance: Pass callback in search batch pks to void large result (#43695 ) Related to #43660 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-02 17:57:37 +08:00
Buqian Zheng	01baf582d5	fix: GroupChunkTranslator to correctly identify vector field (#43706 ) issue: #43653 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-08-02 00:49:37 +08:00
Bingyi Sun	b59bc5e2c0	fix: make json path index non exists offsets compatible with 2.5 (#43691 ) issue: https://github.com/milvus-io/milvus/issues/43666 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-08-01 23:22:23 +08:00
Buqian Zheng	b0226ef47c	fix: added more comprehensive container limit detection (#43693 ) issue: #41435 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-08-01 20:37:37 +08:00
Xianhui Lin	0f0edff7f0	fix: increment offset for null data rows in JsonKeyStats (#43679 ) fix: increment offset for null data rows in JsonKeyStatsInvertedIndex issue: https://github.com/milvus-io/milvus/issues/43151 Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-08-01 15:53:37 +08:00
congqixia	5f2f4eb3d6	enhance: Ignore entry with same ts when DeleteRecord search pks (#43669 ) Related to #43660 This patch reduces the unwanted offset&ts entries having same timestamp of delete record. Under large amount of upsert, this false hit could increase large amount of memory usage while applying delete. The next step could be passing a callback to `search_pk_func_` to handle hit entry streamingly. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-01 10:15:36 +08:00
Ted Xu	e37cd19da2	enhance: enable storage v2 by default (#43652 ) Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-08-01 08:59:36 +08:00
zhagnlu	239f743a18	fix: add enable_mmap key to load config (#43672 ) #43670 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-31 21:35:37 +08:00
sparknack	4aabe23a45	enhance: update flat_hash_map.hpp to a modified version (#43506 ) issue: #41435 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-31 20:09:36 +08:00
congqixia	f29964bd17	fix: Add padding for sorted index preventing 0 length mmap (#43663 ) Related to #43655 This patch add a padding when writing mmap file for ScalarSortedIndex in case of mmap falure due to 0 mmap length. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-31 18:53:36 +08:00
zhagnlu	708e426bb3	enhance: using set element for string term type (#43049 ) issue: #43048 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-31 10:35:37 +08:00
zhagnlu	31801f5937	fix: fix pk in [..] skip next batch when using multi-chunk segment (#43618 ) #43494 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-07-31 10:15:37 +08:00
congqixia	089f02bcca	fix: [StorageV2] Align null bitmap offset for fixed-length datatype (#43654 ) Related to #43626 Similar to previous pr #43321, null bitmap could be dislocated if the bitset ptr does not count the offset of array Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-31 09:55:36 +08:00
congqixia	6a74a7de66	enhance: Make DeleteRecord search pks by batch and PinAll (#43640 ) Related to #43592 When delete records are large, search pk one by one will result into many `Pincells` call which creates lots of futures. This patch make search pk execute in batch to reduce this cost. Also add `GetAllChunks` API to utilize `PinAllCells` to reduce pins. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-30 19:15:36 +08:00
sthuang	a2c7ed2780	fix: [StorageV2] sort field binlogs paths for packed reader and writer (#43585 ) key changes: * fix unstable storage v2 compaction unit test by guaranteeing the order of paths during sync. * bump milvus-storage version, include https://github.com/milvus-io/milvus-storage/pull/222 https://github.com/milvus-io/milvus-storage/pull/223 https://github.com/milvus-io/milvus-storage/pull/224 https://github.com/milvus-io/milvus-storage/pull/225 https://github.com/milvus-io/milvus-storage/pull/226 * Also fix the below related oom issue. related: https://github.com/milvus-io/milvus/issues/43310 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-30 08:09:36 +08:00
congqixia	4fe55e3008	fix: [StorageV2] Use separate channel for `get_cells` (#43632 ) Related to #43584 There might be concurrent calls on `translator.get_cells`. The channel cannot be shared among these calls, otherwise the logic will break. This patch create new channel for each `get_cells` invocation in case of data race. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-29 20:59:38 +08:00
foxspy	d57890449f	enhance: update knowhere version (#43528 ) issue: #42937 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-29 17:21:36 +08:00
Buqian Zheng	052fb6c562	feat: add time based eviction to data managed by cachinglayer (#43490 ) issue: https://github.com/milvus-io/milvus/issues/41435 also added disk capacity protection --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-07-29 16:17:35 +08:00
Bingyi Sun	a765cd1eaa	enhance: unlink mmap file when chunk and index are destructed (#43524 ) issue: https://github.com/milvus-io/milvus/issues/41636 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-07-29 16:05:36 +08:00
congqixia	268f1cdace	fix: Hold field shared_ptr in case of being released (#43614 ) Related to #43584 Directly accessing `fields_` in `get_raw_data` may have race if load vec index happens concurrently during getting raw data. This PR make `bulk_subscript` hold shared_ptr of field column prevent field column being release during reading it. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-29 12:15:36 +08:00
aoiasd	c9412434c8	enhance: add char group tokenizer (#42793 ) relate: https://github.com/milvus-io/milvus/issues/42792 Add char group tokenizer which support use costum char group or use some build-in char group as delimiters. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-29 11:11:35 +08:00
congqixia	f666d89919	fix: [StorageV2] Access future result to get exception if any (#43613 ) Related to #43584 When `LoadWithStrategy` throw exception, the ex was wrapped in the returned future. If the future is not handled, this exception would be ignored. This patch add `future.get()` to get exception if any. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-07-28 22:33:35 +08:00
Spade A	864d1b93b1	enhance: enable stlsort with mmap support (#43359 ) issue: https://github.com/milvus-io/milvus/issues/43358 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-28 15:32:55 +08:00

1 2 3 4 5 ...

2114 Commits (master)