milvus

Commit Graph

Author	SHA1	Message	Date
congqixia	cb7f2fa6fd	enhance: Use v2 package name for pkg module (#39990 ) Related to #39095 https://go.dev/doc/modules/version-numbers Update pkg version according to golang dep version convention --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-22 23:15:58 +08:00
Bingyi Sun	b59555057d	feat: support json index (#36750 ) https://github.com/milvus-io/milvus/issues/35528 This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later. basic usage: ``` collection.create_index("json_field", {"index_type": "INVERTED", "params": {"json_cast_type": DataType.STRING, "json_path": 'json_field["a"]["b"]'}}) ``` There are some limits to use this index: 1. If a record does not have the json path you specify, it will be ignored and there will not be an error. 2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error. 3. A specific json path can have only one json index. 4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-15 14:06:15 +08:00
congqixia	b2d6dbf836	fix: Return early when skip load pk index (#39762 ) Previous PR #39437 only print log and add index while load operation is still executed. This PR return early when segment decides not to load PK index. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-11 11:04:50 +08:00
aoiasd	74890dabc9	enhance: skip load bm25 sparse row data (#39078 ) Skip load row data of bm25 sparse field, because it's forbidden to return. --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-02-06 18:40:44 +08:00
congqixia	fb5761af8d	enhance: Skip loading pk index for sorted segment in loader (#39437 ) Related to #39339 Previous PR #39389 only skips append index into segment Also related to #39428 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-20 18:17:10 +08:00
congqixia	57e5652f1a	enhance: Log error instead of panicking if load lock wait timeout (#39308 ) Related to #39205 Previous PR #39206 This PR change wait timeout behavior to log error and return to avoid making other collection read failure in only some collections have deadlock Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-16 02:31:02 +08:00
yihao.dai	ce41778fe6	enhance: Optimize GetLocalDiskSize and segment loader mutex (#38599 ) 1. Make the segment loader lock protect only the resource. 2. Optimize GetDiskUsage to avoid excessive overhead. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-15 15:45:01 +08:00
yihao.dai	ec2e77b5d7	enhance: Reduce memory usage of BF in DataNode and QueryNode (#38129 ) 1. DataNode: Skip generating BF during the insert phase (BF will be regenerated during the sync phase). 2. QueryNode: Skip generating or maintaining BF for growing segments; deletion checks will be handled in the segcore. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-15 01:59:01 +08:00
congqixia	d89768f9e0	enhance: Unify LoadStateLock RLock & PinIf (#39206 ) Related to #39205 This PR merge `RLock` & `PinIfNotReleased` into `PinIf` function preventing segment being released before any Read operation finished. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-14 18:38:59 +08:00
Zhen Ye	3e788f0fbd	enhance: record memory size (uncompressed) item for index (#38770 ) issue: #38715 - Current milvus use a serialized index size(compressed) for estimate resource for loading. - Add a new field `MemSize` (before compressing) for index to estimate resource. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-14 10:33:06 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00
Bingyi Sun	c6a7f48123	enhance: Fix using wrong pool for warmup (#38941 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-03 17:50:54 +08:00
Bingyi Sun	aa0a87eda7	fix: Block warmup submit if pool full in sync mode (#38690 ) https://github.com/milvus-io/milvus/issues/38692 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-02 15:04:58 +08:00
Zhen Ye	5395ec19ad	enhance: add mem size for index file and fallback to multiply with serialized size (#38716 ) issue: #38715 Signed-off-by: chyezh <chyezh@outlook.com>	2024-12-25 14:46:49 +08:00
congqixia	618f0cb728	enhance: Put release segment and other misc cgo call into pool (#38186 ) Related to #30273 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-05 11:04:40 +08:00
Zhen Ye	c6dcef7b84	enhance: move segcore codes of segment into one package (#37722 ) issue: #33285 - move most cgo opeartions related to search/query into segcore package for reusing for streamingnode. - add go unittest for segcore operations. Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-29 10:22:36 +08:00
congqixia	1ed686783f	enhance: Use `PrimaryKeys` to replace interface slice for segment delete (#37880 ) Related to #35303 Reduce temporary memory usage for PK interface for segment delete. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-22 11:52:33 +08:00
congqixia	92e6ee6285	enhance: Use load pool for `CreateTextIndex` (#37898 ) Related to #37895 Only resolves the starving issue which caused goroutine leakage Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-22 10:06:33 +08:00
congqixia	ee54a98578	enhance: Add cgo call metrics for load/write API (#37405 ) Cgo API cost is not observerable since not metrics is related to them. This PR add metrics for some sync cgo call related to load & write --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-07 10:06:25 +08:00
congqixia	3106384fc4	enhance: Return deltadata for `DeleteCodec.Deserialize` (#37214 ) Related to #35303 #30404 This PR change return type of `DeleteCodec.Deserialize` from `storage.DeleteData` to `DeltaData`, which reduces the memory usage of interface header. Also refine `storage.DeltaData` methods to make it easier to usage. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-10-29 12:04:24 +08:00
congqixia	7774b7275e	enhance: Replace PrimaryKey slice with PrimaryKeys saving memory (#37127 ) Related to #35303 Slice of `storage.PrimaryKey` will have extra interface cost for each element, which may cause notable memory usage when delta row count number is large. This PR replaces PrimaryKey slice with PrimaryKeys interface saving the extra interface cost. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-10-28 10:29:30 +08:00
foxspy	d7b2ffe5aa	enhance: add an unify vector index config checker (#36844 ) issue: #34298 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-28 10:11:37 +08:00
foxspy	3de57ec4fa	enhance: add vector index mgr to remove vector index type dependency (#36843 ) issue: #34298 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-17 22:15:25 +08:00
Bingyi Sun	a75bb85f3a	feat: support chunked column for sealed segment (#35764 ) This PR splits sealed segment to chunked data to avoid unnecessary memory copy and save memory usage when loading segments so that loading can be accelerated. To support rollback to previous version, we add an option `multipleChunkedEnable` which is false by default. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-12 15:04:52 +08:00
aoiasd	db34572c56	feat: support load and query with bm25 metric (#36071 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-10-11 10:23:20 +08:00
SimFG	130a923dec	enhance: the estimate method when loading the collection (#36307 ) - issue: #36530 --------- Signed-off-by: SimFG <bang.fu@zilliz.com> Signed-off-by: xianliang.li <xianliang.li@zilliz.com> Co-authored-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-09 17:35:19 +08:00
cai.zhang	ecb2b242e2	enhance: Add sorted for segment info (#36469 ) issue: #33744 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-09-30 10:01:16 +08:00
yihao.dai	763fd0dfc5	enhance: Use a separate mmap config for chunk cache (#36276 ) issue: https://github.com/milvus-io/milvus/issues/35273 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-09-15 16:23:09 +08:00
Jiquan Long	89bf226f0b	feat: support keyword text match (#35923 ) fix: #35922 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-09-10 15:11:08 +08:00
zhagnlu	208c8a2328	fix:support config index offsetcache and fix create same index again (#35985 ) #35971 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-09-08 18:23:05 +08:00
cai.zhang	2c9bb4dfa3	feat: Support stats task to sort segment by PK (#35054 ) issue: #33744 This PR includes the following changes: 1. Added a new task type to the task scheduler in datacoord: stats task, which sorts segments by primary key. 2. Implemented segment sorting in indexnode. 3. Added a new field `FieldStatsLog` to SegmentInfo to store token index information. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-09-02 14:19:03 +08:00
yihao.dai	f2b83d316b	enhance: Support memory mode chunk cache (#35347 ) Chunk cache supports loading raw vectors into memory. issue: https://github.com/milvus-io/milvus/issues/35273 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-08-25 15:42:58 +08:00
Zhen Ye	a773836b89	enhance: optimize milvus core building (#35610 ) issue: #35549,#35611,#35633 - remove milvus_segcore milvus_indexbuilder..., add libmilvus_core - core building only link once - move opendal compilation into cmake - fix odr --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-08-23 12:35:02 +08:00
SimFG	731d45abbe	enhance: provide more general configuration to control mmap behavior (#35359 ) - issue: #35273 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-08-21 00:22:54 +08:00
zhagnlu	4b553b0333	enhance: revert remove duplicated pk function (#35103 ) issue: #34778 Revert "fix: fix query count(*) concurrently" Revert "enhance: mark duplicated pk as deleted " Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-08-05 10:48:17 +08:00
zhenshan.cao	aa247f192d	enhance: remove unused code for StorageV2 (#35132 ) issue: https://github.com/milvus-io/milvus/issues/34168 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-01 12:08:13 +08:00
congqixia	de8a266d8a	enhance: Enable linux code checker (#35084 ) See also #34483 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-30 15:53:51 +08:00
wei liu	c45f38aa61	enhance: Update protobuf-go to protobuf-go v2 (#34394 ) issue: #34252 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-29 11:31:51 +08:00
zhagnlu	804dd5409a	enhance: mark duplicated pk as deleted (#34586 ) fix #34247 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-07-16 14:25:39 +08:00
chyezh	259a682673	enhance: async search and retrieve in cgo (#33228 ) issue: #30926, #33132 related pr: #33133 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-22 09:38:02 +08:00
wei liu	ab93d9c23d	enhance: Use BatchPkExist to reduce bloom filter func call cost (#33611 ) issue:#33610 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-06-13 17:57:56 +08:00
chyezh	8ca5ced821	fix: async warmup will be blocked by state lock (#33686 ) issue: #33685 Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-10 21:59:53 +08:00
wei liu	c6a1c49e02	enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405 ) issue: #32995 To speed up the construction and querying of Bloom filters, we chose a blocked Bloom filter instead of a basic Bloom filter implementation. WARN: This PR is compatible with old version bf impl, but if fall back to old milvus version, it may causes bloom filter deserialize failed. In single Bloom filter test cases with a capacity of 1,000,000 and a false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times faster than the basic Bloom filter in both querying and construction, at the cost of a 30% increase in memory usage. - Block BF construct time {"time": "54.128131ms"} - Block BF size {"size": 3021578} - Block BF Test cost {"time": "55.407352ms"} - Basic BF construct time {"time": "210.262183ms"} - Basic BF size {"size": 2396308} - Basic BF Test cost {"time": "192.596229ms"} In multi Bloom filter test cases with a capacity of 100,000, an FPR of 0.001, and 100 Bloom filters, we reuse the primary key locations for all Bloom filters to avoid repeated hash computations. As a result, the blocked Bloom filter is also 5 times faster than the basic Bloom filter in querying. - Block BF TestLocation cost {"time": "529.97183ms"} - Basic BF TestLocation cost {"time": "3.197430181s"} --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-31 17:49:45 +08:00
Jiquan Long	0c5d8660aa	feat: support inverted index for array (#33452 ) issue: https://github.com/milvus-io/milvus/issues/27704 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-05-31 09:47:47 +08:00
jaime	0d3272ed6d	enhance: refine logs of cgo pool (#33373 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-05-27 19:06:11 +08:00
jaime	58ee613fea	enhance: remove repeated stats of loaded entity (#33255 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-05-27 01:49:41 +08:00
yihao.dai	760223f80a	fix: use seperate warmup pool and disable warmup by default (#33348 ) 1. use a small warmup pool to reduce the impact of warmup 2. change the warmup pool to nonblocking mode 3. disable warmup by default 4. remove the maximum size limit of 16 for the load pool issue: https://github.com/milvus-io/milvus/issues/32772 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Co-authored-by: xiaofanluan <xiaofan.luan@zilliz.com>	2024-05-27 01:25:40 +08:00
Bingyi Sun	0f8c6f49ff	enhance: mmap load raw data if scalar index does not have raw data (#33175 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-05-21 11:53:39 +08:00
cai.zhang	6ea7633bd5	enhance: Add memory size for binlog (#33025 ) issue: #33005 1. add `MemorySize` field for insert binlog. 2. `LogSize` means the file size in the storage object. 3. `MemorySize` means the size of the data in the memory. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2024-05-15 12:59:34 +08:00
chyezh	96489b814d	fix: remove busy log (#33042 ) issue: #32963 Signed-off-by: chyezh <chyezh@outlook.com>	2024-05-14 14:20:32 +08:00

1 2 3

128 Commits (eb046863485fdf3e130fc60484485c901b81276b)