Commit Graph

1932 Commits (2.5)

Author SHA1 Message Date
cai.zhang 5b8288a0ef
enhance: Refine geometry cache with offsets (#44432)
issue: #43427

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-18 20:24:02 +08:00
cai.zhang 124a1b3ce4
fix: Fix geometry bugs and add cache for create Geometry (#44376)
issue: #44102, #44079, #44075

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-17 15:24:03 +08:00
cai.zhang 7ef76058d5
enhance: Support gis filter operator st_dwithin (#44392)
issue: #43427

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-16 22:44:03 +08:00
congqixia 02d12619e2
fix: [2.5] Update 2.5 branch format (#44096)
Cherry-pick from master
pr: #44077
Related to #44076

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-15 11:16:00 +08:00
aoiasd cb0bb7b31f
enhance: [2.5] forbid panic when tantivy index path not exist (#44136)
pr: https://github.com/milvus-io/milvus/pull/44135

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-15 10:30:00 +08:00
cqy123456 ec4442d39b
enhance: update knowhere version (#44292)
issue: https://github.com/milvus-io/milvus/issues/42937 
master pr:https://github.com/milvus-io/milvus/pull/44294

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-11 15:05:58 +08:00
cai.zhang 877e68f851
enhance: Support R-Tree index for geometry datatype (#44069)
issue: #43427
pr: #37417

Support R-Tree index for geometry datatype.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
2025-09-11 14:19:58 +08:00
zhagnlu 802026569d
enhance:add param to modify delete snapshot size (#44213)
pr: #44215

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-05 14:31:56 +08:00
cqy123456 c17ce3cf90
enhance:[2.5]minhash support and add autoindex config (#44015)
master pr: https://github.com/milvus-io/milvus/pull/44186

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-03 17:39:54 +08:00
zhagnlu 4ff9e49a99
fix:expand lock range for dump_snapshot (#44131)
cherry-pick from pr: #44130

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-01 16:25:54 +08:00
ZhuXi cd931a0388
feat:Geospatial Data Type and GIS Function support for milvus (#43661)
issue: #43427
pr: #37417

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: cai.zhang <cai.zhang@zilliz.com>
2025-08-26 19:11:55 +08:00
zhagnlu 6c29689ca2
enhance: support expr result cache (#43882)
cherry-pick from pr: #43923

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-26 11:19:57 +08:00
cqy123456 a1ff6c89be
enhance:[2.5] Make build ratio of interim index configurable (#43938)
issue: https://github.com/milvus-io/milvus/issues/43993
master pr: https://github.com/milvus-io/milvus/pull/43939

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-08-25 16:01:52 +08:00
Alexander Guzhva 5903f049fb
enhance: Fix ArithHelperI64 for SVE in bitset (#43953)
pr: #43952

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-08-19 22:49:43 +08:00
Alexander Guzhva 84b7ec880d
enhance: remove duplicate code in ArithHelperF32 in SVE for bitset (#43951)
pr: #43950

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-08-19 22:38:44 +08:00
liliu-z bd9fd42310
enhance: Fix template declaration order for ArithHelperF32 in SVE implemementation (#43948)
pr: #43949

Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-08-19 22:00:39 +08:00
liliu-z a6bfa25054
enhance: Cp sve support for bitset (#43928)
pr: #43833

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
Signed-off-by: Li Liu <li.liu@zilliz.com>
Co-authored-by: Alexander Guzhva <alexanderguzhva@gmail.com>
2025-08-19 16:33:47 +08:00
sparknack b57d104742
enhance: [2.5] add write rate limit for disk file writer (#43856)
issue: https://github.com/milvus-io/milvus/issues/43040
pr: #43912

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-18 23:33:46 +08:00
Bingyi Sun 26883919de
fix: Fix wrong null offsets for json path index (#43823)
pr: #43390
issue: https://github.com/milvus-io/milvus/issues/43315

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-08-18 14:47:46 +08:00
yihao.dai 1644d0b288
enhance: [2.5] Improve error message when query vector and dim mismatch (#43836)
/kind improvement

pr: https://github.com/milvus-io/milvus/pull/43835

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-18 10:23:45 +08:00
zhagnlu 6d86aade6c
fix: fix delete consumer bug for cocurrency R-W (#43831) (#43855)
pr: #43831

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-14 10:19:43 +08:00
sparknack 4d944aecf7
enhance: add disk file writer with Direct IO support (#43692)
issue: #43040
pr: #42665 

This patch introduces a disk file writer that supports Direct IO.

Currently, it is exclusively utilized during the QueryNode load process.

Below is its parameters:

1. `common.diskWriteMode` This parameter controls the write mode of the
local disk, which is used to write temporary data downloaded from remote
storage. Currently, only QueryNode uses 'common.diskWrite*' parameters.
Support for other components will be added in the future.
The options include 'direct' and 'buffered'. The default value is
'buffered'.

2. `common.diskWriteBufferSizeKb` Disk write buffer size in KB, only
used when disk write mode is 'direct', default is 64KB.
Current valid range is [4, 65536]. If the value is not aligned to 4KB,
it will be rounded up to the nearest multiple of 4KB.

3. `common.diskWriteNumThreads` This parameter controls the number of
writer threads used for disk write operations. The valid range is [0,
hardware_concurrency]. It is designed to limit the maximum concurrency
of disk write operations to reduce the impact on disk read performance.
For example, if you want to limit the maximum concurrency of disk write
operations to 1, you can set this parameter to 1.
The default value is 0, which means the caller will perform write
operations directly without using an additional writer thread pool. In
this case, the maximum concurrency of disk write operations is
determined by the caller's thread pool size.

Both parameters can be updated during runtime.

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-08 12:13:41 +08:00
Spade A a3c5e2e3c3
feat: support phrase match query for 2.5 (#43716)
pr: https://github.com/milvus-io/milvus/pull/38869
issue: https://github.com/milvus-io/milvus/issues/38930

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-08 11:35:41 +08:00
aoiasd 305524f99a
fix: jieba tokenizer cause panic when dict word was empty string (#43337) (#43718)
pr: https://github.com/milvus-io/milvus/pull/43337
relate: https://github.com/milvus-io/milvus/issues/42779

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-04 20:57:48 +08:00
Chun Han f033294dc1
fix: try to get span raw data for variable length data type(#43544) (#43703)
related: #43544
pr: https://github.com/milvus-io/milvus/pull/43705

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-04 11:25:39 +08:00
Xianhui Lin e44df1c583
fix: increment offset for invalid data rows in JsonKeyStatsInvertedIndex (#43688)
fix: increment offset for null data rows in JsonKeyStatsInvertedIndex
issue: https://github.com/milvus-io/milvus/issues/43151
pr:https://github.com/milvus-io/milvus/pull/43679

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-08-03 13:11:38 +08:00
Bingyi Sun cc21855174
enhance: unlink mmap file when chunk and index are destructed (#43546)
pr: https://github.com/milvus-io/milvus/pull/43524
issue: https://github.com/milvus-io/milvus/issues/41636

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-08-01 11:17:37 +08:00
zhagnlu ea7307747a
fix: fix pk in [..] skip next batch when using multi-chunk segment (#43619)
pr: #43618

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 16:59:37 +08:00
zhagnlu 4b8e8bd9fd
enhance: using set element for string term type (#43393)
pr: #43049

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 10:43:38 +08:00
foxspy bb528ba065
enhance: [2.5]update knowhere version (#43623)
issue: #42937 
related: #43528 https://github.com/zilliztech/knowhere/pull/1278
pr: #43528

Signed-off-by: xianliang <xianliang.li@zilliz.com>
2025-07-29 17:27:38 +08:00
Chun Han a8c28d174f
fix: fail to get string views due to chunk bound empty loop(#41300) (#43482)
related: #41300
pr: https://github.com/milvus-io/milvus/pull/41452

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-25 10:16:54 +08:00
aoiasd fe39128021
enhance: update lindera version (#43457)
relate: https://github.com/milvus-io/milvus/issues/43120
pr: https://github.com/milvus-io/milvus/pull/43121

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-22 19:56:53 +08:00
Chun Han ebb1ff35bb
fix: refine judgement for batch views(#38736) (#43479)
related: #38736
pr: https://github.com/milvus-io/milvus/pull/43481

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-22 14:54:53 +08:00
congqixia b657c076a4
fix: "fix: [2.5] Align null bitmap offset when loading multi-chunk (#43342)" (#43411)
Related to #43389
alternative of #43395

This reverts commit e54d92447c.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-18 17:14:52 +08:00
Spade A ecf011aeb1
fix: update tantivy for fixing dir removing race condition #43399 (#43401)
pr: https://github.com/milvus-io/milvus/pull/43399
issue: https://github.com/milvus-io/milvus/issues/43258

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-18 16:56:52 +08:00
foxspy a6b2284de6
enhance: [2.5]update knowhere version (#43398)
issue: #42937 
/kind branch-feature

Signed-off-by: xianliang <xianliang.li@zilliz.com>
2025-07-18 16:20:52 +08:00
congqixia e54d92447c
fix: [2.5] Align null bitmap offset when loading multi-chunk (#43321) (#43342)
Cherry-pick from master
pr: #43321
Related to #43262

This patch fixes following logic bug:
- When multiple chunks are loaded and size cannot be divided by 8, just
appending uint8_t as bitmap will cause null bitmap dislocation
- `null_bitmap_data()` points to start of whole row group, which may not
stand for current `arrow::Array`

The current solutions is:
- Reorganize the null_bitmap with currect size & offset
- Pass `array->offset()` in tuple to info the current offset

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-17 12:16:52 +08:00
zhagnlu 385112e7e3
fix:fix text_match bug because of not adapting to multi-chunk model (#43297)
pr: #43303

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-17 11:22:53 +08:00
Alexander Guzhva 71c0f64a16
fix: [2.5] fix incorrect bitset for the division comparison when the right is < 0 (#43180)
issue: #42900 
pr: #43179 

Upd: also handles Inf and NaN values, and the division by zero case for
fp32 and fp64

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-07-16 15:36:52 +08:00
Spade A 6ccf1aa9b8
fix: void copy when getting json chunk #43183 (#43202)
master https://github.com/milvus-io/milvus/pull/43183
fix: https://github.com/milvus-io/milvus/issues/43182

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-16 15:34:51 +08:00
Spade A 21d6866a56
fix: fix text match index / json key stats index leak when segment released (#43308)
pr: https://github.com/milvus-io/milvus/pull/42655
issue: https://github.com/milvus-io/milvus/issues/42626

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-15 15:22:18 +08:00
Chun Han 4fe8011a70
enhance: refine variable-length-type memory usage(#38736) (#43093)
related: #38736
pr: https://github.com/milvus-io/milvus/pull/39578

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-04 18:26:45 +08:00
Chun Han bfa9688da3
enhance: supporting separate chunk cache pool(#42803) (#42901)
related: #42803

1. add a new thread pools using folly::CPUThreadPoolExecutor, named by
FThreadPools
2. reading vectors from chunkcache will use the separated
CHUNKCACHE_POOL to avoid being influenced by load collection
3. Note. For safety on cloud side on 2.5.x, only read-chunk-cache
operations is using this newly created thread pools other caller points
for threadpool will be mutated in the near future
4. master-branch doesn't need this pr as caching layer unified the chunk
cache behaviour

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-06-26 15:52:43 +08:00
foxspy 47f265864b
enhance: [2.5] update knowhere version for milvus 2.5.14 (#42939)
issue: #42937 
pr: #42938

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-06-25 21:06:43 +08:00
zhagnlu fe05970eba
fix is_not_in for trie index (#42886)
pr: #42716

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-25 17:30:43 +08:00
Spade A 017eb9ffe2
fix: fix some bugs discovered by chaos tests --- cherry pick (#42909)
master: https://github.com/milvus-io/milvus/pull/42906
issue: https://github.com/milvus-io/milvus/issues/42870

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-24 17:10:42 +08:00
Bingyi Sun 1a1d46d695
fix: Remove cached null expr result (#42783)
pr: https://github.com/milvus-io/milvus/pull/42818
issue: #42698
cached result may be changed in caller so there is no need to cache it

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-23 16:02:42 +08:00
Spade A 79ffc17bfe
enhance: optimize tantivy cargo config cherry-pick (#42881)
master: https://github.com/milvus-io/milvus/pull/42880
issue: https://github.com/milvus-io/milvus/issues/42879

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-20 16:18:11 +08:00
Spade A b837fb2d1e
fix: update tantivy for fixing threads explosion in file watcher (#42828)
master pr: #41450

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-18 15:40:40 +08:00
Chun Han c403f01e6d
enhance: avoid using threadpool when column is ready in chunk cache(#42803) (#42804)
related: #42803

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-06-18 02:44:40 +08:00