Commit Graph

1620 Commits (350dde666dc1d99f6d5a257b595b2be5d74b3fda)

Author SHA1 Message Date
yihao.dai 8cda48a96a
enhance: Use mmap.scalarIndex config for text index (#36400)
issue: https://github.com/milvus-io/milvus/issues/35273

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-24 12:21:13 +08:00
sre-ci-robot 167e4fb10d
[automated] Update Knowhere Commit (#36352)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-09-19 01:01:10 +08:00
Bingyi Sun 23b95aeba3
fix: remove element type check (#35828)
https://github.com/milvus-io/milvus/issues/36275
Array's element type is not same with schema's. It is INT32 for INT16
and INT8

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-09-18 11:37:10 +08:00
jaime 2ff3765058
enhance: catch std::stoi exception and improve error msg (#36267)
issue: #36255

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-09-14 16:17:08 +08:00
zhagnlu 489087d18b
enhance: refactor executor framework V2 (#35251)
#32636

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-13 20:57:09 +08:00
congqixia 58d3200986
enhance: Filter out non-hit delete records during load delta (#36207)
Related to #35303

This PR utilizes pk index in segment to exclude non-hit delete record
during load delete records. This ability is crucial when l0/delete
forward policy only replies on segment itself(without BF filtering).

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-13 19:05:08 +08:00
Jiquan Long f0f2fb4cf0
enhance: span tracing of c++ part (#36205)
fix: https://github.com/milvus-io/milvus/issues/36204

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-13 11:19:09 +08:00
zhagnlu 5e5e87cc2f
enhance: rename some params and reduce default bitmapCardinalityLimit… (#36138)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-12 12:09:08 +08:00
Jiquan Long 89bf226f0b
feat: support keyword text match (#35923)
fix: #35922

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-10 15:11:08 +08:00
Bingyi Sun 53a8a24554
fix: fix empty indices of sparse float (#35403)
https://github.com/milvus-io/milvus/issues/35401

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-09-10 14:23:07 +08:00
congqixia 851f3b9883
fix: Make legacy non-lexicographic branch break swtich (#36125)
Related to #35941
Previous PR: #36034

This patch makes the switch branching logic correct and make the unit
test work for cases which does not select the whole dataset.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-10 10:15:07 +08:00
congqixia 3123093dd7
enhance: Use `MARISA_LABEL_ORDER` when building trie index (#36034)
Related to #35941
Previous PR: #35943

This PR make `Trie` index using `MARISA_LABEL_ORDER`, which make
predictive search iterating in lexicographic order.

When trie index is build in label order, lexicographc could be utilized
accelerating `Range` operations.

However according to the official document, using `MARISA_LABEL_ORDER`
will make "exact match lookup, common prefix search, and predictive
search" slower.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-09 14:29:05 +08:00
congqixia a103dd5eb3
enhance: Fix SearchOnSealed clang-format lint (#36056)
Related to #36008

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-06 16:47:04 +08:00
smellthemoon 21b135c7c2
fix: not append valid data when transfer to insert record (#36027)
fix not append valid data when transfer to insert record and add a tiny
check when in groupBy field.
#35924

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-09-06 14:53:04 +08:00
SimFG 5247631289
fix: fill the metric type field in the LoadMetaInfo object (#35962)
- issue: #35960

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-05 20:50:23 -07:00
Jiquan Long 11325d9ed5
fix: binary arith expression on inverted index (#35945)
issue: https://github.com/milvus-io/milvus/issues/35946

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-05 20:01:05 +08:00
cqy123456 560e8e70b0
enhance: reduce mmap_rss after chunkcache warmup (#35974)
related pr: https://github.com/milvus-io/milvus/pull/35965

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-09-05 18:07:05 +08:00
congqixia c61eea737b
enhance: Fix trace.cpp lint format issue (#36004)
Introduced by #35928

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-05 16:33:04 +08:00
congqixia 7b21032d19
fix: Check all values for `trie.predictive_search` (#35943)
Related to #35941

For marisa trie `predictive_search` default behavior, it value iterated
is not in lexicographic order.

This PR is a brute force fix to make range operator returns correct
values.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-05 15:01:04 +08:00
congqixia 9e96ed4873
fix: Fix tracing config update logic (#35928)
Related to #35927

There are serveral issue this PR addresses:
- Use `ResetTraceConfig` method instead init one in update event handler
- Implement dynamic stats.Handler to receive tracing config update event
- Update `enable_trace` flag when `ResetTraceConfig` is invoked
- Change `enable_trace` to `std::atomic<bool>` in case of data race

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-05 14:27:04 +08:00
Abdullah Ahmed cc02dc0a55
fix: Handle Input/Output Errors in vsnprintf and snprintf (#35898)
Fix for Issue: #35897
2024-09-04 08:15:04 +08:00
foxspy 9da86529a7
enhance: Add disk filemananger parallel load control to reduce the memory consumption (#35281)
issue: #35280 
add parallel control to limit the memory consumption during index file
loading

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-09-03 18:01:03 +08:00
Zhen Ye f68df9a11e
fix: SkipIndex cause segment fault (#35907)
issue: #35882

Signed-off-by: chyezh <chyezh@outlook.com>
2024-09-03 17:15:03 +08:00
zhagnlu 74048ce34f
fix:rename mmap file path to avoid directory conflict (#35810)
#35784

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-03 16:05:03 +08:00
Chun Han 4641fd9195
enhance: make search groupby stop when reaching topk groups (#35814)
related: #33544

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-09-02 18:25:03 +08:00
Zhen Ye b2eb9fe2a7
fix: memory leak in unittest and open the USE_ASAN option when build unittest (#35855)
issue: #35854

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-09-02 15:59:04 +08:00
cai.zhang 2c9bb4dfa3
feat: Support stats task to sort segment by PK (#35054)
issue: #33744 

This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-02 14:19:03 +08:00
zhagnlu 576ac2bbed
fix: Fix the reference to a variable after it has been moved (#35875)
#35607

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-09-02 10:05:02 +08:00
Jiquan Long 5ea2454fdf
feat: tantivy tokenizer binding (#35801)
fix: #35800

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-01 17:13:03 +08:00
zhagnlu 671112d17b
enhance: add more info to hybrid index log (#35808)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-29 21:07:04 +08:00
smellthemoon a3f2f044d6
fix: not set nullable when stream writer write headers (#35799)
#35802

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-29 20:59:00 +08:00
Patrick Weizhi Xu b3089b5bdc
feat: support range search pagination retains order (#35738)
issue: #35464

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-08-29 14:09:00 +08:00
smellthemoon b51b4a2838
fix: try get not exist file after upgrade (#35740)
https://github.com/milvus-io/milvus/issues/35741

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-29 11:09:01 +08:00
Zhen Ye 9b96841ae9
fix: wrong construction in evalctx (#35772)
issue: #35771

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-28 19:21:00 +08:00
Jiquan Long a52ba3d09d
enhance: allow many segments for inverted index (#35616)
fix: https://github.com/milvus-io/milvus/issues/35615

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-08-28 11:30:59 +08:00
Zhen Ye 98866205fa
fix: munmap deallocate too much memory (#35725)
issue: #35693

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-27 17:18:59 +08:00
zhagnlu 4d2f96c760
enhance: support bitmap mmap (#35399)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-27 16:34:59 +08:00
sre-ci-robot 6ddfd02f01
[automated] Update Knowhere Commit (#35688)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-08-26 01:04:57 +08:00
cai.zhang 615a653988
fix: Fix offset out of range for creating Trie index (#35553)
issue: #35550

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-08-25 15:50:57 +08:00
yihao.dai f2b83d316b
enhance: Support memory mode chunk cache (#35347)
Chunk cache supports loading raw vectors into memory.

issue: https://github.com/milvus-io/milvus/issues/35273

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-25 15:42:58 +08:00
zhagnlu 42f7800b5b
enhance: add bitmap offset cache to speed up retrieve raw data (#35498)
#35458

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-08-24 01:40:58 +08:00
Zhen Ye 75da36d1aa
enhance: enable asan for milvus (#35627)
issue: #35626

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-23 21:06:58 +08:00
Zhen Ye a773836b89
enhance: optimize milvus core building (#35610)
issue: #35549,#35611,#35633

- remove milvus_segcore milvus_indexbuilder..., add libmilvus_core
- core building only link once
- move opendal compilation into cmake
- fix odr

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-23 12:35:02 +08:00
zhagnlu 3107701fe8
enhance: optimize retrieve on dynamic field (#35580)
#35514

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
Co-authored-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-22 14:24:56 +08:00
presburger 024eccbde0
enhance: add pkg-config for knowhere (#35433)
Signed-off-by: yusheng.ma <yusheng.ma@zilliz.com>
2024-08-22 09:56:56 +08:00
congqixia 3491608256
fix: Match int8_t and int16_t in Array::get_data (#35579)
Related to #35578

Previously int16/int8 bitmap index may read int32 array as int16, which
may cause build index with half of the data(if array is full) and half
zeros. This causes BITMAP index lost information.

This PR matches int8_t & int16_t while `get_data` when building index.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-20 16:10:56 +08:00
Chun Han 337e065902
fix: querynode hang when failing to allocate disk space for mmap(#35184) (#35187)
related: #35184

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-08-19 15:30:55 +08:00
smellthemoon 80dbe87759
enhance: support null value in index (#35238)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-08-16 15:30:54 +08:00
Buqian Zheng f4a91e135b
enhance: Allow empty sparse row (#34700)
issue: #29419

* If a sparse vector with 0 non-zero value is inserted, no ANN search on
this sparse vector field will return it as a result. User may retrieve
this row via scalar query or ANN search on another vector field though.
* If the user uses an empty sparse vector as the query vector for a ANN
search, no neighbor will be returned.

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-08-16 14:14:54 +08:00
Alexander Guzhva b896143965
enhance: Improve bitset performance for AVX512 (#35479)
see #35478

optimized functions take 20%+ less time to run

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2024-08-16 07:44:53 +08:00