Commit Graph

21053 Commits (bump_milvus_commit_7dd6651124e87ee6d211199c5b4cd11f96efe0c0)

Author SHA1 Message Date
github-actions[bot] 9cdb5e7010 Bump milvus version to v2.4.14 2024-10-29 09:14:35 +00:00
congqixia 7dd6651124
fix: [2.5] Ref collection meta when load l0 segment meta only ()
Cherry-pick from master
pr: 
Related to 

Previous PR 

Collection meta is not ref-ed when loading l0 segment in `RemoteLoad`
policy, which cause collection meta release when lots of l0 segment
released.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 16:38:22 +08:00
smellthemoon 86b9c3ef4a
fix: to just check null in group by field only ()


Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-29 15:38:30 +08:00
Zhen Ye 889434691c
enhance: enable asan for e2e ()
issue: 

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-10-29 14:14:24 +08:00
congqixia 3106384fc4
enhance: Return deltadata for `DeleteCodec.Deserialize` ()
Related to  

This PR change return type of `DeleteCodec.Deserialize` from
`storage.DeleteData` to `DeltaData`, which
reduces the memory usage of interface header.

Also refine `storage.DeltaData` methods to make it easier to usage.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 12:04:24 +08:00
congqixia 0f59bfdf30
enhance: Use middleware to observe restful v2 in/out rpc stats ()
Related to 

Previous PR  add grpc inteceptor to observe rpc stats. Using same
strategy, this pr add gin middleware to observer restful v2 rpc stats.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 11:22:24 +08:00
congqixia 5a0135727d
fix: Check resource when loading deltalogs ()
Related to 

`LoadDeltaLogs` API did not check memory usage. When system is under
high delete load pressure, this could result into OOM quit.

This PR add resource check for `LoadDeltaLogs` actions and separate
internal deltalog loading function with public one.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 10:04:25 +08:00
congqixia 224d797f94
fix: Use singleton delete pool and avoid goroutine leakage ()
Related to 

Previously using newly create pool per request shall cause goroutine
leakage. This PR change this behavior by using singleton delete pool.
This change could also provide better concurrency control over delete
memory usage.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 10:02:24 +08:00
XuanYang-cn 26028f4137
fix: Exlude L0 compaction when clustering is executing ()
Also remove conflit check when executing L0. The exclusive is already
guarenteed in scheduler

See also: 

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-29 06:28:24 +08:00
sre-ci-robot 1e75a42053 Update all contributors
Signed-off-by: sre-ci-robot <sre-ci-robot@zilliz.com>
2024-10-28 12:00:48 +00:00
zhuwenxing 4c108b1564
test: update jieba tokenizer in test ()
/kind improvement

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-10-28 19:22:22 +08:00
yellow-shine f75660456d
enhance: [skip e2e]update mergify ()
Signed-off-by: Yellow Shine <sammy.huang@zilliz.com>
2024-10-28 19:18:23 +08:00
congqixia d8c1bd24f2
enhance: Utilize proxy metacache for `HasCollection` ()
Related to 

Utilize proxy metacache for `HasCollection` request, if collection
exists in metacache, it could be deducted that collection must exist in
system.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-28 18:54:23 +08:00
Patrick Weizhi Xu fc69df44a1
fix: set guarantee ts for seach/query iterator ()
issue: 

Return the GuaranteeTS so that the subsequent requests following the
correct TS.

BeginTS is the current timestamp when the task is created.
The GuaranteeTS is the one parsed based on both consistency level and
beginTS, in PreExecute of the task on Proxy.
The delegator will wait until GuaranteeTS is met.
In PostExecute of the task on Proxy, the TS of the first iterator
request will be returned to the SDK and add it to the subsequent
requests.
Hence, if the default consistency level is Eventually or Bounded, the
order of TS will be
> Guarantee TS < BeginTS

If it returns the BeginTS, the second request will need to catch up and
result in extra 200ms max of latency, which results in something like

| Call | Latency |
| --- | --- |
| first call on `Next()` | 30ms |
| second call on `Next()` | 210ms |
| third call on `Next()` | 10ms |
| fourth call on `Next()` | 11 ms |
| ... | ... |

where we expect

| Call | Latency |
| --- | --- |
| first call on `Next()` | 30ms |
| second call on `Next()` | 10ms |
| third call on `Next()` | 10ms |
| fourth call on `Next()` | 11 ms |
| ... | ... |

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-10-28 15:57:35 +08:00
congqixia f87acdf2a2
fix: Ref collection meta when load l0 segment meta only ()
Related to 

Previous PR 

Collection meta is not ref-ed when loading l0 segment in `RemoteLoad`
policy, which cause collection meta release when lots of l0 segment
released.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-28 15:49:38 +08:00
jaime 33b0b8df80
fix: may exceed max tnx in etcd operations ()
issue: 

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-10-28 15:37:30 +08:00
cai.zhang 86687bd8ed
enhance: Refine code for get_deleted_bitmap ()
issue:  

Check whether the PK is truly sorted in the debug model.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-10-28 15:19:30 +08:00
XuanYang-cn 4926021c02
fix: Skip mark compaction timeout for mix and l0 compaction ()
Timeout is a bad design for long running tasks, especially using a
static timeout config. We should monitor execution progress and fail the
task if the progress has been stale for a long time.

This pr is a small patch to stop DC from marking compaction tasks
timeout, while still waiting for DN to finish. The design is
self-conflicted. After this pr, mix and L0 compaction are no longer
controlled by DC timeout, but clustering is still under timeout control.

The compaction queue capacity grows larger for priority calc, hence
timeout compactions appears more often, and when timeout, the queuing
tasks will be timeout too, no compaction will success after.

See also: , 

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-28 14:33:29 +08:00
Bingyi Sun b81f162f6a
fix: fix several bugs and refactor some codes related with chunked segment ()
issue: https://github.com/milvus-io/milvus/issues/37147

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-28 14:17:30 +08:00
zhuwenxing cdee149191
test: fix testcases for verification after chaos ()
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-10-28 10:33:29 +08:00
zhuwenxing c8dd665bf6
test: supplementing case for text match ()
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-10-28 10:31:40 +08:00
congqixia 7774b7275e
enhance: Replace PrimaryKey slice with PrimaryKeys saving memory ()
Related to 

Slice of `storage.PrimaryKey` will have extra interface cost for each
element, which may cause notable memory usage when delta row count
number is large.

This PR replaces PrimaryKey slice with PrimaryKeys interface saving the
extra interface cost.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-28 10:29:30 +08:00
jaime 9d16b972ea
feat: add tasks page into management WebUI ()
issue: 

1. Add API to access task runtime metrics, including:
  - build index task
  - compaction task
  - import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
  - sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-10-28 10:13:29 +08:00
foxspy d7b2ffe5aa
enhance: add an unify vector index config checker ()
issue: 

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-10-28 10:11:37 +08:00
zhagnlu eeb67a3845
fix:reset default auto index type for scalar ()


Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-10-27 16:19:29 +08:00
Bingyi Sun a2f0092e39
fix: check sparse float before calling get_dim ()
https://github.com/milvus-io/milvus/issues/37146

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-26 16:25:29 +08:00
aoiasd fd72151037
fix: merge datanode bm25 error after reload growing segment with no data ()
Segment with numrow 0 don't init bm25 stats, cause flush with bm25 stats
failed.
relate: https://github.com/milvus-io/milvus/issues/37150

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-26 07:43:28 +08:00
congqixia 05f880708d
enhance: Make skip load work for all branches ()
Related to 

Skip load logic used to work only when there is multiple segment load
info entires in load request. In continous delete case, delegator still
loads l0 segment, which occupies lot of memory.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-25 23:37:29 +08:00
yihao.dai ed37c27bda
fix: Fix collection leak in querynode ()
Unref the removed L0 segment count.

issue: https://github.com/milvus-io/milvus/issues/36918

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 19:59:29 +08:00
yellow-shine 139f4e5ab5
enhance: [codecov]split code coverage into components ()
Signed-off-by: Yellow Shine <sammy.huang@zilliz.com>
2024-10-25 17:42:14 +08:00
smellthemoon 44ddcb5a63
fix: not check has_value before get value in JSON ()
https://github.com/milvus-io/milvus/issues/36236
also: https://github.com/milvus-io/milvus/issues/37113

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-25 17:19:28 +08:00
yihao.dai d7b2906318
enhance: Make dataNode.import.maxConcurrentTaskNum dynamic ()
Resize import execution pool when config
`dataNode.import.maxConcurrentTaskNum` update.

issue: https://github.com/milvus-io/milvus/issues/37095

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 16:51:29 +08:00
SimFG 1cc9cb49ad
enhance: allow to delete data when disk quota exhausted ()
- issue: 

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-10-25 16:47:29 +08:00
cqy123456 ff0b7ea0ef
enhance: build interim index for mmapped vector in ChunkedSealedSegment ()
issue:https://github.com/milvus-io/milvus/issues/36392
related pr: https://github.com/milvus-io/milvus/pull/36391

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-10-25 15:55:28 +08:00
Yinzuo Jiang 3628593d20
feat: Implement custom function module in milvus expr ()
OSPP 2024 project:
https://summer-ospp.ac.cn/org/prodetail/247410235?list=org&navpage=org

Solutions:

- parser (planparserv2)
    - add CallExpr in planparserv2/Plan.g4
    - update parser_visitor and show_visitor
- grpc protobuf
    - add CallExpr in plan.proto
- execution (`core/src/exec`)
- add `CallExpr` `ValueExpr` and `ColumnExpr` (both logical and
physical) for function call and function parameters
- function factory (`core/src/exec/expression/function`)
    - create a global hashmap when starting milvus (see server.go)
- the global hashmap stores function signatures and their function
pointers, the CallExpr in execution engine can get the function pointer
by function signature.
- custom functions
    - empty(string)
    - starts_with(string, string)
- add cpp/go unittests and E2E tests

closes: 

Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
2024-10-25 15:25:30 +08:00
yihao.dai b45cf2d49f
enhance: Add max length check for csv import ()
1. Add max length check for csv import.
2. Tidy import options.
3. Tidy common import util functions.

issue: https://github.com/milvus-io/milvus/issues/34150

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 14:37:29 +08:00
Buqian Zheng 088d5d7d76
fix: optimize BM25 err message ()
issue: https://github.com/milvus-io/milvus/issues/37022

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-25 14:35:45 +08:00
smellthemoon 84d48b498b
enhance: support upsert autoid==true in Restful API ()
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-25 14:33:39 +08:00
yihao.dai 6e90f9e8d9
enhance: Support db for bulkinsert ()
issue: https://github.com/milvus-io/milvus/issues/31273

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 14:31:39 +08:00
aoiasd 22b917a1e6
enhance: Add collection name label for some metric ()
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-25 14:29:47 +08:00
zhuwenxing ac2858d418
test: add full text search checker in test ()
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-10-25 14:09:29 +08:00
smellthemoon 6ef014d931
fix: get correct size when sealed segment chunked ()


Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-25 12:01:31 +08:00
qixuan 80aa9ab4d6
test: Add insert and upsert related cases for null and default value support ()
issue: 

---------

Signed-off-by: qixuan <673771573@qq.com>
2024-10-25 11:03:29 +08:00
binbin c285853de8
test: Add test cases about expr for null and default support ()
issue: 

Signed-off-by: binbin lv <binbin.lv@zilliz.com>
2024-10-25 11:01:30 +08:00
Gao ad2df904c6
fix: correctly set ExecTermArrayVariableInField bitset result ()
issue: https://github.com/milvus-io/milvus/issues/37110

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-10-24 18:52:02 -07:00
Bingyi Sun bf956a3ec2
fix: fix string field has invalid utf-8 ()
issue: https://github.com/milvus-io/milvus/issues/37083
We use vector of string_view to save data temporally but real string
data will be released after record batch is deconstructed.
Change it to vector of string to avoid memory corruption.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-10-24 18:33:47 -07:00
yellow-shine 0dbf94822f
enhance: [skip e2e]update mergify ()
Signed-off-by: Yellow Shine <sammy.huang@zilliz.com>
2024-10-24 19:23:29 +08:00
smellthemoon 2b3f5bec07
fix: panic when create index on all none data ()


Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-10-24 17:09:28 +08:00
sre-ci-robot 53836f320a
[automated] Update Pytest image changes ()
Update Pytest image changes
See changes:
3b024f9b36
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-10-24 16:53:28 +08:00
congqixia b086ef6b19
enhance: Skip load delta data in delegater when using RemoteLoad ()
Related to 

Delta data is not needed when using `RemoteLoad` l0 forward policy. By
skipping load delta data, memory pressure could be eased if l0 segment
size/number is large.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-24 16:21:37 +08:00