Commit Graph

239 Commits (2.4-hotfix)

Author SHA1 Message Date
jaime 5b45debb28
enhance: refine sync memory watermark configuration (#32138)
pr: #32140

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-04-11 20:07:24 +08:00
yihao.dai 39d988cf8d
enhance: Use an individual buffer size parameter for imports (#31833) (#31937)
Use an individual buffer size parameter for imports and set buffer size
to 64MB.

issue: https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31833

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-08 21:05:17 +08:00
Jiquan Long b2a79a0570
enhance: add more metrics (#31271) (#31861)
/kind improvement
fix: #31272 
pr: #31271 

This pr add more metrics, which are:
- Slow query count, which the duration considered as slow can be
configurable;
- Number of deleted entities;
- Number of entities imported;
- Number of entities per collection;
- Number of loaded entities per collection;
- Number of indexed entities;
- Number of indexed entities, per collection, per index and whether it's
a vetor index;
- Quota states (LongTimeTickDelay, MemoryExhuasted, DiskQuotaExhuasted)
per database;

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-04-05 10:09:22 +08:00
yihao.dai 808a944f93
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629) (#31733)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
2. Enhanced input file length check for binlog import.
3. Removal of duplicated manager in datanode.
4. Renaming of executor to scheduler in datanode.
5. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31629

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:45:13 +08:00
Xiaofan be834638d3
fix: etcd not connectable when auth enabled (#31668)
Fix etcd config source didn't respect auth enabled
Also removed pulsar recoverable error when pulsar return ConsumerBusy.
It could happen that pulsar didn't find the original consumer is dead
and recover takes some time.
fix: #31631
pr: #31633

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-04-01 15:23:18 +08:00
Bingyi Sun af7da00488
enhance: use mmap prefix to define all mmap related configs (#31436) (#31572)
related pr: #31436

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-27 16:35:19 +08:00
cai.zhang b8f849e98e
enhance: Support auto index for scalar index (#31593)
issue: #29309 
master pr: #31255

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-27 14:57:10 +08:00
groot b0cbddae8d
fix: minio ssl compatible issue (#31618)
issue: https://github.com/milvus-io/milvus/issues/30709
pr: https://github.com/milvus-io/milvus/pull/31607

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2024-03-27 14:41:20 +08:00
yihao.dai f1a108c97b
enhance: Add max file num limit and max file size limit for import (#31497) (#31542)
The max number of import files per request should not exceed 1024 by
default (configurable).
The import file size allowed for importing should not exceed 16GB by
default (configurable).

issue: https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31497

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-25 14:33:07 +08:00
yihao.dai 1e0bf5acd2
enhance: Remove import v1 (#31403) (#31535)
Remove all code and logic related to import v1.

issue: https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31403

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-24 21:51:07 +08:00
groot a0535edb67
enhance: Support MinIO TLS connection (#31396)
issue: https://github.com/milvus-io/milvus/issues/30709
pr: https://github.com/milvus-io/milvus/pull/31292

Signed-off-by: yhmo <yihua.mo@zilliz.com>
Co-authored-by: Chen Rao <chenrao317328@163.com>
2024-03-21 11:15:20 +08:00
Xiaofan b2b107a774
fix: [cherry-pick] get compaction failure when datanode is actually live (#31356)
if get compaction result failed, then skip processing compaction 
pr: #31353
see also #31352

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-03-18 22:53:05 +08:00
Bingyi Sun e7b053817d
feat: Add global mmap enable configuration (#31267) (#31373)
https://github.com/milvus-io/milvus/issues/31279
related pr: #31267

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-18 21:57:05 +08:00
congqixia 89cff29b6a
enhance: [Cherry-pick] Use different interval for gc scan (#31364)
Cherry-pick from master
pr: #31363
See also #31362

This PR make datacoord garbage collection scan operation using differet
interval than other opeartion.

This interval is a newly added param item, which default value is 7*24
hours.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-18 21:43:05 +08:00
aoiasd b724753137
enhance: Add runtime config to paramtable (#31006)
relate: https://github.com/milvus-io/milvus/issues/30806
Avoid use string convert or format function when get some runtime
parameter

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-03-15 11:07:06 +08:00
wei liu 06b191b164
fix: Balance channel stuck forever due to logic dead lock (#31202)
issue: #30816

cause balance channel will stuck until leader view catch up the current
target, then start to unsub the old delegator. which make sure that the
new delegator can provide search before release old delegator. but
another logic in segment_checker skip loading segment during balance
channel. so during balance channel, if query node crash, new delegator
can't catch up target forever, then stuck forever.

This PR remove the rule that skip loading segment during balance channel
to avoid the logic dead lock here.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-13 15:05:04 +08:00
Chun Han 3298e64bd3
enhance: cache config values for saving cpu cycles to parse config item (#30947)
related: #30958

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-03-12 11:09:04 +08:00
XuanYang-cn ff80d2fd8c
enhance: Enable L0 by default (#30998)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-08 15:53:02 +08:00
wei liu c8efed6562
fix: Balance param use duplicated key (#31112)
issue: #31115
This PR fix balance check interval  param use duplicated key

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-08 12:07:00 +08:00
yihao.dai c411cb4a49
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941)
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-03-07 20:39:02 +08:00
Jiquan Long a88c896733
enhance: purge client infos periodically (#31037)
https://github.com/milvus-io/milvus/issues/31007

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-03-06 12:50:59 +08:00
congqixia 8c2615f840
enhance: Add unit(seconds) for new added connection manager param (#31023)
See also #31007 #31008 #31009

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-05 14:50:59 +08:00
congqixia 1936aa4caa
enhance: Check channel cp lag before generate compaction task (#30997)
See also #30996

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-05 13:39:01 +08:00
congqixia 3b5ce73ded
enhance: Change proxy connection manager to concurrent safe (#31008)
See also #31007

This PR:
- Add param item for connection manager behavior: TTL & check interval
- Change clientInfo map to concurrent map

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-05 10:39:00 +08:00
yihao.dai a434d33e75
feat: Add import scheduler and manager (#29367)
This PR introduces novel managerial roles for importv2:
1. ImportMeta: To manage all the import tasks;
2. ImportScheduler: To process tasks and modify their states;
3. ImportChecker: To ascertain the completion of all tasks and instigate
relevant operations.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 18:31:02 +08:00
groot 85de56e894
fix: Clean kafka default configuration (#30924)
issue: #30917

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2024-03-01 18:17:03 +08:00
chyezh dd957cf9e3
enhance: add configurable memory index load predict memory usage factor (#30561)
related pr: https://github.com/milvus-io/milvus/pull/30475

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-01 15:23:00 +08:00
aoiasd 3633923bb7
enhance: clean invalid pipline excluded segment info (#30429)
relate: https://github.com/milvus-io/milvus/issues/30281

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-03-01 10:43:01 +08:00
congqixia 36d78e3dd0
fix: Use localStorage path to check disk cap (#30944)
See also #30943

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-01 10:17:01 +08:00
MrPresent-Han 17a2fd048e
feat: support set up knowhere-build-pool-size on querynode(#29650) (#30922)
related: #29650

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-02-29 18:15:00 +08:00
chyezh 0c7474d7e8
enhance: add graceful stop timeout to avoid node stop hang under extreme cases (#30317)
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth

issue: #30310
also see pr: #30306

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-02-29 17:01:50 +08:00
PowderLi 50a78b682e
fix: set proxy.http.acceptTypeAllowInt64: true as default (#30720)
issue: #30680

also let the parameter item to be refreshable

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-02-29 09:59:07 +08:00
groot ba6d33cd57
fix: Support TLS for kafka connection (#30468)
#27977

Add extra configurations in milvus.yaml to pass certificates for kafka.

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2024-02-28 18:43:07 +08:00
Bingyi Sun ece9d273a7
enhance: some patches for #30636 (#30664)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-02-26 11:42:55 +08:00
congqixia 1346b57433
enhance: Add deltalog expansion rate in segment loader (#30704)
See also #30191

It turned out that in auto id and batch delete scenario actual memory
size of deltalog maybe way larger than deltalog file size. This PR add a
configurable expansion rate for deltalog memory usage to prevent
out-of-memory panicking during loading deltalogs.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-21 11:26:52 +08:00
aoiasd bbff9193d9
enhance: support clean paramtable config event in test (#30534)
relate: https://github.com/milvus-io/milvus/issues/30441

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-02-20 14:16:51 +08:00
XuanYang-cn 44d436d0b6
enhance: Add force trigger (#30641)
1. Increase maxCount of L0 compaction tasks to 30

This could reduce the l0 compaction task number by 30% for
high-frequently-generated-small l0 segments, with the maximum size 64MB
stay not changed. So that l0 segments would accumulate slower and
decrease the mem presure caused by L0 segment for QueryNode

2. Add force Trigger for later manual timely l0 compaction triggers.

See also: #30191, #30556

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-19 18:40:50 +08:00
Bingyi Sun 564b12c661
enhance: make balance cost threshold configurable (#30636)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-02-19 15:24:50 +08:00
chyezh 941dc755df
feat: add collection level flush rate control (#29567)
flush rate control at collection level to avoid generate too much
segment.
0.1 qps by default.

issue: #29477

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-02-18 15:32:50 +08:00
congqixia 91b02b5d22
enhance: Add param item for datanode l0 batch/linear mode memory ratio (#30523)
See also #27606

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-18 13:02:50 +08:00
Bingyi Sun 715f042965
feat: add a balancer based on both of row count and segment count (#30188)
issue: https://github.com/milvus-io/milvus/issues/30039

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-02-06 17:15:50 +08:00
congqixia d4100d5442
enhance: Change update channel cp magic number to param item (#30555)
See also #28817

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-06 16:02:00 +08:00
XuanYang-cn e6eb6f2c78
enhance: Speed up L0 compaction (#30410)
This PR changes the following to speed up L0 compaction and
prevent OOM:

1. Lower deltabuf limit to 16MB by default, so that each L0 segment
would be 4X smaller than before.
2. Add BatchProcess, use it if memory is sufficient
3. Iterator will Deserialize when called HasNext to avoid massive memory
peek
4. Add tracing in spiltDelta

See also: #30191

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-04 10:49:05 +08:00
xige-16 6d7061824b
enhance: Opt maxVectorFieldNum param check (#30440)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

---------

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-02-02 14:51:05 +08:00
XuanYang-cn e0ed5647b3
fix: Limit L0 Compaction segment size and count (#30374)
See also: #30191

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-01 20:39:03 +08:00
yihao.dai c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00
congqixia fc0d007bd1
enhance: Add `MemoryHighSyncPolicy` back to write buffer manager (#29997)
See also #27675

This PR adds back MemoryHighSyncPolicy implementation. Also change
MinSegmentSize & CheckInterval to configurable param item.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-31 19:03:04 +08:00
cai.zhang 47af347d0e
enhance: Limit index pool size of standalone server (#30170)
issue: #29926

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-30 16:47:03 +08:00
chyezh 211143c5e6
enhance: add basic information of milvus into metrics (#29665)
add basic build information and runtime component dependency into
metrics.

issue: #29664

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-29 15:47:02 +08:00
Bingyi Sun 406bf14e84
enhance: Add growing row count weight (#30271)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-29 14:05:02 +08:00