Commit Graph

8869 Commits (e3d50a192d434844721819cbee3b254436f582da)

Author SHA1 Message Date
smellthemoon c61fb1eff5
enhance: do check when add not empty logpath (#33640)
meta only store logid

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-06-07 10:19:51 +08:00
SimFG ecee7d90d4
enhance: try to speed up the loading of small collections (#33570)
- issue: #33569

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-06-07 08:25:53 +08:00
cai.zhang 27cc9f2630
enhance: Support analyze data (#33651)
issue: #30633

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: chasingegg <chao.gao@zilliz.com>
2024-06-06 17:37:51 +08:00
cai.zhang cfea3f43cf
fix: Don't sync L0 segments to channel watcher (#33664)
issue: #33540

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-06 15:59:50 +08:00
XuanYang-cn 4dd0c54ca0
fix: Fix l0 compactor may cause DN from OOM (#33554)
See also: #33547

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-06 14:33:52 +08:00
congqixia f6e251514f
fix: Write back dbid modification for nonDB id collection (#33641)
See also #33608

Make `fixDefaultDBIDConsistency` also write back collection dbid
modification when nonDB id collection is found.

This fix shall prevent dropped collections of this kind show up again
after dropping and restart.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-06 14:29:53 +08:00
wei liu b69740c8f3
enhance: Remove unnecessary log info during load segment (#33663)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-06 14:13:50 +08:00
XuanYang-cn 68c9e7db8c
fix: Sync dropped segment for dropped partition (#33331)
See also: #33330

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-06 10:25:52 +08:00
sre-ci-robot fd191dd7db
[automated] Update Knowhere Commit (#33655)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-06-06 01:53:50 +08:00
cai.zhang feeb869ff9
enhance: Remove compaction plans on the datanode (#33548)
issue: #33546

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-05 20:27:51 +08:00
cqy123456 703fc73f71
enhance: disk index support binary vector (#33631)
issue:https://github.com/milvus-io/milvus/issues/22837
related https://github.com/milvus-io/milvus/pull/33575

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-05 19:37:57 +08:00
cai.zhang 412ccfbb20
enhance: Refine IndexNode code and ensure compatibility (#33458)
issue: #33432 , #33183

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-05 19:35:52 +08:00
zhagnlu 8ad26093ba
fix: fix load failure (#33599)
issue: #33533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-05 19:19:51 +08:00
aoiasd b9bafc76b4
enhance: access log support print with consistency level (#33503)
relate: https://github.com/milvus-io/milvus/issues/33563

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-06-05 17:13:50 +08:00
yihao.dai 01ce32caa1
enhance: Print more disk quota info (#33596)
/kind enhancement

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 16:51:56 +08:00
yihao.dai bbdf99a45e
fix: Fix import segment size is uneven (#33605)
The data coordinator computed the appropriate number of import segments,
thus when importing in the data node, one can randomly select a segment.

issue: https://github.com/milvus-io/milvus/issues/33604

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 15:41:51 +08:00
Gao 545d4725fb
fix: correct get vector data size for bf16/fp16/binary vector (#33377)
related #22837

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-06-05 14:31:57 +08:00
congqixia 597f4c5e03
enhance: Make hasMoreResult accurate when hit number larger than limit (#33609)
See also milvus-io/milvus-sdk-go#756

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-05 11:51:51 +08:00
yihao.dai 35532a3e7d
fix: Fill stats log id and check validity (#33477)
1. Fill log ID of stats log from import
2. Add a check to validate the log ID before writing to meta

issue: https://github.com/milvus-io/milvus/issues/33476

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 11:17:56 +08:00
congqixia 25080bb455
enhance: Print `UseDefaultConsistency` param in read requests (#33617)
`UseDefaultConsistency` param is crucial for debugging slow query
problems. It could be confusing when guarantee timestamp is 1 while this
param is not logged

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-05 10:27:50 +08:00
zhenshan.cao ac4f3997ce
enhance: Reconstructing Compaction to possess persistence capability (#33265)
issue #33586

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-06-05 10:17:50 +08:00
aoiasd 387b7cd7f4
enhance:avoid maintain checkpoint info in sync manager (#33413)
relate: https://github.com/milvus-io/milvus/issues/32915

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-06-05 10:05:50 +08:00
jaime 8858fcb40a
fix: fix loaded entity num is inaccurate (#33521)
issue: #33520

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-04 20:09:54 +08:00
zhagnlu c6f8a73bb2
enhance: optimize some cache to reduce memory usage (#33534)
#33533

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-04 14:09:47 +08:00
congqixia 7f4698f4a7
enhance: Use map PK to timestamp in buffer insert (#33566)
Related to #27675

Store pk to minimal timestamp in `inData` instead of bloom filter to
check whether some delete entry hit current insert batch

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-04 10:07:48 +08:00
sre-ci-robot d25c755480
[automated] Update Knowhere Commit (#33573)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-06-04 01:55:46 +08:00
SimFG 44d7e03e56
fix: reset the RootCoordQuotaStates metric before recording this metric (#33553)
- issue: #33539

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-06-03 21:09:47 +08:00
congqixia 2b285e5573
fix: Wrap init segcore tracing with golang timeout (#33494)
See also #33483

Wrap `C.InitTrace` & `C.SetTrace` with timeout preventing otlp
initializtion hangs forever when endpoint is not set correctly

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-03 19:25:51 +08:00
wei liu 34c6a989ab
enhance: Avoid load bf in delegator when qn worker has no more memory (#33557)
query coord send load request to delegator, delegator load bf first,
then forward load request to qn worker. but when qn worker has no more
memory, it will return load failed immediatelly. then delegator roll
back the loaded bf. query coord wil retry the load request, and
delegator will load and roll back bf again and again.

this PR delay the loading bf step until load segment succeed in worker.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-03 19:23:45 +08:00
yiwangdr 180d754158
fix: speed up segment lookup via channel name in datacoord (#33530)
issue: #33342

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-06-03 14:47:47 +08:00
yihao.dai 7a2127b09f
enhance: Avoid redundant meta operations of import (#33518)
issue: https://github.com/milvus-io/milvus/issues/33513

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-03 14:15:53 +08:00
XuanYang-cn 0382628668
enhance: Add more tracing for l0 compactor (#33435)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-03 10:19:49 +08:00
smellthemoon 2c7bb0b8ac
fix: replace removeWithPrefix with remove to avoid delete redundantly (#33328)
#33288

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-05-31 18:05:45 +08:00
wei liu c6a1c49e02
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405)
issue: #32995
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

- Block BF construct time	{"time": "54.128131ms"}
- Block BF size	                {"size": 3021578}
- Block BF Test cost	        {"time": "55.407352ms"}
- Basic BF construct time	{"time": "210.262183ms"}
- Basic BF size	                {"size": 2396308}
- Basic BF Test cost	        {"time": "192.596229ms"}

In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

- Block BF TestLocation cost    {"time": "529.97183ms"}
- Basic BF TestLocation cost	{"time": "3.197430181s"}

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-31 17:49:45 +08:00
wei liu 322a4c5b8c
enhance: Remove StringPrimaryKey to reduce unnecessary copy and function call cost (#33486)
issue: #33497

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-31 15:41:45 +08:00
aoiasd 0cf225fa75
enhance: access log support restful api (#33155)
relate: https://github.com/milvus-io/milvus/issues/31823

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-05-31 11:13:44 +08:00
Buqian Zheng 4171414222
enhance: update knowhere version (#33490)
issue: https://github.com/milvus-io/milvus/issues/33489
update knowhere version to latest. remove usage of `seed_ef` as it be
replaced by existing `ef`.

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-05-31 10:17:50 +08:00
Jiquan Long 0c5d8660aa
feat: support inverted index for array (#33452)
issue: https://github.com/milvus-io/milvus/issues/27704

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-31 09:47:47 +08:00
Chun Han 416a2cf507
fix: query iterator lack results(#33137) (#33422)
related: #33137 
adding has_more_result_tag for various level's reduce to rectify
reduce_stop_for_best

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-05-30 17:51:44 +08:00
cai.zhang 77637180fa
enhance: Periodically synchronize segments to datanode watcher (#33420)
issue: #32809

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-05-30 13:37:44 +08:00
zhagnlu 589d4dfd82
enhance: optimize bitmap index (#33358)
#32900

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-05-30 13:09:43 +08:00
SimFG 8f46a20957
fix: show empty collection when has granted the all privilege (#33445)
- issue: #33382

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-05-29 20:51:44 +08:00
congqixia 54797b4286
enhance: Refine frequent log in datacoord (#33449)
This PR changes:
- Frequent `ListIndexes` success log to debug level
- Aggregate collection missing log after collection dropped in
`meta.GetCollectionIndexFilesSize`

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-29 19:15:43 +08:00
smellthemoon 08b94ea81d
enhance:change wrong log (#33447)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-05-29 18:35:44 +08:00
congqixia a26d6cdf23
fix: Remove group checker when closing qn pipeline (#33443)
See also #33442

This fix shall prevent group checker keep printing "some node(s) haven't
received input" err message after collection released

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-29 14:07:44 +08:00
wei liu b13932bb55
enhance: Enable database level replica num and resource groups for loading collection (#33052)
issue: #30040

This PR introduce two database level props:
1. database.replica.number
2. database.resource_groups

User can set those two database props by AlterDatabase API, then can
load collection without specified replica_num and resource groups. then
it will use database level load param when try to load collections.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-29 10:59:43 +08:00
congqixia 73c9b80a7d
enhance: Store locations for largest K in `LocationCache` (#33429)
See also #32642

`LocationCache` used map to store different locations for different K
which may cause lots of CPU time when get locations many times.

This PR change the implementation of LocationCache to store only the
location for the largest K used to totally remove the map access
operation.

See pprof from test of @XuanYang-cn 

![image](https://github.com/milvus-io/milvus/assets/84113973/ad17cff8-62ad-4d78-9bb0-f6df0512f4ea)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-29 10:05:42 +08:00
jaime 3d29907b6e
enhance: decrease cpu overhead during filter segments on datacoord (#33130)
issue: #33129

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-05-28 19:17:43 +08:00
congqixia e71b7c7cc9
enhance: Reduce datanode metacache frequent scan range (#33400)
See also #32165

There were some frequent scan in metacache:
- List all segments whose start positions not synced
- List compacted segments

Those scan shall cause lots of CPU time when flushed segment number is
large meanwhile `Flushed` segments can be skipped in those two scenarios

This PR make:
- Add segment state shortcut in metacache
- List start positions state before `Flushed`
- Make compacted segments state to be `Dropped` and use `Dropped` state
while scanning them

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-28 14:19:42 +08:00
XuanYang-cn 5e39aa9272
enhance: Make channel meta able to writer 200k plus segments (#33279)
See also: #33125

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-28 12:33:42 +08:00