Commit Graph

962 Commits (2.4-hotfix)

Author SHA1 Message Date
yihao.dai b68af208bc
fix: Use pk from binlog during import (#32118) (#32194)
During binlog import, even if the primary key's autoID is set to true,
the primary key from the binlog should be used instead of being
reassigned.

issue: https://github.com/milvus-io/milvus/discussions/31943,
https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/32118

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-15 10:31:19 +08:00
yihao.dai 9cb640fe78
fix: Fix import hanging and improve logging output (#32166) (#32167)
Fix import hanging when the previous import task failed, and improve
parquet import logging outout.

issue: https://github.com/milvus-io/milvus/issues/31834

pr: https://github.com/milvus-io/milvus/pull/32166

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-13 22:03:23 +08:00
congqixia e18ddfc06d
enhance: [Cherry-pick] Make write buffer memory check do until safe (#32172) (#32201)
Cherry-pick from master
pr: #32172
See also #27675 #26177

Make memory check evict memory buffer until memory water level is safe.
Also make `EvictBuffer` wait until sync task done.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-12 16:03:20 +08:00
jaime 5b45debb28
enhance: refine sync memory watermark configuration (#32138)
pr: #32140

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-04-11 20:07:24 +08:00
yihao.dai 39d988cf8d
enhance: Use an individual buffer size parameter for imports (#31833) (#31937)
Use an individual buffer size parameter for imports and set buffer size
to 64MB.

issue: https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31833

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-08 21:05:17 +08:00
yihao.dai 215d314118
fix: Return err for conc.Future in sync manager (#31790) (#31936)
Should not return `err, nil` when using conc.Future, as the error will
be lost/ignored when using `AwaitAll` to wait for the future.

issue: https://github.com/milvus-io/milvus/issues/31788

pr: https://github.com/milvus-io/milvus/pull/31790

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-08 14:41:16 +08:00
congqixia 638823c8c3
fix: [Cherry-pick] Make FlushTs Sync Policy apply to all buffers (#31839) (#31865)
Cherry-pick from master
pr: #31839
See also #30552

FlushTS policy was orignally designed to flushed/L0 segments only, but
in some edge case, new growing segment buffer would by-pass flush
request and hold a buffer before flush ts, which caused flush timeout

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-03 17:45:15 +08:00
XuanYang-cn b0c26e565c
fix: [cherry-pick]Skip changing meta if nodeID not match with channel (#31666)
See also: #31648
pr: #31672
pr: #31694

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-03 17:39:15 +08:00
congqixia 732f0ace11
enhance: [Cherry-pick] Add back unit test for compactor and fix some TODOs (#31829) (#31876)
Cherry-pick from master
pr: #31829
This PR adds back compactor "Unhandled" data type unit test and fixes
some TODOs behvaior

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-03 17:19:15 +08:00
XuanYang-cn 3002f94e04
fix: [cherry-pick]Using zero serverID for metrics (#31519)
Fixes: #31516
pr: #31518

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-02 10:35:13 +08:00
yihao.dai 808a944f93
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629) (#31733)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
2. Enhanced input file length check for binlog import.
3. Removal of duplicated manager in datanode.
4. Renaming of executor to scheduler in datanode.
5. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31629

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:45:13 +08:00
yihao.dai a060d40b7c
enhance: Release blobs in sync task once sync is completed (#31661) (#31678)
Once the synchronization of the sync task is completed, it's necessary
to release the blob within the sync task, as the caller may continue to
reference it.

issue: https://github.com/milvus-io/milvus/issues/31545

pr: #31661

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-29 16:17:11 +08:00
yihao.dai 35664fa302
enhance: Support retriving file size from importutilv2.Reader (#31533) (#31594)
To reduce the overhead caused by listing the S3 objects, add an
interface to importutil.Reader to retrieve file sizes.

issue: https://github.com/milvus-io/milvus/issues/31532,
https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31533

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-26 10:09:08 +08:00
yihao.dai f1a108c97b
enhance: Add max file num limit and max file size limit for import (#31497) (#31542)
The max number of import files per request should not exceed 1024 by
default (configurable).
The import file size allowed for importing should not exceed 16GB by
default (configurable).

issue: https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31497

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-25 14:33:07 +08:00
yihao.dai 1e0bf5acd2
enhance: Remove import v1 (#31403) (#31535)
Remove all code and logic related to import v1.

issue: https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31403

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-24 21:51:07 +08:00
congqixia 99774548f2
enhance: [Cherry-pick] Add AllPartitionsID const to replace InvalidPartitionID (#31438) (#31515)
Cherry-pick from master
pr: #31438

"-1" as `InvalidPartitionID` previously used as All partition place
holder in delete cases. It's confusing and hard to maintain when a const
var has more than one meaning.

This PR add `AllPartitionsID` to replace these usages in delete
scenarios.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-22 16:37:08 +08:00
congqixia 3254a14319
fix: [Cherry-pick] Cleanup write buffer when flowgraph released (#31377)
Cherry-pick from master
pr: #31376
See also #30137

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-18 23:47:06 +08:00
yihao.dai 3c3be49cf4
fix: Fix binlog import (#31310) (#31330)
Fix binlog import functionality by removing the existing check and
refining the size retrieval process.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

pr: https://github.com/milvus-io/milvus/pull/31310

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-18 15:41:05 +08:00
yihao.dai 811316d2ba
fix: Fix binlog import and refine error reporting (#31241)
1. Fix binlog import with partition key.
2. Refine binlog import error reportins.
3. Avoid division by zero when retrieving import progress.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-15 10:55:05 +08:00
jaime db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
XuanYang-cn a52a52064d
fix: Use lock and map instead of concurrentMap (#31212)
See also: #31209

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-14 18:39:04 +08:00
Buqian Zheng 3c80083f51
feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630)
add sparse float vector support to different milvus components,
including proxy, data node to receive and write sparse float vectors to
binlog, query node to handle search requests, index node to build index
for sparse float column, etc.

https://github.com/milvus-io/milvus/issues/29419

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-13 14:32:54 -07:00
yihao.dai b5c67948b7
enhance: Enhance and modify the return content of ImportV2 (#31192)
1. The Import APIs now provide detailed progress information for each
imported file, including details such as file name, file size, progress,
and more.
2. The APIs now return the collection name and the completion time.
3. Other modifications include changing jobID to jobId and other similar
adjustments.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-13 19:51:03 +08:00
congqixia 937f2440ab
fix: TestBlock case use different segment id in testcase (#31173)
Resolves: #31172

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-11 17:51:03 +08:00
congqixia ff1e967e89
enhance: Add segment id short cut for WithSegmentID filter (#31144)
See also #31143

This PR add short cut for datanoe metacache `WithSegmentIDs` filter,
which could just fetch segment from map with provided segmentIDs. Also
add benchmark for new implementation vs old one.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-11 10:55:02 +08:00
Ted Xu 987d9023a5
enhance: Enable binlog deserialize reader in datanode compaction (#31036)
See #30863

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-03-08 18:25:02 +08:00
yihao.dai c411cb4a49
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941)
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-03-07 20:39:02 +08:00
yihao.dai 0a2c255630
enhance: Reduce the memory usage of the timeTickSender (#30968)
In the cache of the timeTickSender, retain only the latest stats instead
of storing stats for every time tick.

issue: https://github.com/milvus-io/milvus/issues/30967

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-02 10:13:01 +08:00
yihao.dai a434d33e75
feat: Add import scheduler and manager (#29367)
This PR introduces novel managerial roles for importv2:
1. ImportMeta: To manage all the import tasks;
2. ImportScheduler: To process tasks and modify their states;
3. ImportChecker: To ascertain the completion of all tasks and instigate
relevant operations.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 18:31:02 +08:00
wayblink f3c56c83ab
fix: Fix binlog_io metric name conflict (#30689)
follow: #29725

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-03-01 18:13:02 +08:00
XuanYang-cn 2867f50fcc
fix: Clear DN unkown compaction tasks (#30850)
If DC restarted,  those unkonwn compaction tasks
will never get call back in DN, so that the segments in the compaction
task will be locked, unable to sync and compaction again, blocking cp
advance and compaction executing.

See also: #30137

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-01 11:31:00 +08:00
yihao.dai a5a4ca8459
enhance: Remove debug log (#30955)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 10:02:59 +08:00
chyezh 0c7474d7e8
enhance: add graceful stop timeout to avoid node stop hang under extreme cases (#30317)
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth

issue: #30310
also see pr: #30306

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-02-29 17:01:50 +08:00
yiwangdr c6665c2a4c
test: support multiple data/querynodes in integration test (#30618)
issue: https://github.com/milvus-io/milvus/issues/29507

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-02-21 11:54:53 +08:00
wayblink f976385421
enhance: replace binlogIO with io.BinlogIO in datanode (#29725)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-20 14:38:51 +08:00
wayblink 6c89609de7
enhance: Reduce unnessary log in binlog_io (#30625)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-18 16:50:51 +08:00
congqixia 3c2e0375df
fix: make compactor inject done called no more than once (#30603)
See also #30571

When `compactionExecutor` stops one compaction task, the `stop` method
will case `injectDone` called.

However in `executeTask` when `compact` method returns error, it shall
also invoke `injectDone` as well. That the reason `Unlock of unlocked
RWMutex` panicking happened.

This PR add sync.Once to make sure that `injectDone` is called only
once. We did not remove any of the `injectDone` since removal any of
those invocation may cause logic problem.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-18 14:08:49 +08:00
congqixia 91b02b5d22
enhance: Add param item for datanode l0 batch/linear mode memory ratio (#30523)
See also #27606

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-18 13:02:50 +08:00
congqixia b111f3b110
enhance: Use RWMutex and change WLock to RLock (#30557)
Related to #27675

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-06 17:13:56 +08:00
congqixia d4100d5442
enhance: Change update channel cp magic number to param item (#30555)
See also #28817

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-06 16:02:00 +08:00
congqixia a68b32134a
fix: Verify sync task target segment and retry if not match (#30500)
See also #27675 #30469

For a sync task, the segment could be compacted during sync task. In
previous implementation, this sync task will hold only the old segment
id as KeyLock, in which case compaction on compacted to segment may run
in parallel with delta sync of this sync task.

This PR introduces sync target segment verification logic. It shall
check target segment lock it's holding beforing actually syncing logic.
If this check failed, sync task shall return`errTargetSegementNotMatch`
error and make manager re-fetch the current target segment id.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-05 11:33:43 +08:00
yihao.dai 18b979d9b4
enhance: Extend support for varchar autoID to BulkInsertV2 (#30477)
issue: https://github.com/milvus-io/milvus/issues/30476

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-04 16:57:05 +08:00
XuanYang-cn e6eb6f2c78
enhance: Speed up L0 compaction (#30410)
This PR changes the following to speed up L0 compaction and
prevent OOM:

1. Lower deltabuf limit to 16MB by default, so that each L0 segment
would be 4X smaller than before.
2. Add BatchProcess, use it if memory is sufficient
3. Iterator will Deserialize when called HasNext to avoid massive memory
peek
4. Add tracing in spiltDelta

See also: #30191

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-04 10:49:05 +08:00
yihao.dai 7ce876a072
fix: Decoupling importing segment from flush process (#30402)
This pr decoups importing segment from flush process by:
1. Exclude the importing segment from the flush policy, this approch
avoids notifying the datanode to flush the importing segment, which may
not exist.
2. When RootCoord call Flush, DataCoord directly set the importing
segment state to `Flushed`.

issue: https://github.com/milvus-io/milvus/issues/30359

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-03 13:01:12 +08:00
congqixia 1ab851d73f
enhance: Remove useless frequent log in Mintimestamp (#30471)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-02 20:39:05 +08:00
XuanYang-cn d744962aa1
fix: Correct Size calculation of DeleteData (#30397)
This PR would correct the actual deltalog size

See also: #30191

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-02 10:47:04 +08:00
XuanYang-cn fb5e09d94d
fix: call injectDone after compaction failed (#30277)
syncMgr.Block() will lock the segment when executing compaction.

Previous implementation was unable to Unblock thoese segments when
compaction failed. If next compaction of the same segments arrives,
it'll stuck forever and block all later compation tasks.

This PR makes sure compaction executor would Unblock these segments
after a failure compaction.

Apart form that, this PR also refines some logs and clean some codes of
compaction, compactor:

1. Log segment count instead of segmentIDs to avoid logging too many
segments
2. Flush RPC returns L1 segments only, skip L0 and L2
3. CompactionType is checked in `Compaction`, no need to check again
inside compactor
4. Use ligter method to replace `getSegmentMeta`
5. Log information for L0 compaction when encounters an error

See also: #30213

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-01 14:25:04 +08:00
congqixia be8831b311
enhance: Reduce get segments scan during l0 compaction (#30408)
See also #27606

Previously l0 linear compaction will scan all target segment id from
metacache for each line of delta entry, which is not needed since
compaction target segments shall be all immutable.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-01 10:59:03 +08:00
yihao.dai c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00
congqixia fc0d007bd1
enhance: Add `MemoryHighSyncPolicy` back to write buffer manager (#29997)
See also #27675

This PR adds back MemoryHighSyncPolicy implementation. Also change
MinSegmentSize & CheckInterval to configurable param item.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-31 19:03:04 +08:00