Commit Graph

85 Commits (eb046863485fdf3e130fc60484485c901b81276b)

Author SHA1 Message Date
sthuang 90acc8a58f
enhance: upgrade go arrow version from 12.0.1 to 17.0.0 (#39916)
related: https://github.com/milvus-io/milvus/issues/39915

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-02-25 10:30:02 +08:00
congqixia cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Ted Xu 8562a102ec
enhance: API integration with storage v2 in mix-compactions (#40008)
See #39173

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-02-22 14:23:54 +08:00
smellthemoon 8b974c5742
enhance: support compact if lack of binlog (#40000)
https://github.com/milvus-io/milvus/issues/39718

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-02-22 10:51:56 +08:00
Ted Xu 2978b0890e
enhance: iterative download data during compaction to reduce memory cost (#39724)
See #37234

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-02-13 10:36:47 +08:00
Ted Xu 53a4207f46
enhance: improve sort performance by writing with batch record (#39685)
See #37234

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-02-12 15:08:47 +08:00
XuanYang-cn 1f14053c70
enhance: Enable to observe write amplification (#39661)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-02-08 18:38:43 +08:00
Ted Xu 427b6a4c94
enhance: reduce stats task cost by skipping ser/de (#39568)
See #37234

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-02-06 17:14:45 +08:00
congqixia 6d8441ad7e
enhance: Use mockery pkg config for datacoord&datanode (#39567)
Related to #38339

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-24 14:25:06 +08:00
XuanYang-cn b8fca4f5c1
fix: Clustering compaction ignoring deltalogs (#39132)
See also: #39131

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-01-10 14:07:05 +08:00
Zhen Ye bb8d1ab3bf
enhance: make new go package to manage proto (#39114)
issue: #39095

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:49:01 +08:00
XuanYang-cn 4df444ef25
fix: Aviod add negative missing count (#38748)
See also: #34665

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-26 18:58:56 +08:00
XuanYang-cn c731357538
enhance: Add missing delete metrics (#38634)
Add 2 counter metrics:
- Total delete entries from deltalog:
milvus_datanode_compaction_delete_count
- Total missing deletes: milvus_datanode_compaction_missing_delete_count

See also: #34665

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-25 10:24:50 +08:00
tinswzy 27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
cai.zhang 6ffc57c8dc
fix: Fix sorting buffer in clustering compaction (#38417)
issue: #28410

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-13 10:12:49 +08:00
Ted Xu dc85d8e968
enhance: improve mix compaction performance by removing max segment limitations (#38344)
See #37234

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-12-11 20:38:42 +08:00
cai.zhang 41b19c6b1d
enhance: Determine the number of buffers based on the resource limits of the DataNode (#38209)
issue: #28410

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-08 18:02:40 +08:00
cai.zhang dae4160466
enhance: Whether to enable mergeSort mode when performing mixCompaction (#37664)
issue: #37579

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-11-19 11:28:31 +08:00
yihao.dai 81879425e1
enhance: Optimize the performance of stats task (#37374)
1. Increase the writer's `batchSize` to avoid multiple serialization
operations.
2. Perform asynchronous upload of binlog files to prevent blocking the
data processing flow.
3. Reduce multiple calls to `writer.Flush()`.

issue: https://github.com/milvus-io/milvus/issues/37373

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-08 10:08:27 +08:00
Ted Xu bc9562feb1
enhance: avoid memory copy and serde in mix compaction (#37479)
See: #37234

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-11-07 16:30:57 -08:00
aoiasd b4c749dcd5
fix: merge sort segment loss data (#37400)
relate: https://github.com/milvus-io/milvus/issues/37238

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-07 11:18:26 +08:00
Ted Xu b792b199d7
enhance: load deltalogs on demand when doing compactions (#37310)
See #37234

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-11-01 16:40:21 +08:00
Ted Xu 262a994d6d
enhance: generally improve the performance of mix compactions (#37163)
See #37234

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-10-29 18:12:20 +08:00
cai.zhang 04c306e63f
fix: Fix clustering compaction task leak (#36800)
issue: #36686 

bug reason:
- The clustering compaction tasks on the datanode were never cleaned up.
- The clustering compaction task contains a mapping from clustering key
to buffer, this caused a large memory leak.

fix:
- clean the tasks on datanode by datacoord when clustering compaction
finished.
- reset the mapping that from clustering key to buffer on datanode when
clustering finished.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-10-17 20:43:30 +08:00
aoiasd 5ec4163d0f
feat: support bm25 logs mixcompaction (#36072)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-14 16:57:22 +08:00
CharlesFeng 7c8b71e26c
fix: BinlogDeserializeReader leak in mix_compactor.go (#36270)
https://github.com/milvus-io/milvus/issues/36269

Signed-off-by: fengjun2016 <jornfeng@gmail.com>
2024-10-11 15:41:20 +08:00
XuanYang-cn 290ceb4e84
enhance: Add more info in logs (#36731)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-10-10 17:51:25 +08:00
wayblink 00a5025949
enhance: support clustering compaction on null value (#36372)
issue: #36055

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-09-30 14:33:17 +08:00
cai.zhang 2adca8b754
fix: Fix data race for cluerting compaction (#36440)
issue: #36438

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-28 17:19:21 +08:00
aoiasd 139787371e
feat: support embedding bm25 sparse vector and flush bm25 stats log (#36036)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-09-19 10:57:12 +08:00
cai.zhang 8395c8a8db
enhance: Update stats task to optional (#35947)
issue: #33744

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-12 20:37:08 +08:00
smellthemoon 3f75bf1f20
fix: clustering compact not support null (#36152)
#36055

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-09-11 14:49:06 +08:00
XuanYang-cn 2687747278
fix: Set an empty segment if compaction deleted all inserts (#36044)
See also: #36038

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-09-09 14:23:05 +08:00
Chun Han e480b103bd
feat: supporing hybrid search group_by (#35982)
related: #35096

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-09-08 17:09:04 +08:00
SimFG 5247631289
fix: fill the metric type field in the LoadMetaInfo object (#35962)
- issue: #35960

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-09-05 20:50:23 -07:00
cai.zhang 90bdb171ab
fix: Fix data race for clustering compaction writer (#35957)
issue: #35950

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-05 04:07:10 +08:00
yihao.dai 6fd33285e1
fix: Fix compile error (#35901)
/kind improvement

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-02 14:50:35 +08:00
cai.zhang 2c9bb4dfa3
feat: Support stats task to sort segment by PK (#35054)
issue: #33744 

This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-02 14:19:03 +08:00
yihao.dai 1413ffe9b1
enhance: Rename preAllocatedSegments (#35871)
Rename `preAllocatedSegments` to `preAllocatedSegmentIDs` to avoid
confusion.

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-01 17:09:01 +08:00
XuanYang-cn 323400c190
enhance: Enable to write multiple segments in mix compactor (#35705)
Prevent segments to be written larger than maxSize * expansionRate

See also: #35584

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-08-30 11:29:01 +08:00
congqixia ab532ae199
enhance: Add back BF lazy load logic for datanode watch channel (#35646)
Add back lazy loading statslog when watch dml channel on datanode.

Related to #22994 #27675

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-22 19:42:57 +08:00
Ted Xu 41646c8439
feat: integrate new deltalog format (#35522)
See #34123

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-08-20 19:06:56 +08:00
XuanYang-cn 967f38672a
enhance: Add integration tests for l0 (#35429)
See also: #34796

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-08-19 10:56:54 +08:00
cai.zhang 1bbf7a3c0e
enhance: Optimize the use of locks and avoid double flush clustering buffer writer (#35486)
issue: #35436

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-08-16 02:24:58 +08:00
cai.zhang 196b343a94
fix: Fix data race for clustering compaction (#35435)
issue: #35436

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-08-13 17:10:20 +08:00
cai.zhang aaab827a16
fix: Fix the issue of missing stats log after clustering compaction (#35266)
issue: #35265

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-08-08 14:24:17 +08:00
yihao.dai a4439cc911
enhance: Implement flusher in streamingNode (#34942)
- Implement flusher to:
  - Manage the pipelines (creation, deletion, etc.)
  - Manage the segment write buffer
  - Manage sync operation (including receive flushMsg and execute flush)
- Add a new `GetChannelRecoveryInfo` RPC in DataCoord.
- Reorganize packages: `flushcommon` and `datanode`.

issue: https://github.com/milvus-io/milvus/issues/33285

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-02 18:30:23 +08:00
wayblink 95462668ca
enhance: unify time in clustering compaction task to unix (#35167)
#34495

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-08-02 10:30:19 +08:00
zhenshan.cao aa247f192d
enhance: remove unused code for StorageV2 (#35132)
issue: https://github.com/milvus-io/milvus/issues/34168

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
cai.zhang 9412002d7d
fix: Fix data race for clustering buffer writer (#35145)
issue: #34495

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-08-01 11:20:13 +08:00