Commit Graph

100 Commits (ba625835bc2571270cc0e09729b5f2547943db9e)

Author SHA1 Message Date
congqixia 0feee53631
enhance: Add back unit test for compactor and fix some TODOs (#31829)
This PR adds back compactor "Unhandled" data type unit test and fixes
some TODOs behvaior

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-02 20:35:14 +08:00
Buqian Zheng 3c80083f51
feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630)
add sparse float vector support to different milvus components,
including proxy, data node to receive and write sparse float vectors to
binlog, query node to handle search requests, index node to build index
for sparse float column, etc.

https://github.com/milvus-io/milvus/issues/29419

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-13 14:32:54 -07:00
Ted Xu 987d9023a5
enhance: Enable binlog deserialize reader in datanode compaction (#31036)
See #30863

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-03-08 18:25:02 +08:00
wayblink f976385421
enhance: replace binlogIO with io.BinlogIO in datanode (#29725)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-20 14:38:51 +08:00
congqixia 3c2e0375df
fix: make compactor inject done called no more than once (#30603)
See also #30571

When `compactionExecutor` stops one compaction task, the `stop` method
will case `injectDone` called.

However in `executeTask` when `compact` method returns error, it shall
also invoke `injectDone` as well. That the reason `Unlock of unlocked
RWMutex` panicking happened.

This PR add sync.Once to make sure that `injectDone` is called only
once. We did not remove any of the `injectDone` since removal any of
those invocation may cause logic problem.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-18 14:08:49 +08:00
XuanYang-cn fb5e09d94d
fix: call injectDone after compaction failed (#30277)
syncMgr.Block() will lock the segment when executing compaction.

Previous implementation was unable to Unblock thoese segments when
compaction failed. If next compaction of the same segments arrives,
it'll stuck forever and block all later compation tasks.

This PR makes sure compaction executor would Unblock these segments
after a failure compaction.

Apart form that, this PR also refines some logs and clean some codes of
compaction, compactor:

1. Log segment count instead of segmentIDs to avoid logging too many
segments
2. Flush RPC returns L1 segments only, skip L0 and L2
3. CompactionType is checked in `Compaction`, no need to check again
inside compactor
4. Use ligter method to replace `getSegmentMeta`
5. Log information for L0 compaction when encounters an error

See also: #30213

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-01 14:25:04 +08:00
congqixia 0c7a96b48d
enhance: Make compaction log has traceID (#30338)
See also #30167

After support open telemetry tracing, we want to have traceID as well,
this PR adds util functions to set traceID with span & propagate traceID
between different context.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-30 10:09:03 +08:00
congqixia 6a73860815
enhance: Add open telemetry tracing for compaction (#30168)
Resolves #30167

This PR add tracing for all compaction from the task start in datacoord
and execution procedures in datanode.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-23 10:37:00 +08:00
XuanYang-cn 86f48861c1
fix: Add more throughput in related metrics (#30038)
This PR also fixes bugs in l0 compactor where
l0 results would never be removed from datanode

See also: #30099

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-19 11:34:54 +08:00
smellthemoon e52ce370b6
enhance:don't store logPath in meta to reduce memory (#28873)
don't store logPath in meta to reduce memory, when service get
segmentinfo, generate logpath from logid.
#28885

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-18 22:06:31 +08:00
SimFG d9edd50f97
fix: the delete msg disorder issue (#29915)
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-01-14 10:26:52 +08:00
congqixia ed89c6a2ee
enhance: make compactor use actual buffer size to decide when to sync (#29945)
See also: #29657

Datanode Compactor use estimated row number from schema to decide when
to sync the batch of data when executing compaction. This est value
could go way from actual size when the schema contains variable field(
say VarChar, JSON, etc.)

This PR make compactor able to check the actual buffer data size and
make it possible to sync when buffer is actually beyong max binglog
size.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-13 01:32:52 +08:00
Xu Tong e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
yah01 a8a0aa9357
fix: missing to support compact for Array type (#29505)
the array type can't be compacted, the system could continue with the
inserted segments, but these segments can be never compacted

fix #29503

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-12-28 15:42:48 +08:00
XuanYang-cn fe04598900
enhance: Add compaction type label to metrics (#29485)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-12-27 15:56:48 +08:00
XuanYang-cn aae7e62729
feat: Add levelzero compaction in DN (#28470)
See also: #27606

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-30 14:30:28 +08:00
XuanYang-cn 321c5c32e3
fix: Separate schedule and check results loop (#28692)
This PR:

- Separates compaction scheduler and check results loop So that slow in
check-loop doesn't influence execution.

- Cleans compaction tasks when drop a vchannel so dropped-channel's
compaction tasks won't be checked over and over again.

  - Skips meta change when meta's already changed, avoid panic
  - Remove not inuse injectDone(bool) parameter

See also: #28628, #28209

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-29 10:50:29 +08:00
congqixia 2b3fa8f67b
fix: Add length check for `storage.NewPrimaryKeyStats` (#28576)
See also #28575
Add zero-length check for `storage.NewPrimaryKeyStats`. This function
shall return error when non-positive rowNum passed.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-21 10:28:21 +08:00
congqixia bed7467f20
enhance: Remove commented code and fix naming issue (#28450)
This PR removes all the commented code and files from PR #28320

For naming issue:
- Renaming `MinCheckpoint` to `EarliestPosition`, see #28320 comment
- Renaming `writebuffer.Mananger` to `BufferMananger`, see #27874
comment

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-16 00:22:20 +08:00
congqixia 0b905078e7
Use writebuffer, sync manager refactory in datanode (#28320)
See also #27675
This PR make previously merged refactory of datanode go online
- Use write node to replace insert/delete node
- Use write buffer manager to control all buffers
- Use sync manager to control sync tasks instead of flush manager

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-15 15:24:18 +08:00
XuanYang-cn 40d5c902b6
Enable getting multiple segments in plan result (#28350)
Compaction plan result contained one segment for one plan. For l0
compaction would write to multiple segments, this PR expand the segments
number in plan results and refactor some names for readibility.

- Name refactory: - CompactionStateResult -> CompactionPlanResult -
CompactionResult -> CompactionSegment

See also: #27606

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-14 15:56:19 +08:00
XuanYang-cn 294ff74ca5
Add new compaction types (#27608)
See also: #27606

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-10-11 06:49:32 +08:00
congqixia 1d76565894
Add metrics for garbage collection (#27303)
Also fix second metrics usage in compaction

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-22 18:47:25 +08:00
SimFG 26f06dd732
Format the code (#27275)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
Xiaofan 6635398a6d
Fix Bin log concurrency by adding a pool (#27189)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-09-19 18:05:22 +08:00
Xu Tong 9166011c4a
Add float16 vector (#25852)
Signed-off-by: Writer-X <1256866856@qq.com>
2023-09-08 10:03:16 +08:00
XuanYang-cn b2e7cbdf4b
Remove TimeTravel in compactor (#26785)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-09-04 17:41:48 +08:00
XuanYang-cn 8d54509e54
Fix CompactionLatency metrics (#26747)
- Refine compactor logs

See also: #26743

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-08-31 17:35:03 +08:00
wayblink 21eeb37ce9
Save binlog timestampFrom, timestampTo meta when compact (#26203)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-08-08 21:17:21 +08:00
xige-16 33c2012675
Add more metrics (#25081)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-06-26 17:52:44 +08:00
PowderLi 3f4356df10
fix the spelling of `field` (#25008)
Signed-off-by: PowderLi <min.li@zilliz.com>
2023-06-21 14:00:42 +08:00
jaime b75dcf90c0
Fix Flush hang after SyncSegments timeout (#24953)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-06-20 10:20:41 +08:00
Xiaofan f57fe6d70b
Add compaction log (#24976)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-06-19 14:18:41 +08:00
congqixia 41af0a98fa
Use go-api/v2 for milvus-proto (#24770)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-06-09 01:28:37 +08:00
aoiasd c84bdcea49
merge stats log when segment flushing or compacting (#23570)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2023-05-29 10:21:28 +08:00
yah01 99cc24b283
Fix compaction doesn't support JSON data (#24151)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-05-17 19:09:22 +08:00
jaime c9d0c157ec
Move some modules from internal to public package (#22572)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-04-06 19:14:32 +08:00
XuanYang-cn 93bc805933
Enhance ID allocator in DataNode (#22905)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-03-23 19:43:57 +08:00
SimFG 4a90490a67
Fix the `segment not found` error (#22772)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-03-17 17:27:56 +08:00
smellthemoon 8799b06dbc
Fix the data is disappeared when upsert (#22762)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-03-14 19:11:54 +08:00
Enwei Jiao 697dedac7e
Use cockroachdb/errors to replace other error pkg (#22390)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-02-26 11:31:49 +08:00
wei liu 0ce70b4f10
fix compact task data race (#22299)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-02-21 16:10:26 +08:00
Xiaofan 949d5d078f
Fix memory calculation in dataCodec (#21800)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-01-28 11:09:52 +08:00
Xiaofan e977e014a9
Fix flush didn't respect binaryvector and other schemas (#21120)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>

Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2022-12-12 10:33:26 +08:00
Enwei Jiao 89b810a4db
Refactor all params into ParamItem (#20987)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>

Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2022-12-07 18:01:19 +08:00
zhenshan.cao 9724ae5e27
Refine log level (#20904)
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2022-12-01 20:07:15 +08:00
Bingyi Sun 2390095232
Fix load uses compacted segments' binlogs (#20655)
Signed-off-by: sunby <bingyi.sun@zilliz.com>

Signed-off-by: sunby <bingyi.sun@zilliz.com>
Co-authored-by: sunby <bingyi.sun@zilliz.com>
2022-11-17 20:37:10 +08:00
MrPresent-Han 21b54709a2
enable delbuf auto flush function to avoid OOM when del msg accumulating (#20213)
issue: #19713

Signed-off-by: MrPresent-Han <jamesharden11122@gmail.com>

Signed-off-by: MrPresent-Han <jamesharden11122@gmail.com>
Co-authored-by: MrPresent-Han <zilliz@zillizdeMacBook-Pro-2.local>
2022-11-07 18:49:02 +08:00
Enwei Jiao 956c5e1b9d
Make Params singleton (#20088)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>

Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2022-11-04 14:25:38 +08:00
Xiaofan 2bfecf5b4e
Refine bloomfilter and memory usage (#20168)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>

Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2022-10-31 17:41:34 +08:00