At most cases, data in each channel is almost evenly distributed, we
could utilize the channel num info to optimize searh param in queryHook
Signed-off-by: chasingegg <chao.gao@zilliz.com>
issue: #34715
if collection's segment list doesn't changes anymore, then the next
target will be empty at most time, and balance segment will check
whether segment exist in both current and next target, so the balance
cloud be blocked due to next target is empty.
This PR permit segment to be moved if next target is empty, to avoid
balance stuck.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Grow slice & map.growWork may cause a lot when segment number is large
for big K query. This PR pre-allocate space for reduce methods to avoid
this cost.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
add prometheus dependency for monitor module. Or else Some compilers may
report a compilation failure.
issue: #35077
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
1.fix compaction task not be cleaned correctly
2.add a new parameter to control compaction gc loop interval
3.remove some useless configs of clustering compaction
bug: #34764
Signed-off-by: wayblink <anyang.wang@zilliz.com>
after manual stop component by management restful api, `healthz` may
return unhealthy state. k8s may restart the pod to save the unhealthy
sate, and the manual stop operation will got unexpected result.
to solve this, we make `healthz` API skip the manual stopped component.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
#34778#34849
fix two problems:
1. count(*) incorrect, if growing insert duplicated (pk, timestamp)
pairs that pk and timestamp all same, need to keep just one pair.
2. count(*) may core dump, if get_real_count interface get snapshot and
do mvcc at not consistency status, mainly happens under concurrency.
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
issue: #33583
the old policy permit datanode has at most 2 more channels than other
datanode. so if milvus has 2 datanode and 2 channels, both 2 channels
will be assign to 1 datanode, left another datanode empty.
This PR refine the balance policy to solve channel unbalance on datanode
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #34746
This PR add segment level field in response of
`GetPersistentSegmentInfo` and `GetQuerySegmentInfo`
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #33285
- register streaming coord service into datacoord.
- add new streaming node role.
- add global static switch to enable streaming service or not.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #33285
- add specialized mutable and immutable message, make type safe.
- add version based constructor and type.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #34972
fix string type data use memcpy to fill cause segv for not malloc enough
memory in advance.
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
issue: #34798
after we remove the task priority on query coord, to avoid load/release
segment blocked by too much balance task, we limit the balance task size
in each round. at same time, we reduce the balance interval to trigger
balance more frequently.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
1. support read and write null in segcore
will store valid_data(use uint8_t type to save memory) in fieldData.
2. support load null
binlog reader read and write data into column(sealed segment),
insertRecord(growing segment). In sealed segment, store valid_data
directly. In growing segment, considering prior implementation and easy
code reading, it covert uint8_t to fbvector<bool>, which may optimize in
future.
3. retrieve valid_data.
parse valid_data in search/query.
#31728
---------
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
issue: #34595
pr#34596 to we add an overloaded factor to segment in delegator, which
cause same segment got different score in delegator and worker. which
may cause segment bounce between delegator and worker.
This PR use average score to compute the delegator overloaded factor, to
avoid segment bounce between delegator and worker.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #34685
knowhere needs a new json param `range_search_k` for RangeSearch to
early terminate the iterator.
Signed-off-by: min.tian <min.tian.cn@gmail.com>
issue: #33285
- make message builder and message conversion type safe
- add adaptor type and function to adapt old msgstream msgpack and
message interface
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #34781
when balance segment hasn't finished yet, query coord may found 2 loaded
copy of segment, then it will generate task to deduplicate, which may
cancel the balance task. then the old copy has been released, and the
new copy hasn't be ready yet but canceled, then search failed by segment
lack.
this PR set deduplicate segment task's proirity to low, to avoid balance
segment task canceled by deduplicate task.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
1. Move the common modules of streamingNode and dataNode to flushcommon
2. Add new GetVChannels interface for rootcoord
issue: https://github.com/milvus-io/milvus/issues/33285
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Related to #33235
THe querynode pipeline will make map & call ProcessInsert when there is
no write messages. So querynodes will have high CPU usage even when
there is no workload.
This PR check msg length before composing data struct and calling method
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
add scalar filtering and vector search latency metrics to distinguish
the cost of scalar filtering.
To add metrics in query chain, add a monitor module and move the metric
files from original storage module.
issue: #34780
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
Seals the largest growing segment if the total size of growing segments
of each shard exceeds the size threshold(default 4GB). Introducing this
policy can help keep the size of growing segments within a suitable
level, alleviating the pressure on the delegator.
issue: https://github.com/milvus-io/milvus/issues/34554
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #34357
Go Parquet uses dictionary encoding by default, and it will fall back to
plain encoding if the dictionary size exceeds the dictionary size page
limit. Users can specify custom fallback encoding by using
`parquet.WithEncoding(ENCODING_METHOD)` in writer properties. However,
Go Parquet [fallbacks to plain
encoding](e65c1e295d/go/parquet/file/column_writer_types.gen.go.tmpl (L238))
rather than custom encoding method users provide. Therefore, this patch
only turns off dictionary encoding for the primary key.
With a 5 million auto ID primary key benchmark, the parquet file size
improves from 13.93 MB to 8.36 MB when dictionary encoding is turned
off, reducing primary key storage space by 40%.
Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
See also #34670
This PR add quota configuration for l0 segment entry number per
collection. If l0 compaction cannot keep up the insertion/upsertion
rate, this feature could back press the related rate.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR removes the dependency of compaction on the ID allocator by
pre-allocating the logID and segmentID.
issue: https://github.com/milvus-io/milvus/issues/33957
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #33285
- add idAlloc interface
- fix binary unsafe bug for message
- fix service discovery lost when repeated address with different server
id
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #34595
When consuming insert data on the delegator node, QueryCoord will move
out some sealed segments to manage its memory usage. After the growing
segment gets flushed, some sealed segments from other workers will be
moved back to the delegator node. To avoid the frequent movement of
segments, we estimate the maximum growing row count and preserve a
fixed-size memory in the delegator node.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #33285
- add two grpc resolver (by session and by streaming coord assignment
service)
- add one grpc balancer (by serverID and roundrobin)
- add lazy conn to avoid block by first service discovery
- add some utility function for streaming service
Signed-off-by: chyezh <chyezh@outlook.com>
The nodeID for compaction task initialization is 0. This PR adjusts the
task reassignment conditions to allow new compaction tasks to be
reassigned and executed.
issue: https://github.com/milvus-io/milvus/issues/34460
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
See also #34574
Add jitter for segment seal proportion to avoid seal operation burst in
short period of time.
This PR also fix license header in paramtable pkg.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
github.com/gogo/protobuf is deprecated and could be error prune after
upgrade protobuf message to v2.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
related: #33544
mainly changes in three aspects:
1. enable setting group_size for group by function
2. separate normal reduce and group by reduce
3. eleminate uncessary padding in search result for reducing
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
issue: # #34545
Print warn log instead of check health fail if orphan channel cp meta is
found in health check request.
Signed-off-by: jaime <yun.zhang@zilliz.com>
Related to #34508
The padding bytes shall be written only at the end of the mmap file not
the chunk of each field data file.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #34500
cause the sort in `GetLevel0Deletions` will broken the corresponed order
between pks and tss, then the pks and tss will be sorted in
segment.Delete() interface.
This PR remove this uncessary and incorrect sort progress to avoid query
may return deleted records.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #34483
Some lint issues are introduced due to lack of static check run. This PR
fixes these problems.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #33285
- implement producing and consuming server of message
- implement management operation for streaming node server
---------
Signed-off-by: chyezh <chyezh@outlook.com>
related: #30376
fix: paritionIDs lost when no setting paritions
enhance: refine metrics for segment prune
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
issue: #34304
cosine is more widely used in float vectors, and cosine and hamming
distance are 'metrics' which have good geometric properties
Signed-off-by: chasingegg <chao.gao@zilliz.com>
The import is dependent on syncTask, which in turn relies on the
allocator. This PR pre-allocate the necessary IDs for import syncTask.
issue: https://github.com/milvus-io/milvus/issues/33957
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #34123
Benchmark case: The benchmark run the go benchmark function
`BenchmarkDeltalogFormat` which is put in the Files changed. It tests
the performance of serializing and deserializing from two different data
formats under a 10 million delete log dataset.
Metrics: The benchmarks measure the average time taken per operation
(ns/op), memory allocated per operation (MB/op), and the number of
memory allocations per operation (allocs/op).
| Test Name | Avg Time (ns/op) | Time Comparison | Memory Allocation
(MB/op) | Memory Comparison | Allocation Count (allocs/op) | Allocation
Comparison |
|---------------------------------|------------------|-----------------|---------------------------|-------------------|------------------------------|------------------------|
| one_string_format_reader | 2,781,990,000 | Baseline | 2,422 | Baseline
| 20,336,539 | Baseline |
| pk_ts_separate_format_reader | 480,682,639 | -82.72% | 1,765 | -27.14%
| 20,396,958 | +0.30% |
| one_string_format_writer | 5,483,436,041 | Baseline | 13,900 |
Baseline | 70,057,473 | Baseline |
| pk_and_ts_separate_format_writer| 798,591,584 | -85.43% | 2,178 |
-84.34% | 30,270,488 | -56.78% |
Both read and write operations show significant improvements in both
speed and memory allocation.
Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
Check if the segment exists during FlushSegments and add some key logs
in write path.
issue: https://github.com/milvus-io/milvus/issues/34255
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Added retry method and unit test cases for retrying etcd server start.
New to open source and Go language. Please reject if this is not the
requirement/ specify the modifications needed in code.
issue : #17569
Signed-off-by: Charles Kakumanu <charles_kakumanu@apple.com>
Co-authored-by: Charles Kakumanu <charles_kakumanu@apple.com>
issue: #31224#34374
for query api:
1. param filter is not requried
2. param limit is useless while outputFields = [count(*)]
add hook about grpc call
---------
Signed-off-by: PowderLi <min.li@zilliz.com>
related: #30376
1. support more complex expr
2. add more ut test for unrelated fields
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
Correct the update logic of timerecorder in the flowgraph to avoid false
failure: "some node(s) haven't received input".
issue: https://github.com/milvus-io/milvus/issues/34337
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Log will be confusing when `Reassign` channel operation failed for both
success & failure log will be printed in row. This PR continue the loop
to avoid this output.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Some lint issue is not detect due to recent static check pipeline issue.
This PR fixes these problem and Go milvusclient testcases.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR make varchar & string array field max length exceeded error
message clearer. Also fixed a minor issue that error string format and
argument number not match.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #34234
`LoadPartitions` does not guarantee the current target has loading
partitions if there are some partitions already loaded before.
This PR check current target contains the partition to load when
advancing loading percentage to 100.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #32252
This PR try to pre-allocate FieldData for Reduce operations in the Query
chain using typeutil.PrepareResultFieldData to avoid the overhead of
dynamically growing the slice during appendFieldData process.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
try to update index for l0 segment, will failed by `index not found`
This PR skip update index for l0 segment
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #34095
When a new query node comes online, the segment_checker,
channel_checker, and balance_checker simultaneously attempt to allocate
segments to it. If this occurs during the execution of a load task and
the distribution of the new query node hasn't been updated, the query
coordinator may mistakenly view the new query node as empty. As a
result, it assigns segments or channels to it, potentially overloading
the new query node with more segments or channels than expected.
This PR measures the workload of the executing tasks on the target query
node to prevent assigning an excessive number of segments to it.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #33285
- use reader but not consumer for pulsar
- advanced test framework
- move some streaming related package into pkg
---------
Signed-off-by: chyezh <chyezh@outlook.com>