issue: #39551
This PR remove querycoord's scheduling of l0 segments:
- only load l0 segment when watch channel
- only release l0 segment when release channel or sync data distribution
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
https://github.com/milvus-io/milvus/issues/35528
This PR adds json index support for json and dynamic fields. Now you can
only do unary query like 'a["b"] > 1' using this index. We will support
more filter type later.
basic usage:
```
collection.create_index("json_field", {"index_type": "INVERTED",
"params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})
```
There are some limits to use this index:
1. If a record does not have the json path you specify, it will be
ignored and there will not be an error.
2. If a value of the json path fails to be cast to the type you specify,
it will be ignored and there will not be an error.
3. A specific json path can have only one json index.
4. If you try to create more than one json indexes for one json field,
sdk(pymilvus<=2.4.7) may return immediately because of internal
implementation. This will be fixed in a later version.
---------
Signed-off-by: sunby <sunbingyi1992@gmail.com>
1. Make the segment loader lock protect only the resource.
2. Optimize GetDiskUsage to avoid excessive overhead.
issue: https://github.com/milvus-io/milvus/issues/37630
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #33285
- move most cgo opeartions related to search/query into segcore package
for reusing for streamingnode.
- add go unittest for segcore operations.
Signed-off-by: chyezh <chyezh@outlook.com>
Related to #35303#30404
This PR change return type of `DeleteCodec.Deserialize` from
`storage.DeleteData` to `DeltaData`, which
reduces the memory usage of interface header.
Also refine `storage.DeltaData` methods to make it easier to usage.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #36887
`LoadDeltaLogs` API did not check memory usage. When system is under
high delete load pressure, this could result into OOM quit.
This PR add resource check for `LoadDeltaLogs` actions and separate
internal deltalog loading function with public one.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #35303
Slice of `storage.PrimaryKey` will have extra interface cost for each
element, which may cause notable memory usage when delta row count
number is large.
This PR replaces PrimaryKey slice with PrimaryKeys interface saving the
extra interface cost.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #33744
This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #33005
1. add `MemorySize` field for insert binlog.
2. `LogSize` means the file size in the storage object.
3. `MemorySize` means the size of the data in the memory.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
issue: #32206, #32801
- search failure with some assertion, segment not loaded and resource
insufficient.
- segment leak when query segments
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue:#32899
This PR fix the wrong metric value of load index, which introduced by
pr#32567, use wrong time unit for load index metrics
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #32530
when try to match segment bloom filter with pk, we can reuse the hash
locations. This PR maintain the max hash Func, and compute hash location
once for all segment, reuse hash location can speed up bf access
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #32748
This PR:
- Add `metautil.Channel` utiltiy which convert virtual name to physical
channel name, collectionID and shard idx
- Add channel mapper interface & implementation to convert limited
physical channel name into int index
- Apply `metautil.Channel` filter in querynode segment manager logic
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #32663
- Use new param to control request resource timeout for lazy load.
- Remove the timeout parameter of `Do`, remove `DoWait`. use `context`
to control the timeout.
- Use `VersionedNotifier` to avoid notify event lost and broadcast,
remove the redundant goroutine in cache.
related dev pr: #32684
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #30361
- Delete may be lost when segment is not data-loaded status in lru
cache. skip filtering to fix it.
- `stats_` and `variable_fields_avg_size_` should be reset when
`ReleaseData`
- Remove repeat load delta log operation in lru.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
related: #31959
1. reset segment index status after evicting to lazyload=true
2. reset num_rows to null_opt
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>