milvus/deployments/monitor/grafana
edward.zeng 95c4600958
enhance: Refine milvus dashboard for milvus 2.3 and 2.4 (#34994)
/kind improvement
cc @yanliang567 
issue: https://github.com/milvus-io/milvus/issues/34476

Signed-off-by: Edward Zeng <jie.zeng@zilliz.com>
2024-07-25 15:41:46 +08:00
..
README.md Add some msgstream metrics (#20296) 2022-11-07 10:15:02 +08:00
kafka-dashboard.json Add kafka metics dashboard (#19009) 2022-09-05 16:35:11 +08:00
milvus-dashboard.json enhance: Refine milvus dashboard for milvus 2.3 and 2.4 (#34994) 2024-07-25 15:41:46 +08:00

README.md

Milvus Metrics Dashboard

Milvus outputs a list of detailed time-series metrics during runtime. You can use Prometheus and Grafana to visualize the metrics. This topic introduces the monitoring metrics displayed in the Grafana Milvus Dashboard.

We recommend reading Milvus monitoring framework overview to understand Prometheus metrics first.

The time unit in this topic is millisecond.

And "99th percentile" in this topic refers to the fact that 99 percent of the time statistics are controlled within certain value.

Proxy
Panel Panel description PromQL (Prometheus query language) The Milvus metrics used Milvus metrics description
Search Vector Count Rate The average number of vectors queried per second by each proxy within the past two minutes. sum(increase(milvus_proxy_search_vectors_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id) milvus_proxy_search_vectors_count The accumulated number of vectors queried.
Insert Vector Count Rate The average number of vectors inserted per second by each proxy within the past two minutes. sum(increase(milvus_proxy_insert_vectors_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id) milvus_proxy_insert_vectors_count The accumulated number of vectors inserted.
Search Latency The average latency and the 99th percentile of the latency of receiving search and query requests by each proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_sq_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)
milvus_proxy_sq_latency The latency of search and query requests.
Wait Search Result Latency The average latency and the 99th percentile of the latency between sending search and query requests and receiving results by proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_wait_result_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_sq_wait_result_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_wait_result_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)
milvus_proxy_sq_wait_result_latency The latency between sending search and query requests and receiving results.
Reduce Search Result Latency The average latency and the 99th percentile of the latency of aggregating search and query results by proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_reduce_result_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_sq_reduce_result_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_reduce_result_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)
milvus_proxy_sq_reduce_result_latency The latency of aggregating search and query results returned by each query node.
Decode Search Result Latency The average latency and the 99th percentile of the latency of decoding search and query results by proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_proxy_sq_decode_result_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_sq_decode_result_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type) / sum(increase(milvus_proxy_sq_decode_resultlatency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, query_type)
milvus_proxy_sq_decode_result_latency The latency of decoding each search and query result.
Msg Stream Object Num The average, maximum, and minimum number of the msgstream objects created by each proxy on its corresponding physical topic within the past two minutes. avg(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_proxy_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_proxy_msgstream_obj_num The number of msgstream objects created on each physical topic.
Mutation Req Latency The average latency and the 99th percentile of the overall latency of receiving insertion or deletion requests by each proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, msg_type, pod, node_id) (rate(milvus_proxy_mutation_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_mutation_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type) / sum(increase(milvus_proxy_mutation_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type)
milvus_proxy_mutation_latency The latency of insertion or deletion requests.
Mutation Send Latency The average latency and the 99th percentile of the latency of sending insertion or deletion requests by each proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, msg_type, pod, node_id) (rate(milvus_proxy_mutation_send_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_mutation_send_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type) / sum(increase(milvus_proxy_mutation_send_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id, msg_type)
milvus_proxy_mutation_send_latency The latency of sending insertion or deletion requests.
Cache Hit Rate The average cache hit rate of operations including GeCollectionID, GetCollectionInfo , and GetCollectionSchema per second within the past two minutes. sum(increase(milvus_proxy_cache_hit_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace", cache_state="hit"}[2m])/120) by(cache_name, pod, node_id) / sum(increase(milvus_proxy_cache_hit_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(cache_name, pod, node_id) milvus_proxy_cache_hit_count The statistics of hit and failure rate of each cache reading operation.
Cache Update Latency The average latency and the 99th percentile of cache update latency by proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_cache_update_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_cache_update_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_cache_update_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)
milvus_proxy_cache_update_latency The latency of updating cache each time.
Sync Time The average, maximum, and minimum number of epoch time synced by each proxy in its corresponding physical channel. avg(milvus_proxy_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_proxy_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_proxy_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_proxy_tt_lag_ms Each physical channel's epoch time (Unix time, the milliseconds passed ever since January 1, 1970).
There is a default ChannelName apart from the physical channels.
Apply PK Latency The average latency and the 99th percentile of primary key application latency by each proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_apply_pk_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_apply_pk_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_apply_pk_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)
milvus_proxy_apply_pk_latency The latency of applying primary key.
Apply Timestamp Latency The average latency and the 99th percentile of timestamp application latency by each proxy within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_proxy_apply_timestamp_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_apply_timestamp_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id) / sum(increase(milvus_proxy_apply_timestamp_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (pod, node_id)
milvus_proxy_apply_timestamp_latency The latency of applying timestamp.
DQL Request Rate The status and number of DQL requests received per second by each proxy within the past two minutes.
DQL requests include DescribeCollection, DescribeIndex, GetCollectionStatistics, HasCollection, Search, Query, ShowPartitions, etc. This panel specifically shows the total number and the number of successful DQL requests.
sum(increase(milvus_proxy_dql_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(function_name, status, pod, node_id) milvus_proxy_dql_req_count The number of all types of DQL requests.
DML Request Rate The status and number of DML requests received per second by each proxy within the past two minutes.
DML requests include Insert, Delete, LoadCollection, HasCollection, ReleaseCollection, etc. This panel specifically shows the total number and the number of successful DML requests.
sum(increase(milvus_proxy_dml_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(function_name, status, pod, node_id) milvus_proxy_dml_req_count The number of all types of DML requests.
DDL Request Rate The status and number of DDL requests received per second by each proxy within the past two minutes.
DML requests include CreateCollection, DropCollection, ShowCollection, CreatePartition, Flush, etc. This panel specifically shows the total number and the number of successful DDL requests.
sum(increase(milvus_proxy_ddl_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(function_name, status, pod, node_id) milvus_proxy_ddl_req_count The number of all types of DDL requests.
DQL Request Latency The average latency and the 99th percentile of the latency of successfully receiving DQL requests by each proxy in the past two minutes. p99:
histogram_quantile(0.99, sum by (le, function_name, pod, node_id) (rate(milvus_proxy_dql_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_dql_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name, pod, node_id) / sum(increase(milvus_proxy_dql_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name, pod, node_id)
milvus_proxy_dql_req_latency The latency of successful DQL requests.
DML Request Latency The average latency and the 99th percentile of the latency of successfully receiving DML requests by each proxy in the past two minutes. p99:
histogram_quantile(0.99, sum by (le, function_name, pod, node_id) (rate(milvus_proxy_dml_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_dml_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name, pod, node_id) / sum(increase(milvus_proxy_dml_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name, pod, node_id)
milvus_proxy_dml_req_latency The latency of successful DQL requests excluding Insert and Delete requests.
For metrics of Insert and Delete requests, refer to milvus_proxy_mutation_latency.
DDL Request Latency The average latency and the 99th percentile of the latency of successfully receiving DDL requests by each proxy in the past two minutes. p99:
histogram_quantile(0.99, sum by (le, function_name, pod, node_id) (rate(milvus_proxy_ddl_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_proxy_ddl_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name, pod, node_id) / sum(increase(milvus_proxy_ddl_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name, pod, node_id)
milvus_proxy_ddl_req_latency The latency of successful DDL requests.
Insert/Delete Request Byte Rate The number of bytes of insert and delete requests received per second by proxy within the past two minutes. sum(increase(milvus_proxy_receive_bytes_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(pod, node_id) milvus_proxy_receive_bytes_count The count of insert and delete requests.
Send Byte Rate The number of bytes per second sent back to the client while each proxy is responding to search and query requests within the past two minutes. sum(increase(milvus_proxy_send_bytes_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by(pod, node_id) milvus_proxy_send_bytes_count The number of bytes sent back to the client while each proxy is responding to search and query requests.
Root coordinator
Panel Panel description PromQL (Prometheus query language) The Milvus metrics used Milvus metrics description
Proxy Node Num The number of proxies created. sum(milvus_rootcoord_proxy_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_rootcoord_proxy_num The number of proxies.
Sync Time The average, maximum, and minimum number of epoch time synced by each root coord in each physical channel (PChannel). avg(milvus_rootcoord_produce_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) max(milvus_rootcoord_produce_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) min(milvus_rootcoord_produce_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_rootcoord_produce_tt_lag_ms Each physical channel's epoch time (Unix time, the milliseconds passed ever since January 1, 1970).
DDL Request Rate The status and number of DDL requests per second within the past two minutes. sum(increase(milvus_rootcoord_ddl_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, function_name) milvus_rootcoord_ddl_req_count The total number of DDL requests including CreateCollection, DescribeCollection, DescribeSegments, HasCollection, ShowCollections, ShowPartitions, and ShowSegments.
DDL Request Latency The average latency and the 99th percentile of DDL request latency within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, function_name) (rate(milvus_rootcoord_ddl_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_rootcoord_ddl_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name) / sum(increase(milvus_rootcoord_ddl_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by (function_name)
milvus_rootcoord_ddl_req_latency The latency of all types of DDL requests.
Sync Timetick Latency The average latency and the 99th percentile of the time used by root coord to sync all timestamp to PChannel within the past two minutes. p99:
histogram_quantile(0.99, sum by (le) (rate(milvus_rootcoord_sync_timetick_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_rootcoord_sync_timetick_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_rootcoord_sync_timetick_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))
milvus_rootcoord_sync_timetick_latency the time used by root coord to sync all timestamp to pchannel.
ID Alloc Rate The number of IDs assigned by root coord per second within the past two minutes. sum(increase(milvus_rootcoord_id_alloc_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) milvus_rootcoord_id_alloc_count The accumulated number of IDs assigned by root coord.
Timestamp The latest timestamp of root coord. milvus_rootcoord_timestamp{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"} milvus_rootcoord_timestamp The latest timestamp of root coord.
Timestamp Saved The pre-assigned timestamps that root coord saves in meta storage. milvus_rootcoord_timestamp_saved{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"} milvus_rootcoord_timestamp_saved The pre-assigned timestamps that root coord saves in meta storage.
The timestamps are assigned 3 seconds earlier. And the timestamp is updated and saved in meta storage every 50 millisecond.
Collection Num The total number of collections. sum(milvus_rootcoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_rootcoord_collection_num The total number of collections existing in Milvus currently.
Partition Num The total number of partitions. sum(milvus_rootcoord_partition_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_rootcoord_partition_num The total number of partitions existing in Milvus currently.
DML Channel Num The total number of DML channels. sum(milvus_rootcoord_dml_channel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_rootcoord_dml_channel_num The total number of DML channels existing in Milvus currently.
Msgstream Num The total number of msgstreams. sum(milvus_rootcoord_msgstream_obj_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_rootcoord_msgstream_obj_num The total number of msgstreams in Milvus currently.
Credential Num The total number of credentials. sum(milvus_rootcoord_credential_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_rootcoord_credential_num The total number of credentials in Milvus currently.
Query coordinator
Panel Panel description PromQL (Prometheus query language) The Milvus metrics used Milvus metrics description
Collection Loaded Num The number of collections that are currently loaded into memory. sum(milvus_querycoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_querycoord_collection_num The number of collections that are currently loaded by Milvus.
Entity Loaded Num The number of entities that are currently loaded into memory. sum(milvus_querycoord_entity_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_querycoord_entitiy_num The number of entities that are currently loaded by Milvus.
Load Request Rate The number of load requests per second within the past two minutes. sum(increase(milvus_querycoord_load_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])120) by (status) milvus_querycoord_load_req_count The accumulated number of load requests.
Release Request Rate The number of release requests per second within the past two minutes. sum(increase(milvus_querycoord_release_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status) milvus_querycoord_release_req_count The accumulated number of release requests.
Load Request Latency The average latency and the 99th percentile of load request latency within the past two minutes. p99:
histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_load_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querycoord_load_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_load_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))
milvus_querycoord_load_latency The time used to complete a load request.
Release Request Latency The average latency and the 99th percentile of release request latency within the past two minutes. p99:
histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_release_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querycoord_release_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_release_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m]))
milvus_querycoord_release_latency The time used to complete a release request.
Sub-Load Task The number of sub load tasks. sum(milvus_querycoord_child_task_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_querycoord_child_task_num The number of sub load tasks.
A query coord splits a load request into multiple sub load tasks.
Parent Load Task The number of parent load tasks. sum(milvus_querycoord_parent_task_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_querycoord_parent_task_num The number of sub load tasks.
Each load request corresponds to a parent task in the task queue.
Sub-Load Task Latency The average latency and the 99th percentile of the latency of a sub load task within the past two minutes. p99:
histogram_quantile(0.99, sum by (le) (rate(milvus_querycoord_child_task_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querycoord_child_task_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) / sum(increase(milvus_querycoord_child_task_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) namespace"}[2m])))
milvus_querycoord_child_task_latency The latency to complete a sub load task.
Query Node Num The number of query nodes managed by query coord. sum(milvus_querycoord_querynode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_querycoord_querynode_num The number of query nodes managed by query coord.
Query node
Panel Panel description PromQL (Prometheus query language) The Milvus metrics used Milvus metrics description
Collection Loaded Num The number of collections loaded into memory by each query node. sum(milvus_querynode_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_collection_num The number of collection loaded by each query node.
Partition Loaded Num The number of partitions loaded into memory by each query node. sum(milvus_querynode_partition_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_partition_num The number of partitions loaded by each query node.
Segment Loaded Num The number of segments loaded into memory by each query node. sum(milvus_querynode_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_segment_num The number of segments loaded by each query node.
Queryable Entity Num The number of queryable and searchable entities on each query node. sum(milvus_querynode_entity_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_entity_num The number of queryable and searchable entities on each query node.
DML Virtual Channel The number of DML virtual channels watched by each query node. sum(milvus_querynode_dml_vchannel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_dml_vchannel_num The number of DML virtual channels watched by each query node.
Delta Virtual Channel The number of delta channels watched by each query node. sum(milvus_querynode_delta_vchannel_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_delta_vchannel_num The number of delta channels watched by each query node.
Consumer Num The number of consumers in each query node. sum(milvus_querynode_consumer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_consumer_num The number of consumers in each query node.
Search Request Rate The total number of search and query requests received per second by each query node and the number of successful search and query requests within the past two minutes. sum(increase(milvus_querynode_sq_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (query_type, status, pod, node_id) milvus_querynode_sq_req_count The accumulated number of search and query requests.
Search Request Latency The average latency and the 99th percentile of the time used in search and query requests by each query node within the past two minutes.
This panel displays the latency of search and query requests whose status are "success" or "total".
p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_sq_req_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_sq_req_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_req_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)
milvus_querynode_sq_req_latency The search request latency of query node.
Search in Queue Latency The average latency and the 99th percentile of the latency of search and query requests in queue within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id, query_type) (rate(milvus_querynode_sq_queue_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_sq_queue_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_queue_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)
milvus_querynode_sq_queue_latency The latency of the search and query requests received by query node.
Search Segment Latency The average latency and the 99th percentile of the time each query node takes to search and query a segment within the past two minutes.
The status of a segment can be sealed or growing.
p99:
histogram_quantile(0.99, sum by (le, query_type, segment_state, pod, node_id) (rate(milvus_querynode_sq_segment_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_sq_segment_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type, segment_state) / sum(increase(milvus_querynode_sq_segment_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type, segment_state)
milvus_querynode_sq_segment_latency The time each query node takes to search and query each segment.
Segcore Request Latency The average latency and the 99th percentile of the time each query node takes to search and query in segcore within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, query_type, pod, node_id) (rate(milvus_querynode_sq_core_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_sq_core_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_core_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)
milvus_querynode_sq_core_latency The time each query node takes to search and query in segcore.
Search Reduce Latency The average latency and the 99th percentile of the time used by each query node during the reduce stage of a search or query within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id, query_type) (rate(milvus_querynode_sq_reduce_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_sq_reduce_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type) / sum(increase(milvus_querynode_sq_reduce_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id, query_type)
milvus_querynode_sq_reduce_latency The time each query spends during the stage of reduce.
Load Segment Latency The average latency and the 99th percentile of the time each query node takes to load a segment in the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_load_segment_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_load_segment_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_load_segment_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_querynode_load_segment_latency_bucket The time each query node takes to load a segment.
Flowgraph Num The number of flowgraphs in each query node. sum(milvus_querynode_flowgraph_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_flowgraph_num The number of flowgraphs in each query node.
Unsolved Read Task Length The length of the queue of unsolved read requests in each query node. sum(milvus_querynode_read_task_unsolved_len{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_read_task_unsolved_len The length of the queue of unsolved read requests.
Ready Read Task Length The length of the queue of read requests to be executed in each query node. sum(milvus_querynode_read_task_ready_len{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_read_task_ready_len The length of the queue of read requests to be executed.
Parallel Read Task Num The number of concurrent read requests currently executed in each query node. sum(milvus_querynode_read_task_concurrency{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_read_task_concurrency The number of concurrent read requests currently executed.
Estimate CPU Usage The CPU usage by each query node estimated by the scheduler. sum(milvus_querynode_estimate_cpu_usage{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_querynode_estimate_cpu_usage The CPU usage by each query node estimated by the scheduler.
When the value is 100, this means a whole virtual CPU (vCPU) is used.
Search Group Size The average number and the 99th percentile of the search group size (i.e. The total number of original search requests in the combined search requests executed by each query node) within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_size_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_search_group_size_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_size_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_querynode_load_segment_latency_bucket The number of original search tasks among the combined search tasks from different buckets (i.e. The search group size).
Search NQ The average number and the 99th percentile of the number of queries (NQ) done while each query node executes search requests within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_size_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_search_group_size_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_size_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_querynode_load_segment_latency_bucket The number of queries (NQ) of search requests.
Search Group NQ The average number and the 99th percentile of NQ of search requests combined and executed by each query node within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_nq_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_search_group_nq_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_nq_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_querynode_load_segment_latency_bucket The NQ of search requests combined from different buckets.
Search Top_K The average number and the 99th percentile of the Top_K of search requests executed by each query node within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_topk_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_search_topk_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_topk_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_querynode_load_segment_latency_bucket The Top_K of search requests.
Search Group Top_K The average number and the 99th percentile of the Top_K of search requests combined and executed by each query node within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_querynode_search_group_topk_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_querynode_search_group_topk_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_querynode_search_group_topk_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_querynode_load_segment_latency_bucket The Top_K of search requests combined from different buckets .
Evicted Read Requests Rate The number of read requests evicted per second by each query node within the past two minutes. sum(increase(milvus_querynode_read_evicted_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id) milvus_querynode_sq_req_count The accumulated number of read requests evicted by query node due to traffic restriction.
Data coordinator
Panel Panel description PromQL (Prometheus query language) The Milvus metrics used Milvus metrics description
Data Node Num The number of data nodes managed by data coord. sum(milvus_datacoord_datanode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_datacoord_datanode_num The number of data nodes managed by data coord.
Segment Num The number of all types of segments recorded in metadata by data coord. sum(milvus_datacoord_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (segment_state) milvus_datacoord_segment_num The number of all types of segments recorded in metadata by data coord.
Types of segment include: dropped, flushed, flushing, growing, and sealed.
Collection Num The number of collections recorded in metadata by data coord. sum(milvus_datacoord_collection_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_datacoord_collection_num The number of collections recorded in metadata by data coord.
Stored Rows The accumulated number of rows of valid and flushed data in data coord. sum(milvus_datacoord_stored_rows_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_datacoord_stored_rows_num The accumulated number of rows of valid and flushed data in data coord.
Stored Rows Rate The average number of rows flushed per second within the past two minutes. sum(increase(milvus_datacoord_stored_rows_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (pod, node_id) milvus_datacoord_stored_rows_count The accumulated number of rows flushed by data coord.
Sync Time The average, maximum, and minimum number of epoch time synced by data coord in each physical channel. avg(milvus_datacoord_consumer_datanode_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) max(milvus_datacoord_consumer_datanode_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) min(milvus_datacoord_consumer_datanode_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_datacoord_consumer_datanode_tt_lag_ms Each physical channel's epoch time (Unix time, the milliseconds passed ever since January 1, 1970).
Data node
Panel Panel description PromQL (Prometheus query language) The Milvus metrics used Milvus metrics description
Flowgraph Num The number of flowgraph objects that correspond to each data node. sum(milvus_datanode_flowgraph_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_datanode_flowgraph_num The number of flowgraph objects.
Each shard in a collection corresponds to a flowgraph object.
Msg Rows Consume Rate The number of rows of streaming messages consumed per second by each data node within the past two minutes. sum(increase(milvus_datanode_msg_rows_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (msg_type, pod, node_id) milvus_datanode_msg_rows_count The number of rows of streaming messages consumed.
Currently, streaming messages counted by data node only include insertion and deletion messages.
Flush Data Size Rate The size of each flushed message recorded per second by each data node within the past two minutes. sum(increase(milvus_datanode_flushed_data_size{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (msg_type, pod, node_id) milvus_datanode_flushed_data_size The size of each flushed message.
Currently, streaming messages counted by data node only include insertion and deletion messages.
Consumer Num The number of consumers created on each data node. sum(milvus_datanode_consumer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_datanode_consumer_num The number of consumers created on each data node.
Each flowgraph corresponds to a consumer.
Producer Num The number of producers created on each data node. sum(milvus_datanode_producer_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_datanode_producer_num The number of consumers created on each data node.
Each shard in a collection corresponds to a delta channel producer and a timetick channel producer.
Sync Time The average, maximum, and minimum number of epoch time synced by each data node in all physical topics. avg(milvus_datanode_produce_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) max(milvus_datanode_produce_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) min(milvus_datanode_produce_tt_lag_ms{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_datanode_produce_tt_lag_ms The epoch time (Unix time, the milliseconds passed ever since January 1, 1970.) of each physical topic on a data node.
Unflushed Segment Num The number of unflushed segments created on each data node. sum(milvus_datanode_unflushed_segment_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (pod, node_id) milvus_datanode_unflushed_segment_num The number of unflushed segments created on each data node.
Encode Buffer Latency The average latency and the 99th percentile of the time used to encode a buffer by each data node within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_encode_buffer_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_datanode_encode_buffer_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_encode_buffer_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_datanode_encode_buffer_latency The time each data node takes to encode a buffer.
Save Data Latency The average latency and the 99th percentile of the time used to write a buffer into the storage layer by each data node within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_save_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_datanode_save_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_save_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_datanode_save_latency The time each data node takes to write a buffer into the storage layer.
Flush Operate Rate The number of times each data node flushes a buffer per second within the past two minutes. sum(increase(milvus_datanode_flush_buffer_op_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id) milvus_datanode_flush_buffer_op_count The accumulated number of times a data node flushes a buffer.
Autoflush Operate Rate The number of times each data node auto-flushes a buffer per second within the past two minutes. sum(increase(milvus_datanode_autoflush_buffer_op_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id) milvus_datanode_autoflush_buffer_op_count The accumulated number of times a data node auto-flushes a buffer.
Flush Request Rate The number of times each data node receives a buffer flush request per second within the past two minute. sum(increase(milvus_datanode_flush_req_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id) milvus_datanode_flush_req_count The accumulated number of times a data node receives a flush request from a data coord.
Compaction Latency The average latency and the 99 the percentile of the time each data node takes to execute a compaction task within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_datanode_compaction_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_datanode_compaction_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_datanode_compaction_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_datanode_compaction_latency The time each data node takes to execute a compaction task.
Index coordinator
Panel Panel description PromQL (Prometheus query language) The Milvus metrics used Milvus metrics description
Index Request Rate The average number of index building requests received per second by index coord within the past two minutes. sum(increase(milvus_indexcoord_indexreq_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status) milvus_indexcoord_indexreq_count The number of index building requests received by index coord.
Index Task Count The count of all indexing tasks recorded by index coord in index metadata. sum(milvus_indexcoord_indextask_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (index_task_status) milvus_indexcoord_indextask_count The count of all indexing tasks recorded by index coord in index metadata.
Index Node Num The number of index nodes managed by index coord. sum(milvus_indexcoord_indexnode_num{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}) by (app_kubernetes_io_instance) milvus_indexcoord_indexnode_num The number of index nodes managed by index coord.
Index node
Panel Panel description PromQL (Prometheus query language) The Milvus metrics used Milvus metrics description
Index Task Rate The average number of index building tasks received by each index node per second within the past two minutes. sum(increase(milvus_indexnode_index_task_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])/120) by (status, pod, node_id) milvus_indexnode_index_task_count The number of index building tasks received.
Load Field Latency The average latency and the 99th percentile of the time used by each index node to load segment field data each time within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_load_field_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_indexnode_load_field_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_load_field_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_indexnode_load_field_latency The time used by index node to load segment field data.
Decode Field Latency The average latency and the 99th percentile of the time used by each index node to encode field data each time within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_decode_field_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_indexnode_decode_field_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_decode_field_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_indexnode_decode_field_latency The time used to decode field data.
Build Index Latency The average latency and the 99th percentile of the time used by each index node to build indexes within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_build_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_indexnode_build_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_build_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_indexnode_build_index_latency The time used to build indexes.
Encode Index Latency The average latency and the 99th percentile of the time used by each index node to encode index files within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_encode_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_indexnode_encode_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_encode_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_indexnode_encode_index_latency The time used to encode index files.
Save Index Latency The average latency and the 99th percentile of the time used by each index node to save index files within the past two minutes. p99:
histogram_quantile(0.99, sum by (le, pod, node_id) (rate(milvus_indexnode_save_index_latency_bucket{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])))
avg:
sum(increase(milvus_indexnode_save_index_latency_sum{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id) / sum(increase(milvus_indexnode_save_index_latency_count{app_kubernetes_io_instance=~"$instance", app_kubernetes_io_name="$app_name", namespace="$namespace"}[2m])) by(pod, node_id)
milvus_indexnode_save_index_latency The time used to save index files.