Clean up flow-control.md
parent
64b2336468
commit
b8cc58a7a3
|
@ -792,14 +792,15 @@ your requests.
|
||||||
|
|
||||||
To detect whether requests are being rejected due to APF, check the following
|
To detect whether requests are being rejected due to APF, check the following
|
||||||
metrics:
|
metrics:
|
||||||
|
|
||||||
- apiserver_flowcontrol_rejected_requests_total: the total number of requests
|
- apiserver_flowcontrol_rejected_requests_total: the total number of requests
|
||||||
rejected per FlowSchema and PriorityLevelConfiguration.
|
rejected per FlowSchema and PriorityLevelConfiguration.
|
||||||
- apiserver_flowcontrol_current_inqueue_requests: the current number of requests
|
- apiserver_flowcontrol_current_inqueue_requests: the current number of requests
|
||||||
queued per FlowSchema and PriorityLevelConfiguration.
|
queued per FlowSchema and PriorityLevelConfiguration.
|
||||||
- apiserver_flowcontrol_request_wait_duration_seconds: the latency added to
|
- apiserver_flowcontrol_request_wait_duration_seconds: the latency added to
|
||||||
requests waiting in queues.
|
requests waiting in queues.
|
||||||
- apiserver_flowcontrol_priority_level_seat_utilization: the seat utilization
|
- apiserver_flowcontrol_priority_level_seat_utilization: the seat utilization
|
||||||
per PriorityLevelConfiguration.
|
per PriorityLevelConfiguration.
|
||||||
|
|
||||||
### Workload modifications {#good-practice-workload-modifications}
|
### Workload modifications {#good-practice-workload-modifications}
|
||||||
|
|
||||||
|
@ -807,21 +808,21 @@ To prevent requests from queuing and adding latency or being dropped due to APF,
|
||||||
you can optimize your requests by:
|
you can optimize your requests by:
|
||||||
|
|
||||||
- Reducing the rate at which requests are executed. A fewer number of requests
|
- Reducing the rate at which requests are executed. A fewer number of requests
|
||||||
over a fixed period will result in a fewer number of seats being needed at a
|
over a fixed period will result in a fewer number of seats being needed at a
|
||||||
given time.
|
given time.
|
||||||
- Avoid issuing a large number of expensive requests concurrently. Requests can
|
- Avoid issuing a large number of expensive requests concurrently. Requests can
|
||||||
be optimized to use fewer seats or have lower latency so that these requests
|
be optimized to use fewer seats or have lower latency so that these requests
|
||||||
hold those seats for a shorter duration. List requests can occupy more than 1
|
hold those seats for a shorter duration. List requests can occupy more than 1
|
||||||
seat depending on the number of objects fetched during the request. Restricting
|
seat depending on the number of objects fetched during the request. Restricting
|
||||||
the number of objects retrieved in a list request, for example by using
|
the number of objects retrieved in a list request, for example by using
|
||||||
pagination, will use less total seats over a shorter period. Furthermore,
|
pagination, will use less total seats over a shorter period. Furthermore,
|
||||||
replacing list requests with watch requests will require lower total concurrency
|
replacing list requests with watch requests will require lower total concurrency
|
||||||
shares as watch requests only occupy 1 seat during its initial burst of
|
shares as watch requests only occupy 1 seat during its initial burst of
|
||||||
notifications. If using streaming lists in versions 1.27 and later, watch
|
notifications. If using streaming lists in versions 1.27 and later, watch
|
||||||
requests will occupy the same number of seats as a list request for its initial
|
requests will occupy the same number of seats as a list request for its initial
|
||||||
burst of notifications because the entire state of the collection has to be
|
burst of notifications because the entire state of the collection has to be
|
||||||
streamed. Note that in both cases, a watch request will not hold any seats after
|
streamed. Note that in both cases, a watch request will not hold any seats after
|
||||||
this initial phase.
|
this initial phase.
|
||||||
|
|
||||||
Keep in mind that queuing or rejected requests from APF could be induced by
|
Keep in mind that queuing or rejected requests from APF could be induced by
|
||||||
either an increase in the number of requests or an increase in latency for
|
either an increase in the number of requests or an increase in latency for
|
||||||
|
@ -840,33 +841,34 @@ objects or create new objects of these types to better accommodate your
|
||||||
workload.
|
workload.
|
||||||
|
|
||||||
APF settings can be modified to:
|
APF settings can be modified to:
|
||||||
|
|
||||||
- Give more seats to high priority requests.
|
- Give more seats to high priority requests.
|
||||||
- Isolate non-essential or expensive requests that would starve a concurrency
|
- Isolate non-essential or expensive requests that would starve a concurrency
|
||||||
level if it was shared with other flows.
|
level if it was shared with other flows.
|
||||||
|
|
||||||
#### Give more seats to high priority requests
|
#### Give more seats to high priority requests
|
||||||
|
|
||||||
1. If possible, the number of seats available across all priority levels for a
|
1. If possible, the number of seats available across all priority levels for a
|
||||||
particular `kube-apiserver` can be increased by increasing the values for the
|
particular `kube-apiserver` can be increased by increasing the values for the
|
||||||
`max-requests-inflight` and `max-mutating-requests-inflight` flags. Alternatively,
|
`max-requests-inflight` and `max-mutating-requests-inflight` flags. Alternatively,
|
||||||
horizontally scaling the number of `kube-apiserver` instances will increase the
|
horizontally scaling the number of `kube-apiserver` instances will increase the
|
||||||
total concurrency per priority level across the cluster assuming there is
|
total concurrency per priority level across the cluster assuming there is
|
||||||
sufficient load balancing of requests.
|
sufficient load balancing of requests.
|
||||||
2. You can create a new FlowSchema which references a PriorityLevelConfiguration
|
1. You can create a new FlowSchema which references a PriorityLevelConfiguration
|
||||||
with a larger concurrency level. This new PriorityLevelConfiguration could be an
|
with a larger concurrency level. This new PriorityLevelConfiguration could be an
|
||||||
existing level or a new level with its own set of nominal concurrency shares.
|
existing level or a new level with its own set of nominal concurrency shares.
|
||||||
For example, a new FlowSchema could be introduced to change the
|
For example, a new FlowSchema could be introduced to change the
|
||||||
PriorityLevelConfiguration for your requests from global-default to workload-low
|
PriorityLevelConfiguration for your requests from global-default to workload-low
|
||||||
to increase the number of seats available to your user. Creating a new
|
to increase the number of seats available to your user. Creating a new
|
||||||
PriorityLevelConfiguration will reduce the number of seats designated for
|
PriorityLevelConfiguration will reduce the number of seats designated for
|
||||||
existing levels. Recall that editing a default FlowSchema or
|
existing levels. Recall that editing a default FlowSchema or
|
||||||
PriorityLevelConfiguration will require setting the
|
PriorityLevelConfiguration will require setting the
|
||||||
`apf.kubernetes.io/autoupdate-spec` annotation to false.
|
`apf.kubernetes.io/autoupdate-spec` annotation to false.
|
||||||
3. You can also increase the NominalConcurrencyShares for the
|
1. You can also increase the NominalConcurrencyShares for the
|
||||||
PriorityLevelConfiguration which is serving your high priority requests.
|
PriorityLevelConfiguration which is serving your high priority requests.
|
||||||
Alternatively, for versions 1.26 and later, you can increase the LendablePercent
|
Alternatively, for versions 1.26 and later, you can increase the LendablePercent
|
||||||
for competing priority levels so that the given priority level has a higher pool
|
for competing priority levels so that the given priority level has a higher pool
|
||||||
of seats it can borrow.
|
of seats it can borrow.
|
||||||
|
|
||||||
#### Isolate non-essential requests from starving other flows
|
#### Isolate non-essential requests from starving other flows
|
||||||
|
|
||||||
|
@ -886,17 +888,16 @@ Example FlowSchema object to isolate list event requests:
|
||||||
{{% code file="priority-and-fairness/list-events-default-service-account.yaml" %}}
|
{{% code file="priority-and-fairness/list-events-default-service-account.yaml" %}}
|
||||||
|
|
||||||
- This FlowSchema captures all list event calls made by the default service
|
- This FlowSchema captures all list event calls made by the default service
|
||||||
account in the default namespace. The matching precedence 8000 is lower than the
|
account in the default namespace. The matching precedence 8000 is lower than the
|
||||||
value of 9000 used by the existing service-accounts FlowSchema so these list
|
value of 9000 used by the existing service-accounts FlowSchema so these list
|
||||||
event calls will match list-events-default-service-account rather than
|
event calls will match list-events-default-service-account rather than
|
||||||
service-accounts.
|
service-accounts.
|
||||||
- The catch-all PriorityLevelConfiguration is used to isolate these requests.
|
- The catch-all PriorityLevelConfiguration is used to isolate these requests.
|
||||||
The catch-all priority level has a very small concurrency share and does not
|
The catch-all priority level has a very small concurrency share and does not
|
||||||
queue requests.
|
queue requests.
|
||||||
|
|
||||||
## {{% heading "whatsnext" %}}
|
## {{% heading "whatsnext" %}}
|
||||||
|
|
||||||
|
|
||||||
For background information on design details for API priority and fairness, see
|
For background information on design details for API priority and fairness, see
|
||||||
the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness).
|
the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness).
|
||||||
You can make suggestions and feature requests via [SIG API Machinery](https://github.com/kubernetes/community/tree/master/sig-api-machinery)
|
You can make suggestions and feature requests via [SIG API Machinery](https://github.com/kubernetes/community/tree/master/sig-api-machinery)
|
||||||
|
|
Loading…
Reference in New Issue