diff --git a/content/zh-cn/docs/concepts/cluster-administration/flow-control.md b/content/zh-cn/docs/concepts/cluster-administration/flow-control.md
index a872913cd4..c6db70c5f3 100644
--- a/content/zh-cn/docs/concepts/cluster-administration/flow-control.md
+++ b/content/zh-cn/docs/concepts/cluster-administration/flow-control.md
@@ -36,7 +36,7 @@ The API Priority and Fairness feature (APF) is an alternative that improves upon
 aforementioned max-inflight limitations. APF classifies
 and isolates requests in a more fine-grained way. It also introduces
 a limited amount of queuing, so that no requests are rejected in cases
-of very brief bursts.  Requests are dispatched from queues using a
+of very brief bursts. Requests are dispatched from queues using a
 fair queuing technique so that, for example, a poorly-behaved
 {{< glossary_tooltip text="controller" term_id="controller" >}} need not
 starve others (even at the same priority level).
@@ -81,16 +81,17 @@ APF 适用于 **watch** 请求。当 APF 被禁用时，**watch** 请求不受 `
 
 <!--
 The API Priority and Fairness feature is controlled by a command-line flag
-and is enabled by default.  See 
+and is enabled by default. See 
 [Options](/docs/reference/command-line-tools-reference/kube-apiserver/#options)
 for a general explanation of the available kube-apiserver command-line 
-options and how to enable and disable them.  The name of the 
-command-line option for APF is "--enable-priority-and-fairness".  This feature
+options and how to enable and disable them. The name of the 
+command-line option for APF is "--enable-priority-and-fairness". This feature
 also involves an {{<glossary_tooltip term_id="api-group" text="API Group" >}} 
 with: (a) a stable `v1` version, introduced in 1.29, and 
 enabled by default (b) a `v1beta3` version, enabled by default, and
-deprecated in v1.29.  You can
+deprecated in v1.29. You can
 disable the API group beta version `v1beta3` by adding the
+following command-line flags to your `kube-apiserver` invocation:
 -->
 API 优先级与公平性（APF）特性由命令行标志控制，默认情况下启用。
 有关可用 kube-apiserver 命令行参数以及如何启用和禁用的说明，
@@ -101,6 +102,13 @@ APF 的命令行参数是 "--enable-priority-and-fairness"。
 (b) `v1beta3` 版本，默认被启用，在 1.29 中被弃用。
 你可以通过添加以下内容来禁用 Beta 版的 `v1beta3` API 组：
 
+<!--
+```shell
+kube-apiserver \
+--runtime-config=flowcontrol.apiserver.k8s.io/v1beta3=false \
+ # …and other flags as usual
+```
+-->
 ```shell
 kube-apiserver \
 --runtime-config=flowcontrol.apiserver.k8s.io/v1beta3=false \
@@ -164,7 +172,7 @@ from succeeding.
 <!--
 The concurrency limits of the priority levels are periodically
 adjusted, allowing under-utilized priority levels to temporarily lend
-concurrency to heavily-utilized levels.  These limits are based on
+concurrency to heavily-utilized levels. These limits are based on
 nominal limits and bounds on how much concurrency a priority level may
 lend and how much it may borrow, all derived from the configuration
 objects mentioned below.
@@ -184,10 +192,10 @@ word "seat" is used to mean one unit of concurrency, inspired by the
 way each passenger on a train or aircraft takes up one of the fixed
 supply of seats.
 
-But some requests take up more than one seat.  Some of these are **list**
+But some requests take up more than one seat. Some of these are **list**
 requests that the server estimates will return a large number of
-objects.  These have been found to put an exceptionally heavy burden
-on the server.  For this reason, the server estimates the number of objects
+objects. These have been found to put an exceptionally heavy burden
+on the server. For this reason, the server estimates the number of objects
 that will be returned and considers the request to take a number of seats
 that is proportional to that estimated number.
 -->
@@ -206,19 +214,19 @@ that is proportional to that estimated number.
 ### Execution time tweaks for watch requests
 
 API Priority and Fairness manages **watch** requests, but this involves a
-couple more excursions from the baseline behavior.  The first concerns
-how long a **watch**  request is considered to occupy its seat.  Depending
-on request parameters, the response to a **watch**  request may or may not
-begin with **create**  notifications for all the relevant pre-existing
-objects.  API Priority and Fairness considers a **watch**  request to be
+couple more excursions from the baseline behavior. The first concerns
+how long a **watch** request is considered to occupy its seat. Depending
+on request parameters, the response to a **watch** request may or may not
+begin with **create** notifications for all the relevant pre-existing
+objects. API Priority and Fairness considers a **watch** request to be
 done with its seat once that initial burst of notifications, if any,
 is over.
 
 The normal notifications are sent in a concurrent burst to all
-relevant **watch**  response streams whenever the server is notified of an
-object create/update/delete.  To account for this work, API Priority
+relevant **watch** response streams whenever the server is notified of an
+object create/update/delete. To account for this work, API Priority
 and Fairness considers every write request to spend some additional
-time occupying seats after the actual writing is done.  The server
+time occupying seats after the actual writing is done. The server
 estimates the number of notifications to be sent and adjusts the write
 request's number of seats and seat occupancy time to include this
 extra work.
@@ -266,7 +274,7 @@ many instances should authenticate with distinct usernames
 
 <!--
 After classifying a request into a flow, the API Priority and Fairness
-feature then may assign the request to a queue.  This assignment uses
+feature then may assign the request to a queue. This assignment uses
 a technique known as {{< glossary_tooltip term_id="shuffle-sharding"
 text="shuffle sharding" >}}, which makes relatively efficient use of
 queues to insulate low-intensity flows from high-intensity flows.
@@ -353,7 +361,7 @@ API 服务器的总并发量限制通过这些份额按例分配到现有 Priori
 <!--
 In the versions before `v1beta3` the relevant
 PriorityLevelConfiguration field is named "assured concurrency shares"
-rather than "nominal concurrency shares".  Also, in Kubernetes release
+rather than "nominal concurrency shares". Also, in Kubernetes release
 1.25 and earlier there were no periodic adjustments: the
 nominal/assured limits were always applied without adjustment.
 -->
@@ -365,12 +373,12 @@ nominal/assured limits were always applied without adjustment.
 <!--
 The bounds on how much concurrency a priority level may lend and how
 much it may borrow are expressed in the PriorityLevelConfiguration as
-percentages of the level's nominal limit.  These are resolved to
+percentages of the level's nominal limit. These are resolved to
 absolute numbers of seats by multiplying with the nominal limit /
-100.0 and rounding.  The dynamically adjusted concurrency limit of a
+100.0 and rounding. The dynamically adjusted concurrency limit of a
 priority level is constrained to lie between (a) a lower bound of its
 nominal limit minus its lendable seats and (b) an upper bound of its
-nominal limit plus the seats it may borrow.  At each adjustment the
+nominal limit plus the seats it may borrow. At each adjustment the
 dynamic limits are derived by each priority level reclaiming any lent
 seats for which demand recently appeared and then jointly fairly
 responding to the recent seat demand on the priority levels, within
@@ -473,7 +481,7 @@ https://play.golang.org/p/Gi0PLgVHiUg , which computes this table.
 
 {{< table caption = "混分切片配置示例" >}}
 <!-- HandSize | Queues | 1 elephant | 4 elephants | 16 elephants -->
-随机分片 | 队列数 | 1 个大象 | 4 个大象 | 16 个大象
+随机分片 | 队列数 | 1 头大象 | 4 头大象 | 16 头大象
 |----------|-----------|------------|----------------|--------------------|
 | 12 | 32 | 4.428838398950118e-09 | 0.11431348830099144 | 0.9935089607656024 |
 | 10 | 32 | 1.550093439632541e-08 | 0.0626479840223545 | 0.9753101519027554 |
@@ -512,7 +520,7 @@ with the highest `matchingPrecedence`. If multiple FlowSchemas with equal
 smaller `name` will win, but it's better not to rely on this, and instead to
 ensure that no two FlowSchemas have the same `matchingPrecedence`.
 -->
-对一个请求来说，只有首个匹配的 FlowSchema  才有意义。
+对一个请求来说，只有首个匹配的 FlowSchema 才有意义。
 如果一个入站请求与多个 FlowSchema 匹配，则将基于逻辑上最高优先级 `matchingPrecedence` 的请求进行筛选。
 如果一个请求匹配多个 FlowSchema 且 `matchingPrecedence` 的值相同，则按 `name` 的字典序选择最小，
 但是最好不要依赖它，而是确保不存在两个 FlowSchema 具有相同的 `matchingPrecedence` 值。
@@ -570,9 +578,9 @@ mandatory and suggested.
 ### Mandatory Configuration Objects
 
 The four mandatory configuration objects reflect fixed built-in
-guardrail behavior.  This is behavior that the servers have before
+guardrail behavior. This is behavior that the servers have before
 those objects exist, and when those objects exist their specs reflect
-this behavior.  The four mandatory objects are as follows.
+this behavior. The four mandatory objects are as follows.
 -->
 ### 强制的配置对象   {#mandatory-configuration-objects}
 
@@ -613,8 +621,8 @@ this behavior.  The four mandatory objects are as follows.
 ### Suggested Configuration Objects
 
 The suggested FlowSchemas and PriorityLevelConfigurations constitute a
-reasonable default configuration.  You can modify these and/or create
-additional configuration objects if you want.  If your cluster is
+reasonable default configuration. You can modify these and/or create
+additional configuration objects if you want. If your cluster is
 likely to experience heavy load then you should consider what
 configuration will work best.
 
@@ -660,9 +668,11 @@ The suggested configuration groups requests into six priority levels:
 <!--
 * The `workload-high` priority level is for other requests from built-in
   controllers.
+
 * The `workload-low` priority level is for requests from any other service
   account, which will typically include all requests from controllers running in
   Pods.
+
 * The `global-default` priority level handles all other traffic, e.g.
   interactive `kubectl` commands run by nonprivileged users.
 -->
@@ -712,10 +722,10 @@ inconsistent with the server's guardrail behavior.
 
 <!--
 Maintenance of suggested configuration objects is designed to allow
-their specs to be overridden.  Deletion, on the other hand, is not
-respected: maintenance will restore the object.  If you do not want a
+their specs to be overridden. Deletion, on the other hand, is not
+respected: maintenance will restore the object. If you do not want a
 suggested configuration object then you need to keep it around but set
-its spec to have minimal consequences.  Maintenance of suggested
+its spec to have minimal consequences. Maintenance of suggested
 objects is also designed to support automatic migration when a new
 version of the `kube-apiserver` is rolled out, albeit potentially with
 thrashing while there is a mixed population of servers.
@@ -729,9 +739,9 @@ thrashing while there is a mixed population of servers.
 <!--
 Maintenance of a suggested configuration object consists of creating
 it --- with the server's suggested spec --- if the object does not
-exist.  OTOH, if the object already exists, maintenance behavior
+exist. OTOH, if the object already exists, maintenance behavior
 depends on whether the `kube-apiservers` or the users control the
-object.  In the former case, the server ensures that the object's spec
+object. In the former case, the server ensures that the object's spec
 is what the server suggests; in the latter case, the spec is left
 alone.
 -->
@@ -743,16 +753,16 @@ alone.
 
 <!--
 The question of who controls the object is answered by first looking
-for an annotation with key `apf.kubernetes.io/autoupdate-spec`.  If
+for an annotation with key `apf.kubernetes.io/autoupdate-spec`. If
 there is such an annotation and its value is `true` then the
-kube-apiservers control the object.  If there is such an annotation
-and its value is `false` then the users control the object.  If
+kube-apiservers control the object. If there is such an annotation
+and its value is `false` then the users control the object. If
 neither of those conditions holds then the `metadata.generation` of the
-object is consulted.  If that is 1 then the kube-apiservers control
-the object.  Otherwise the users control the object.  These rules were
+object is consulted. If that is 1 then the kube-apiservers control
+the object. Otherwise the users control the object. These rules were
 introduced in release 1.22 and their consideration of
 `metadata.generation` is for the sake of migration from the simpler
-earlier behavior.  Users who wish to control a suggested configuration
+earlier behavior. Users who wish to control a suggested configuration
 object should set its `apf.kubernetes.io/autoupdate-spec` annotation
 to `false`.
 -->
@@ -786,7 +796,7 @@ nor suggested but are annotated
 
 The suggested configuration gives no special treatment to the health
 check requests on kube-apiservers from their local kubelets --- which
-tend to use the secured port but supply no credentials.  With the
+tend to use the secured port but supply no credentials. With the
 suggested config, these requests get assigned to the `global-default`
 FlowSchema and the corresponding `global-default` priority level,
 where other traffic can crowd them out.
@@ -808,7 +818,7 @@ requests from rate limiting.
 <!--
 Making this change also allows any hostile party to then send
 health-check requests that match this FlowSchema, at any volume they
-like.  If you have a web traffic filter or similar external security
+like. If you have a web traffic filter or similar external security
 mechanism to protect your cluster's API server from general internet
 traffic, you can configure rules to block any health check requests
 that originate from outside your cluster.
@@ -861,7 +871,7 @@ poorly-behaved workloads that may be harming system health.
   (cumulative since server start) of requests that were rejected,
   broken down by the labels `flow_schema` (indicating the one that
   matched the request), `priority_level` (indicating the one to which
-  the request was assigned), and `reason`.  The `reason` label will be
+  the request was assigned), and `reason`. The `reason` label will be
   one of the following values:
 -->
 * `apiserver_flowcontrol_rejected_requests_total` 是一个计数器向量，
@@ -939,6 +949,16 @@ poorly-behaved workloads that may be harming system health.
   因此你可以将一个优先级的所有 FlowSchema 的直方图相加，以得到分配给该优先级的请求的有效直方图。
   {{< /note >}}
 
+<!--
+* `apiserver_flowcontrol_nominal_limit_seats` is a gauge vector
+  holding each priority level's nominal concurrency limit, computed
+  from the API server's total concurrency limit and the priority
+  level's configured nominal concurrency shares.
+-->
+* `apiserver_flowcontrol_nominal_limit_seats` 是一个测量向量，
+  记录了每个优先级的额定并发限制。
+  此值是根据 API 服务器的总并发限制和优先级的配置额定并发份额计算得出的。
+
 <!--
 #### Maturity level ALPHA
 -->
@@ -949,7 +969,7 @@ poorly-behaved workloads that may be harming system health.
   high water marks of the number of queued requests, grouped by a
   label named `request_kind` whose value is `mutating` or `readOnly`.
   These high water marks describe the largest number seen in the one
-  second window most recently completed.  These complement the older
+  second window most recently completed. These complement the older
   `apiserver_current_inflight_requests` gauge vector that holds the
   last window's high water mark of number of requests actively being
   served.
@@ -976,7 +996,7 @@ poorly-behaved workloads that may be harming system health.
   nanosecond, of the number of requests broken down by the labels
   `phase` (which takes on the values `waiting` and `executing`) and
   `request_kind` (which takes on the values `mutating` and
-  `readOnly`).  Each observed value is a ratio, between 0 and 1, of
+  `readOnly`). Each observed value is a ratio, between 0 and 1, of
   the number of requests divided by the corresponding limit on the
   number of requests (queue volume limit for waiting and concurrency
   limit for executing).
@@ -1000,7 +1020,7 @@ poorly-behaved workloads that may be harming system health.
   histogram vector of observations, made at the end of each
   nanosecond, of the number of requests broken down by the labels
   `phase` (which takes on the values `waiting` and `executing`) and
-  `priority_level`.  Each observed value is a ratio, between 0 and 1,
+  `priority_level`. Each observed value is a ratio, between 0 and 1,
   of a number of requests divided by the corresponding limit on the
   number of requests (queue volume limit for waiting and concurrency
   limit for executing).
@@ -1015,13 +1035,13 @@ poorly-behaved workloads that may be harming system health.
 * `apiserver_flowcontrol_priority_level_seat_utilization` is a
   histogram vector of observations, made at the end of each
   nanosecond, of the utilization of a priority level's concurrency
-  limit, broken down by `priority_level`.  This utilization is the
-  fraction (number of seats occupied) / (concurrency limit).  This
+  limit, broken down by `priority_level`. This utilization is the
+  fraction (number of seats occupied) / (concurrency limit). This
   metric considers all stages of execution (both normal and the extra
   delay at the end of a write to cover for the corresponding
   notification work) of all requests except WATCHes; for those it
   considers only the initial stage that delivers notifications of
-  pre-existing objects.  Each histogram in the vector is also labeled
+  pre-existing objects. Each histogram in the vector is also labeled
   with `phase: executing` (there is no seat limit for the waiting
   phase).
 -->
@@ -1062,9 +1082,9 @@ poorly-behaved workloads that may be harming system health.
 
 <!--
 * `apiserver_flowcontrol_request_concurrency_limit` is the same as
-  `apiserver_flowcontrol_nominal_limit_seats`.  Before the
-  introduction of concurrency borrowing between priority levels, this
-  was always equal to `apiserver_flowcontrol_current_limit_seats`
+  `apiserver_flowcontrol_nominal_limit_seats`. Before the
+  introduction of concurrency borrowing between priority levels,
+  this was always equal to `apiserver_flowcontrol_current_limit_seats`
   (which did not exist as a distinct metric).
 -->
 * `apiserver_flowcontrol_request_concurrency_limit` 与
@@ -1087,8 +1107,8 @@ poorly-behaved workloads that may be harming system health.
 <!--
 * `apiserver_flowcontrol_demand_seats` is a histogram vector counting
   observations, at the end of every nanosecond, of each priority
-  level's ratio of (seat demand) / (nominal concurrency limit).  A
-  priority level's seat demand is the sum, over both queued requests
+  level's ratio of (seat demand) / (nominal concurrency limit). 
+  A priority level's seat demand is the sum, over both queued requests
   and those in the initial phase of execution, of the maximum of the
   number of seats occupied in the request's initial and final
   execution phases.
@@ -1418,9 +1438,9 @@ FlowSchema 将这些列表调用与其他请求隔离开来。
 <!--
 - You can visit flow control [reference doc](/docs/reference/debug-cluster/flow-control/) to learn more about troubleshooting.
 - For background information on design details for API priority and fairness, see
-the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness).
+  the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness).
 - You can make suggestions and feature requests via [SIG API Machinery](https://github.com/kubernetes/community/tree/master/sig-api-machinery)
-or the feature's [slack channel](https://kubernetes.slack.com/messages/api-priority-and-fairness).
+  or the feature's [slack channel](https://kubernetes.slack.com/messages/api-priority-and-fairness).
 -->
 - 你可以查阅流控[参考文档](/zh-cn/docs/reference/debug-cluster/flow-control/)了解有关故障排查的更多信息。
 - 有关 API 优先级和公平性的设计细节的背景信息，