From 46d35e2baba34636fa8df502d978a9b02a19616e Mon Sep 17 00:00:00 2001 From: cici37 Date: Tue, 22 Mar 2022 13:06:07 -0700 Subject: [PATCH 1/3] Adding doc for transition rules, func library and resource constraints. --- .../custom-resource-definitions.md | 70 +++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md index df66a99281..af6886ec8e 100644 --- a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md +++ b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md @@ -716,6 +716,8 @@ CustomResourceDefinition schemas using the `x-kubernetes-validations` extension. The Rule is scoped to the location of the `x-kubernetes-validations` extension in the schema. And `self` variable in the CEL expression is bound to the scoped value. +Note all the validation rules are scoped to the current object, no cross-object or stateful validation rules are supported. + For example: ```yaml @@ -994,7 +996,75 @@ Here is the declarations type mapping between OpenAPIv3 and CEL type: xref: [CEL types](https://github.com/google/cel-spec/blob/v0.6.0/doc/langdef.md#values), [OpenAPI types](https://swagger.io/specification/#data-types), [Kubernetes Structural Schemas](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#specifying-a-structural-schema). +#### Function Library +Functions available include: + - CEL standard functions, defined in the[list of standard definitions](https://github.com/google/cel-spec/blob/v0.7.0/doc/langdef.md#list-of-standard-definitions) + - CEL standard [macros](https://github.com/google/cel-spec/blob/v0.7.0/doc/langdef.md#macros) + - CEL [extended string function library](https://pkg.go.dev/github.com/google/cel-go@v0.11.2/ext#Strings) + - Kubernetes [CEL extension library](https://pkg.go.dev/k8s.io/apiextensions-apiserver@v0.24.0-alpha.4/pkg/apiserver/schema/cel/library#pkg-functions) + +#### Transition Rules + +A rule that contains an expression referencing the identifier `oldSelf` is implicitly considered a +"transition rule". Transition rules allow schema authors to prevent certain transitions between two +otherwise valid states. For example: + +```yaml +type: string +enum: ["low", "medium", "high"] +x-kubernetes-validations: +- rule: "!(self == 'high' && oldSelf == 'low') && !(self == 'low' && oldSelf == 'high')" + message: cannot transition directly between 'low' and 'high' +``` + +Unlike other rules, transition rules apply only to operations meeting the following criteria: + +- The operation updates an existing object. Transition rules never apply to create operations. + +- Both an old and a new value exist. It remains possible to check if a value has been added or + removed by placing a transition rule on the parent node. Transition rules are never applied to + custom resource creation. When placed on an optional field, a transition rule will not apply to + update operations that set or unset the field. + +- The path to the schema node being validated by a transition rule must resolve to a node that is + comparable between the old object and the new object. For example, list items and their + descendants (`spec.foo[10].bar`) can't necessarily be correlated between an existing object and a + later update to the same object. + +Errors will be generated on CRD writes if a schema node contains a transition rule that can never be +applied, e.g. "*path*: update rule *rule* cannot be set on schema because the schema or its parent +schema is not mergeable". + +Transition rules are only allowed on "correlatable" portions of a schema. +A portion of the schema is correlatable if all `array` parent schemas are of type `x-kubernetes-list-type=map`; any `set`or `atomic`array parent schemas make it impossible to unambiguously correlate a `self` with `oldSelf`. + +##### Use Cases + +| Use Case | Rule +| -------- | -------- +| Immutability | `self.foo == oldSelf.foo` +| Prevent modification/removal once assigned | `oldSelf != 'bar' \|\| self == 'bar'` or `!has(oldSelf.field) \|\| has(self.field)` +| Append-only set | `self.all(element, element in oldSelf)` +| If previous value was X, new value can only be A or B, not Y or Z | `oldSelf != 'X' \|\| self in ['A', 'B']` +| Nondecreasing counters | `self >= oldSelf` + +#### Resource Constraints + +CEL expressions have the potential to consume unacceptable amounts of API server resources. We constrain the resource utilization in following ways: +- Validation of CEL expression's "cost" when a CEL expression is written to a field in a CRD (at CRD creation/update time) +- Runtime cost budget during CEL evaluation + - CEL validation might fail due to runtime cost budget exceed with error message `validation failed due to running out of cost budget, no further validation rules will be run` + - CEL validation might fail due to cost limit exceed per expression with message `operation cancelled: actual cost limit exceeded: no further validation rules will be run due to call cost exceeds limit for rule:{$rule}` +- Go context cancellation to bound CEL expression evaluation to the request lifetime + +Guidelines for working with estimated limits: +- Adding MaxItems, MaxProperties and MaxLength limits on all data accessed by CEL rules is the best practice. +- O(n) - For simple rules, it is possible to iterate across a single map/list/string without exceeding the limit, but adding limits on all data accessed by CEL rules is the best practice +- O(n^2)+ the product of the max lengths usually needs to be <1,000,000. E.g. 1000 for 2 levels of nesting, 100 for 3 levels of nesting +- O(n^3) - should generally be avoided + +// TODO: edit info for cost estimation ### Defaulting From e638ab5ee8618cec6796ebdbcb1b96fafc835531 Mon Sep 17 00:00:00 2001 From: Kermit Alexander Date: Wed, 30 Mar 2022 01:40:17 +0000 Subject: [PATCH 2/3] Reword resource constraint section. --- .../custom-resource-definitions.md | 119 ++++++++++++++++-- 1 file changed, 107 insertions(+), 12 deletions(-) diff --git a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md index af6886ec8e..0576a2c057 100644 --- a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md +++ b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md @@ -1051,20 +1051,115 @@ A portion of the schema is correlatable if all `array` parent schemas are of typ #### Resource Constraints -CEL expressions have the potential to consume unacceptable amounts of API server resources. We constrain the resource utilization in following ways: -- Validation of CEL expression's "cost" when a CEL expression is written to a field in a CRD (at CRD creation/update time) -- Runtime cost budget during CEL evaluation - - CEL validation might fail due to runtime cost budget exceed with error message `validation failed due to running out of cost budget, no further validation rules will be run` - - CEL validation might fail due to cost limit exceed per expression with message `operation cancelled: actual cost limit exceeded: no further validation rules will be run due to call cost exceeds limit for rule:{$rule}` -- Go context cancellation to bound CEL expression evaluation to the request lifetime +Resource consumption of validation rules is checked at CustomResourceDefinition creation and update time. If a rule is estimated to be prohibitively expensive to execute, it will result in a validation error. A similar +system is used at runtime that observes the actions the interpreter takes. If the interpreter executes +too many instructions, execution of the rule will be halted, and an error will result. +Each CustomResourceDefinition is also allowed a certain amount of resources to finish executing all of +its validation rules. If the sum total of its rules are estimated at creation time to go over that limit, +then a validation error will also occur. -Guidelines for working with estimated limits: -- Adding MaxItems, MaxProperties and MaxLength limits on all data accessed by CEL rules is the best practice. -- O(n) - For simple rules, it is possible to iterate across a single map/list/string without exceeding the limit, but adding limits on all data accessed by CEL rules is the best practice -- O(n^2)+ the product of the max lengths usually needs to be <1,000,000. E.g. 1000 for 2 levels of nesting, 100 for 3 levels of nesting -- O(n^3) - should generally be avoided +In general, both systems will allow rules that do not need to iterate; these rules will +always take the same amount of time regardless of how large their input is. `self.foo == 1` will be allowed. +But if `foo` is a string and we instead have `self.foo.contains("someString")`, our rule will take +longer to execute depending on how long `foo` is. Another example would be if `foo` was an array, and we +had a rule `self.foo.all(x, x > 5)`. The cost system will always assume the worst-case scenario if +a limit on the length of `foo` is not given, and this will happen for anything that can be iterated +over (lists, maps, etc.). -// TODO: edit info for cost estimation +Because of this, it is considered best practice to put a limit via `maxItems`, `maxProperties`, and +`maxLength` for anything that will be processed in a validation rule in order to prevent validation errors during cost estimation. For example, given this schema with one rule: + +```yaml +openAPIV3Schema: + type: object + properties: + foo: + type: array + items: + type: string + x-kubernetes-validations: + - rule: "self.all(x, x.contains('a string'))" +``` + +The cost system will not allow this rule. Using `self.all` means calling `contains` on every string in `foo`, +which in turn will check the given string to see if it contains `'a string'`. Without limits, this is a very +expensive rule: + +``` + spec.validation.openAPIV3Schema.properties[spec].properties[foo].x-kubernetes-validations[0].rule: Forbidden: + CEL rule exceeded budget by more than 100x (try simplifying the rule, or adding maxItems, maxProperties, and + maxLength where arrays, maps, and strings are used) +``` + +Without limits being set, the estimated cost of this rule will exceed the per-rule cost limit. But if we +add limits in the appropriate places, the rule will be allowed: + +```yaml +openAPIV3Schema: + type: object + properties: + foo: + type: array + maxItems: 25 + items: + type: string + maxLength: 10 + x-kubernetes-validations: + - rule: "self.all(x, x.contains('a string'))" +``` + +The cost estimation system takes into account how many times the rule will be executed in addition to the +estimated cost of the rule itself. For instance, the following rule will have the same estimated cost as the +previous example (despite the rule now being defined on the individual array items): + +```yaml +openAPIV3Schema: + type: object + properties: + foo: + type: array + maxItems: 25 + items: + type: string + x-kubernetes-validations: + - rule: "self.contains('a string'))" + maxLength: 10 +``` + +If a list inside of a list has a validation rule that uses `self.all`, that is significantly more expensive +than a non-nested list with the same rule. A rule that would have been allowed on a non-nested list might need lower limits set on both nested lists in order to be allowed. For example, even without having limits set, +the following rule is allowed: + +```yaml +openAPIV3Schema: + type: object + properties: + foo: + type: array + items: + type: integer + x-kubernetes-validations: + - rule: "self.all(x, x == 5)" +``` + +But the same rule on the following schema (with a nested array added) produces a validation error: + +```yaml +openAPIV3Schema: + type: object + properties: + foo: + type: array + items: + type: array + items: + type: integer + x-kubernetes-validations: + - rule: "self.all(x, x == 5)" +``` + +This is because each item of `foo` is itself an array, and each subarray in turn calls `self.all`. Avoid nested +lists and maps if possible where validation rules are used. ### Defaulting From 40157c8e05e735faae7d1bb0af07ba83ae6ba67f Mon Sep 17 00:00:00 2001 From: cici37 Date: Wed, 6 Apr 2022 16:08:12 -0700 Subject: [PATCH 3/3] Address comments --- .../custom-resource-definitions.md | 52 +++++++++++-------- 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md index 0576a2c057..cfe5cb2ad4 100644 --- a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md +++ b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md @@ -716,7 +716,7 @@ CustomResourceDefinition schemas using the `x-kubernetes-validations` extension. The Rule is scoped to the location of the `x-kubernetes-validations` extension in the schema. And `self` variable in the CEL expression is bound to the scoped value. -Note all the validation rules are scoped to the current object, no cross-object or stateful validation rules are supported. +All validation rules are scoped to the current object: no cross-object or stateful validation rules are supported. For example: @@ -996,18 +996,18 @@ Here is the declarations type mapping between OpenAPIv3 and CEL type: xref: [CEL types](https://github.com/google/cel-spec/blob/v0.6.0/doc/langdef.md#values), [OpenAPI types](https://swagger.io/specification/#data-types), [Kubernetes Structural Schemas](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#specifying-a-structural-schema). -#### Function Library +#### Validation functions {#available-validation-functions} Functions available include: - - CEL standard functions, defined in the[list of standard definitions](https://github.com/google/cel-spec/blob/v0.7.0/doc/langdef.md#list-of-standard-definitions) + - CEL standard functions, defined in the [list of standard definitions](https://github.com/google/cel-spec/blob/v0.7.0/doc/langdef.md#list-of-standard-definitions) - CEL standard [macros](https://github.com/google/cel-spec/blob/v0.7.0/doc/langdef.md#macros) - CEL [extended string function library](https://pkg.go.dev/github.com/google/cel-go@v0.11.2/ext#Strings) - - Kubernetes [CEL extension library](https://pkg.go.dev/k8s.io/apiextensions-apiserver@v0.24.0-alpha.4/pkg/apiserver/schema/cel/library#pkg-functions) + - Kubernetes [CEL extension library](https://pkg.go.dev/k8s.io/apiextensions-apiserver@v0.24.0/pkg/apiserver/schema/cel/library#pkg-functions) -#### Transition Rules +#### Transition rules A rule that contains an expression referencing the identifier `oldSelf` is implicitly considered a -"transition rule". Transition rules allow schema authors to prevent certain transitions between two +_transition rule_. Transition rules allow schema authors to prevent certain transitions between two otherwise valid states. For example: ```yaml @@ -1036,33 +1036,40 @@ Errors will be generated on CRD writes if a schema node contains a transition ru applied, e.g. "*path*: update rule *rule* cannot be set on schema because the schema or its parent schema is not mergeable". -Transition rules are only allowed on "correlatable" portions of a schema. +Transition rules are only allowed on _correlatable portions_ of a schema. A portion of the schema is correlatable if all `array` parent schemas are of type `x-kubernetes-list-type=map`; any `set`or `atomic`array parent schemas make it impossible to unambiguously correlate a `self` with `oldSelf`. -##### Use Cases +Here are some examples for transition rules: +{{< table caption="Transition rules examples" >}} | Use Case | Rule | -------- | -------- | Immutability | `self.foo == oldSelf.foo` | Prevent modification/removal once assigned | `oldSelf != 'bar' \|\| self == 'bar'` or `!has(oldSelf.field) \|\| has(self.field)` | Append-only set | `self.all(element, element in oldSelf)` | If previous value was X, new value can only be A or B, not Y or Z | `oldSelf != 'X' \|\| self in ['A', 'B']` -| Nondecreasing counters | `self >= oldSelf` +| Monotonic (non-decreasing) counters | `self >= oldSelf` +{{< /table >}} -#### Resource Constraints +#### Resource use by validation functions -Resource consumption of validation rules is checked at CustomResourceDefinition creation and update time. If a rule is estimated to be prohibitively expensive to execute, it will result in a validation error. A similar -system is used at runtime that observes the actions the interpreter takes. If the interpreter executes +When you create or update a CustomResourceDefinition that uses validation rules, +the API server checks the likely impact of running those validation rules. If a rule is +estimated to be prohibitively expensive to execute, the API server rejects the create +or update operation, and returns an error message. +A similar system is used at runtime that observes the actions the interpreter takes. If the interpreter executes too many instructions, execution of the rule will be halted, and an error will result. Each CustomResourceDefinition is also allowed a certain amount of resources to finish executing all of its validation rules. If the sum total of its rules are estimated at creation time to go over that limit, then a validation error will also occur. -In general, both systems will allow rules that do not need to iterate; these rules will -always take the same amount of time regardless of how large their input is. `self.foo == 1` will be allowed. -But if `foo` is a string and we instead have `self.foo.contains("someString")`, our rule will take -longer to execute depending on how long `foo` is. Another example would be if `foo` was an array, and we -had a rule `self.foo.all(x, x > 5)`. The cost system will always assume the worst-case scenario if +You are unlikely to encounter issues with the resource budget for validation if you only +specify rules that always take the same amount of time regardless of how large their input is. +For example, a rule that asserts that `self.foo == 1` does not by itself have any +risk of rejection on validation resource budget groups. +But if `foo` is a string and you define a validation rule `self.foo.contains("someString")`, that rule takes +longer to execute depending on how long `foo` is. +Another example would be if `foo` were an array, and you specified a validation rule `self.foo.all(x, x > 5)`. The cost system always assumes the worst-case scenario if a limit on the length of `foo` is not given, and this will happen for anything that can be iterated over (lists, maps, etc.). @@ -1081,17 +1088,18 @@ openAPIV3Schema: - rule: "self.all(x, x.contains('a string'))" ``` -The cost system will not allow this rule. Using `self.all` means calling `contains` on every string in `foo`, -which in turn will check the given string to see if it contains `'a string'`. Without limits, this is a very -expensive rule: - +then the API server rejects this rule on validation budget grounds with error: ``` spec.validation.openAPIV3Schema.properties[spec].properties[foo].x-kubernetes-validations[0].rule: Forbidden: CEL rule exceeded budget by more than 100x (try simplifying the rule, or adding maxItems, maxProperties, and maxLength where arrays, maps, and strings are used) ``` -Without limits being set, the estimated cost of this rule will exceed the per-rule cost limit. But if we +The rejection happens because `self.all` implies calling `contains()` on every string in `foo`, +which in turn will check the given string to see if it contains `'a string'`. Without limits, this is a very +expensive rule. + +If you do not specify any validation limit, the estimated cost of this rule will exceed the per-rule cost limit. But if you add limits in the appropriate places, the rule will be allowed: ```yaml