Update implemented proposal with info about environment synchronization.
parent
a20257a888
commit
efe77c32d2
|
@ -1,4 +1,4 @@
|
|||
# Your inspiring proposal title
|
||||
# Public Flake Rate Charts
|
||||
|
||||
* First proposed: 2021-05-17
|
||||
* Authors: Andriy Dzikh (@andriyDev)
|
||||
|
@ -14,7 +14,7 @@ Please review this proposal with the following priorities:
|
|||
|
||||
## Summary
|
||||
|
||||
As of June 2021, public users have no way to view the flake rates of integration tests. This can make it tricky to determine whether an individual PR is causing a new error, or if the test failure is just a flake, or if the test is entirely broken. While each test failure should be investigated, sometimes an unrelated test fails, and knowing that the test has been "flake-y" can increase confidence in a particular PR.
|
||||
As of June 2021, public users have no way to view the flake rates of integration tests. This can make it tricky to determine whether an individual PR is causing a new error, or if the test failure is just a flake, or if the test is entirely broken. While each test failure should be investigated, sometimes an unrelated test fails, and knowing that the test has been flaky can increase confidence in a particular PR.
|
||||
|
||||
This proposal is for a system to inform users, both public and internal, of the flake rates of various tests on the master branch.
|
||||
|
||||
|
@ -28,11 +28,11 @@ This proposal is for a system to inform users, both public and internal, of the
|
|||
### Overview
|
||||
|
||||
The full overview of the system is as follows:
|
||||
* Jenkins integration test jobs running on master use gopogh summaries (already created) and processes them into a series of test run entries, including whether the test passed and its duration
|
||||
* These test run entries are appended into a full dataset of all test runs at `gs://minikube-flake-rate/data.csv`
|
||||
* The `minikube` Jenkins job builds all binaries for integration tests. On completion, it triggers `minikube_set_pending.sh`, which updates the PR status of integration tests to pending. In addition, `minikube_set_pending.sh` will upload the list of environments to wait for to `gs://minikube-builds/logs/<MINIKUBE_LOCATION>/<COMMIT>/started_environments_<minikube_BUILD_NUMBER>.txt`
|
||||
* Jenkins integration test jobs running on master generate gopogh summaries. Each job then triggers `Flake Rate Upload` which appends the completed environment to `gs://minikube-builds/logs/<MINIKUBE_LOCATION>/<COMMIT>/finished_environments_<minikube_BUILD_NUMBER>.txt`
|
||||
* Once all started environments are present in finished environments, if running on master, all gopogh reports are processed through `upload_tests.sh` and appended into the dataset of all test runs at `gs://minikube-flake-rate/data.csv`. If running on a PR, the gopogh reports are used with `report_flakes.sh` to write a comment on PRs about the flake rates of all failed tests.
|
||||
* A Jenkins job runs regularly to compute the flake rates of tests in `gs://minikube-flake-rate/data.csv` and outputs the results into `gs://minikube-flake-rate/flake_rates.csv`, including the environment (e.g. `Docker_Linux`), the test name, the flake rate as a percentage, and the average duration
|
||||
* An HTML+JS file, hosted on `gs://minikube-flake-rate/flake_chart.html`, will read the full test data (`gs://minikube-flake-rate/data.csv`), and parse it into a chart displaying the daily flake rates and average durations of the requested tests (specified by url query arguments)
|
||||
* When the Jenkins integration test jobs run on a PR, they will instead write a comment on the PR including all failed tests and their flake rates. These comments will also include a link to the flake charts for each test/environment
|
||||
|
||||
### Test Data Collection
|
||||
|
||||
|
@ -44,13 +44,11 @@ On a regular schedule (every 4 hours for example), a Jenkins job named `Flake Ra
|
|||
|
||||
### Charts
|
||||
|
||||
To allow users to see the daily "flakiness" of a test/environment combination, we will have an HTML file at `gs://minikube-flake-rate/flake_chart.html` and a JS file at `gs://minikube-flake-rate/flake_chart.js`. These will fetch `gs://minikube-flake-rate/data.csv` and parse it into Google Charts allowing us to visualize the "flakiness" over time. This can help track down exactly when a test became "flake-y" by telling us the commits associated with each test date. The flake rate charts will use two query parameters (e.g. `google.com?these=are&three=query¶m=eters`): test which will control which test to view (`TestFunctional/parallel/LogsCmd`), and env which will control the environment to view (e.g. `Docker_Linux`). By hosting this in a GCS bucket, we can avoid needing to create actual servers to manage this. Since these files are incredibly lightweight, there is little concern over the workload of hosting these files.
|
||||
To allow users to see the daily "flakiness" of a test/environment combination, we will have an HTML file at `gs://minikube-flake-rate/flake_chart.html` and a JS file at `gs://minikube-flake-rate/flake_chart.js`. These will fetch `gs://minikube-flake-rate/data.csv` and parse it into Google Charts allowing us to visualize the "flakiness" over time. This can help track down exactly when a test became flaky by telling us the commits associated with each test date. The flake rate charts will use two query parameters (e.g. `https://storage.googleapis.com/minikube-flake-rate/flake_chart.html?test=TestFunctional/parallel/LogsCmd&env=Docker_Linux`): `test` which will control which test to view (`TestFunctional/parallel/LogsCmd`), and `env` which will control the environment to view (e.g. `Docker_Linux`). If `test` is omitted, a chart describing all tests for `env` will be displayed. By hosting this in a GCS bucket, we can avoid needing to create actual servers to manage this. Since these files are incredibly lightweight, there is little concern over the workload of hosting these files.
|
||||
|
||||
### PR Comments
|
||||
|
||||
As PRs can have many failures, it is useful to be told the flake rates of some of these tests. Some of our tests could be more stable, and knowing that a failed test is known to be unreliable can be informative for both the PR creators and the PR reviewers. To that end, whenever an integration test running on a PR completes, it will call a script named `report_flakes.sh`. This script will use a provided gopogh summary (for the test run that should be reported about) and the public `gs://minikube-flake-rate/flake_rates.csv` to comment on the PR about all failed tests, their flake rates, and links to the flake charts for the test and the environment the failure occurred on.
|
||||
|
||||
An important note, as of June 2021, there is no system for synchronizing after all integration tests run. To workaround this, each integration test (from a set of "important" environments) will trigger `report_flakes.sh` on its own. This means that each environment's test failures will be reported in a separate list. The set of "important" environments should be kept as small as possible in order to prevent spam and keep PRs readable. This is merely a temporary solution - a more permanent design will need to be considered in the future.
|
||||
As PRs can have many failures, it is useful to be told the flake rates of some of these tests. Some of our tests could be more stable, and knowing that a failed test is known to be unreliable can be informative for both the PR creators and the PR reviewers. To that end, once all integration tests have finished, it will call a script named `report_flakes.sh`. This script will use gopogh summaries of all environments (for the test run that should be reported about) and the public `gs://minikube-flake-rate/flake_rates.csv` to comment on the PR about all failed tests, their flake rates, and links to the flake charts for the test and the environment the failure occurred on.
|
||||
|
||||
### Additional Information
|
||||
|
||||
|
@ -73,4 +71,4 @@ This optimization will be done in `optimize_data.sh`.
|
|||
|
||||
Another optimization technique that can be used on `gs://minikube-flake-rate/data.csv` is to use a string table. The string table would be stored at `gs://minikube-flake-rate/data_strings.txt` and would contain an ordered list of unique strings. The index of each string can then be used in place of the actual text in `gs://minikube-flake-rate/data.csv`. The index into the string table will very likely be shorter than the text it represents, saving space. For non-consecutive strings, this can be a very big saving. For example, test names are repeated very often in `gs://minikube-flake-rate/data.csv`, but almost never consecutively. With this technique, the dataset can be compressed even further.
|
||||
|
||||
The trouble with this technique is complexity - any users of the dataset would need to also manage the string table. More importantly, if a new string needs to be added to the string table, the order is critical, meaning synchronization can be a problem (since our integration tests run in parallel). Due to these concerns, this option was rejected (although this may be a more feasible option in the future depending on how integration test synchronization is handled).
|
||||
The trouble with this technique is complexity - any users of the dataset would need to also manage the string table. More importantly, if a new string needs to be added to the string table, the order is critical, meaning synchronization can be a problem (since our integration tests run in parallel). Due to these concerns, this option was rejected.
|
||||
|
|
Loading…
Reference in New Issue