fix: updates write reponses, suggests exponential backoffs (#6574)

* fix: updates write reponses, suggests exponential backoffs, closes influxdata/DAR#557

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix cURL example to use $max_delay variable instead of hardcoded value (#6575)

* Initial plan

* Fix: use $max_delay variable instead of hardcoded 30 in cURL example

Co-authored-by: jstirnaman <212227+jstirnaman@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: jstirnaman <212227+jstirnaman@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: jstirnaman <212227+jstirnaman@users.noreply.github.com>
Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com>
pull/6325/head
Scott Anderson 2025-12-02 12:40:56 -07:00 committed by GitHub
parent 5b37cb7eba
commit 6b1905c1d5
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
5 changed files with 144 additions and 128 deletions

View File

@ -130,21 +130,9 @@ paths:
schema:
$ref: '#/components/schemas/LineProtocolLengthError'
'429':
description: Token is temporarily over quota. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
type: integer
format: int32
description: Token is temporarily over quota or ingesters are resource constrained.
'503':
description: Server is temporarily unavailable to accept writes. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
type: integer
format: int32
description: Server is temporarily unavailable to accept writes due to too many concurrent requests or insufficient healthy ingesters.
default:
description: Internal server error
content:
@ -293,13 +281,7 @@ paths:
type: string
format: binary
'429':
description: Token is temporarily over quota. The Retry-After header describes when to try the read again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
type: integer
format: int32
description: Token is temporarily over quota or the querier is resource constrained.
default:
description: Error processing query
content:
@ -479,13 +461,7 @@ paths:
type: string
format: binary
'429':
description: Token is temporarily over quota. The Retry-After header describes when to try the read again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
type: integer
format: int32
description: Token is temporarily over quota or queriers are resource constrained.
default:
description: Error processing query
content:

View File

@ -423,15 +423,8 @@ paths:
description: |
Service unavailable.
- Returns this error if
the server is temporarily unavailable to accept writes.
- Returns a `Retry-After` header that describes when to try the write again.
headers:
Retry-After:
description: Non-negative decimal integer indicating seconds to wait before retrying the request.
schema:
format: int32
type: integer
- Returns this error if the server is temporarily unavailable to accept writes due to concurrent request limits or insufficient healthy ingesters.
default:
$ref: '#/components/responses/GeneralServerError'
summary: Write data
@ -562,18 +555,10 @@ paths:
type: string
'429':
description: |
#### InfluxDB Cloud:
- returns this error if a **read** or **write** request exceeds your
plan's [adjustable service quotas](/influxdb3/cloud-dedicated/account-management/limits/#adjustable-service-quotas)
or if a **delete** request exceeds the maximum
[global limit](/influxdb3/cloud-dedicated/account-management/limits/#global-limits)
- returns `Retry-After` header that describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
format: int32
type: integer
Too many requests.
- Returns this error if a **read** or **write** request exceeds rate
limits or if queriers or ingesters are resource constrained.
default:
content:
application/json:
@ -719,21 +704,9 @@ paths:
The response body contains details about the [rejected points](/influxdb3/cloud-dedicated/write-data/troubleshoot/#troubleshoot-rejected-points).
'429':
description: Token is temporarily over quota. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
format: int32
type: integer
description: Token is temporarily over quota or ingesters are resource constrained.
'503':
description: Server is temporarily unavailable to accept writes. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
format: int32
type: integer
description: Server is temporarily unavailable to accept writes due to too many concurrent requests or insufficient healthy ingesters.
default:
content:
application/json:

View File

@ -130,21 +130,9 @@ paths:
schema:
$ref: '#/components/schemas/LineProtocolLengthError'
'429':
description: Token is temporarily over quota. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
type: integer
format: int32
description: Token is temporarily over quota or ingesters are resource constrained.
'503':
description: Server is temporarily unavailable to accept writes. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
type: integer
format: int32
description: Server is temporarily unavailable to accept writes due to too many concurrent requests or insufficient healthy ingesters.
default:
description: Internal server error
content:
@ -274,13 +262,7 @@ paths:
type: string
format: binary
'429':
description: Token is temporarily over quota. The Retry-After header describes when to try the read again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
type: integer
format: int32
description: Token is temporarily over quota or the querier is resource constrained.
default:
description: Error processing query
content:
@ -441,13 +423,7 @@ paths:
type: string
format: binary
'429':
description: Token is temporarily over quota. The Retry-After header describes when to try the read again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
type: integer
format: int32
description: Token is temporarily over quota or queriers are resource constrained.
default:
description: Error processing query
content:

View File

@ -419,27 +419,15 @@ paths:
'429':
description: |
Too many requests.
headers:
Retry-After:
description: Non-negative decimal integer indicating seconds to wait before retrying the request.
schema:
format: int32
type: integer
- Returns this error if ingesters are resource constrained.
'500':
$ref: '#/components/responses/InternalServerError'
'503':
description: |
Service unavailable.
- Returns this error if
the server is temporarily unavailable to accept writes.
- Returns a `Retry-After` header that describes when to try the write again.
headers:
Retry-After:
description: Non-negative decimal integer indicating seconds to wait before retrying the request.
schema:
format: int32
type: integer
- Returns this error if the server is temporarily unavailable to accept writes due to concurrent request limits or insufficient healthy ingesters.
default:
$ref: '#/components/responses/GeneralServerError'
summary: Write data
@ -570,13 +558,9 @@ paths:
type: string
'429':
description: |
Token is temporarily over quota. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
format: int32
type: integer
Too many requests.
- Returns this error if queriers are resource constrained.
default:
content:
application/json:
@ -678,21 +662,9 @@ paths:
$ref: '#/components/schemas/LineProtocolLengthError'
description: Write has been rejected because the payload is too large. Error message returns max size supported. All data in body was rejected and not written.
'429':
description: Token is temporarily over quota. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
format: int32
type: integer
description: Too many requests. The service may be temporarily unavailable or ingesters are resource constrained.
'503':
description: Server is temporarily unavailable to accept writes. The Retry-After header describes when to try the write again.
headers:
Retry-After:
description: A non-negative decimal integer indicating the seconds to delay after the response is received.
schema:
format: int32
type: integer
description: Server is temporarily unavailable to accept writes due to too many concurrent requests or insufficient healthy ingesters.
default:
content:
application/json:

View File

@ -5,6 +5,7 @@ Learn how to avoid unexpected results and recover from errors when writing to {{
- [Troubleshoot failures](#troubleshoot-failures)
- [Troubleshoot rejected points](#troubleshoot-rejected-points)
- [Report write issues](#report-write-issues)
{{% show-in "cloud-dedicated,clustered" %}}- [Implement an exponential backoff strategy](#implement-an-exponential-backoff-strategy){{% /show-in %}}
## Handle write responses
@ -39,7 +40,7 @@ The `message` property of the response body may contain additional details about
| `404 "Not found"` | A requested **resource type** (for example, "database"), and **resource name** | A requested resource wasn't found |
| `422 "Unprocessable Entity"` | `message` contains details about the error | The data isn't allowed (for example, falls outside of the database's retention period). |
| `500 "Internal server error"` | Empty | Default status for an error |
| `503 "Service unavailable"` | Empty | The server is temporarily unavailable to accept writes. The `Retry-After` header contains the number of seconds to wait before trying the write again. |
| `503 "Service unavailable"` | Empty | The server is temporarily unavailable or the requested service is resource constrained. [Implement an exponential backoff strategy](#implement-an-exponential-backoff-strategy). |
{{% /show-in %}}
{{% show-in "cloud-serverless" %}}
@ -346,3 +347,121 @@ Include the support package when contacting InfluxData support through your stan
- Business context if the issue affects production systems
This comprehensive information will help InfluxData engineers identify root causes and provide targeted solutions for your write issues.
{{% show-in "cloud-dedicated,clustered" %}}
## Implement an exponential backoff strategy
Use exponential backoff with jitter for retrying requests that return `429` or `503`.
This reduces load spikes and avoids thundering-herd problems.
**Recommended parameters**:
- Base delay: 1s
- Multiplier: 2 (double each retry)
- Max delay: 30s
- Max retries: 5 (increase only with care)
- Jitter: use "full jitter" (random between 0 and computed delay)
### Exponential backoff examples
{{< code-tabs-wrapper >}}
{{% code-tabs %}}
[cURL](#)
[Python](#)
[JavaScript](#)
{{% /code-tabs %}}
{{% code-tab-content %}}
<!--------------------------------- BEGIN cURL -------------------------------->
<!--pytest.mark.skip-->
```sh
base=1
max_delay=30
max_retries=5
for attempt in $(seq 0 $max_retries); do
resp_code=$(curl -s -o /dev/null -w "%{http_code}" --request POST "https://{{< influxdb/host >}}/write?db=DB" ...)
if [ "$resp_code" -eq 204 ]; then
echo "Write succeeded"
break
fi
if [ "$resp_code" -ne 429 ] && [ "$resp_code" -ne 503 ]; then
echo "Non-retryable response: $resp_code"
break
fi
# compute exponential delay and apply full jitter
delay=$(awk -v b=$base -v a=$attempt -v m=$max_delay 'BEGIN{d=b*(2^a); if(d>m) d=m; print d}')
sleep_seconds=$(awk -v d=$delay 'BEGIN{srand(); printf "%.3f", rand()*d}')
sleep $sleep_seconds
done
```
<!---------------------------------- END cURL --------------------------------->
{{% /code-tab-content %}}
{{% code-tab-content %}}
<!-------------------------------- BEGIN Python ------------------------------->
<!--pytest.mark.skip-->
```python
import random
import time
import requests
base = 1.0
max_delay = 30.0
max_retries = 5
for attempt in range(max_retries + 1):
r = requests.post(url, headers=headers, data=body, timeout=10)
if r.status_code == 204:
break
if r.status_code not in (429, 503):
raise RuntimeError(f"Non-retryable: {r.status_code} {r.text}")
# exponential backoff with full jitter
retry_delay = min(base * (2 ** attempt), max_delay)
sleep = random.random() * retry_delay # full jitter
time.sleep(sleep)
else:
raise RuntimeError("Max retries exceeded")
```
<!--------------------------------- END Python -------------------------------->
{{% /code-tab-content %}}
{{% code-tab-content %}}
<!------------------------------ BEGIN JavaScript ----------------------------->
<!--pytest.mark.skip-->
```js
const base = 1000;
const maxDelay = 30000;
const maxRetries = 5;
async function sleep(ms) { return new Promise(r => setTimeout(r, ms)); }
for (let attempt = 0; attempt <= maxRetries; attempt++) {
const res = await fetch(url, { method: 'POST', body });
if (res.status === 204) break;
if (![429, 503].includes(res.status)) throw new Error(`Non-retryable ${res.status}`);
let delay = base * 2 ** attempt;
delay = Math.min(delay, maxDelay);
const sleepMs = Math.random() * delay; // full jitter
await sleep(sleepMs);
}
```
<!------------------------------- END JavaScript ------------------------------>
{{% /code-tab-content %}}
{{< /code-tabs-wrapper >}}
### Exponential backoff best practices
- Only retry on idempotent or safe request semantics your client supports.
- Retry only for `429` (Too Many Requests) and `503` (Service Unavailable).
- Do not retry on client errors like `400`, `401`, `404`, `422`.
- Cap the delay with `max_delay` to avoid excessively long waits.
- Limit total retries to avoid infinite loops and provide meaningful errors.
- Log retry attempts and backoff delays for observability and debugging.
- Combine backoff with bounded concurrency to avoid overwhelming the server.
{{% /show-in %}}