Skip to content

Retry wrapper for transient failures

The CLI already retries HTTP 429, 502, 503, and 504 responses up to 3 times with exponential backoff — see Built-in Resilience. For nearly every workflow that’s enough.

This recipe is for the cases where it isn’t:

  • A long-running batch job that hits the rate-limit envelope and needs more than 3 retries.
  • Network flakes where DNS or TLS itself misbehaves and never reaches the CLI’s retry layer.
  • A dependent system (CI runner, proxy) that occasionally rejects requests.

The built-in retry is silent on success. On the third and final failure you’ll see an error like:

Error: API request failed: 429 Too Many Requests

If you also pass --json, the structured error shape on stderr looks like:

{
"error": {
"code": "API_ERROR",
"message": "API request failed: 429 Too Many Requests"
}
}

That JSON shape is stable — see the Error Codes reference. If you see this and the operation is idempotent, retrying is safe.

#!/bin/bash
# bb-retry.sh - Run a bb command with exponential backoff.
#
# Usage:
# ./bb-retry.sh pr list -w myworkspace -r myrepo --json
#
# Env vars:
# MAX_ATTEMPTS default 5
# INITIAL_DELAY default 2 (seconds)
# MAX_DELAY default 60 (seconds)
set -euo pipefail
MAX_ATTEMPTS="${MAX_ATTEMPTS:-5}"
INITIAL_DELAY="${INITIAL_DELAY:-2}"
MAX_DELAY="${MAX_DELAY:-60}"
attempt=1
delay="$INITIAL_DELAY"
while true; do
if bb "$@"; then
exit 0
fi
exit_code=$?
if [ "$attempt" -ge "$MAX_ATTEMPTS" ]; then
echo "bb $* failed after $attempt attempts (exit $exit_code)" >&2
exit "$exit_code"
fi
# Add jitter: ±25% of the current delay
jitter=$(( RANDOM % (delay / 2 + 1) - delay / 4 ))
sleep_time=$(( delay + jitter ))
[ "$sleep_time" -lt 1 ] && sleep_time=1
echo "bb $* failed (exit $exit_code); retry $((attempt + 1))/$MAX_ATTEMPTS in ${sleep_time}s" >&2
sleep "$sleep_time"
attempt=$(( attempt + 1 ))
delay=$(( delay * 2 ))
[ "$delay" -gt "$MAX_DELAY" ] && delay="$MAX_DELAY"
done
Terminal window
chmod +x bb-retry.sh
# Read-only: safe to retry freely.
./bb-retry.sh pr list -w myworkspace -r myrepo --json > prs.json
# Tune the cap for big jobs.
MAX_ATTEMPTS=10 INITIAL_DELAY=5 ./bb-retry.sh pr view 42 --json

Recipe: retry only on a specific error code

Section titled “Recipe: retry only on a specific error code”

If you only want to retry on rate-limit errors and not on, say, authentication failures, parse the structured error:

bb-retry-on-rate-limit.sh
#!/bin/bash
set -euo pipefail
MAX_ATTEMPTS=5
attempt=1
delay=2
while true; do
err_file=$(mktemp)
if bb "$@" 2> "$err_file"; then
rm -f "$err_file"
exit 0
fi
err_msg=$(cat "$err_file")
rm -f "$err_file"
# Retry only on rate-limit / transient gateway errors.
if echo "$err_msg" | grep -qE '429|502|503|504'; then
if [ "$attempt" -ge "$MAX_ATTEMPTS" ]; then
echo "$err_msg" >&2
exit 1
fi
echo "Transient error, retrying in ${delay}s..." >&2
sleep "$delay"
delay=$(( delay * 2 ))
attempt=$(( attempt + 1 ))
continue
fi
# Non-transient: fail fast.
echo "$err_msg" >&2
exit 1
done

The wrappers above are the right answer when the built-in retry has been exhausted. They are not the right answer for:

  • Auth failures (Authentication required) — retrying won’t help; fix the credentials.
  • 404 / not-found errors — these don’t get more found over time.
  • Validation errors (requireOption() failures) — these are bugs in your script, not transient.

The CLI exits with code 1 for all of these, same as for transient errors. Use the JSON error shape on stderr (.error.code) to distinguish them rather than retrying everything.