Skip to content

Error Handling

duckflux provides fine-grained control over what happens when a step fails. Error handling is configured through the onError field and can be set at two levels: on the participant (default behavior) and in the flow (per-invocation override). The flow always wins.


The onError field accepts one of four values:

ValueBehavior
failStops the workflow immediately. This is the global default.
skipMarks the step as skipped and continues the flow.
retryRe-executes the step according to the retry configuration.
<participant>Redirects execution to another participant as a fallback.

Set onError on a participant to define its default error behavior wherever it is used in the flow:

participants:
build:
type: exec
run: npm run build
onError: retry
retry:
max: 3
backoff: 2s
tests:
type: exec
run: npm test
onError: skip
deploy:
type: exec
run: ./deploy.sh
onError: fail

In this example:

  • build retries up to 3 times with a 2-second wait between attempts.
  • tests is marked as skipped on failure and the workflow continues.
  • deploy stops the whole workflow if it fails.

Any onError (or retry) defined on a participant can be overridden at the point of invocation in the flow. The flow-level value always takes precedence:

participants:
coder:
type: exec
run: ./generate.sh
onError: retry
retry:
max: 3
backoff: 2s
flow:
- coder:
onError: skip # overrides the participant-level retry
- reviewer

Here coder will skip on error in this invocation, ignoring the retry defined on the participant.


When onError: skip, the step is marked with status: skipped and the workflow moves to the next step. Subsequent steps can read <step>.status to react accordingly:

participants:
notify:
type: http
url: https://hooks.example.com/notify
method: POST
onError: skip
flow:
- build
- notify
- deploy:
when: notify.status == "success"

When onError: retry, the runner re-executes the step according to the retry configuration:

participants:
fetchData:
type: http
url: https://api.example.com/data
method: GET
onError: retry
retry:
max: 3 # maximum number of attempts (required)
backoff: 2s # wait between attempts (default: 0s)
factor: 2 # backoff multiplier (default: 1)
FieldTypeDefaultDescription
maxintegerMaximum number of attempts. Required when onError: retry.
backoffduration0sInitial wait interval between attempts.
factornumber1Multiplier applied to backoff on each attempt.

With backoff: 2s and factor: 2, the wait intervals grow exponentially:

AttemptWait before retry
1st retry2s
2nd retry4s
3rd retry8s

If all attempts fail, the step is treated as a final failure and the workflow stops (unless overridden at flow level).


The onError field also accepts the name of another participant. When the step fails, execution is redirected to that participant instead of stopping or skipping:

participants:
deploy:
type: exec
run: ./deploy.sh
onError: notify_failure
notify_failure:
type: http
url: https://hooks.example.com/failure
method: POST
onError: skip

When deploy fails, notify_failure is called. This allows building fallback chains and cleanup paths without branching the entire flow.

Fallback participants can themselves have an onError policy, forming a chain:

participants:
primary:
type: exec
run: ./primary.sh
onError: secondary
secondary:
type: exec
run: ./secondary.sh
onError: notify
notify:
type: http
url: https://hooks.example.com/alert
method: POST
onError: skip

If primary fails → runs secondary. If secondary also fails → runs notify. If notify fails → skips (workflow continues).


When a step exceeds its configured timeout, it is treated as a failure and the onError strategy applies normally. This means you can use retry or fallback participants to handle timeouts the same way as other errors:

participants:
slowApi:
type: http
url: https://slow.example.com/endpoint
method: GET
timeout: 10s
onError: retry
retry:
max: 2
backoff: 5s

See Timeout and Working Directory for the full timeout precedence rules.


A deployment pipeline that combines retry, skip, fallback participants, and flow-level overrides:

id: deploy-pipeline
name: Deployment Pipeline
version: "1"
defaults:
timeout: 5m
inputs:
env:
type: string
default: staging
participants:
build:
type: exec
run: npm run build
timeout: 3m
onError: retry
retry:
max: 2
backoff: 10s
factor: 2
tests:
type: exec
run: npm test
timeout: 2m
onError: skip
deploy:
type: exec
run: ./deploy.sh
timeout: 10m
onError: rollback
rollback:
type: exec
run: ./rollback.sh
timeout: 5m
onError: notify_failure
notify_success:
type: http
url: https://hooks.example.com/success
method: POST
onError: skip
notify_failure:
type: http
url: https://hooks.example.com/failure
method: POST
onError: skip
flow:
- build
- tests:
onError: fail # override: fail instead of skip in this pipeline
- deploy
- if:
condition: deploy.status == "success"
then:
- notify_success
else:
- notify_failure
output:
buildStatus: build.status
deployStatus: deploy.status

In this workflow:

  • build retries twice with escalating backoff before failing.
  • tests is configured to skip by default but overridden to fail at flow level.
  • If deploy fails, rollback runs automatically; if rollback also fails, notify_failure is called.
  • The final if block handles the success/failure notification regardless of how the flow ended.