Running Workflows
How to execute workflow runs in CodeCourier, monitor progress in real time, handle CI checks, quality scores, scheduled runs, and manage run lifecycle.
Running a workflow in CodeCourier creates an execution instance (a "run") that processes the pipeline steps sequentially. Each run has its own prompt, configuration overrides, and execution state. This guide covers how to start runs, what happens during execution, and how to monitor and manage them.
Starting a Run
Select a Workflow
From the Workflows page, select the workflow blueprint you want to execute. The workflow detail page shows the pipeline configuration and any previous runs.
Write the Prompt
Every run requires a prompt -- the task description that tells the AI agents what to do. The prompt is sent to the first step in the pipeline and serves as the primary input for the entire run.
Write clear, specific prompts for best results. Include details about what files to modify, what behavior to implement, and what the expected outcome should be.
Configure the Run (Optional)
You can override the workflow's default configuration for a specific run:
- GitHub repo URL-- Override the project's default repository.
- Branch name -- Specify the feature branch for this run.
- Checker instructions -- Custom instructions for checker steps in this run.
- Reference images -- Upload images that the agent can reference during implementation (e.g., design mockups).
- Sandbox config -- Override template, timeout, memory, CPU, and model settings.
Launch
Starting the run creates a record in the runs table with status pending and dispatches the execution to Trigger.dev. The run transitions to running when the orchestrator begins processing the first step.
Run Execution Flow
The workflow orchestrator (a Trigger.dev background task) manages the entire run lifecycle:
1. Run Initialization
The orchestrator reads the workflow blueprint, resolves persona references, and builds the execution plan. It parses the pipeline steps into execution blocks (single steps and loops) and prepares the sandbox configuration.
2. Step Execution
For each step in the pipeline, the orchestrator:
- Creates a run step record in the
runStepstable with the step's role, iteration number, persona reference, and status. - Creates an E2B sandboxfrom the configured template. The sandbox is set up with the project's Git repo, environment variables, skills, and learnings.
- Dispatches the appropriate step task (designer-step, checker-step, optimizer-step, etc.) to Trigger.dev.
- The step task runs the AI agentinside the sandbox with the step's prompt and instructions. Output streams back to the sandbox messages table.
- Records token usage and cost data for the step.
- Updates the run step status to
completedorfailed.
3. Iteration Block Handling
When the orchestrator reaches an Iteration Block(a group of consecutive steps sharing a loopId), it executes the block's steps in sequence, then checks the verdict:
- If the checker step produces a pass verdict, the block exits and execution continues to the next block.
- If the checker step produces a fail verdict, the block re-runs from its first step. The checker's feedback is incorporated into the designer's prompt on the next iteration.
- If the block's
loopMaxIterationsis reached without a passing verdict, the block terminates and the run is markedfailed.
Steps outside an Iteration Block execute exactly once. Checkers outside a block still emit a verdict but do not cause an automatic retry.
4. Run Completion
When all execution blocks have been processed:
- The run status is updated to
completed. - The
completedAttimestamp is set. - A pull request is created if the run produced Git changes.
- Learning extraction is dispatched for the run's sandboxes.
- Project counters are updated.
- Notifications are sent (if configured).
Background Execution
Run States
Every run is in one of seven states:
scheduled- The run has been queued by the recurring task scheduler and is waiting for its scheduled fire time. This is a pre-execution state that precedespending. Scheduled runs carry ascheduledFortimestamp, atimezone, arecurrencePattern, and arecurringTaskIdthat links back to the recurring task that created them.pending- The run has been created but the orchestrator has not started processing yet.running- The orchestrator is actively executing steps. ThecurrentIterationfield tracks progress.paused- The run has been temporarily suspended. This can happen when user intervention is needed.completed- All steps finished successfully.failed- A step encountered an unrecoverable error. Theerrorfield contains the failure message.cancelled- The run was manually cancelled by the user.
Scheduled vs. Pending
scheduled and pending are distinct states. A scheduled run has been created by the recurring task system and is waiting for a future time. A pending run has been dispatched to the Trigger.dev queue and is waiting for the orchestrator to pick it up. A scheduled run transitions to pending when the scheduler dispatches it, which happens at or just after the scheduledFor timestamp.Monitoring a Run
Run Detail View
The run detail page provides a comprehensive view of the execution:
- Status and progress -- Current state, iteration count, and elapsed time.
- Step timeline -- A visual timeline showing each run step, its role, status, and duration. You can click a step to see its sandbox output.
- Sandbox messages-- The streaming terminal output from each sandbox, showing the AI agent's work in real time.
- Verdicts -- For checker steps, the pass/fail verdict and feedback text are displayed.
- PR status -- If a pull request was created, its URL and status are shown.
Run List View
The Runs page shows a paginated list of all runs for the project. Each row displays the run name, status, source (workflow, sprint, or sandbox), creation time, and prompt preview. You can filter and sort runs and use bulk actions to delete multiple runs at once.
Run Sources
Runs are created from several sources, tracked by the source field:
workflow- Triggered manually from a workflow blueprint on the Workflows page.issue- Created by a work chain as part of an issue session execution.sandbox- Created from a standalone sandbox launch.merge_agent- Created by the merge agent for PR management.sprint- Created by a sprint chain orchestrator as part of a batch sprint execution.scheduled- Created by the recurring task scheduler at a configured cadence (daily, weekly, etc.).
Error Handling
Step Failures
When an individual step fails, the error is recorded on the run step record. Depending on the failure type:
- Sandbox creation failure -- E2B could not provision the VM. This usually means the E2B API key is invalid or the template does not exist.
- Agent crash -- The AI CLI process exited unexpectedly. The error output is captured in the sandbox messages.
- Timeout -- The sandbox exceeded its configured timeout. The work done before the timeout is preserved if committed.
- API error -- The AI provider returned an error (rate limit, invalid key, server error).
Run Recovery
CodeCourier does not currently support resuming a failed run from the point of failure. If a run fails, you can start a new run with the same prompt. The new run starts fresh, but if the previous run pushed commits to the branch, the new run picks up where the code left off.
Branch Reuse
Stopping After the Current Turn
In addition to immediate cancellation, you can set the stopAfterCurrentTurn flag on a running workflow. This is a graceful stop that allows the currently executing agent turn to complete before halting the run. It is useful when you want to review partial progress without losing the work that is already in flight.
When stopAfterCurrentTurn is set:
- The orchestrator completes the current agent turn (the AI finishes its current response, tool use, and any file commits).
- Rather than proceeding to the next iteration or step, the run transitions to
paused. - Code changes committed during the final turn are preserved on the branch.
You can set this flag from the run detail page using the "Stop after turn" button, which is available while the run is in the running state. This is preferable to hard cancellation when you want a clean stopping point rather than an abrupt halt.
CI Checks Tracking
After a run creates a pull request, CodeCourier tracks the CI check status for that PR. The ciChecksobject on the run record reflects the latest status from GitHub’s checks API:
ciChecks: {
status: "passing" | "failing" | "pending", // Aggregate CI status
checks: Array<{
name: string, // Check name (e.g., "Build", "Tests", "Lint")
status: string, // Individual check status
url: string, // Link to the check run on GitHub
}>,
checkedAt: number, // Unix timestamp of the last status poll
}When the CI checks on a run’s PR are failing, the run’s PR status transitions to blocked_on_ci. This status is distinct from the other PR states and indicates that the PR exists and is open, but cannot be merged until CI passes.
PR Status Values
The prStatus field on a run tracks the full lifecycle of the associated pull request:
creating- The PR creation request has been dispatched but has not completed yet.created- The PR is open and waiting for review. CI checks may be running.blocked_on_ci- The PR is open but the CI checks are failing. TheciChecksobject contains details about which checks failed.merged- The PR has been merged into the target branch.failed- The PR creation attempt failed (e.g., GitHub API error, authentication failure). TheprErrorfield contains the error message.skipped- No code changes were produced by the run, so no PR was created.
CI Tracking Cadence
checkedAt timestamp on the ciChecks object shows when the most recent poll occurred. The run detail view displays CI check status with links to each individual check run on GitHub.Quality Scores
If the workflow includes an Evaluator step, the run record tracks an overall qualityScore (the composite score from all evaluator steps). Individual run step records carry the full qualityScores breakdown:
// On the run record:
qualityScore: number, // Composite score (0-100) from all evaluator steps
// On individual runStep records (type: "evaluator"):
qualityScores: {
correctness: number, // 0-100
typeSafety: number, // 0-100
codeStyle: number, // 0-100
testCoverage: number, // 0-100
completeness: number, // 0-100
composite: number, // Weighted average of all five dimensions
thresholdResult: boolean, // Whether composite meets the configured threshold
}Quality scores are visible in the run detail view’s step timeline and in the Monitoring dashboard. Runs with low quality scores or a failing thresholdResult are flagged visually so they stand out in the run list.
Cancelling a Run
You can cancel a running workflow from the run detail page. Cancellation:
- Sets the run status to
cancelled. - Attempts to kill any active sandboxes associated with the run.
- Stops the orchestrator from processing further steps.
Cancellation is best-effort - if a step is in the middle of execution, the sandbox may continue until the kill command takes effect. If you want a cleaner stop, use the stopAfterCurrentTurn flag instead (see above).
Run Metadata
Each run record contains metadata that is useful for tracking and analysis:
{
status: "running", // Current state (includes "scheduled")
prompt: "...", // The task description
source: "workflow", // How the run was created (workflow|sprint|sandbox|scheduled|...)
currentIteration: 2, // Current iteration inside an Iteration Block (or 1 when no block is active)
config: { ... }, // Sandbox configuration
githubRepoUrl: "...", // Git repository
branchName: "feat/...", // Working branch
prUrl: "...", // Pull request URL (after completion)
prStatus: "created", // PR lifecycle state (includes "blocked_on_ci")
startedAt: 1712345678, // Execution start timestamp
completedAt: null, // Null until finished
cliVersion: "1.2.3", // CLI tool version used
qualityScore: 87, // Composite quality score from evaluator steps
stopAfterCurrentTurn: false, // Graceful stop flag
ciChecks: { // CI check status for the run's PR
status: "passing",
checks: [...],
checkedAt: 1712345900,
},
// Scheduled run fields (only when source = "scheduled"):
scheduledFor: 1712340000, // Intended fire time
timezone: "America/New_York", // Timezone from recurring task
recurrencePattern: "daily", // Frequency (daily|weekly|...)
recurringTaskId: "...", // Reference to the recurring task
}