Workflow Steps

Explore all workflow step types in CodeCourier: designer, checker, optimizer, prompter, investigator, deep-dive, evaluator, judge, and answerer.

10 min read

workflowsstepsdesigner

Workflow steps are the individual units of work within a CodeCourier pipeline. Each step represents a specific role that an AI agent plays during the workflow execution. CodeCourier defines ten step types, each designed for a distinct purpose in the software development lifecycle. Steps are configured through personas and assembled into pipelines in the workflow builder.

Step Type Reference

The table below summarizes all available step types and their primary characteristics.

All step type identifiers

type StepRole =
  | "designer"      // Primary implementation agent
  | "checker"       // Review and verdict agent
  | "optimizer"     // Code polish agent
  | "prompter"      // Prompt refinement agent
  | "investigator"  // Codebase research agent
  | "deep-dive"     // Deep analysis agent
  | "evaluator"     // Quality scoring agent
  | "judge"         // Multi-branch comparison agent
  | "answerer";     // Question-answering agent

Step Types

Designer

The designer is the primary coding agent. It receives the task prompt (or refined prompt from a previous step) and implements the solution. Designer steps produce code changes, create files, install packages, and perform any development work needed to fulfill the requirements.

Default Tool: Claude Code
Default Model: claude-opus-4-6
Accepts Input: Yes -- receives the prompt and context from prior steps
Produces Output: Yes -- code changes and implementation results
Can Loop: Yes -- commonly paired with a checker in an iteration loop

The designer is the workhorse of most workflows. In a simple workflow, a single designer step may be sufficient. In more complex pipelines, the designer works within a loop where a checker reviews its output and sends it back for revision if needed.

Designer steps include built-in self-validation behavior. The system prompt instructs the designer to run TypeScript compilation (npx tsc --noEmit), linting (npx next lint), and Convex schema validation before committing. This catches common errors before the checker even reviews the work.

Checker

The checker is a review agent that evaluates the previous step's output. It produces a verdict -- a structured response with a pass boolean and a feedbackstring. If the checker passes, the pipeline continues to the next step. If it fails, the loop restarts with the checker's feedback incorporated into the next designer prompt.

Default Tool: Claude Code
Default Model: claude-sonnet-4-6
Accepts Input: Yes -- reviews the designer's output
Produces Output: Yes -- verdict (pass/fail) and feedback
Can Loop: Yes -- always paired with a designer in a loop

The checker's system prompt emphasizes end-to-end testing. By default, checkers are instructed to:

Detect and deploy any backend changes (Convex, database, etc.).
Install dependencies and start the dev server.
Run end-to-end tests using a headless browser.
Verify each requirement from the original prompt.
Produce a structured verdict based on real test results.

Verdict Format

The checker verdict is stored as a structured object with two fields:{ pass: boolean, feedback: string }. The orchestrator reads the pass field to decide whether to continue or loop. The feedbackfield is prepended to the designer's prompt on the next iteration.

Optimizer

The optimizer runs after a designer-checker loop passes. Its job is to clean up and polish the code without changing functionality. Optimizer steps typically handle:

Removing dead code and unused imports.
Improving variable and function naming.
Adding or improving documentation and comments.
Refactoring for readability and maintainability.
Ensuring consistent code style.

Default Tool: Claude Code
Default Model: claude-sonnet-4-6
Accepts Input: Yes -- works on the approved code
Produces Output: Yes -- cleaned-up code changes
Can Loop: Typically no -- runs once after approval

Prompter

The prompter is an agent that refines or expands a vague task description into a detailed, actionable prompt. It analyzes the codebase, understands the project structure, and produces a thorough specification that subsequent designer steps can implement effectively.

Default Tool: Claude Code
Default Model: claude-opus-4-6
Accepts Input: Yes -- receives the original prompt
Produces Output: Yes -- refined/expanded prompt
Can Loop: Typically no -- runs once at the start

The prompter is especially useful when your task descriptions are high level. Instead of "add authentication", the prompter might produce a multi-paragraph specification covering which auth provider to use, what pages need protection, where to add login buttons, and how to handle session state.

Investigator

The investigator is a research agent that explores the codebase to understand a problem before other agents act on it. Investigator steps are useful for debugging workflows -- the investigator reads code, runs tests, and produces a report that subsequent steps use as context.

Default Tool: Claude Code
Default Model: claude-opus-4-6
Accepts Input: Yes -- receives the problem description
Produces Output: Yes -- investigation report and findings
Can Loop: Typically no -- runs once before designer steps

Deep-Dive

The deep-dive step is an intensive analysis agent that performs thorough research into complex issues. Similar to the investigator but designed for harder problems that require reading many files, tracing execution paths, and understanding system architecture in depth.

Default Tool: Claude Code
Default Model: claude-opus-4-6
Accepts Input: Yes - receives the issue or question
Produces Output: Yes - detailed analysis report
Can Loop: No - runs once

Evaluator

The evaluator step scores the current pipeline output against multiple quality dimensions. It produces a qualityScores object that quantifies how well the implementation meets defined quality criteria. The evaluator is configured separately through the evaluator-setup page, where you define the context it uses, the skills it applies, and any setup commands or scripts it runs before assessment.

Step Type Identifier: evaluator
Default Tool: Claude Code
Default Model: claude-opus-4-6
Accepts Input: Yes - evaluates the current state of the codebase
Produces Output: Yes - qualityScores object
Can Loop: Typically no - runs once after design is approved

The evaluator generates quality scores across five dimensions:

Evaluator quality score structure

qualityScores: {
  correctness: number,       // 0-100: Does implementation meet requirements?
  typeSafety: number,        // 0-100: Are TypeScript types correct and complete?
  codeStyle: number,         // 0-100: Does code follow project conventions?
  testCoverage: number,      // 0-100: Are changes covered by tests?
  completeness: number,      // 0-100: Is the implementation fully finished?
  composite: number,         // 0-100: Weighted average of all dimensions
  thresholdResult: boolean,  // True if composite meets the configured threshold
}

The thresholdResult boolean indicates whether the composite score meets the quality threshold configured on the evaluator persona. This field can be used by downstream steps to decide whether to proceed or trigger additional refinement. The overall run record also carries a top-level qualityScore field (the composite from all evaluator steps) for quick filtering and analytics.

Evaluator Configuration

Unlike other step types, the evaluator has its own setup surface (the evaluator-setup page) where you configure the context it receives, the skills it uses, and any shell commands or scripts that prepare the evaluation environment. This separation keeps evaluator configuration independent from the workflow blueprint itself, allowing you to tune evaluation criteria without editing the pipeline.

Judge

The judge step compares outputs from parallel branches and determines which is better. It is used in multi-branch evaluation scenarios where two or more implementations have been produced - for example, by running the same workflow with different models or instructions - and a final decision about which to keep is needed.

Step Type Identifier: judge
Default Tool: Claude Code
Default Model: claude-opus-4-6
Accepts Input: Yes - receives outputs from multiple branches for comparison
Produces Output: Yes - a verdict identifying the winning branch and rationale
Can Loop: No - runs once after all branches complete

The judge is an advanced step type for teams that want to run A/B experiments with their workflow configurations. Rather than manually reviewing two implementations, the judge agent evaluates them against the original requirements and produces a structured comparison.

Answerer

The answerer step is used in answering sessions to respond to questions and assumptions discovered during issue investigation sessions. When an issue session surfaces ambiguities or questions that require human or automated clarification, the answerer provides responses that allow the pipeline to continue without manual intervention.

Step Type Identifier: answerer
Default Tool: Claude Code
Default Model: claude-sonnet-4-6
Accepts Input: Yes - receives questions and assumptions from the issue session
Produces Output: Yes - structured answers that resolve ambiguities
Can Loop: No - runs once per answering session

The answerer integrates with the issue session workflow to close the feedback loop between investigation and implementation. It is most commonly used in pipelines that include an investigator or deep-dive step, where the investigation phase may surface open questions that must be resolved before implementation begins.

Step Configuration

Each step in a pipeline is configured through its persona. The persona defines:

Step configuration via persona

{
  // Role determines the step type and execution behavior
  type: "designer" | "checker" | "optimizer" |
        "prompter" | "investigator" | "deep-dive" |
        "evaluator" | "judge" | "answerer",

  // CLI tool override (falls back to workflow default)
  cliId: "claude",     // or "opencode", "codex", "pi"

  // Model override (falls back to workflow default)
  model: "claude-opus-4-6",

  // Thinking effort for this step
  thinkingEffort: "high",

  // Custom instructions appended to the step's system prompt
  instructions: "Focus on TypeScript type safety...",

  // Skills enabled for this persona
  selectedSkills: ["vitest-testing", "seo-integration"],

  // Whether to inject compiled learnings
  learningsEnabled: true,
}

Step Execution Details

Run Step Records

Every step execution creates a record in the runStepstable. This record tracks:

runId - Parent run reference.
sandboxId - The E2B sandbox used for this step.
personaId - The persona that defined this step.
cliId - The CLI tool used (e.g., "claude").
modelId - The AI model used (e.g., "claude-opus-4-6").
iteration - Which overall iteration this step belongs to.
loopId - The loop group ID (if in a loop).
loopIteration - The iteration within the loop.
role - The step type (designer, checker, evaluator, etc.).
status - pending, running, completed, or failed.
verdict - Checker verdict (pass/fail + feedback).
qualityScores - Evaluator quality scores (correctness, typeSafety, codeStyle, testCoverage, completeness, composite, thresholdResult). Only populated for evaluator steps.
startedAt / completedAt - Timing information.

Step Timeline

The run detail view displays a visual step timeline that shows each step as a node with its role, status, and duration. Clicking a step opens its sandbox output. The timeline makes it easy to trace the execution flow, see which iterations passed or failed, and identify bottlenecks.

Conditional Logic

CodeCourier's workflow system uses a verdict-based conditional model rather than explicit if/else branching. The checker's pass/fail verdict is the primary decision point:

Pass -- Continue to the next block in the pipeline.
Fail -- Loop back and retry (if within iteration limits).
Max iterations reached -- Force-continue regardless of verdict.

This model keeps workflow configuration simple while providing the feedback loop needed for iterative quality improvement.

Parallel Execution

Within a single workflow run, steps execute sequentially -- step 2 waits for step 1 to complete. However, CodeCourier supports parallel execution at the run level. You can start multiple runs of the same workflow simultaneously, each working on a different task on a different branch. This is how sprint chains work -- multiple sprints can execute in sequence, but each sprint's internal steps are sequential.

Choosing Step Types

For most workflows, a designer-checker loop with 3 iterations is the best starting point. Add a prompter at the beginning if your task descriptions are vague, an optimizer at the end for code polish, and an evaluator before the PR is opened to gate on quality scores.