Introduction to CodeCourier

Learn what CodeCourier is, how it orchestrates AI coding agents in isolated cloud sandboxes, and how its architecture enables scalable automated development workflows.

8 min read
introductionoverviewgetting-started

CodeCourier is an AI workflow orchestration platform that lets development teams run AI coding agents - Claude Code, OpenCode, Codex, Pi, and others - inside isolated E2B cloud sandboxes. Instead of running AI assistants on your local machine, CodeCourier provisions full Linux environments on demand, executes multi-step development workflows, and captures the institutional knowledge your agents produce along the way.

Whether you need a single sandbox to prototype a feature, a multi-iteration design-and-review pipeline, a fully automated sprint that converts a project plan into pull requests, or a recurring scheduled task that runs on your team’s cadence - CodeCourier gives you a single control plane to manage it all.

Why CodeCourier

Running AI coding agents locally creates a set of practical problems: environments drift between machines, long-running tasks block your workstation, secrets leak into local shells, and there is no shared record of what the agent did or learned. CodeCourier addresses every one of these by moving agent execution to the cloud and wrapping it in a structured workflow layer.

  • Isolation - Each sandbox is a disposable Linux VM powered by E2B. Agents can install packages, modify files, and run arbitrary commands without touching your machine or other projects.
  • Reproducibility- Sandbox templates define the base image, installed tools, and pre-configured CLI clients. Every run starts from a known state, augmented by your project’s contexts and approved learnings.
  • Observability - Every message exchanged between the platform and the agent is stored in real time. You can review the full conversation, inspect step-by-step verdicts from checker agents, view quality scores for each run step, monitor CI check status, and trace cost and token usage down to the individual run step.
  • Knowledge capture - When an agent discovers a gotcha, a pattern, or a preference, CodeCourier extracts that into a structured learning record. Approved learnings are compiled into markdown and automatically injected into future sessions so your agents get smarter over time.
  • Team collaboration - Projects support owner, admin, and member roles. Team members share workflows, personas, contexts, assets, and learnings within a project while maintaining their own API key configurations.

Core features

Projects

A project is the top-level workspace in CodeCourier. It scopes every other resource - sandboxes, workflows, personas, contexts, assets, plans, learnings, issues, and team members - to a single context. Projects are identified by a unique slug and can optionally link to a GitHub repository for branch management and pull request creation.

Sandboxes

Sandboxes are isolated Linux environments provisioned through E2B. Each sandbox has a configurable template (which determines the pre-installed CLI tool, such as Claude Code or OpenCode), memory allocation from 256 MB to 8 GB, CPU count from 1 to 8 cores, and a timeout from 1 minute to 4 hours. Sandboxes can be created manually for interactive exploration or spawned automatically as part of a workflow run.

Workflows

Workflows define repeatable, multi-step AI processes. CodeCourier supports four workflow types:

  • Single Designer - A single agent executes the prompt in one pass.
  • Designer & Checker - A designer agent writes code, then a checker agent reviews it. If the checker rejects, the designer iterates. This loop continues up to a configurable maximum number of iterations.
  • Custom Pipeline - Define an arbitrary sequence of step types (designer, checker, optimizer, prompter, investigator, evaluator, judge) with optional loops and per-step model overrides.
  • Persona Pipeline - Chain together named personas, each with their own instructions, skill sets, and model configuration, into a sequential pipeline.

Personas

Personas are reusable AI agent configurations scoped to a project. Each persona has a type (designer, checker, optimizer, prompter, investigator, planner, deep-dive, reviewer, or custom), an optional model override, thinking effort level, custom instructions, and sets of activated skills, commands, and scripts. Personas let you standardize how agents behave across runs - a “security-reviewer” persona, for instance, might use high thinking effort with specific security-focused skills enabled and targeted commands for static analysis.

Contexts

Contexts are reusable, versioned system prompt and CLAUDE.md documents that can be bound to specific session types within a project. Rather than maintaining a single global system prompt, you can create distinct context documents for each session type - one for issue scanning sessions, another for learning sessions, another for merge operations - and update them independently as your project evolves. Each context document is versioned, so you can track how your agent instructions change over time and roll back if needed.

Assets: Skills, Commands, and Scripts

Assets are independently versioned, publishable packages that extend agent behavior in the sandbox. There are three asset types:

  • Skills- Domain-specific knowledge packages composed of one or more files (e.g., a Convex patterns skill might include a reference file, code snippets, and architectural guidelines). Skills are written to the sandbox filesystem and referenced in the agent’s context.
  • Commands - Shell command aliases that agents can invoke inside the sandbox. Commands standardize common operations like running tests, linting, or invoking project-specific tooling.
  • Scripts - Executable scripts that can be run inside the sandbox at specific points in a workflow. Scripts are useful for pre-run setup, post-run teardown, or injecting dynamic context into agent sessions.

All asset types are selectable per persona and per session type, giving you fine-grained control over which capabilities each agent role has access to.

Issues

Issue sessions let you scan a codebase for bugs, tech debt, or improvement opportunities. CodeCourier analyzes the repository and generates structured issues with titles, descriptions, priorities, and suggested prompts. When an issue session produces questions or assumptions that require clarification, an Answering Session lets the AI agent resolve those questions before implementation begins. Issues can then be executed individually or grouped into work chains or sprint chains.

Sprint Chains

Sprint chains are batch orchestration pipelines that execute multiple workflow runs across branches in sequence. Unlike work chains (which process a list of issues against a single branch), sprint chains define a range of sprints, track a current sprint index, and maintain per-sprint pull request tracking. Sprint chains are ideal for executing a planned roadmap of features or fixes where each sprint produces its own PR.

Recurring Tasks

Recurring tasks let you schedule any workflow to execute automatically on a recurring schedule. You configure the frequency (daily, every other day, weekly, biweekly, or monthly), timezone, and the hour and minute of execution. CodeCourier tracks the next scheduled run time and dispatches it automatically. Recurring tasks are useful for nightly test runs, weekly dependency audits, or any repetitive workflow your team wants to automate.

Learnings

Learnings capture institutional knowledge from agent sessions. Each learning record includes a description, trigger condition, correct behavior, severity (critical, important, or minor), and category (preference, pattern, gotcha, tool, or architecture). Learnings go through a review workflow - pending, approved, or rejected - and approved learnings are versioned and compiled into markdown that is automatically included in future sandbox prompts.

Quality Scoring

Every run step includes a structured quality score that evaluates the agent’s output across six dimensions: correctness, type safety, code style, test coverage, completeness, and a composite score that aggregates them. Runs track an overall quality score derived from their steps. Quality scores let teams monitor output quality over time and identify which workflow configurations produce the best results.

CI Checks

Runs track CI check status through a ciChecks object that records overall status, the array of individual check results, and the time the checks were last polled. This gives you a live view of whether agent-generated code passes your CI pipeline without leaving the CodeCourier interface.

How it works

The typical flow through CodeCourier follows this path:

  1. Configure your project - Create a project, link your GitHub repository, add API keys for E2B, Anthropic (or OpenRouter / OpenAI), and GitHub, and invite your team.
  2. Set up contexts and assets - Define context documents for each session type (issues, learning, merging, answering, evaluating, judging) and create skill, command, and script assets that agents will use.
  3. Define personas and workflows - Set up the agent personalities you need (a fast designer, a thorough checker, a quality evaluator) and create workflow blueprints that chain them together.
  4. Run - Trigger a run from a workflow, an issue, or a recurring task schedule. CodeCourier dispatches a background job through Trigger.dev, which provisions an E2B sandbox, installs the configured CLI tool, and feeds it your prompt along with the active context document and any compiled learnings.
  5. Iterate - For multi-step workflows, the platform manages the designer/checker loop automatically, creating new sandbox sessions for each step, recording every message in real time, and scoring output quality at each step.
  6. Deliver - When the run completes, CodeCourier can automatically create a pull request on your linked GitHub repository, extract learnings from the session, and notify your team. CI check status is tracked post-merge.

Real-time by default

CodeCourier uses Convex as its database and backend runtime. All data - sandbox status, run progress, messages, quality scores, CI check status, learning reviews - updates in real time across every connected client. There is no polling; changes appear instantly.

Architecture overview

CodeCourier is built on four core services, each responsible for a distinct layer of the platform:

  • Next.js frontend - A Next.js 16 application with App Router, server components, and internationalization via next-intl. The UI is built with Tailwind CSS, Radix UI primitives, and shadcn/ui components, with Framer Motion for animations.
  • Convex backend - Convex provides the database, real-time subscriptions, server functions (queries, mutations, actions), and file storage. All business logic - authentication checks, authorization, data validation - runs in Convex functions. The schema defines tables for users, projects, sandboxes, workflows, runs, run steps, personas, contexts, assets, issues, issue sessions, learnings, recurring tasks, sprint chains, usage records, notifications, and more.
  • E2B sandboxes - E2B provides the isolated Linux virtual machines where AI agents execute. CodeCourier manages sandbox lifecycle (create, pause, resume, kill) and communicates with the agent CLI running inside via the E2B SDK.
  • Trigger.dev jobs - Trigger.dev handles background job orchestration. Long-running operations like sandbox provisioning, multi-step workflow execution, issue sessions, sprint chains, and recurring task dispatch are handled as Trigger.dev tasks with callback secrets for secure communication back to Convex.

Authentication is handled by Clerk, which provides OAuth, email/password, and session management. Clerk JWTs are verified in Convex server functions to enforce per-user and per-project authorization.

Next steps

Ready to get started? The quickstart guide walks you through your first run in under ten minutes, or jump into core concepts for a deeper understanding of the platform’s building blocks.