Architecture Overview

cc-bridge

A self-hosted control plane for AI coding agents — drive long-running agents from a browser or Telegram, over a clean adapter layer that's agnostic to which agent runtime you use.

What it is

AI coding agents (Claude Code, DeepSeek/Pi, …) run as long-lived processes on a server. cc-bridge gives them a real-time interface: a web app and a Telegram bot that attach to the same running sessions, stream what the agent is doing, and let you steer it — from any device, without sitting at the terminal. The agent runtime is hidden behind an adapter, so the same UI and protocol work across different agents.

Python · FastAPIWebSocketsNext.js (static) SQLite · WALtmux control-modeJSONL tail

System architecture

Four layers. Commands flow down (you → agent); events stream up (agent → you) in real time. The backend never embeds an agent — it talks to whatever the active adapter exposes.

CLIENTS cc-bridge web app browser · capability-driven UI · same-origin API+WS ccbot Telegram bridge · parallel access path any device: phone · laptop · tablet REST + WebSocket cc-bridge BACKEND · FastAPI REST API events WebSocketstreaming + workspace permission policyauto-approve governance voice / STTtranscribe → text adapter interface (list / drive / tail / capabilities) ADAPTERS · one per agent runtime Claude adaptertmux send-keys + capture-pane Pi / DeepSeek adaptercontrol-mode over TCP (mirror+drive) your adapterbring your own agent AGENT SUBSTRATE tmux sessions — running agents (Claude Code / Pi cells) + their JSONL transcripts SQLite (WAL) — cursors · acks · policy · voice
Clients ↔ FastAPI backend ↔ adapter ↔ live agent sessions. State lives in SQLite (WAL); transcripts are tailed from JSONL.

Request lifecycle

A message round-trips through the agent's own transcript, so the UI always reflects ground truth — not an optimistic guess.

You typein the web app POST /…/textbackend → adapter adapter drivessend-keys / control-mode Agent runsthinks + acts writes transcript JSONL transcriptappend-only log backend tails ittyped events WebSocket push/…/events UI rendersstreaming Event types: user_message · assistant_thinking · assistant_text · tool_use · tool_result — the same shapes across every adapter. Optimistic echo shows your bubble instantly; it reconciles to the real JSONL line (deduped by id) when it lands.
Drive down through the adapter; stream up from the transcript. One typed event model, regardless of agent.

The adapter model — bring your own agent

The core knows nothing about Claude or DeepSeek. Each adapter implements a small contract — enumerate sessions, drive input, tail output, declare capabilities — and the frontend renders itself from the capabilities the backend advertises (multi-session, lifecycle, MCP, live pane, …). Add an agent by adding an adapter; the UI follows automatically.

cc-bridge core protocol · capabilities · UI Claude adapterClaude Code in tmux Pi / DeepSeek adaptersealed cell, control-mode OpenAI / local / …future adapter your runtime~one small module solid = implemented & proven · dashed = the extension point
Two adapters ship today (Claude Code, Pi/DeepSeek). The interface is the product — a new agent is a new adapter, not a fork.

Engineering highlights

Capability-driven UI

One frontend bundle. The backend advertises per-adapter capabilities at /api/health; the UI shows or hides multi-session, agent lifecycle, MCP controls, and the live terminal pane accordingly — no per-agent forks.

Ground-truth transcripts

Output is tailed from the agent's append-only JSONL, not synthesized. Streaming, collapsible reasoning, and reload-safe de-duplication (merge by transcript id) all fall out of this.

Truncation & resume guards

A pause-and-acknowledge contract handles transcript truncation cleanly; resume is guarded by a ground-truth process check, so a UI reload never double-counts or loses a turn.

Permission governance

A per-session policy (in SQLite) controls auto-approval, published over a small file contract so the Telegram and web front-ends agree on who may approve what.

SQLite for all state

Cursors, acks, policy, and voice live in one WAL database with parametrized SQL and idempotent migrations — no scattered files to drift.

Reproducible build

The image build refreshes the bundled frontend and asserts its build id changed before shipping, so a stale UI can never go out with a new backend.

Deployment shape

Backend binds to localhost; a reverse proxy terminates TLS and gates access (mTLS / VPN). The frontend is a static export served same-origin with the API, so there's no CORS surface. Per-workspace instances are single-tenant and isolated by deployment — each agent owner gets their own.

This page describes the architecture as built. It's an actively evolving personal/OSS project — see the README for current setup and the honest "what's validated" status.