cc-bridge — Architecture Overview

What it is

AI coding agents (Claude Code, DeepSeek/Pi, …) run as long-lived processes on a server. cc-bridge gives them a real-time interface: a web app and a Telegram bot that attach to the same running sessions, stream what the agent is doing, and let you steer it — from any device, without sitting at the terminal. The agent runtime is hidden behind an adapter, so the same UI and protocol work across different agents.

Python · FastAPIWebSocketsNext.js (static) SQLite · WALtmux control-modeJSONL tail

System architecture

Four layers. Commands flow down (you → agent); events stream up (agent → you) in real time. The backend never embeds an agent — it talks to whatever the active adapter exposes.

Clients ↔ FastAPI backend ↔ adapter ↔ live agent sessions. State lives in SQLite (WAL); transcripts are tailed from JSONL.

Request lifecycle

A message round-trips through the agent's own transcript, so the UI always reflects ground truth — not an optimistic guess.

Drive down through the adapter; stream up from the transcript. One typed event model, regardless of agent.

The adapter model — bring your own agent

The core knows nothing about Claude or DeepSeek. Each adapter implements a small contract — enumerate sessions, drive input, tail output, declare capabilities — and the frontend renders itself from the capabilities the backend advertises (multi-session, lifecycle, MCP, live pane, …). Add an agent by adding an adapter; the UI follows automatically.

Two adapters ship today (Claude Code, Pi/DeepSeek). The interface is the product — a new agent is a new adapter, not a fork.

Engineering highlights

Capability-driven UI

One frontend bundle. The backend advertises per-adapter capabilities at /api/health; the UI shows or hides multi-session, agent lifecycle, MCP controls, and the live terminal pane accordingly — no per-agent forks.

Ground-truth transcripts

Output is tailed from the agent's append-only JSONL, not synthesized. Streaming, collapsible reasoning, and reload-safe de-duplication (merge by transcript id) all fall out of this.

Truncation & resume guards

A pause-and-acknowledge contract handles transcript truncation cleanly; resume is guarded by a ground-truth process check, so a UI reload never double-counts or loses a turn.

Permission governance

A per-session policy (in SQLite) controls auto-approval, published over a small file contract so the Telegram and web front-ends agree on who may approve what.

SQLite for all state

Cursors, acks, policy, and voice live in one WAL database with parametrized SQL and idempotent migrations — no scattered files to drift.

Reproducible build

The image build refreshes the bundled frontend and asserts its build id changed before shipping, so a stale UI can never go out with a new backend.

Deployment shape

Backend binds to localhost; a reverse proxy terminates TLS and gates access (mTLS / VPN). The frontend is a static export served same-origin with the API, so there's no CORS surface. Per-workspace instances are single-tenant and isolated by deployment — each agent owner gets their own.

This page describes the architecture as built. It's an actively evolving personal/OSS project — see the README for current setup and the honest "what's validated" status.