Introducing tapes: Transparent Telemetry for AI Agents

AI agents are everywhere. They write code, manage infrastructure, and make decisions on our behalf. But when a session ends, everything disappears. The reasoning, the missteps, the decisions that shaped the outcome—gone.

This is the observability gap we talk about at Paper Compute. You can’t debug what you can’t see. You can’t improve what you never recorded.

Today we’re open-sourcing tapes, a telemetry layer that creates durable, auditable records of every AI agent session. For the full technical deep dive, read my detailed walkthrough.

“If you can’t replay it, you can’t trust it.”

The Problem

Right now, most AI agent tooling operates on faith. You launch a session, the agent does its work, and you hope for the best. There’s no audit trail for what the model saw. No record of which tool calls succeeded or failed. No way to understand why an agent made a particular decision three hours ago.

This is a security problem. It’s an auditability problem. And for anyone running agents in production, it’s an operational problem that gets worse with every session you can’t inspect.

How tapes Works

The name is a nod to magnetic tape—resilient, sequential, and built to last. The system sits between your AI agents and their inference providers, capturing telemetry without altering the workflow.

01.Proxy service. Intercepts and records traffic between agents and model providers. Drop it in, no code changes required.
02.API server. Query sessions, messages, and metadata programmatically. Build dashboards, alerts, or custom analysis on top.
03.CLI client. Manage recordings, run searches, and analyze sessions from the terminal.
04.Terminal UI. A TUI for deeper investigation when you need to trace through a session turn by turn.

Semantic Search Across Sessions

Every recorded session is embedded for vector search. Instead of grepping through logs, you describe what you’re looking for in natural language and tapes returns the most relevant interactions with similarity scores.

This means you can ask questions like “when did the agent modify the database schema?” or “show me every session that touched authentication” and get meaningful results across your entire history. See the search guide to get started.

Content-Addressed Conversations

Every message in tapes gets a content-addressable hash—similar to how Git handles commits. This unlocks some powerful capabilities:

01.Conversation checkpointing. Save the state of a conversation at any point and return to it later.
02.Branching. Fork a conversation from a specific turn to explore alternative paths.
03.Point-in-time retry. When a workflow goes sideways, rewind to the last good state and try again.

This is the kind of infrastructure primitive we believe every agent operator needs. Not smarter models—better tooling around the models we already have.

What’s Next?

tapes is the first piece of the Paper Compute observability stack. We’re actively working on integration with agent-trace.dev, expanding provider support to AWS Bedrock and Google Vertex, and adding storage backends including Postgres and Qdrant.

For the full technical deep dive—architecture diagrams, setup instructions, and code examples—read my detailed walkthrough.

The repo is live at github.com/papercomputeco/tapes. Star it, try it, break it, and tell us what’s missing.