What is skill extraction?

Skill extraction is the process of reading a recorded agent session and producing a draft skill that captures the procedure, tool sequence, decisions, and known fixes that helped the session succeed. The draft is reviewed and edited before it is reused, so a one-time success can become a reusable artifact.

How do you extract a skill from a session?

In paper console, open a recorded session and choose Generate Skill. The server runs extraction over the session and returns a draft skill you can review and edit before saving. The draft records which sessions it came from and that it was AI-generated, so its origin stays traceable.

Does extraction need a successful session?

It works best on one. Extraction reads the path that resolved the task, so a session that genuinely succeeded tends to produce a cleaner draft. Because a session can appear to succeed for the wrong reason, an extracted skill is reviewed before it is allowed to shape future runs.

What does extraction leave out of a session?

Extraction aims to keep the reusable procedure and drop run-specific noise: timestamps, one-off literal values, redacted credentials, and conversational filler that did not change the path. The goal is a skill that generalizes to the class of task rather than replaying one exact session.

Can skill extraction be fully automated?

The drafting step is automated; acceptance is intentionally not. An extracted skill is reviewed like a change to a shared artifact, because skills that are adopted without review can propagate a pattern nobody validated.

Skill Extraction - Paper Compute Concepts

Skill extraction is the step that turns a recorded agent session into a reusable skill: the procedure, the tool sequence, the decisions, and the known fixes that helped that session succeed, lifted into a draft you can review and reuse. In the continuous agent improvement loop, extraction is the move between capturing a session and applying what it taught — the point where a one-time result becomes something a later run can reuse instead of rediscovering.

Quick breakdown

Extraction is a transformation. Each part of a recorded session maps to a part of the resulting skill.

What extraction reads, and what it writes
Recurring symptom → Trigger	The error or task type that started the session becomes a signal for when the skill applies.
Successful tool calls → Procedure	The ordered calls that resolved the task become the skill's recommended steps.
Branch after an error → Decision point	A choice the agent made based on observed state becomes a documented decision.
Error and the fix that cleared it → Troubleshooting	Symptom, likely cause, and fix become a reusable lookup.
Source sessions → Provenance	The draft records which sessions it came from and that it was AI-generated, so its origin is auditable.

What a recorded session contains that extraction reads

Extraction is only as good as the session record under it. A useful record is the full trajectory of a run rather than a summary written afterward. It holds the prompts, the model responses, the tool calls and their arguments, the errors returned, the fixes attempted, and the outcome, captured as the run happened.

Extraction reads what already worked in a session, instead of guessing what should work next time.

This is the practical difference between a skill and a prompt. A prompt is written from intent. A skill is drafted from a decision trail that already produced a result. The session record is the evidence, and extraction is the step that turns that evidence into something reusable.

What skill extraction keeps and what it discards

A raw session is mostly noise. The job of extraction is to keep the part that generalizes and drop the part that was specific to one run.

Signal versus noise in a session
Kept (tends to generalize)	Discarded (run-specific)
The tool sequence that resolved the task	Timestamps and wall-clock timing
Error signatures and the fixes that cleared them	Credentials and secrets, redacted at capture
Decision branches based on observed state	One-off literal values unique to that run
Conditions that say when the skill applies	Conversational filler that did not change the path
Inputs that materially changed the outcome	Dead ends the session already superseded

The hard part is the boundary. Keep too much and the skill only fires on the exact session it came from. Keep too little and the skill is a vague suggestion. Good extraction leans toward the procedure and the decisions, and away from the literals.

How to generate a skill from a recorded session

In paper console, skill generation runs from a recorded session. Open the session, choose Generate Skill, and the server runs extraction over the session and returns a draft. The draft opens in an editor so you can read it, adjust it, and decide whether it is worth keeping.

Session to draft skill

Recorded session (captured by paper cli)
prompts · tool calls · errors · fixes · outcome
          │
          ▼
 paper console:  Generate Skill
 (server runs extraction over the session)
          │
          ▼
 Draft skill  (Markdown + structured fields)
 ├── type: workflow | prompt-template | domain-knowledge
 ├── procedure, decisions, troubleshooting
 ├── originating sessions + AI-generated flag
 └── version + visibility (private or team)
          │
          ▼
 Review and edit  →  save

A generated skill carries structure beyond its text. It has a type — a workflow, a prompt template, or domain knowledge — so the library can tell procedures apart from reference material. It records the sessions it was generated from and that it was AI-generated, which keeps its origin traceable. It is versioned, and its visibility can be scoped to you or shared with the team.

Why a generated skill is a draft, not a finished skill

Extraction produces a draft. Adoption is a separate, human decision.

A generated skill is a starting point a person reviews, not a verdict the system commits on its own.

Two failure modes make the review step matter. First, a session can look successful without being so: the agent declared victory, but the fix was incidental, and extracting from it would bake in a pattern that does not hold. Second, an over-fit draft captures literals that should have been general, so it quietly breaks the first time the inputs differ. Both are caught by a person reading the draft before it is saved and shared. That is why the console opens the generated skill in an editor rather than publishing it directly.

How skill extraction differs from prompt engineering

Both produce instructions for an agent. They differ in where the instructions come from.

Drafted from a run versus authored from intent
Prompt engineering	Skill extraction
Starts from what you think should work	Starts from a session that already worked
Iterated by trial against live runs	Drafted from a recorded trajectory
Often omits the error paths you did not anticipate	Can include the errors the session hit, with fixes
Tends to live as text maintained by hand	Lives as a versioned artifact, re-derivable from new sessions

The two are complementary. You might prompt-engineer your way to a working session, then extract a skill so the next run does not have to repeat the search.

How Paper Compute extracts skills from sessions

paper cli records each session as a durable, queryable archive. In paper console, Generate Skill runs extraction over a recorded session and returns a draft skill you review and edit before saving. The result is a structured artifact with a type, a version, a record of the sessions it came from, and a visibility scope. From there it belongs to your skill library, where review and versioning govern it like any other shared asset. stereOS is where a skill runs safely when applying it involves executing code.

Extraction is the step that keeps a solved problem from being solved twice. Capture makes it possible; review makes it trustworthy.

Skill Extraction

Quick breakdown

What a recorded session contains that extraction reads

What skill extraction keeps and what it discards

How to generate a skill from a recorded session

Why a generated skill is a draft, not a finished skill

How skill extraction differs from prompt engineering

How Paper Compute extracts skills from sessions

Frequently asked questions

Where to go next

Paper Compute

stereOS

Skill Extraction

Quick breakdown

What a recorded session contains that extraction reads

What skill extraction keeps and what it discards

How to generate a skill from a recorded session

Why a generated skill is a draft, not a finished skill

How skill extraction differs from prompt engineering

How Paper Compute extracts skills from sessions

Frequently asked questions

Where to go next

Related resources

Paper Compute

stereOS