I wish Claude just knew how I work without me explaining - so I made something that quietly observes me, learns and teaches it. Open source | AI Paper Digest

TL;DR Highlight

A Mac app that automatically creates Skills by observing your actual work instead of repeatedly entering the same context for each Claude Code session.

Who Should Read

Developers who use Claude Code or AI coding agents daily and are tired of explaining the same workflows in each session. Especially freelancers or startup developers with many repetitive workflows such as PR reviews and client onboarding.

Core Mechanics

Operates as a Mac menu bar app, observing the screen to automatically convert work patterns into 'Skill' files and deliver them to Claude or other agents via MCP (Model Context Protocol).
Offers two capture modes: 'Focus Record' (record once and answer a few questions to create a Skill) and 'Passive Discovery' (quietly observe in the background for a few days).
Passive Discovery distinguishes work from noise using an activity classifier, connects related tasks across sessions with cross-session linking even if a session is interrupted, and synthesizes patterns into Skills after observing them at least three times.
Processed through an 11-step local pipeline: screen capture → frame annotation with a local VLM (Qwen 2.5 via Ollama) → semantic embedding for grouping similar workflows → all executed on your device.
Skill files are structured playbooks containing strategy, steps, guardrails, and writing tone, not just simple prompts, and their confidence score is updated and self-improved based on execution success/failure.
Screenshots are deleted after processing, PII (Personally Identifiable Information) and API keys are automatically masked, stored with local encryption, and no telemetry is sent — data does not leave the device.

Evidence

Operates through an 11-step local pipeline, with Qwen 2.5 (local VLM based on Ollama) annotating each frame with predictions of the current app, current action, and next action to improve accuracy.
Real-world use has confirmed cases where Passive Discovery automatically discovered workflows that the user themselves were not aware of (e.g., Monday metrics routines, GitHub issue triage patterns).
Each Skill must pass a lifecycle gate after being observed at least three times before it can be used by an agent, and the confidence score is updated with each execution.

How to Apply

For tasks with clear procedures like a PR review process, recording once with Focus Record immediately creates a Skill, and Claude Code automatically references that Skill via MCP in subsequent sessions.
For routines where you don't know if there's a pattern (issue triage, weekly reports, etc.), leaving Passive Discovery running in the background for a few days will automatically find patterns and create Skills.
The generated Skill files can be immediately connected to other agent tools that support MCP, such as OpenClaw and Codex, allowing you to reuse the same context even when switching agents.

Code Example

snippet

# 1. Clone the repo and install
git clone https://github.com/sandroandric/AgentHandover
cd AgentHandover

# 2. Prepare a local VLM with Ollama
ollama pull qwen2.5

# 3. Connect to the MCP server from Claude Code (example claude_desktop_config.json)
{
  "mcpServers": {
    "agenthandover": {
      "command": "agenthandover-mcp",
      "args": []
    }
  }
}

# 4. Example flow for creating a Skill with Focus Record
# - Run the menu bar app → Click Record
# - Actually perform the "PR review" task
# - Answer a few questions after completion
# → skills/pr_review.skill.json is automatically generated

# 5. Claude Code session automatically references the Skill
# Claude loads the pr_review Skill via MCP and
# automatically applies the corresponding strategy/steps/guardrails

Terminology

MCPAbbreviation for Model Context Protocol. A protocol for AI agents to read external tools or data in a standardized way. Like USB, it can be plugged into any agent if the specifications match.

VLMAbbreviation for Vision Language Model. An AI model that looks at images (screen captures) and describes them in text. It plays a role in understanding 'what you are doing' on the screen.

SkillA unit file for storing work patterns in this app. It's not just a simple memo, but a structured playbook containing strategy, sequence, exception handling, and writing tone.

Semantic EmbeddingA technique for converting the meaning of text or actions into numerical vectors. Used to group together 'similar types of work' even if the screen UI is different.

Activity ClassifierA model that classifies observed actions as either real work or noise like watching YouTube. Similar to a surveillance camera distinguishing between people and cats.

Confidence ScoreA score indicating how reliable a given Skill is. It increases with successful executions and the Skill is modified if it fails.

PIIPersonally Identifiable Information. Sensitive information that can identify an individual, such as name, email, and API keys. Automatically masked.

Related Papers

Related Resources

AgentHandover GitHub repo