No description

Find a file

Igor Ryzhkov b8ffa1f4fa Persist transcript incrementally — survive Stop / crash Transcripts were only written at end of executor.run, so stopping a mid-attempt run lost all the conversation data and made debugging impossible without reconstructing from screenshots. - executor.run accepts an optional incremental_save_path and writes the transcript to disk after every turn (cheap atomic write via the existing Transcript.save tempfile+rename) - Initial save after the user-intro append covers the "stopped at turn 0" case - Final save still happens at end of executor.run (idempotent — runner's own final save covers it too) - runner._run_one_step pre-computes the transcript path and passes it to executor.run so per-attempt files keep the same naming Operator can now `tail -f` or `cat` the per-step transcript while a long run is in flight, and partial transcripts survive Stop / process crash. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-24 21:32:12 -07:00
docs	TIME_LOG: fill in the iterative-debugging + retrospective phases	2026-05-23 10:48:18 -07:00
scripts	Two-model strategy + per-turn observability + Haiku shadow-replay tool	2026-05-24 15:27:31 -07:00
trailhead_agent	Persist transcript incrementally — survive Stop / crash	2026-05-24 21:32:12 -07:00
.env.example	Scaffold Trailhead agent project	2026-05-22 12:31:18 -07:00
.gitignore	Scaffold trailhead_agent package + RunState model	2026-05-22 13:30:30 -07:00
.python-version	Scaffold Trailhead agent project	2026-05-22 12:31:18 -07:00
pyproject.toml	Add FastAPI web UI — operator console at localhost:8080	2026-05-22 14:38:53 -07:00
README.md	Move ancillary docs under docs/	2026-05-23 10:47:05 -07:00
setup.sh	Scaffold Trailhead agent project	2026-05-22 12:31:18 -07:00
uv.lock	Add FastAPI web UI — operator console at localhost:8080	2026-05-22 14:38:53 -07:00

README.md

Trailhead Agent

An agent that automates Salesforce Trailhead trails end-to-end through the UI.

Stakeholder: Preconfigured. Take-home assignment — see docs/BRIEF.md. Architecture in docs/DESIGN.md. End-to-end retrospective in docs/LESSONS.md.

What's in the repo

Component	Where	Purpose
Trail ingestion	`scripts/fetch_trail.py`	Pulls trail structure + per-step instructions + inline icons into `out/<slug>.json`. Cached, ~27s for the Travel Approval trail.
Login wizard	`scripts/login.py`	One-time interactive Trailhead login → reusable `auth/trailhead.json` session.
Web operator console	`scripts/serve.py` → `localhost:8080`	Dashboard, run detail with live SSE updates, "I fixed it" intervention button, per-step transcript viewer.
Agent core	`trailhead_agent/`	Async Browser wrapper, PlaygroundManager, Verifier, Step Executor (Claude + 12 tools), Runner state machine, EventBus.
Per-run artifacts	`runs/<id>/`	`state.json`, `storage_state.json`, `transcripts/`, `screenshots/`. Resumable.

Setup

./setup.sh                                  # uv sync + Chromium install
echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env # required
uv run python -m scripts.login              # one-time: interactive Trailhead login
uv run python -m scripts.fetch_trail \
  https://trailhead.salesforce.com/content/learn/trails/build-a-travel-approval-app

Running

uv run python -m scripts.serve              # http://localhost:8080

Then in the web UI: paste the trail URL, click Start run. A headed Chromium opens; the agent works through the trail step-by-step, clicking Verify after each. If it gets stuck, the WAITING block surfaces in the web UI — take over the Chromium window, finish the step manually, then click I've fixed it — re-verify.

For demo purposes, delete any existing playgrounds first at https://trailhead.salesforce.com/users/profiles/orgs — playground creation works reliably from the empty state.

Tracking

Wall-clock time invested per session: docs/TIME_LOG.md
Software costs: docs/COSTS.md