No description
Find a file
Igor Ryzhkov b8ffa1f4fa Persist transcript incrementally — survive Stop / crash
Transcripts were only written at end of executor.run, so stopping a
mid-attempt run lost all the conversation data and made debugging
impossible without reconstructing from screenshots.

  - executor.run accepts an optional incremental_save_path and writes
    the transcript to disk after every turn (cheap atomic write via
    the existing Transcript.save tempfile+rename)
  - Initial save after the user-intro append covers the "stopped at
    turn 0" case
  - Final save still happens at end of executor.run (idempotent —
    runner's own final save covers it too)
  - runner._run_one_step pre-computes the transcript path and passes
    it to executor.run so per-attempt files keep the same naming

Operator can now `tail -f` or `cat` the per-step transcript while
a long run is in flight, and partial transcripts survive Stop /
process crash.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 21:32:12 -07:00
docs TIME_LOG: fill in the iterative-debugging + retrospective phases 2026-05-23 10:48:18 -07:00
scripts Two-model strategy + per-turn observability + Haiku shadow-replay tool 2026-05-24 15:27:31 -07:00
trailhead_agent Persist transcript incrementally — survive Stop / crash 2026-05-24 21:32:12 -07:00
.env.example Scaffold Trailhead agent project 2026-05-22 12:31:18 -07:00
.gitignore Scaffold trailhead_agent package + RunState model 2026-05-22 13:30:30 -07:00
.python-version Scaffold Trailhead agent project 2026-05-22 12:31:18 -07:00
pyproject.toml Add FastAPI web UI — operator console at localhost:8080 2026-05-22 14:38:53 -07:00
README.md Move ancillary docs under docs/ 2026-05-23 10:47:05 -07:00
setup.sh Scaffold Trailhead agent project 2026-05-22 12:31:18 -07:00
uv.lock Add FastAPI web UI — operator console at localhost:8080 2026-05-22 14:38:53 -07:00

Trailhead Agent

An agent that automates Salesforce Trailhead trails end-to-end through the UI.

Stakeholder: Preconfigured. Take-home assignment — see docs/BRIEF.md. Architecture in docs/DESIGN.md. End-to-end retrospective in docs/LESSONS.md.

What's in the repo

Component Where Purpose
Trail ingestion scripts/fetch_trail.py Pulls trail structure + per-step instructions + inline icons into out/<slug>.json. Cached, ~27s for the Travel Approval trail.
Login wizard scripts/login.py One-time interactive Trailhead login → reusable auth/trailhead.json session.
Web operator console scripts/serve.pylocalhost:8080 Dashboard, run detail with live SSE updates, "I fixed it" intervention button, per-step transcript viewer.
Agent core trailhead_agent/ Async Browser wrapper, PlaygroundManager, Verifier, Step Executor (Claude + 12 tools), Runner state machine, EventBus.
Per-run artifacts runs/<id>/ state.json, storage_state.json, transcripts/, screenshots/. Resumable.

Setup

./setup.sh                                  # uv sync + Chromium install
echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env # required
uv run python -m scripts.login              # one-time: interactive Trailhead login
uv run python -m scripts.fetch_trail \
  https://trailhead.salesforce.com/content/learn/trails/build-a-travel-approval-app

Running

uv run python -m scripts.serve              # http://localhost:8080

Then in the web UI: paste the trail URL, click Start run. A headed Chromium opens; the agent works through the trail step-by-step, clicking Verify after each. If it gets stuck, the WAITING block surfaces in the web UI — take over the Chromium window, finish the step manually, then click I've fixed it — re-verify.

For demo purposes, delete any existing playgrounds first at https://trailhead.salesforce.com/users/profiles/orgs — playground creation works reliably from the empty state.

Tracking