No description
Transcripts were only written at end of executor.run, so stopping a
mid-attempt run lost all the conversation data and made debugging
impossible without reconstructing from screenshots.
- executor.run accepts an optional incremental_save_path and writes
the transcript to disk after every turn (cheap atomic write via
the existing Transcript.save tempfile+rename)
- Initial save after the user-intro append covers the "stopped at
turn 0" case
- Final save still happens at end of executor.run (idempotent —
runner's own final save covers it too)
- runner._run_one_step pre-computes the transcript path and passes
it to executor.run so per-attempt files keep the same naming
Operator can now `tail -f` or `cat` the per-step transcript while
a long run is in flight, and partial transcripts survive Stop /
process crash.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| docs | ||
| scripts | ||
| trailhead_agent | ||
| .env.example | ||
| .gitignore | ||
| .python-version | ||
| pyproject.toml | ||
| README.md | ||
| setup.sh | ||
| uv.lock | ||
Trailhead Agent
An agent that automates Salesforce Trailhead trails end-to-end through the UI.
Stakeholder: Preconfigured. Take-home assignment — see docs/BRIEF.md. Architecture in docs/DESIGN.md. End-to-end retrospective in docs/LESSONS.md.
What's in the repo
| Component | Where | Purpose |
|---|---|---|
| Trail ingestion | scripts/fetch_trail.py |
Pulls trail structure + per-step instructions + inline icons into out/<slug>.json. Cached, ~27s for the Travel Approval trail. |
| Login wizard | scripts/login.py |
One-time interactive Trailhead login → reusable auth/trailhead.json session. |
| Web operator console | scripts/serve.py → localhost:8080 |
Dashboard, run detail with live SSE updates, "I fixed it" intervention button, per-step transcript viewer. |
| Agent core | trailhead_agent/ |
Async Browser wrapper, PlaygroundManager, Verifier, Step Executor (Claude + 12 tools), Runner state machine, EventBus. |
| Per-run artifacts | runs/<id>/ |
state.json, storage_state.json, transcripts/, screenshots/. Resumable. |
Setup
./setup.sh # uv sync + Chromium install
echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env # required
uv run python -m scripts.login # one-time: interactive Trailhead login
uv run python -m scripts.fetch_trail \
https://trailhead.salesforce.com/content/learn/trails/build-a-travel-approval-app
Running
uv run python -m scripts.serve # http://localhost:8080
Then in the web UI: paste the trail URL, click Start run. A headed Chromium opens; the agent works through the trail step-by-step, clicking Verify after each. If it gets stuck, the WAITING block surfaces in the web UI — take over the Chromium window, finish the step manually, then click I've fixed it — re-verify.
For demo purposes, delete any existing playgrounds first at https://trailhead.salesforce.com/users/profiles/orgs — playground creation works reliably from the empty state.
Tracking
- Wall-clock time invested per session:
docs/TIME_LOG.md - Software costs:
docs/COSTS.md