Roadmap¶
Where lazy-harness is heading. This is a curated list of committed themes, not a wishlist. Items land here only when they are concrete enough to execute and the author intends to do them; exploratory ideas live in the internal backlog instead.
How to read this document¶
- Each theme groups related work that ships together or builds on a shared mechanism.
- Items are written as short, checkable deliverables. They get ticked off as PRs merge.
- The roadmap is revised whenever a theme closes or when priorities shift. There are no dates —
lazy-harnessis a single-maintainer project and ship-when-ready beats commit-to-a-date. - Finer-grained work (bugs, nice-to-haves, scratch notes) is not here. It lives in the internal backlog under
specs/backlog.md.
Theme 1 — Stability & quality gates¶
The pre-commit gate defined in CLAUDE.md requires pytest, ruff, and mkdocs build --strict to all pass cleanly. That invariant is the floor every other change builds on.
- [ ] Rewrite
tests/unit/test_version.pyto comparepyproject.tomlandsrc/lazy_harness/__init__.pyagainst each other (no hardcoded expected value). - [ ] Resolve 23 preexisting
ruff check src testsfindings — auto-fix the fixable, justify or exclude the rest. - [ ] Make the pre-commit gate green on
mainand keep it green. - [ ] Add a GitHub Actions workflow that runs the gate on every PR and blocks merge on failure.
Theme 2 — Knowledge pipeline maturity¶
The compound-loop worker is the framework's memory engine. Two known issues today cause lost learnings — they share an underlying mechanism (per-session delta tracking) and should land together.
- [ ] Implement
★ Insight ─block extraction as first-class compound-loop output. Spec:specs/designs/2026-04-13-compound-loop-insight-capture.md. - [ ] Fix "learnings lost on long sessions" by tracking
last_insight_message_indexper session and re-scanning only the delta on subsequent Stop hooks. - [ ] Add a contract test that pins the exact marker characters the
explanatoryoutput style emits, so a template change forces a visible failure.
Theme 3 — Open architecture decisions¶
Decisions the audit surfaced that are waiting on real evidence before being promoted or rejected. Each one has a concrete trigger for revisiting.
- [ ] Legacy ADR-010 Ollama backend. Decide: promote to active ADR as a configurable alternative, or reject with a "revisit if cost/rate-limit pressure appears" note. Trigger: if you hit Claude API cost or throttling limits driving compound-loop failures in practice.
- [ ] Legacy ADR-013 Proactivity levels. Decide: promote as per-profile configuration, or reject and keep proactivity encoded as prose in each profile's
CLAUDE.md. Trigger: when a third profile (beyondlazyandflex) is added and the difference in autonomy stops fitting in prose. - [ ] ADR-018 implementation. Build the
lh config <feature>command group and the "Features" section oflh doctor. Trigger: when the second extension point (beyondmetrics_sink) ships and the wizard UX has a concrete second example to design against.
Theme 4 — Framework extensibility¶
The plugin system (metrics sinks, ADR-004 agent adapters) is the framework's growth surface. The next extension point is not yet chosen; the one that ships will inform ADR-018 implementation in Theme 3.
- [ ] Identify and ship the second extension point (candidates: knowledge backend, hook providers, session-export targets). Selection criterion is concrete user need, not speculative design.
- [ ] Document the extension-point contract once two exist in code — three data points beats an abstract spec.
Closed themes¶
None yet. As themes complete, they move into this section with the date and the PR references that closed them.