feat: add /gloop goal-loop command by Avanderheyde · Pull Request #2065 · garrytan/gstack

Avanderheyde · 2026-06-21T00:16:04Z

What

Adds /gloop — a goal-loop orchestrator skill. You hand it a goal; it drives that goal to a PR-ready state through gstack's own plan → review → implement → review loop, instead of jumping straight from a vague sentence to code.

The problem it solves: a plain /goal loop (Codex, or vanilla Claude) takes a one-line goal, jumps to implementation, then loops on its own output with no independent check. It optimizes for "the tests I wrote pass" and ships scope nobody approved. /gloop refuses that shape — every goal passes through gstack's planning and review skills, and an independent /review runs after every implementation pass.

The loop

Phase	What it does
0 — Frame	Restate the goal, extract constraints, convert it into verifiable success criteria. If too vague, ask or offer `/office-hours` / `/spec`.
1 — Plan	Create/update a plan artifact (or load an existing `/spec` issue or design doc).
2 — Plan review (HARD GATE)	Run `/autoplan` (full) or `/plan-eng-review` (focused) before any code. Drives them via the `{{INVOKE_SKILL}}` resolver.
2.5 — Autonomous reconciliation	Review/evaluation findings are folded into the plan automatically. It asks only for missing/contradictory objectives, credentials/access, destructive or externally irreversible actions, or proven impossibility.
3 — Implement	One scoped milestone per pass. Entry-guarded on `plan_review: done`.
4 — Tests	Read the test command from `CLAUDE.md` (platform-agnostic), run it, add tests for the new behavior.
5 — Review	`/review` on the diff — an independent reviewer, not the context that wrote the code. Re-runs tests if review auto-fixes anything.
6 — Loop decision	DONE / NEXT PASS / BLOCKED, against a durable pass counter + convergence guard, so it stops instead of spinning.
7 — Stop	PR-ready summary and hand-off to `/ship`, or an exact blocker for `--resume`.

/gloop ends at PR-ready; it does not merge or deploy.

How it's wired in

gstack skill discovery is directory-driven, so the skill is gloop/SKILL.md.tmpl → generated gloop/SKILL.md via bun run gen:skill-docs. The loop drives sibling skills with the canonical {{INVOKE_SKILL:autoplan|plan-eng-review|review}} resolver (reads and follows their SKILL.md, skipping preamble). preamble-tier: 3, matching the /autoplan orchestrator sibling. Wired into: the root routing catalog, README.md, docs/skills.md, AGENTS.md, and the test coverage registry (SKILL_COVERAGE in test/skill-coverage-matrix.ts); gstack/llms.txt and scripts/proactive-suggestions.json regenerated.

Tests

New structural test gloop skill structure in test/skill-validation.test.ts (matches the investigate/office-hours pattern): pins the full Phase 0–7 loop, the anti-pattern banner + hard gate, that the loop drives /autoplan + /plan-eng-review + /review (and that the {{INVOKE_SKILL}} blocks actually resolved — no unresolved {{placeholders}}), autonomous reconciliation except for true objective blockers, the durable plan_review/--resume state, the loop guards, and the /ship hand-off.
Dynamic coverage: gen-skill-docs.test.ts and skill-coverage-matrix.test.ts discover skills by directory scan, so /gloop is auto-covered by registry, freshness, description-length, structural-compliance, and placeholder-resolution checks. Latest focused suite: 756 pass / 0 fail.
bun run slop:diff — no new findings in any changed file.
Generated docs regenerated for all hosts; the Codex/Factory freshness gates regenerate gitignored host dirs fresh.
The full bun test run also surfaces 4 pre-existing, environmental failures (3 brain-cache-* + 1 user-slug-fallback) that touch real ~/.gstack cache/config state. They are unrelated to this branch: the complete branch diff is gloop + docs + two test-registration files, none of which the failing tests import or exercise, and they reproduce in isolation against the same (unchanged) source.

Review

Reviewed by an independent Claude eng-review subagent and by Codex (codex exec) as a final outside voice. Valid findings were applied before commit: explicit plan_review: pending|done state marker so --resume can't mistake a chosen review path for a completed one; durable pass counter read from the state file (guards survive resume); re-run tests after review auto-fixes before declaring DONE; autonomous Phase 2.5 reconciliation so /gloop does not stop for ordinary review/evaluation findings; flag precedence. One Codex finding (persisting the test command to CLAUDE.md) was consciously rejected — that is gstack's documented platform-agnostic convention, followed by /qa, /ship, and others.

Not in scope

VERSION / CHANGELOG are intentionally left to /ship per the repo's convention (CHANGELOG entries are written at ship time, and version-gate.yml is path-filtered to those files). No compiled binaries touched.

🤖 Generated with Claude Code

/gloop drives a goal to a PR-ready state through gstack's own plan -> review -> implement -> review loop, instead of jumping straight from a vague goal to code the way a plain /goal loop does. The skill: - frames the goal into verifiable success criteria (Phase 0) - writes/updates a plan artifact (Phase 1) - runs a plan review BEFORE any code — /autoplan or /plan-eng-review via {{INVOKE_SKILL}} — as a hard gate (Phase 2) - surfaces major scope changes at a user-challenge gate instead of auto-deciding them (Phase 2.5) - implements one scoped milestone per pass (Phase 3) - runs the project's tests (Phase 4) and /review on the diff (Phase 5) - loops with durable state (pass count + plan_review marker) and guards (max passes, convergence) so it stops instead of spinning (Phase 6) - ends at PR-ready and hands off to /ship (Phase 7) Discovery is directory-driven (gloop/SKILL.md.tmpl -> generated gloop/SKILL.md). Wires the skill into the routing catalog (SKILL.md.tmpl), the README and docs/skills.md skill tables, AGENTS.md, and the test coverage registry (SKILL_COVERAGE in test/skill-coverage-matrix.ts). Generated outputs (SKILL.md, llms.txt, proactive-suggestions.json) regenerated via gen:skill-docs. preamble-tier 3, matching the autoplan orchestrator sibling.

Structural tripwire for the new /gloop skill, matching the investigate/ office-hours pattern. Asserts the full Phase 0-7 loop is present, the goal-straight-to-code anti-pattern banner and hard gate exist, the loop drives /autoplan, /plan-eng-review, and /review (and that those {{INVOKE_SKILL}} blocks actually resolved to skill-file reads, with no unresolved {{placeholders}}), the user-challenge gate, the durable plan_review/--resume state, the loop guards, and the PR-ready /ship hand-off.

trunk-io · 2026-06-21T00:16:08Z

Merging to main in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

not0xjarvis added 2 commits June 20, 2026 17:35

fix: make gloop autonomous by default

2553e74

Avanderheyde force-pushed the feat/gloop-goal-command branch from 5797c81 to 2553e74 Compare June 21, 2026 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add /gloop goal-loop command#2065

feat: add /gloop goal-loop command#2065
Avanderheyde wants to merge 3 commits into
garrytan:mainfrom
Avanderheyde:feat/gloop-goal-command

Avanderheyde commented Jun 21, 2026 •

edited

Loading

Uh oh!

trunk-io Bot commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Avanderheyde commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

The loop

How it's wired in

Tests

Review

Not in scope

Uh oh!

trunk-io Bot commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Avanderheyde commented Jun 21, 2026 •

edited

Loading