Add code-compaction skill for shrinking bloated diffs#19954
Conversation
Distilled from 14 days of bloat-related sessions on dotnet/fsharp.
Stress-tested by 4 routing ducks (Opus 4.8 / 4.7-high / 4.7-xhigh /
GPT 5.5) on 10 real-user scenarios + 3 garbage-hunt ducks (Opus 4.6 /
GPT 5.4 / Gemini 3.1 Pro) applied to the skill itself. Cross-model
consensus drove both the routing fixes and the deletion pass.
STRUCTURE
.github/skills/code-compaction/
SKILL.md (122 lines: mode selector + handoff)
templates/
garbage-hunt.md (73 lines: deletion-only reviewers)
test-compaction.md (54 lines: parametrization + helpers)
logic-rounds.md (239 lines: 3-round adversarial)
488 lines total
KEY DESIGN DECISIONS
- Mode-based skill, not monolithic. 'This PR is slop' usually means
multiple surfaces; the selector picks Garbage Hunt / Test
Compaction / Logic Compaction or chains out to pr-description /
reviewing-compiler-prs.
- Multiple-mode runs are ordered Garbage -> Test -> Logic, and SKIP
any mode whose surface is empty (driven by routing duck finding
that the prior 'always run all three' rule wasted Test rounds).
- Logic Compaction has a dedicated file-pressure / extract-to-new-file
angle (4/4 routing ducks flagged this as missing from the selector).
- Cross-MODEL diversity over angle diversity. Three same-model agents
are NOT a substitute. If the user requests a same-model setup, push
back once then comply.
- Defers substantive rules to NoBloat.instructions.md; chains GitHub
prose to pr-description; chains compiler review to reviewing-
compiler-prs.
- Description focuses on the user's actual trigger vocabulary (slop,
bullshit, WTF, crap, rubbish, embarrassing, LLM slop, not paid by
LOC, rethink from first principles, huge-file growth) and avoids
summarizing the workflow.
VALIDATION
- SKILL.md 122 lines, frontmatter 741 chars, description 706 chars
(well under Anthropic 1024 / 500 limits).
- Templates 73 / 54 / 239 lines.
- Repo-agnostic: zero language-specific module/type/PR references in
SKILL.md.
- References one level deep.
- Third-person description, starts with 'Use when', no workflow
summary leaked into the description.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
✅ No release notes required |
The skill-validation check requires every agent to carry a name field; this one was missing it, failing the check on main.
T-Gro
left a comment
There was a problem hiding this comment.
🤖 This review was generated by AI (@expert-reviewer agent). Findings may contain inaccuracies — please verify independently.
This PR adds documentation-only skill files (markdown + embedded shell snippets); there is no F# product code to assess for type-checking, IL, or API impact. The process design reads well. One substantive issue in an embedded measurement snippet is noted inline.
|
|
||
| ```bash | ||
| git diff $base -- <suspect-file> | awk '/^\+[^+]/{p++} /^-[^-]/{m++} END{print "+",p," -",m}' | ||
| git diff $base -- <suspect-file> | awk '/^\+\s*(\/\/|--|#)/{c++} /^\+[^+\s]/{l++} END{print "comments",c," code",l}' |
There was a problem hiding this comment.
The awk comment-ratio smell-meter never actually separates comments from code. Inside a bracket expression, [^+\s] is not "non-whitespace": the \s shorthand is not honored inside [...] (even in gawk), so it means "any char except +, \, s". Since every indented diff line is + followed by a space, that space matches [^+\s] and l++ fires for essentially every non-empty added line — including comment lines, which also increment c.
Verified with gawk 5.4: feeding 1 comment line + 4 code lines (all indented) prints comments 1 code 5. The comment is double-counted and code is really just "all non-empty added lines", so any comments/code ratio derived from this understates the comment bloat the meter is meant to surface.
Fix — strip the + and leading whitespace first, then classify by the first real character (also avoids the bracket/\s portability trap):
git diff $base -- <suspect-file> \
| awk '/^\+[^+]/{ s=$0; sub(/^\+[[:space:]]*/,"",s); if (s ~ /^(\/\/|--|#)/) c++; else if (length(s)) l++ } END{print "comments",c," code",l}'There was a problem hiding this comment.
Fixed in ad9320e. Applied your suggested approach: strip the leading + and whitespace first, then classify by the first real character. Verified with gawk 5.4 — feeding 1 comment + 4 code lines (all indented) now prints comments 1 code 4 instead of the previous comments 1 code 5. Thanks for the catch on the \s-in-bracket-expression portability trap.
The bracket expression [^+\s] does not treat \s as a whitespace shorthand, so the leading space of every indented diff line matched and inflated the code count (also double-counting comment lines). Strip the + and leading whitespace first, then classify by the first real char. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
A process skill for code/test diffs flagged as bloated, slop, or
overengineered. It measures the diff, routes it to garbage-hunt,
test-compaction, or logic-reuse modes, and runs cross-model adversarial
reviewers to produce a smaller, higher-reuse patch — favouring DRY logic,
parametrized tests, and reuse of existing helpers over fresh duplication.
The substantive style rules it enforces (bloated comments, test bloat, file
size, "Not paid by LOC") stay in
NoBloat.instructions.md; the skill is theprocess that applies them, not a second copy.