Fix the Claude-engine log parser in Avenger — it fails the agent job on every scheduled run even though the agent actually executes. Avenger has failed 3/3 scheduled runs in the last 6h at the Parse agent logs for step summary step with ERR_CONFIG: Claude execution failed: no structured log entries were produced, while the same config succeeded earlier the same day — this is an intermittent parser/engine defect, not a workflow-logic bug.
Parent: #39883 (6h failure investigation report).
Problem statement
The Avenger agent job is marked failure at the Parse agent logs for step summary step, which raises:
##[error]ERR_CONFIG: Claude execution failed: no structured log entries were produced. This usually indicates a startup or configuration error before tool execution.
But the agent did run: audit of run 27783264854 shows turns=2, a Read tool call, and a substantive noop output ("No fixable mechanical issues found... investigated CI failure 27782333344..."). So the engine produced agent output yet the parser reported zero structured entries and failed the job. The error fires after successful agent execution.
Affected workflow and run IDs
- Workflow:
Avenger (.github/workflows/avenger.lock.yml), engine claude / Claude Code 2.1.179, trigger schedule.
- Failed runs (all at
Parse agent logs for step summary):
- Nearest successful baseline (same config):
27756907485 — 11:40Z (cohort_match, success). → intermittent, not a permanent regression.
Probable root cause
The Claude engine intermittently writes an agent-stdio.log that the structured-log parser cannot decode into JSONL entries (non-structured / free-text or partially-written output on these runs). The parse step treats "zero structured entries" as a hard ERR_CONFIG and fails the whole agent job — even when the agent demonstrably executed and produced a safe output (noop). Likely a startup/output-format race in the Claude engine specific to this run shape, surfaced by an over-strict parser.
Proposed remediation
- Make the parser non-fatal when the agent actually ran. If structured entries are empty but
agent-stdio.log is non-empty and/or a safe-output (incl. noop) was produced, downgrade ERR_CONFIG to a warning instead of failing the agent job.
- Preserve evidence: always upload the raw
agent-stdio.log on this failure path so "engine produced nothing" can be distinguished from "engine produced non-structured output."
- Investigate the engine side: check Claude Code
2.1.179 startup on Avenger for an intermittent pre-tool init error / unflushed structured-log stream.
Success criteria / verification
- Avenger scheduled runs no longer fail at
Parse agent logs for step summary when the agent executed and emitted output.
ERR_CONFIG only fires when the agent produced no output and no safe outputs.
- Add a regression unit test for the parser covering the "non-empty but non-structured stdio" case.
Correlation note
Distinct from the tracked Copilot CLI failures (#39946, #40074) — those are a different engine/signature. This is the only untracked cluster in the 6h window; the Copilot cluster (5 runs: SPDD, Daily Issues Report, Blog Writer, PR Code Quality, Doc Unbloat) remains covered by #39946/#40074, and no existing tracking issue showed evidence of being fixed (so none were closed). Filed as a sub-issue per #40071 (avoid re-filing tracked problems).
References:
Generated by 🔍 [aw] Failure Investigator (6h) · ◷
Fix the Claude-engine log parser in Avenger — it fails the agent job on every scheduled run even though the agent actually executes. Avenger has failed 3/3 scheduled runs in the last 6h at the
Parse agent logs for step summarystep withERR_CONFIG: Claude execution failed: no structured log entries were produced, while the same config succeeded earlier the same day — this is an intermittent parser/engine defect, not a workflow-logic bug.Parent: #39883 (6h failure investigation report).
Problem statement
The Avenger agent job is marked
failureat theParse agent logs for step summarystep, which raises:But the agent did run:
auditof run27783264854showsturns=2, aReadtool call, and a substantive noop output ("No fixable mechanical issues found... investigated CI failure 27782333344..."). So the engine produced agent output yet the parser reported zero structured entries and failed the job. The error fires after successful agent execution.Affected workflow and run IDs
Avenger(.github/workflows/avenger.lock.yml), engineclaude/ Claude Code2.1.179, triggerschedule.Parse agent logs for step summary):ERR_CONFIG, agent emitted noop (2 turns)27756907485— 11:40Z (cohort_match, success). → intermittent, not a permanent regression.Probable root cause
The Claude engine intermittently writes an
agent-stdio.logthat the structured-log parser cannot decode into JSONL entries (non-structured / free-text or partially-written output on these runs). The parse step treats "zero structured entries" as a hardERR_CONFIGand fails the whole agent job — even when the agent demonstrably executed and produced a safe output (noop). Likely a startup/output-format race in the Claude engine specific to this run shape, surfaced by an over-strict parser.Proposed remediation
agent-stdio.logis non-empty and/or a safe-output (incl. noop) was produced, downgradeERR_CONFIGto a warning instead of failing the agent job.agent-stdio.logon this failure path so "engine produced nothing" can be distinguished from "engine produced non-structured output."2.1.179startup on Avenger for an intermittent pre-tool init error / unflushed structured-log stream.Success criteria / verification
Parse agent logs for step summarywhen the agent executed and emitted output.ERR_CONFIGonly fires when the agent produced no output and no safe outputs.Correlation note
Distinct from the tracked Copilot CLI failures (#39946, #40074) — those are a different engine/signature. This is the only untracked cluster in the 6h window; the Copilot cluster (5 runs: SPDD, Daily Issues Report, Blog Writer, PR Code Quality, Doc Unbloat) remains covered by #39946/#40074, and no existing tracking issue showed evidence of being fixed (so none were closed). Filed as a sub-issue per #40071 (avoid re-filing tracked problems).
References:
Related to [aw-failures] [aw] Failure Investigation Report — 6h window (2026-06-17 19:34 UTC) #39883