perf(json): pooled-buffer JSON deserialize#3023
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces two performance-oriented changes in Echo’s core hot paths: JSON deserialization and router static-child lookup. It also adjusts/extends benchmarks and updates tests to reflect the stricter JSON parsing semantics.
Changes:
- Reworked
DefaultJSONSerializer.Deserializeto read into a pooledbytes.Bufferand decode viajson.Unmarshal. - Optimized router static-child lookup by scanning a cache-friendly
scLabels []byteparallel slice and centralized leaf recomputation viarefreshLeaf(). - Hardened/added benchmarks and updated bind/tests for the new JSON error semantics and behaviors.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| router.go | Adds scLabels for faster static child lookup and introduces refreshLeaf(); updates insert/remove paths to keep state in sync. |
| json.go | Switches JSON deserialization to pooled-buffer + json.Unmarshal to reduce per-request allocations. |
| perf_bench_test.go | Avoids request rebuild inside benchmark loops; adds focused JSON serialize/deserialize benchmarks. |
| json_test.go | Adds tests for trailing-data rejection and pooled-buffer reuse/concurrency/cap behaviors. |
| bind_test.go | Updates expected JSON parse error message for truncated input (unexpected end of JSON input). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // refreshLeaf recomputes whether the node is a leaf (has no children of any | ||
| // kind). len() is used for staticChildren so it stays correct whether the slice | ||
| // is nil or an emptied-but-non-nil slice left behind after a removal. | ||
| func (n *node) refreshLeaf() { | ||
| n.isLeaf = len(n.staticChildren) == 0 && n.paramChild == nil && n.anyChild == nil |
There was a problem hiding this comment.
Good catch — fixed in 4dae923. newNode now computes isLeaf with len(sc) == 0 instead of sc == nil, so an empty-but-non-nil staticChildren slice (which a prior Remove can splice down to length 0 on a handler node) is treated as no children, consistent with refreshLeaf.
For the record the routing impact was benign — the split branch only ever runs on static nodes, and a static node's isLeaf is not branched on in the Route hot path (only param nodes consult it) — but the inconsistency was real and is now removed.
| buf := jsonBufPool.Get().(*bytes.Buffer) | ||
| buf.Reset() | ||
| defer func() { | ||
| // Do not return oversized buffers to the pool — they would pin memory. | ||
| if buf.Cap() <= maxPooledJSONBuf { | ||
| jsonBufPool.Put(buf) | ||
| } | ||
| }() | ||
| if _, err := buf.ReadFrom(c.Request().Body); err != nil { | ||
| return ErrBadRequest.Wrap(err) | ||
| } | ||
| if err := json.Unmarshal(buf.Bytes(), target); err != nil { |
There was a problem hiding this comment.
Documented in 4dae923 rather than adding a streaming fallback.
Rationale for not doing the size-aware hybrid: the previous json.NewDecoder path also fully materializes any valid large JSON, so it does not bound memory either — the only real delta is a malformed-early, very large body (ReadFrom reads it all before Unmarshal rejects it). A small-vs-large hybrid would reintroduce the per-request decoder allocation this PR removes (the -69% B/op win) and add branching, while large valid bodies would still allocate fully.
The correct guard for untrusted input is middleware.BodyLimit (or http.MaxBytesReader), which makes the buf.ReadFrom here fail fast once the limit is exceeded — unchanged by this PR. I added a doc note on Deserialize recommending exactly that.
- router: newNode computes isLeaf with len(sc)==0 instead of sc==nil so an empty-but-non-nil staticChildren slice (which a prior Remove can leave behind) is treated as no children, staying consistent with refreshLeaf. - json: document that Deserialize reads the full request body into memory and recommend middleware.BodyLimit / http.MaxBytesReader for untrusted input, which makes the body read fail fast once the limit is exceeded. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| // len() (not == nil) so an empty-but-non-nil sc — e.g. a slice a prior | ||
| // Remove spliced down to length 0 — is still treated as no children, | ||
| // matching refreshLeaf. |
There was a problem hiding this comment.
Fixed in 8f373dd — reworded to "a slice from a prior Remove that was spliced down to length 0".
|
@vishr , could you order LLM to do these improvements in separate PRs. |
aldas
left a comment
There was a problem hiding this comment.
please split this PR to 2 separate PRs. It is easier to understand what LLM has add for the feature that way.
DefaultJSONSerializer.Deserialize used json.NewDecoder(body).Decode(), which allocates a decoder and its internal read buffer on every JSON request. Read the body into a capped pooled buffer and decode with json.Unmarshal instead; Unmarshal does not retain the input slice, so the buffer is safe to reuse. The pool drops oversized buffers (>64 KiB) so a large body cannot pin memory. BenchmarkBind_JSON: 1008 B -> 312 B/op (-69%), 9 -> 7 allocs, ~12% faster. Behavioral note: json.Unmarshal is stricter than Decode — it rejects trailing data after the JSON value and reports "unexpected end of JSON input" for truncated bodies (both still 400 Bad Request). Two bind tests asserting the old "unexpected EOF" message are updated to match. Also add a general ServeHTTP/JSON benchmark suite (perf_bench_test.go): BenchmarkBind_JSON now builds the request once and resets a reusable body instead of calling httptest.NewRequest inside the loop (which dominated the old measurement), plus BenchmarkJSONSerialize/Deserialize and the routing benchmarks used to measure both this and the companion router PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add correctness coverage for the new pooled-buffer JSON deserializer, addressing gaps surfaced in review: - RejectsTrailingData: documents that json.Unmarshal rejects content after the first top-level value (a behavior change from streaming json.Decoder). - PooledBufferReuse: long body followed by a short one must not leak stale bytes through the reused buffer. - PooledBufferConcurrent: many goroutines decoding distinct bodies; under -race this catches any aliasing/missing-reset regression (data bleed). - LargeBodyThenNormal: exercises the >64 KiB buffer-cap path and that the oversized buffer does not corrupt the next request. - BodyReadError: a failing body read surfaces as 400, matching prior behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
8f373dd to
270f040
Compare
|
Done, @aldas — split into two PRs as you asked. This one (#3023) is now JSON-only (pooled-buffer |
Per @aldas's request, this PR is now JSON-only. The router change moved to #3024.
What
DefaultJSONSerializer.Deserializeusedjson.NewDecoder(body).Decode(), which allocates a decoder and its internal read buffer on every JSON request. It now reads the body into a capped pooled buffer and decodes withjson.Unmarshal;Unmarshaldoes not retain the input slice, so the buffer is safe to reuse. The pool drops oversized buffers (>64 KiB) so a single large body cannot pin memory.Numbers
BenchmarkBind_JSON: 1008 → 312 B/op (-69%), 9 → 7 allocs, ~12% faster.Behavioral note
json.Unmarshalis stricter than streamingDecode— it rejects trailing data after the first top-level value and reports"unexpected end of JSON input"for truncated bodies (both still 400 Bad Request). Two bind tests asserting the old"unexpected EOF"message are updated to match. Covered by new tests injson_test.go(trailing-data rejection, pooled-buffer reuse/concurrency under-race, the >64 KiB cap path, and body-read errors).DoS note (from review)
The full body is read into memory before decoding. The previous
json.Decoderpath also fully materialized any valid large body, so the only real delta is a malformed-early huge body. The correct guard for untrusted input ismiddleware.BodyLimit/http.MaxBytesReader, which makes the read here fail fast — documented onDeserialize.This PR also adds the general ServeHTTP/JSON benchmark harness (
perf_bench_test.go) used to measure both this and #3024.🤖 Generated with Claude Code