Skip to content

Fix flat reader subrange decode reuse#8596

Open
lukekim wants to merge 2 commits into
vortex-data:developfrom
spiceai:lukim/8587-regression
Open

Fix flat reader subrange decode reuse#8596
lukekim wants to merge 2 commits into
vortex-data:developfrom
spiceai:lukim/8587-regression

Conversation

@lukekim

@lukekim lukekim commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Fixes #8587.

Summary

  • Memoize FlatReader's decoded array future so synthetic subrange scans share the decoded flat segment instead of issuing repeated decode work.
  • Add regression coverage for the query patterns that regressed after perf(scan): intra-file decode parallelism — sub-split large chunk spans #8400, including projection-only, filter-only, filtered projection, computed projection, string filtered projection, and string filtered computed projection cases.

Validation

  • cargo nextest run -p vortex-layout -E 'test(layouts::flat::reader)'
  • cargo nextest run -p vortex-layout
  • cargo clippy -p vortex-layout --all-targets --all-features
  • git diff --check
  • cargo bench --workspace
  • Re-ran SQL benches with /opt/homebrew/bin/uv 0.11.24; core suites passed: Appian, TPCH, TPCDS, ClickBench, ClickBench sorted, FineWeb, and GH Archive via direct binary rerun. PolarSignals/StatPopGen exposed pre-existing benchmark-definition/runtime backend failures, and bare Public BI requires --opt dataset=<name>.

Signed-off-by: Luke Kim <80174+lukekim@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: Subsplitting large chunks causes some regression for vortex-compact for some benchmarks

1 participant