fix: support 64K alignment#8599
Conversation
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
robert3005
left a comment
There was a problem hiding this comment.
This is one chonky alignment
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | copy_nullable[65536] |
1 ms | 1.4 ms | -24.26% |
| ❌ | Simulation | copy_non_nullable[65536] |
908.6 µs | 1,089.3 µs | -16.58% |
| ❌ | Simulation | slice_empty_vortex |
310 ns | 368.6 ns | -15.9% |
| ❌ | Simulation | chunked_varbinview_canonical_into[(100, 100)] |
224.4 µs | 259.5 µs | -13.55% |
| ❌ | Simulation | map_each[BufferMut<i32>, 128] |
407.8 ns | 466.1 ns | -12.51% |
| ❌ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
271.5 µs | 306.4 µs | -11.38% |
| ⚡ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
26.8 µs | 16.1 µs | +66.54% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
244.4 ns | 186.1 ns | +31.34% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
304.7 ns | 246.4 ns | +23.68% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
205.8 µs | 168.9 µs | +21.85% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[2048] |
398.6 ns | 340.3 ns | +17.14% |
| ⚡ | Simulation | true_count_arrow_buffer[128] |
784.7 ns | 697.2 ns | +12.55% |
| ⚡ | Simulation | rebuild_naive |
109.5 µs | 98.8 µs | +10.79% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing aduffy/allow-64k (5bec64a) with develop (3c53111)
Footnotes
-
4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
|
take it up with Jensen |
Rationale for this change
Our
Alignmenttype currently has a restriction that it fits in au16. This is easy enough to satisfy most of the time.However, while doing some testing with GH200 machines, which are ARM and which NVIDIA recommends you setup with 64K page alignment, I have seen some production code fail because it uses a custom allocator that passes back page-aligned memory, and attempting to construct 64K alignment is one past the maximum value so it panics.
What changes are included in this PR?
Alignment > u16::MAX, now you can represent 64K alignment on buffersWhat APIs are changed? Are there any user-facing changes?
Alignmentnow has no restriction that it must fit into au16, so some code which may have previously panicked will no longer panic.The infallable
From<Alignment> for u16impl has been removed entirely, there were zero callers within Vortex. This is technically a breaking change and anyone who was relying on this behavior should switch to using the exponent instead which will fit inu8.