Skip to content

GH-50240: [C++] Make IPC message decoding stricter#50235

Merged
kou merged 1 commit into
apache:mainfrom
pitrou:exp-stricter-message-decoding
Jun 24, 2026
Merged

GH-50240: [C++] Make IPC message decoding stricter#50235
kou merged 1 commit into
apache:mainfrom
pitrou:exp-stricter-message-decoding

Conversation

@pitrou

@pitrou pitrou commented Jun 22, 2026

Copy link
Copy Markdown
Member

Rationale for this change

An IPC file has a footer listing the exact locations in the file of the various IPC messages, such as RecordBatch messages.

However, we currently don't notice if a message size advertised in the IPC footer is larger than the actual serialized message size, therefore we happily accept invalid IPC files.

Found by OSS-Fuzz in https://issues.oss-fuzz.com/issues/524437775

What changes are included in this PR?

  1. Error out when a message metadata size doesn't match the advertised value
  2. Also fix a bug where ReadFieldsSubset did not properly handle legacy IPC encapsulation (without a continuation indicator)
  3. Add a test suite for MessageDecoder and the various ReadMessage functions

Are these changes tested?

Yes, by new test suite and by additional fuzz regression file.

Are there any user-facing changes?

Being stricter implies that some IPC files might be rejected that were accepted before. Hopefully such files don't exist, but some IPC writers might have emitted them anyway.

@github-actions

Copy link
Copy Markdown

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@pitrou pitrou force-pushed the exp-stricter-message-decoding branch 6 times, most recently from d9884c5 to f817f7c Compare June 23, 2026 14:27
@pitrou pitrou changed the title EXP: [C++] Make IPC message decoding stricter GH-50240: [C++] Make IPC message decoding stricter Jun 23, 2026
@pitrou pitrou force-pushed the exp-stricter-message-decoding branch 3 times, most recently from 1bd50da to a40538f Compare June 23, 2026 15:26
@pitrou

pitrou commented Jun 23, 2026

Copy link
Copy Markdown
Member Author

@github-actions crossbow submit -g cpp

@pitrou pitrou marked this pull request as ready for review June 23, 2026 15:39
@pitrou pitrou requested review from kou and lidavidm June 23, 2026 15:40
@github-actions

Copy link
Copy Markdown

Revision: a40538f

Submitted crossbow builds: ursacomputing/crossbow @ actions-2ee7f5115a

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-debian-13-cpp-amd64 GitHub Actions
test-debian-13-cpp-i386 GitHub Actions
test-debian-experimental-cpp-gcc-15 GitHub Actions
test-fedora-42-cpp GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

Error out when the advertised message metadata length is larger than expected.
@pitrou pitrou force-pushed the exp-stricter-message-decoding branch from a40538f to 487a7f8 Compare June 23, 2026 15:44
@github-actions github-actions Bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Jun 23, 2026

@kou kou left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kou kou merged commit f3f65df into apache:main Jun 24, 2026
54 of 56 checks passed
@kou kou removed the awaiting merge Awaiting merge label Jun 24, 2026
@pitrou pitrou deleted the exp-stricter-message-decoding branch June 24, 2026 06:04
@conbench-apache-arrow

Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit f3f65df.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 9 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants