Skip to content

feat: Support UUID values across row and file IO#790

Open
zhjwpku wants to merge 2 commits into
apache:mainfrom
zhjwpku:issue-314-uuid-support
Open

feat: Support UUID values across row and file IO#790
zhjwpku wants to merge 2 commits into
apache:mainfrom
zhjwpku:issue-314-uuid-support

Conversation

@zhjwpku

@zhjwpku zhjwpku commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator

closes #314

@huan233usc huan233usc left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor, non-blocking nits — otherwise LGTM.

return std::nullopt;
}

constexpr std::array<uint8_t, Uuid::kLength> kUuidBytes1 = {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kUuidBytes1/2 + MakeUuidArray are duplicated with avro_test.cc — consider a shared test util.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kUuidBytes1/2 are just random UUID values, adding a shared test util file might be overkill.

field->nullable(), field->metadata());
}

// Arrow cannot construct builders for arrow.uuid extension arrays yet, so build

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This storage-type rewrite is a general workaround; if a second caller shows up, consider promoting it to a shared internal helper.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only AvroReader needs this workaround, so I think it's fine to keep it here for now?

SchemaField::MakeRequired(3, MapType::kValueName, iceberg::uuid())))});

auto map_offsets = MakeInt32Array({0, 2, 3});
auto map_keys = MakeStringArray({"first", "second", "only"});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map key is a string here — the UUID-key path (StorageTypeForBuilder(key_type)) is untested; add a UUID-key case if Iceberg allows it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a case, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support UUID type

2 participants