[python] Support tag time_retained (TTL) on FileSystemCatalog#8319
Open
TheR1sing3un wants to merge 4 commits into
Open
[python] Support tag time_retained (TTL) on FileSystemCatalog#8319TheR1sing3un wants to merge 4 commits into
TheR1sing3un wants to merge 4 commits into
Conversation
Add an optional encoder/decoder per dataclass field in the JSON serializer (json_field_with_codec), applied only when present so existing dataclasses are unaffected. Add time_utils codecs that mirror Jackson's on-disk shapes for java.time types: LocalDateTime as a [y, mo, d, h, mi, s, ns] array, Duration as decimal seconds, plus duration_to_iso8601 and local_datetime_to_millis helpers.
Turn Tag into a dataclass extending Snapshot with optional tag_create_time and tag_time_retained, serialized in the same on-disk JSON shape as Java org.apache.paimon.tag.Tag so tag files round-trip across the Java and Python SDKs. Add the from_snapshot_and_tag_ttl factory mirroring Java's Tag.fromSnapshotAndTagTtl.
Thread time_retained through TagManager / FileStoreTable / FileSystemCatalog so create_tag and replace_tag persist a create-time and TTL. Mirror Java TagManager.createOrReplaceTag: with no retention the plain Snapshot JSON is written (no tag-specific fields) to stay readable by older readers; with a retention the richer Tag JSON is written. Drop the NotImplementedError that previously rejected time_retained and surface the values via FileSystemCatalog.get_tag and the $tags system table instead of None.
Add unit tests for the temporal codecs and Tag serde (Java golden-value on-disk shape, round-trip, reading Java-written tags, and legacy plain-snapshot backward compatibility). Update the FileSystemCatalog and $tags end-to-end tests to cover create/replace with time_retained, Java-compatible on-disk JSON, and the no-TTL plain-snapshot path.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
FileSystemCatalog.create_tagrejectedtime_retainedwithNotImplementedError, and the PythonTagonly inheritedSnapshotfields, soget_tagand the$tagssystem table could only returnNonefor create-time / TTL.This implements real tag
time_retainedsupport on the FileSystem path, persistingtagCreateTime/tagTimeRetainedin the same on-disk JSON shape as Java (org.apache.paimon.tag.Tag) so tag files round-trip across the Java and Python SDKs.Changes
Tagnow carriestag_create_time(LocalDateTime as a[y, mo, d, h, mi, s, ns]array) andtag_time_retained(Duration as decimal seconds), via a new per-field JSON codec.create_tag/replace_tagthreadtime_retainedthrough TagManager / FileStoreTable / FileSystemCatalog. With no retention, the plain Snapshot JSON is written (backward compatible), mirroring JavaTagManager.createOrReplaceTag.get_tagand the$tagssystem table surface realcreate_time/time_retained(matching JavaTimestamp.fromLocalDateTime/Duration.toString()).Tests
Unit tests for the temporal codecs and Tag serde (Java golden-value shape, round-trip, reading Java-written tags, legacy plain-snapshot backward compatibility) plus FileSystemCatalog /
$tagsend-to-end coverage for create/replace withtime_retained.Does this PR introduce a user-facing change?
No.
Generative AI disclosure: drafted with AI assistance and reviewed by the author.