feat: support manifest compaction in table commits#400
Conversation
| let entries = self | ||
| .read_manifest_entries(file_io, manifest_dir, candidates) | ||
| .await?; | ||
| let active_entries = merge_active_entries(entries); |
There was a problem hiding this comment.
This merge helper drops DELETE entries whose matching ADD is not in the compacted candidate set. That is unsafe for manifest compaction: a DELETE may refer to an ADD kept in an older/base manifest that is not being rewritten. Dropping it can make the deleted data file visible again in later scans.
Java FileEntry.mergeEntries keeps unmatched DELETE entries for this reason. We should use equivalent compaction merge semantics here: ADD/DELETE in the same candidate set can cancel, but unmatched DELETE entries must be preserved.
There was a problem hiding this comment.
Thanks for catching this. I fixed the merge semantics to align with Java.
The minor compaction path now uses Java FileEntry.mergeEntries-style behavior, so ADD/DELETE pairs in the same candidate set cancel, while unmatched DELETE entries are preserved.
For full compaction, I aligned with Java ManifestFileMerger: it reads ADD entries and filters them by collected DELETE identifiers, so delete-only manifests are removed instead of writing DELETE entries back.
I also extracted the logic into a separate ManifestFileMerger module and moved the compaction coverage there.
7768824 to
eaf9177
Compare
| .read_for_full_compaction(&manifest_file, &delete_identifiers) | ||
| .await?; | ||
| if read_result.require_change { | ||
| new_files.extend(self.write_compacted_entries(&read_result.entries).await?); |
There was a problem hiding this comment.
Full compaction should feed all changed entries into one rolling writer instead of rewriting each input manifest independently. With the current loop, if a table has many small manifests and manifest.full-compaction-threshold-size is reached without deletes, every manifest is must_change, but each one is written back as its own new manifest and merge() returns before the minor compaction path can merge them. The result is N small manifests with new names, so full compaction does a lot of I/O while leaving the manifest count and small-file problem unchanged. Java keeps a single RollingFileWriter for the whole toBeMerged set and rolls only when manifest.target-file-size is reached; this path should do the same, or fall back to minor compaction when no entries were actually removed.
Purpose
Support manifest compaction during table commits and add Java-compatible manifest compaction options.
Tests
cargo fmt --allcargo test -p paimon manifest_compaction --features storage-memorycargo test -p paimon table_commit::tests --features storage-memorygit diff --check