Skip to content

[python] Fix upsert by key to update all rows matching an existing key#8318

Open
XiaoHongbo-Hope wants to merge 8 commits into
apache:masterfrom
XiaoHongbo-Hope:upsert_by_key_fix
Open

[python] Fix upsert by key to update all rows matching an existing key#8318
XiaoHongbo-Hope wants to merge 8 commits into
apache:masterfrom
XiaoHongbo-Hope:upsert_by_key_fix

Conversation

@XiaoHongbo-Hope

@XiaoHongbo-Hope XiaoHongbo-Hope commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Purpose

When updating an append-only table via upsert_by_key, every existing row with a matching key should be updated. But if the table already has multiple rows sharing the same key (append-only tables allow duplicates), only one of them is updated — the rest silently keep their old values, leaving inconsistent rows for that key.

This PR updates all matching rows. update_cols behavior and the single-match case are unchanged.

Tests

table_upsert_by_key_test.py:

  • test_upsert_for_existing_table_duplicate_keys
  • test_existing_duplicate_keys_partial_update_cols
  • test_existing_duplicate_keys_partitioned
  • test_multiple_keys_each_with_duplicates
  • updated test_composite_key_upsert to expect all duplicate matches updated

When the table already holds multiple rows sharing an upsert key
(append-only tables allow duplicate keys), TableUpsertByKey updates only
the last-scanned matching row and leaves the others stale.

Add a test asserting the intended behavior (all matching rows updated);
it currently fails. Fix to follow.
When the table already holds multiple rows sharing an upsert key,
TableUpsertByKey updated only the last-scanned row and left the others
stale. Collect all row ids per key (key -> [row_id, ...]) and expand each
matched input row to every matching row id, so all of them are updated.

Turns the previously failing test_upsert_for_existing_table_duplicate_keys
green.
@XiaoHongbo-Hope XiaoHongbo-Hope marked this pull request as ready for review June 22, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant