Skip to content

fix: decimal overflow in enriched packages copy#4305

Open
epipav wants to merge 2 commits into
mainfrom
fix/tb-package-versions-filter-bad-dates
Open

fix: decimal overflow in enriched packages copy#4305
epipav wants to merge 2 commits into
mainfrom
fix/tb-package-versions-filter-bad-dates

Conversation

@epipav

@epipav epipav commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Note

Low Risk
Narrow change to one analytics pipe node; behavior is equivalent for valid dates and only excludes corrupt or extreme timestamps.

Overview
Fixes the ossPackages_enriched_release Tinybird pipe so the scheduled COPY into ossPackages_enriched_ds no longer fails on decimal overflow.

secondPublished no longer uses arraySort with negated toUnixTimestamp(publishedAt); it now takes the second-most-recent date via arrayReverseSort(groupArray(publishedAt)), avoiding numeric overflow from timestamp negation.

The versions filter now drops out-of-range publishedAt values (before epoch and more than one day in the future) in addition to nulls, so bad dates do not propagate into releaseCadence scoring downstream.

Reviewed by Cursor Bugbot for commit 726fd36. Bugbot is set up for automated code reviews on this repo. Configure here.

Signed-off-by: anilb <epipav@gmail.com>
Copilot AI review requested due to automatic review settings July 3, 2026 13:51
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a decimal-overflow failure in the ossPackages_enriched Tinybird COPY pipe. The pipe computes per-package release-cadence signals from the versions datasource, where publishedAt is a Nullable(DateTime64(3)) replicated from Postgres. Out-of-range date values (far past/future) in that column caused DateTime64 (Decimal-backed) overflow during the daily COPY job, breaking the enriched dataset build.

The fix targets the ossPackages_enriched_release node by (1) replacing the arraySort(x -> -toUnixTimestamp(x), ...) descending-sort trick with the simpler arrayReverseSort(...) — removing the toUnixTimestamp conversion — and (2) bounding publishedAt to the sane range [1970-01-01, now()+1 day) so pathological dates no longer flow into the sort and downstream dateDiff calls.

Changes:

  • Replace arrayElement(arraySort(x -> -toUnixTimestamp(x), groupArray(publishedAt)), 2) with arrayElement(arrayReverseSort(groupArray(publishedAt)), 2) to compute secondPublished (semantically equivalent: newest-first ordering, second element = second-newest).
  • Add WHERE guards filtering publishedAt to >= 1970-01-01 and < now() + INTERVAL 1 DAY, excluding out-of-range dates that trigger the overflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants