Skip to content

Generate partitions/subsets of knowledge split by ioos_category#180

Draft
abkfenris wants to merge 1 commit into
phase-2b-wasm-js-ingestion-apifrom
phase-2c-partition-subset-generation
Draft

Generate partitions/subsets of knowledge split by ioos_category#180
abkfenris wants to merge 1 commit into
phase-2b-wasm-js-ingestion-apifrom
phase-2c-partition-subset-generation

Conversation

@abkfenris

@abkfenris abkfenris commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Add utils/generate_partitions.py (uv script) that reads core/standards/*.yaml
and emits both compact JSON and human-readable YAML for:
data/all-standards.{json,yaml} — full CF vocabulary
data/all-knowledge.{json,yaml} — CF standards + all knowledge entries
data/partitions/{category}.{json,yaml} — 14 per-IOOS-category self-contained subsets

Each partition: { cf_standards: {standard_names, aliases}, knowledge: [...] }
Consumers call loadStandards(data.cf_standards) + loadKnowledgeObjects(data.knowledge).

YAML scalars that look like numbers (e.g. 'unit: 1', 'unit: 0.001') are coerced
to strings in both scripts — matching serde_yaml's automatic coercion but
preserving correctness through the JSON path.

Extend gen-data.mjs to mirror the same partition logic into public/data/ for
Vite dev/test serving (identical format, JS-only dependency chain).

Add 3 Vitest tests verifying partition loading:

  • meteorology subset is self-contained (out-of-category standards absent)
  • all-knowledge covers every category
  • all-standards loads the full vocabulary without knowledge enrichment

Add generate_partitions nox session (off by default, opt-in for CI release path).
Full nox suite passes: 19+11 Rust, 15 Vitest, 5 Playwright, 1 pack smoke.


This is part 4 of 6 in a stack made with GitButler:

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown

Javascript size report

Size
Base (phase-2b-wasm-js-ingestion-api) 462.3 KB
PR 462.3 KB
Delta +0.0 KB (+0.0%)

✅ No size change.

Add utils/generate_partitions.py (uv script) that reads core/standards/*.yaml
and emits both compact JSON and human-readable YAML for:
  data/all-standards.{json,yaml}          — full CF vocabulary
  data/all-knowledge.{json,yaml}          — CF standards + all knowledge entries
  data/partitions/{category}.{json,yaml}  — 14 per-IOOS-category self-contained subsets

Each partition: { cf_standards: {standard_names, aliases}, knowledge: [...] }
Consumers call loadStandards(data.cf_standards) + loadKnowledgeObjects(data.knowledge).

YAML scalars that look like numbers (e.g. 'unit: 1', 'unit: 0.001') are coerced
to strings in both scripts — matching serde_yaml's automatic coercion but
preserving correctness through the JSON path.

Extend gen-data.mjs to mirror the same partition logic into public/data/ for
Vite dev/test serving (identical format, JS-only dependency chain).

Add 3 Vitest tests verifying partition loading:
  - meteorology subset is self-contained (out-of-category standards absent)
  - all-knowledge covers every category
  - all-standards loads the full vocabulary without knowledge enrichment

Add generate_partitions nox session (off by default, opt-in for CI release path).
Full nox suite passes: 19+11 Rust, 15 Vitest, 5 Playwright, 1 pack smoke.
@abkfenris abkfenris force-pushed the phase-2b-wasm-js-ingestion-api branch from f4285cd to 5caceff Compare June 25, 2026 20:36
@abkfenris abkfenris force-pushed the phase-2c-partition-subset-generation branch from 407164e to 16f471a Compare June 25, 2026 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant