Skip to content

test(datafusion): add Lumina index build procedure#392

Merged
JingsongLi merged 2 commits into
apache:mainfrom
liujiwen-up:test/lumina-build-query-e2e-review-main
Jun 23, 2026
Merged

test(datafusion): add Lumina index build procedure#392
JingsongLi merged 2 commits into
apache:mainfrom
liujiwen-up:test/lumina-build-query-e2e-review-main

Conversation

@liujiwen-up

@liujiwen-up liujiwen-up commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Purpose

Linked PR: #347

Expose the core Lumina index build capability through DataFusion SQL so users can build a Lumina global vector index with a CALL sys.create_lumina_index(...) statement, then query it through the existing vector_search table function.

Brief change log

  • Add CALL sys.create_lumina_index(table => ..., index_column => ...) to DataFusion procedures.
  • Wire optional Lumina builder settings through index_type and comma-separated options => 'key=value,...' arguments.
  • Add procedure error coverage for missing index_column, invalid index type, and malformed options.
  • Add an ignored native Lumina E2E test that writes vector data, builds the Lumina index via SQL, verifies $table_indexes, and runs vector_search.
  • Add a CI step to run the native Lumina DataFusion E2E test with the vortex feature after installing the Lumina native library.
  • Document the new SQL procedure and options syntax.

Tests

  • cargo fmt: passed
  • cargo check -p paimon-datafusion: passed
  • cargo test -p paimon-datafusion --test procedures create_lumina_index: passed, 3 tests
  • cargo test -p paimon-datafusion --features vortex vector_search_tests::test_lumina_build_then_vector_search_query -- --ignored --exact --list: passed, verifies the ignored native E2E test compiles and is discoverable

API and Format

Adds a DataFusion SQL procedure surface for an existing core Lumina index build API. No storage format, manifest format, or public Rust API changes are introduced by this PR.

Documentation

Updated docs/src/sql.md with the create_lumina_index procedure and optional options example.

@liujiwen-up liujiwen-up changed the title feat(datafusion): add Lumina index build procedure test(datafusion): add Lumina index build procedure Jun 17, 2026
Comment thread docs/src/sql.md
@QuakeWang

Copy link
Copy Markdown
Member

Please rebase main.

@liujiwen-up liujiwen-up force-pushed the test/lumina-build-query-e2e-review-main branch from 2b525e1 to 117760d Compare June 22, 2026 02:42
@liujiwen-up liujiwen-up force-pushed the test/lumina-build-query-e2e-review-main branch from 117760d to b4a3364 Compare June 22, 2026 02:44
Comment thread docs/src/sql.md
CALL sys.create_lumina_index(table => 'paimon.my_db.my_table', index_column => 'embedding');
```

The optional `index_type` argument selects the Lumina index identifier. It defaults to

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: this wording suggests index_type controls the index identifier written to the manifest. The builder accepts lumina-vector-ann for compatibility, but newly built indexes are canonicalized and stored as lumina. Could we clarify that here to avoid exposing a misleading contract?

@JingsongLi JingsongLi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 6aa16c2 into apache:main Jun 23, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants