Skip to content

feat!: align runtime API and add runtime dispatch#11

Merged
voltjia merged 16 commits into
masterfrom
fix/runtime-api-alignment
Jul 3, 2026
Merged

feat!: align runtime API and add runtime dispatch#11
voltjia merged 16 commits into
masterfrom
fix/runtime-api-alignment

Conversation

@voltjia

@voltjia voltjia commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Move CUDA Runtime API-shaped public wrappers under infini::rt::runtime, keeping infini::rt free for InfiniRT-specific APIs.
  • Add top-level infini::rt::set_runtime_device_type and infini::rt::runtime_device_type for runtime dispatch between enabled backends.
  • Normalize runtime memcpy-kind constants to kMemcpy... names and remove the non-k aliases from the public API surface.
  • Keep the generated runtime API declaration list in scripts/generate_public_headers.py for now, instead of parsing CUDA docs or headers.
  • Constrain TensorView's tensor-like constructor and add focused test_tensor_view coverage.
  • Update README examples for the new infini::rt::runtime namespace and runtime-device selector APIs.

Motivation

The latest interface guide requires the lowest-level C++ runtime API to match CUDA Runtime API signatures after only Google C++ Style naming conversion. Keeping those CUDA-shaped APIs under infini::rt::runtime makes that contract explicit, while the outer infini::rt namespace can host InfiniRT-specific dispatch APIs without worrying about current or future CUDA symbol conflicts.

Companion InfiniOps PR: InfiniTensor/InfiniOps#787

Type of Change

  • feat - new feature / new operator / new platform
  • fix - bug fix
  • perf - performance improvement (no behavioral change)
  • refactor - code restructuring without behavior change
  • test - adding or fixing tests only
  • docs - documentation only
  • build / ci - build system or CI configuration
  • chore - tooling, formatting, or other non-code changes
  • Breaking change (requires a ! in the Conventional Commits prefix or a BREAKING CHANGE: footer)

Platforms Affected

  • CPU (WITH_CPU)
  • NVIDIA (WITH_NVIDIA)
  • Iluvatar (WITH_ILUVATAR)
  • MetaX (WITH_METAX)
  • Cambricon (WITH_CAMBRICON)
  • Moore (WITH_MOORE)
  • Ascend (WITH_ASCEND)
  • PyTorch C++ bindings (WITH_TORCH)
  • Build system / CMake / CI
  • Python bindings / user-facing API

Smoke Test Result

# Host validation on ssh nvidia, outside Docker
export PATH=/usr/local/cuda/bin:$HOME/.local/bin:$PATH
cmake -S /tmp/infinirt-host-bab8/src -B /tmp/infinirt-host-bab8/build \
  -DWITH_CPU=ON -DWITH_NVIDIA=ON -DINFINI_RT_BUILD_TESTING=ON \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_INSTALL_PREFIX=/tmp/infinirt-host-bab8/prefix
cmake --build /tmp/infinirt-host-bab8/build -j8
ctest --test-dir /tmp/infinirt-host-bab8/build --output-on-failure
cmake --install /tmp/infinirt-host-bab8/build

100% tests passed, 0 tests failed out of 8
# accelerator-dev/nvidia:latest, with companion InfiniOps branch
InfiniOps no-Torch InfiniLM operator subset wheel: built and installed successfully.
InfiniOps representative PyTorch wrapper smoke (`abs,clamp,exp`): built and installed successfully.

Test Results on Supported Platforms

Platform Affected Build / Smoke Result Full Result / Notes
NVIDIA yes smoke passed Host CPU+NVIDIA CMake build, CTest 8/8, install, and install-consumer passed. Companion InfiniOps no-Torch and representative PyTorch smoke builds passed in accelerator-dev/nvidia:latest.
Iluvatar yes N/A Not tested: no Iluvatar hardware/toolkit in the validation environment.
MetaX yes N/A Not tested: no MetaX hardware/toolkit in the validation environment.
Cambricon yes N/A Not tested: no Cambricon hardware/toolkit in the validation environment.
Moore yes N/A Not tested: no Moore hardware/toolkit in the validation environment.
Ascend yes N/A Not tested: no Ascend hardware/toolkit in the validation environment.
Full `pytest` output (optional)
N/A

Benchmark / Performance Impact

N/A. This PR changes runtime API shape and dispatch plumbing; no performance-sensitive kernels are changed.

Notes for Reviewers

  • The lowest-level CUDA-shaped APIs are now infini::rt::runtime::*; the outer infini::rt namespace currently only adds runtime-device selector APIs.
  • The generated public header declarations are still driven by a short explicit list in the generator script. Parsing CUDA docs or cuda_runtime.h is intentionally left out of this PR to keep scope contained.
  • Full InfiniCore/InfiniLM inference validation was attempted, but the remote environment repeatedly failed while downloading xmake's Boost dependency from GitHub. The failure occurred before InfiniCore compilation and appears network-related. The relevant InfiniRT and InfiniOps compatibility checks listed above passed.

BREAKING CHANGE: CUDA-shaped runtime APIs move to infini::rt::runtime, and non-k runtime memcpy-kind aliases are removed from the normalized runtime API surface.

@voltjia voltjia changed the title feat!: align runtime API and add default dispatch feat!: align runtime API and add runtime dispatch Jul 2, 2026
@voltjia voltjia marked this pull request as ready for review July 2, 2026 06:04
@voltjia voltjia merged commit 2c9a4fa into master Jul 3, 2026
4 checks passed
@voltjia voltjia deleted the fix/runtime-api-alignment branch July 3, 2026 01:18
voltjia added a commit that referenced this pull request Jul 3, 2026
* docs: add `README.md` (#7)

* docs: add README

* docs: update `README.md`

* feat: add tensor-like `TensorView` constructor (#10)

* feat!: align runtime API and add runtime dispatch (#11)

* Align runtime API with generated wrappers

* Add default runtime dispatch specialization

* Refactor runtime dispatch namespace

* Use Abseil status for runtime device API

* Revert "Use Abseil status for runtime device API"

This reverts commit a26ddff.

* Address runtime dispatch review feedback

* Keep runtime API list in generator

* Add TensorView constructor guard test

* Align runtime memcpy kind constants with CUDA API

* Use CUDA-style runtime memcpy constants

* Use CUDA-style runtime memcpy constants

* Move TensorView tests back into core test

* Remove standalone TensorView test target

* Remove standalone TensorView test file

* Use fully qualified runtime API names in README

* style: format runtime dispatch test
voltjia added a commit that referenced this pull request Jul 3, 2026
* feat!: align runtime API and add runtime dispatch (#11)

* Align runtime API with generated wrappers

* Add default runtime dispatch specialization

* Refactor runtime dispatch namespace

* Use Abseil status for runtime device API

* Revert "Use Abseil status for runtime device API"

This reverts commit a26ddff.

* Address runtime dispatch review feedback

* Keep runtime API list in generator

* Add TensorView constructor guard test

* Align runtime memcpy kind constants with CUDA API

* Use CUDA-style runtime memcpy constants

* Use CUDA-style runtime memcpy constants

* Move TensorView tests back into core test

* Remove standalone TensorView test target

* Remove standalone TensorView test file

* Use fully qualified runtime API names in README

* style: format runtime dispatch test

* feat: refactor InfiniCore CPU runtime to InfiniRT (#8)

Co-authored-by: Jiacheng Huang <huangjiacheng0709@outlook.com>

* feat: add platform-adaptive runtime tests (#15)

* feat: add runtime backend API foundation (#14)

---------

Co-authored-by: spike-zhu <74974704+spike-zhu@users.noreply.github.com>
voltjia added a commit that referenced this pull request Jul 3, 2026
* feat!: align runtime API and add runtime dispatch (#11)

* Align runtime API with generated wrappers

* Add default runtime dispatch specialization

* Refactor runtime dispatch namespace

* Use Abseil status for runtime device API

* Revert "Use Abseil status for runtime device API"

This reverts commit a26ddff.

* Address runtime dispatch review feedback

* Keep runtime API list in generator

* Add TensorView constructor guard test

* Align runtime memcpy kind constants with CUDA API

* Use CUDA-style runtime memcpy constants

* Use CUDA-style runtime memcpy constants

* Move TensorView tests back into core test

* Remove standalone TensorView test target

* Remove standalone TensorView test file

* Use fully qualified runtime API names in README

* style: format runtime dispatch test

* feat: refactor InfiniCore CPU runtime to InfiniRT (#8)

Co-authored-by: Jiacheng Huang <huangjiacheng0709@outlook.com>

* feat: add platform-adaptive runtime tests (#15)

* feat: add runtime backend API foundation (#14)

---------

Co-authored-by: spike-zhu <74974704+spike-zhu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant