Skip to content

Introduce NumpyArray façade over py::array#513

Merged
evertlammerts merged 1 commit into
duckdb:mainfrom
evertlammerts:prototype/numpy-ndarray-facade
Jun 26, 2026
Merged

Introduce NumpyArray façade over py::array#513
evertlammerts merged 1 commit into
duckdb:mainfrom
evertlammerts:prototype/numpy-ndarray-facade

Conversation

@evertlammerts

Copy link
Copy Markdown
Member

Add a thin wrapper class NumpyArray (src/duckdb_py/include/duckdb_python/ numpy/numpy_array.hpp) whose single data member is a py::array. This is now the only spot in the codebase that names py::array as the underlying numpy-array representation, so a future migration to nanobind's nb::ndarray is localized to this one header.

The façade exposes Data()/MutableData() (data buffer pointers), an Allocate() factory (dtype + count), a FromObject() factory, an explicit NumpyArray(py::array) constructor (a py::object argument implicitly converts via np.asarray semantics, matching prior behaviour), and GetArray() accessors for .attr(...) calls, iteration, resize, and handing the array back to Python. It is default-constructible, copyable, and movable.

Route every direct py::array use through the façade:

  • numpy/raw_array_wrapper.{hpp,cpp}: member + Allocate/MutableData, resize via GetArray()
  • pandas/pandas_bind.hpp (RegisteredArray) and pandas/column/ pandas_numpy_column.hpp: members + constructors take NumpyArray
  • numpy/numpy_scan.cpp: scan helpers take NumpyArray&, .data() -> .Data()
  • numpy/numpy_bind.cpp, pandas/bind.cpp: construct NumpyArray instead of py::array; dtype attrs via GetArray()
  • numpy/array_wrapper.cpp (ToArray): move out / bool-check via GetArray()
  • pyconnection.cpp, python_replacement_scan.cpp: py::castpy::array(...) ->
    wrap the object in NumpyArray and use GetArray()

Add a thin wrapper class `NumpyArray` (src/duckdb_py/include/duckdb_python/
numpy/numpy_array.hpp) whose single data member is a `py::array`. This is now
the only spot in the codebase that names `py::array` as the underlying
numpy-array representation, so a future migration to nanobind's `nb::ndarray`
is localized to this one header.

The façade exposes Data()/MutableData() (data buffer pointers), an Allocate()
factory (dtype + count), a FromObject() factory, an `explicit
NumpyArray(py::array)` constructor (a py::object argument implicitly converts
via np.asarray semantics, matching prior behaviour), and GetArray() accessors
for .attr(...) calls, iteration, resize, and handing the array back to Python.
It is default-constructible, copyable, and movable.

Route every direct py::array use through the façade:
- numpy/raw_array_wrapper.{hpp,cpp}: member + Allocate/MutableData, resize via
        GetArray()
- pandas/pandas_bind.hpp (RegisteredArray) and pandas/column/
        pandas_numpy_column.hpp: members + constructors take NumpyArray
- numpy/numpy_scan.cpp: scan helpers take NumpyArray&, .data() -> .Data()
- numpy/numpy_bind.cpp, pandas/bind.cpp: construct NumpyArray instead of
        py::array; dtype attrs via GetArray()
- numpy/array_wrapper.cpp (ToArray): move out / bool-check via GetArray()
- pyconnection.cpp, python_replacement_scan.cpp: py::cast<py::array>(...) ->
        wrap the object in NumpyArray and use GetArray()
@evertlammerts evertlammerts merged commit 5a7cb18 into duckdb:main Jun 26, 2026
15 checks passed
@evertlammerts evertlammerts deleted the prototype/numpy-ndarray-facade branch June 26, 2026 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant