polars.DataFrame.to_numpy#

DataFrame.to_numpy( structured: bool = False, *, order: IndexOrder = 'fortran', ) → np.ndarray[Any, Any][source]#

Convert DataFrame to a 2D NumPy array.

This operation clones data.

Parameters:

structured: Optionally return a structured array, with field names and dtypes that correspond to the DataFrame schema.
order: The index order of the returned NumPy array, either C-like or Fortran-like. In general, using the Fortran-like index order is faster. However, the C-like order might be more appropriate to use for downstream applications to prevent cloning data, e.g. when reshaping into a one-dimensional array. Note that this option only takes effect if structured is set to False and the DataFrame dtypes allow for a global dtype for all columns.

Notes

If you’re attempting to convert Utf8 to an array you’ll need to install pyarrow.

Examples

>>> df = pl.DataFrame(
...     {
...         "foo": [1, 2, 3],
...         "bar": [6.5, 7.0, 8.5],
...         "ham": ["a", "b", "c"],
...     },
...     schema_overrides={"foo": pl.UInt8, "bar": pl.Float32},
... )

Export to a standard 2D numpy array.

>>> df.to_numpy()
array([[1, 6.5, 'a'],
       [2, 7.0, 'b'],
       [3, 8.5, 'c']], dtype=object)

Export to a structured array, which can better-preserve individual column data, such as name and dtype…

>>> df.to_numpy(structured=True)
array([(1, 6.5, 'a'), (2, 7. , 'b'), (3, 8.5, 'c')],
      dtype=[('foo', 'u1'), ('bar', '<f4'), ('ham', '<U1')])

…optionally zero-copying as a record array view:

>>> import numpy as np
>>> df.to_numpy(True).view(np.recarray)
rec.array([(1, 6.5, 'a'), (2, 7. , 'b'), (3, 8.5, 'c')],
          dtype=[('foo', 'u1'), ('bar', '<f4'), ('ham', '<U1')])