Numpy
Polars expressions support NumPy ufuncs. See here for a list on all supported numpy functions.
This means that if a function is not provided by Polars, we can use NumPy and we still have fast columnar operation through the NumPy API.
Example
DataFrame
· log
· Available on feature numpy
import polars as pl
import numpy as np
df = pl.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
out = df.select(np.log(pl.all()).name.suffix("_log"))
print(out)
shape: (3, 2)
┌──────────┬──────────┐
│ a_log ┆ b_log │
│ --- ┆ --- │
│ f64 ┆ f64 │
╞══════════╪══════════╡
│ 0.0 ┆ 1.386294 │
│ 0.693147 ┆ 1.609438 │
│ 1.098612 ┆ 1.791759 │
└──────────┴──────────┘
Interoperability
Polars Series
have support for NumPy universal functions (ufuncs) and generalized ufuncs. Element-wise functions such as np.exp()
, np.cos()
, np.div()
, etc. all work with almost zero overhead.
However, as a Polars-specific remark: missing values are a separate bitmask and are not visible by NumPy. This can lead to a window function or a np.convolve()
giving flawed or incomplete results, so an error will be raised if you pass a Series
with missing data to a generalized ufunc.
Convert a Polars Series
to a NumPy array with the .to_numpy()
method. Missing values will be replaced by np.nan
during the conversion.