polars.map_batches#

polars.map_batches(
exprs: Sequence[str | Expr],
function: Callable[[Sequence[Series]], Series | Any],
return_dtype: PolarsDataType | DataTypeExpr | None = None,
*,
is_elementwise: bool = False,
returns_scalar: bool = False,
_is_ufunc: bool = False,
) Expr[source]#

Map a custom function over multiple columns/expressions.

Produces a single Series result.

Parameters:
exprs

Expression(s) representing the input Series to the function.

function

Function to apply over the input.

return_dtype

dtype of the output Series.

is_elementwise

Set to true if the operations is elementwise for better performance and optimization.

An elementwise operations has unit or equal length for all inputs and can be ran sequentially on slices without results being affected.

returns_scalar

If the function returns a scalar, by default it will be wrapped in a list in the output, since the assumption is that the function always returns something Series-like. If you want to keep the result as a scalar, set this argument to True.

Returns:
Expr

Expression with the data type given by return_dtype.

Examples

>>> def test_func(a, b, c):
...     return a + b + c
>>> df = pl.DataFrame(
...     {
...         "a": [1, 2, 3, 4],
...         "b": [4, 5, 6, 7],
...     }
... )
>>>
>>> df.with_columns(
...     (
...         pl.struct(["a", "b"]).map_batches(
...             lambda x: test_func(x.struct.field("a"), x.struct.field("b"), 1)
...         )
...     ).alias("a+b+c")
... )
shape: (4, 3)
┌─────┬─────┬───────┐
│ a   ┆ b   ┆ a+b+c │
│ --- ┆ --- ┆ ---   │
│ i64 ┆ i64 ┆ i64   │
╞═════╪═════╪═══════╡
│ 1   ┆ 4   ┆ 6     │
│ 2   ┆ 5   ┆ 8     │
│ 3   ┆ 6   ┆ 10    │
│ 4   ┆ 7   ┆ 12    │
└─────┴─────┴───────┘