polars.Series.apply#

Series.apply(
function: Callable[[Any], Any],
return_dtype: PolarsDataType | None = None,
*,
skip_nulls: bool = True,
) Self[source]#

Apply a custom/user-defined function (UDF) over elements in this Series.

Warning

This method is much slower than the native expressions API. Only use it if you cannot implement your logic otherwise.

If the function returns a different datatype, the return_dtype arg should be set, otherwise the method will fail.

Implementing logic using a Python function is almost always _significantly_ slower and more memory intensive than implementing the same logic using the native expression API because:

  • The native expression engine runs in Rust; UDFs run in Python.

  • Use of Python UDFs forces the DataFrame to be materialized in memory.

  • Polars-native expressions can be parallelised (UDFs typically cannot).

  • Polars-native expressions can be logically optimised (UDFs cannot).

Wherever possible you should strongly prefer the native expression API to achieve the best performance.

Parameters:
function

Custom function or lambda.

return_dtype

Output datatype. If none is given, the same datatype as this Series will be used.

skip_nulls

Nulls will be skipped and not passed to the python function. This is faster because python can be skipped and because we call more specialized functions.

Returns:
Series

Warning

If return_dtype is not provided, this may lead to unexpected results. We allow this, but it is considered a bug in the user’s query.

Notes

If your function is expensive and you don’t want it to be called more than once for a given input, consider applying an @lru_cache decorator to it. With suitable data you may achieve order-of-magnitude speedups (or more).

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.apply(lambda x: x + 10)  
shape: (3,)
Series: 'a' [i64]
[
        11
        12
        13
]