polars.Series.map_elements#
- Series.map_elements(
- function: Callable[[Any], Any],
- return_dtype: PolarsDataType | None = None,
- *,
- skip_nulls: bool = True,
Map a custom/user-defined function (UDF) over elements in this Series.
Warning
This method is much slower than the native expressions API. Only use it if you cannot implement your logic otherwise.
If the function returns a different datatype, the return_dtype arg should be set, otherwise the method will fail.
Implementing logic using a Python function is almost always significantly slower and more memory intensive than implementing the same logic using the native expression API because:
The native expression engine runs in Rust; UDFs run in Python.
Use of Python UDFs forces the DataFrame to be materialized in memory.
Polars-native expressions can be parallelised (UDFs typically cannot).
Polars-native expressions can be logically optimised (UDFs cannot).
Wherever possible you should strongly prefer the native expression API to achieve the best performance.
- Parameters:
- function
Custom function or lambda.
- return_dtype
Output datatype. If none is given, the same datatype as this Series will be used.
- skip_nulls
Nulls will be skipped and not passed to the python function. This is faster because python can be skipped and because we call more specialized functions.
- Returns:
- Series
Warning
If
return_dtype
is not provided, this may lead to unexpected results. We allow this, but it is considered a bug in the user’s query.Notes
If your function is expensive and you don’t want it to be called more than once for a given input, consider applying an
@lru_cache
decorator to it. If your data is suitable you may achieve significant speedups.Examples
>>> s = pl.Series("a", [1, 2, 3]) >>> s.map_elements(lambda x: x + 10) shape: (3,) Series: 'a' [i64] [ 11 12 13 ]