polars.Expr.replace#
- Expr.replace(
- mapping: dict[Any, Any],
- *,
- default: Any = _NoDefault.no_default,
- return_dtype: PolarsDataType | None = None,
Replace values according to the given mapping.
Needs a global string cache for lazily evaluated queries on columns of type
Categorical
.- Parameters:
- mapping
Mapping of values to their replacement.
- default
Value to use when the mapping does not contain the lookup value. Defaults to keeping the original value. Accepts expression input. Non-expression inputs are parsed as literals.
- return_dtype
Set return dtype to override automatic return dtype determination.
See also
Examples
Replace a single value by another value. Values not in the mapping remain unchanged.
>>> df = pl.DataFrame({"a": [1, 2, 2, 3]}) >>> df.with_columns(pl.col("a").replace({2: 100}).alias("replaced")) shape: (4, 2) ┌─────┬──────────┐ │ a ┆ replaced │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪══════════╡ │ 1 ┆ 1 │ │ 2 ┆ 100 │ │ 2 ┆ 100 │ │ 3 ┆ 3 │ └─────┴──────────┘
Replace multiple values. Specify a default to set values not in the given map to the default value.
>>> df = pl.DataFrame({"country_code": ["FR", "ES", "DE", None]}) >>> country_code_map = { ... "CA": "Canada", ... "DE": "Germany", ... "FR": "France", ... None: "unspecified", ... } >>> df.with_columns( ... pl.col("country_code") ... .replace(country_code_map, default=None) ... .alias("replaced") ... ) shape: (4, 2) ┌──────────────┬─────────────┐ │ country_code ┆ replaced │ │ --- ┆ --- │ │ str ┆ str │ ╞══════════════╪═════════════╡ │ FR ┆ France │ │ ES ┆ null │ │ DE ┆ Germany │ │ null ┆ unspecified │ └──────────────┴─────────────┘
The return type can be overridden with the
return_dtype
argument.>>> df = df.with_row_count() >>> df.select( ... "row_nr", ... pl.col("row_nr") ... .replace({1: 10, 2: 20}, default=0, return_dtype=pl.UInt8) ... .alias("replaced"), ... ) shape: (4, 2) ┌────────┬──────────┐ │ row_nr ┆ replaced │ │ --- ┆ --- │ │ u32 ┆ u8 │ ╞════════╪══════════╡ │ 0 ┆ 0 │ │ 1 ┆ 10 │ │ 2 ┆ 20 │ │ 3 ┆ 0 │ └────────┴──────────┘
To reference other columns as a
default
value, a struct column must be constructed first. The first field must be the column in which values are replaced. The other columns can be used in the default expression.>>> df.with_columns( ... pl.struct("country_code", "row_nr") ... .replace( ... mapping=country_code_map, ... default=pl.col("row_nr").cast(pl.Utf8), ... ) ... .alias("replaced") ... ) shape: (4, 3) ┌────────┬──────────────┬─────────────┐ │ row_nr ┆ country_code ┆ replaced │ │ --- ┆ --- ┆ --- │ │ u32 ┆ str ┆ str │ ╞════════╪══════════════╪═════════════╡ │ 0 ┆ FR ┆ France │ │ 1 ┆ ES ┆ 1 │ │ 2 ┆ DE ┆ Germany │ │ 3 ┆ null ┆ unspecified │ └────────┴──────────────┴─────────────┘