polars.Expr.name.replace#
- Expr.name.replace(pattern: str, value: str, *, literal: bool = False) Expr[source]#
Replace matching regex/literal substring in the name with a new value.
- Parameters:
- pattern
A valid regular expression pattern, compatible with the regex crate.
- value
String that will replace the matched substring.
- literal
Treat
patternas a literal string, not a regex.
See also
Notes
To modify regular expression behaviour (such as case-sensitivity) with flags, use the inline
(?iLmsuxU)syntax. See the regex crate’s section on grouping and flags for additional information about the use of inline expression modifiers.The dollar sign (
$) is a special character related to capture groups; if you want to replace some target pattern with characters that include a literal$you should escape it by doubling it up as$$, or setliteral=Trueif you do not need a full regular expression pattern match. Otherwise, you will be referencing a (potentially non-existent) capture group.
Examples
>>> df = pl.DataFrame( ... { ... "n_foo": [1, 2, 3], ... "n_bar": ["x", "y", "z"], ... } ... ) >>> df.select(pl.all().name.replace(r"^n_", "col_")) shape: (3, 2) ┌─────────┬─────────┐ │ col_foo ┆ col_bar │ │ --- ┆ --- │ │ i64 ┆ str │ ╞═════════╪═════════╡ │ 1 ┆ x │ │ 2 ┆ y │ │ 3 ┆ z │ └─────────┴─────────┘ >>> df.select(pl.all().name.replace(r"(a|e|i|o|u)", "@")).schema Schema({'n_f@@': Int64, 'n_b@r': String})
Apply case-insensitive string replacement using the
(?i)flag.>>> pl.DataFrame({"Foo": [1], "faz": [2]}).select( ... pl.all().name.replace(r"(?i)^f", "b") ... ) shape: (1, 2) ┌─────┬─────┐ │ boo ┆ baz │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 1 ┆ 2 │ └─────┴─────┘
Capture groups are supported. Use
$1or${1}in thevaluestring to refer to the first capture group in the pattern,$2or${2}to refer to the second capture group, and so on. You can also use named capture groups.>>> df = pl.DataFrame({"x_1": [1], "x_2": [2], "group_id": ["xyz"]}) >>> df.select(pl.all().name.replace(r"_(\d+)$", ":$1")) shape: (1, 3) ┌─────┬─────┬──────────┐ │ x:1 ┆ x:2 ┆ group_id │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ str │ ╞═════╪═════╪══════════╡ │ 1 ┆ 2 ┆ xyz │ └─────┴─────┴──────────┘
The
${1}form is used to disambiguate the group reference from surrounding text.>>> df = pl.DataFrame({"hat": [1], "hut": [2]}).with_row_index() >>> df.with_columns(pl.all().name.replace(r"^h(.)t", "s$1m")) # ComputeError: the name 's' passed to `LazyFrame.with_columns` is duplicate
>>> df.with_columns(pl.all().name.replace(r"^h(.)t", "s${1}m")) shape: (1, 5) ┌───────┬─────┬─────┬─────┬─────┐ │ index ┆ hat ┆ hut ┆ sam ┆ sum │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ u32 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞═══════╪═════╪═════╪═════╪═════╡ │ 0 ┆ 1 ┆ 2 ┆ 1 ┆ 2 │ └───────┴─────┴─────┴─────┴─────┘