polars.Series.str.replace_all#

Series.str.replace_all(pattern: str, value: str, *, literal: bool = False) Series[source]#

Replace all matching regex/literal substrings with a new string value.

Parameters:
pattern

A valid regular expression pattern, compatible with the regex crate.

value

String that will replace the matched substring.

literal

Treat pattern as a literal string.

See also

replace

Notes

  • To modify regular expression behaviour (such as case-sensitivity) with flags, use the inline (?iLmsuxU) syntax. (See the regex crate’s section on grouping and flags for additional information about the use of inline expression modifiers).

  • The dollar sign ($) is a special character related to capture groups; if you want to replace some target pattern with characters that include a literal $ you should escape it by doubling it up as $$, or set literal=True if you do not need a full regular expression pattern match. Otherwise, you will be referencing a (potentially non-existent) capture group.

    In the example below we need to double up $ (to represent a literal dollar sign, and then refer to the capture group using $n or ${n}, hence the three consecutive $ characters in the replacement value:

    >>> s = pl.Series("cost", ["#12.34", "#56.78"])
    >>> s.str.replace_all(r"#(\d+)", "$$${1}").alias("cost_usd")
    shape: (2,)
    Series: 'cost_usd' [str]
    [
        "$12.34"
        "$56.78"
    ]
    

Examples

>>> s = pl.Series(["123abc", "abc456"])
>>> s.str.replace_all(r"abc\b", "ABC")
shape: (2,)
Series: '' [str]
[
    "123ABC"
    "abc456"
]

Capture groups are supported. Use $1 or ${1} in the value string to refer to the first capture group in the pattern, $2 or ${2} to refer to the second capture group, and so on. You can also use named capture groups.

>>> s = pl.Series(["hat", "hut"])
>>> s.str.replace_all("h(.)t", "b${1}d")
shape: (2,)
Series: '' [str]
[
    "bad"
    "bud"
]
>>> s.str.replace_all("h(?<vowel>.)t", "b${vowel}d")
shape: (2,)
Series: '' [str]
[
    "bad"
    "bud"
]

Apply case-insensitive string replacement using the (?i) flag.

>>> s = pl.Series("weather", ["Foggy", "Rainy", "Sunny"])
>>> s.str.replace_all(r"(?i)foggy|rainy", "Sunny")
shape: (3,)
Series: 'weather' [str]
[
    "Sunny"
    "Sunny"
    "Sunny"
]