polars.Series.str.replace#

Series.str.replace(
pattern: str,
value: str,
*,
literal: bool = False,
n: int = 1,
) Series[source]#

Replace first matching regex/literal substring with a new string value.

Parameters:
pattern

A valid regular expression pattern, compatible with the regex crate.

value

String that will replace the matched substring.

literal

Treat pattern as a literal string.

n

Number of matches to replace.

See also

replace_all

Notes

  • To modify regular expression behaviour (such as case-sensitivity) with flags, use the inline (?iLmsuxU) syntax. (See the regex crate’s section on grouping and flags for additional information about the use of inline expression modifiers).

  • The dollar sign ($) is a special character related to capture groups; if you want to replace some target pattern with characters that include a literal $ you should escape it by doubling it up as $$, or set literal=True if you do not need a full regular expression pattern match. Otherwise, you will be referencing a (potentially non-existent) capture group.

    If not escaped, the $0 in the replacement value (below) represents a capture group:

    >>> s = pl.Series("cents", ["000.25", "00.50", "0.75"])
    >>> s.str.replace(r"^(0+)\.", "$0.")
    shape: (3,)
    Series: 'cents' [str]
    [
      "000..25"
      "00..50"
      "0..75"
    ]
    

    To have $ represent a literal value, it should be doubled-up as $$ (or, for simpler find/replace operations, set literal=True if you do not require a full regular expression match):

    >>> s.str.replace(r"^(0+)\.", "$$0.")
    shape: (3,)
    Series: 'cents' [str]
    [
      "$0.25"
      "$0.50"
      "$0.75"
    ]
    

Examples

>>> s = pl.Series(["123abc", "abc456"])
>>> s.str.replace(r"abc\b", "ABC")
shape: (2,)
Series: '' [str]
[
    "123ABC"
    "abc456"
]

Capture groups are supported. Use $1 or ${1} in the value string to refer to the first capture group in the pattern, $2 or ${2} to refer to the second capture group, and so on. You can also use named capture groups.

>>> s = pl.Series(["hat", "hut"])
>>> s.str.replace("h(.)t", "b${1}d")
shape: (2,)
Series: '' [str]
[
    "bad"
    "bud"
]
>>> s.str.replace("h(?<vowel>.)t", "b${vowel}d")
shape: (2,)
Series: '' [str]
[
    "bad"
    "bud"
]

Apply case-insensitive string replacement using the (?i) flag.

>>> s = pl.Series("weather", ["Foggy", "Rainy", "Sunny"])
>>> s.str.replace(r"(?i)foggy|rainy", "Sunny")
shape: (3,)
Series: 'weather' [str]
[
    "Sunny"
    "Sunny"
    "Sunny"
]