polars.Series.str.replace_all#
- Series.str.replace_all(pattern: str, value: str, *, literal: bool = False) Series [source]#
Replace all matching regex/literal substrings with a new string value.
- Parameters:
- pattern
A valid regular expression pattern, compatible with the regex crate.
- value
String that will replace the matched substring.
- literal
Treat
pattern
as a literal string.
See also
Notes
To modify regular expression behaviour (such as case-sensitivity) with flags, use the inline
(?iLmsuxU)
syntax. (See the regex crate’s section on grouping and flags for additional information about the use of inline expression modifiers).The dollar sign (
$
) is a special character related to capture groups; if you want to replace some target pattern with characters that include a literal$
you should escape it by doubling it up as$$
, or setliteral=True
if you do not need a full regular expression pattern match. Otherwise, you will be referencing a (potentially non-existent) capture group.In the example below we need to double up
$
(to represent a literal dollar sign, and then refer to the capture group using$n
or${n}
, hence the three consecutive$
characters in the replacement value:>>> s = pl.Series("cost", ["#12.34", "#56.78"]) >>> s.str.replace_all(r"#(\d+)", "$$${1}").alias("cost_usd") shape: (2,) Series: 'cost_usd' [str] [ "$12.34" "$56.78" ]
Examples
>>> s = pl.Series(["123abc", "abc456"]) >>> s.str.replace_all(r"abc\b", "ABC") shape: (2,) Series: '' [str] [ "123ABC" "abc456" ]
Capture groups are supported. Use
$1
or${1}
in thevalue
string to refer to the first capture group in thepattern
,$2
or${2}
to refer to the second capture group, and so on. You can also use named capture groups.>>> s = pl.Series(["hat", "hut"]) >>> s.str.replace_all("h(.)t", "b${1}d") shape: (2,) Series: '' [str] [ "bad" "bud" ] >>> s.str.replace_all("h(?<vowel>.)t", "b${vowel}d") shape: (2,) Series: '' [str] [ "bad" "bud" ]
Apply case-insensitive string replacement using the
(?i)
flag.>>> s = pl.Series("weather", ["Foggy", "Rainy", "Sunny"]) >>> s.str.replace_all(r"(?i)foggy|rainy", "Sunny") shape: (3,) Series: 'weather' [str] [ "Sunny" "Sunny" "Sunny" ]