polars.Expr.str.split#

Expr.str.split(by: IntoExpr, *, inclusive: bool = False) Expr[source]#

Split the string by a substring.

Parameters:
by

Substring to split by.

inclusive

If True, include the split character/string in the results.

Returns:
Expr

Expression of data type Utf8.

Examples

>>> df = pl.DataFrame({"s": ["foo bar", "foo_bar", "foo_bar_baz"]})
>>> df.with_columns(
...     pl.col("s").str.split(by="_").alias("split"),
...     pl.col("s").str.split(by="_", inclusive=True).alias("split_inclusive"),
... )
shape: (3, 3)
┌─────────────┬───────────────────────┬─────────────────────────┐
│ s           ┆ split                 ┆ split_inclusive         │
│ ---         ┆ ---                   ┆ ---                     │
│ str         ┆ list[str]             ┆ list[str]               │
╞═════════════╪═══════════════════════╪═════════════════════════╡
│ foo bar     ┆ ["foo bar"]           ┆ ["foo bar"]             │
│ foo_bar     ┆ ["foo", "bar"]        ┆ ["foo_", "bar"]         │
│ foo_bar_baz ┆ ["foo", "bar", "baz"] ┆ ["foo_", "bar_", "baz"] │
└─────────────┴───────────────────────┴─────────────────────────┘
>>> df = pl.DataFrame(
...     {"s": ["foo^bar", "foo_bar", "foo*bar*baz"], "by": ["_", "_", "*"]}
... )
>>> df.with_columns(
...     pl.col("s").str.split(by=pl.col("by")).alias("split"),
...     pl.col("s")
...     .str.split(by=pl.col("by"), inclusive=True)
...     .alias("split_inclusive"),
... )
shape: (3, 4)
┌─────────────┬─────┬───────────────────────┬─────────────────────────┐
│ s           ┆ by  ┆ split                 ┆ split_inclusive         │
│ ---         ┆ --- ┆ ---                   ┆ ---                     │
│ str         ┆ str ┆ list[str]             ┆ list[str]               │
╞═════════════╪═════╪═══════════════════════╪═════════════════════════╡
│ foo^bar     ┆ _   ┆ ["foo^bar"]           ┆ ["foo^bar"]             │
│ foo_bar     ┆ _   ┆ ["foo", "bar"]        ┆ ["foo_", "bar"]         │
│ foo*bar*baz ┆ *   ┆ ["foo", "bar", "baz"] ┆ ["foo*", "bar*", "baz"] │
└─────────────┴─────┴───────────────────────┴─────────────────────────┘