polars.Expr.str.len_bytes#
- Expr.str.len_bytes() Expr [source]#
Return the length of each string as the number of bytes.
- Returns:
- Expr
Expression of data type
UInt32
.
See also
Notes
When working with non-ASCII text, the length in bytes is not the same as the length in characters. You may want to use
len_chars()
instead. Note thatlen_bytes()
is much more performant (_O(1)_) thanlen_chars()
(_O(n)_).Examples
>>> df = pl.DataFrame({"a": ["Café", "345", "東京", None]}) >>> df.with_columns( ... pl.col("a").str.len_bytes().alias("n_bytes"), ... pl.col("a").str.len_chars().alias("n_chars"), ... ) shape: (4, 3) ┌──────┬─────────┬─────────┐ │ a ┆ n_bytes ┆ n_chars │ │ --- ┆ --- ┆ --- │ │ str ┆ u32 ┆ u32 │ ╞══════╪═════════╪═════════╡ │ Café ┆ 5 ┆ 4 │ │ 345 ┆ 3 ┆ 3 │ │ 東京 ┆ 6 ┆ 2 │ │ null ┆ null ┆ null │ └──────┴─────────┴─────────┘