polars.Expr.value_counts#

Expr.value_counts(
*,
sort: bool = False,
parallel: bool = False,
name: str | None = None,
normalize: bool = False,
) Expr[source]#

Count the occurrences of unique values.

Parameters:
sort

Sort the output by count in descending order. If set to False (default), the order of the output is random.

parallel

Execute the computation in parallel.

Note

This option should likely not be enabled in a group by context, as the computation is already parallelized per group.

name

Give the resulting count column a specific name; if normalize is True defaults to “proportion”, otherwise defaults to “count”.

normalize

If true gives relative frequencies of the unique values

Returns:
Expr

Expression of data type Struct with mapping of unique values to their count.

Examples

>>> df = pl.DataFrame(
...     {"color": ["red", "blue", "red", "green", "blue", "blue"]}
... )
>>> df.select(pl.col("color").value_counts())  
shape: (3, 1)
┌─────────────┐
│ color       │
│ ---         │
│ struct[2]   │
╞═════════════╡
│ {"red",2}   │
│ {"green",1} │
│ {"blue",3}  │
└─────────────┘

Sort the output by (descending) count and customize the count field name.

>>> df = df.select(pl.col("color").value_counts(sort=True, name="n"))
>>> df
shape: (3, 1)
┌─────────────┐
│ color       │
│ ---         │
│ struct[2]   │
╞═════════════╡
│ {"blue",3}  │
│ {"red",2}   │
│ {"green",1} │
└─────────────┘
>>> df.unnest("color")
shape: (3, 2)
┌───────┬─────┐
│ color ┆ n   │
│ ---   ┆ --- │
│ str   ┆ u32 │
╞═══════╪═════╡
│ blue  ┆ 3   │
│ red   ┆ 2   │
│ green ┆ 1   │
└───────┴─────┘