polars.Series.value_counts#

Series.value_counts( *, sort: bool = False, parallel: bool = False, name: str_ | None = None, normalize: bool = False, ) → DataFrame[source]#

Count the occurrences of unique values.

Parameters:

sort: Sort the output by count, in descending order. If set to False (default), the order is non-deterministic.
parallel: Execute the computation in parallel.

Note

This option should likely not be enabled in a group_by context, as the computation will already be parallelized per group.
name: Give the resulting count column a specific name; if normalize is True this defaults to “proportion”, otherwise defaults to “count”.
normalize: If True, the count is returned as the relative frequency of unique values normalized to 1.0.

Returns:

DataFrame: Columns map the unique values to their count (or proportion).

Examples

>>> s = pl.Series("color", ["red", "blue", "red", "green", "blue", "blue"])
>>> s.value_counts()  
shape: (3, 2)
┌───────┬───────┐
│ color ┆ count │
│ ---   ┆ ---   │
│ str   ┆ u32   │
╞═══════╪═══════╡
│ red   ┆ 2     │
│ green ┆ 1     │
│ blue  ┆ 3     │
└───────┴───────┘

Sort the output by count and customize the count column name.

>>> s.value_counts(sort=True, name="n")
shape: (3, 2)
┌───────┬─────┐
│ color ┆ n   │
│ ---   ┆ --- │
│ str   ┆ u32 │
╞═══════╪═════╡
│ blue  ┆ 3   │
│ red   ┆ 2   │
│ green ┆ 1   │
└───────┴─────┘

Return the count as a relative frequency, normalized to 1.0:

>>> s.value_counts(sort=True, normalize=True, name="fraction")
shape: (3, 2)
┌───────┬──────────┐
│ color ┆ fraction │
│ ---   ┆ ---      │
│ str   ┆ f64      │
╞═══════╪══════════╡
│ blue  ┆ 0.5      │
│ red   ┆ 0.333333 │
│ green ┆ 0.166667 │
└───────┴──────────┘