polars.approx_n_unique#
- polars.approx_n_unique(*columns: str) Expr [source]#
Approximate count of unique values.
This function is syntactic sugar for
pl.col(columns).approx_n_unique()
, and uses the HyperLogLog++ algorithm for cardinality estimation.- Parameters:
- columns
One or more column names.
Examples
>>> df = pl.DataFrame( ... { ... "a": [1, 8, 1], ... "b": [4, 5, 2], ... "c": ["foo", "bar", "foo"], ... } ... ) >>> df.select(pl.approx_n_unique("a")) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 2 │ └─────┘ >>> df.select(pl.approx_n_unique("b", "c")) shape: (1, 2) ┌─────┬─────┐ │ b ┆ c │ │ --- ┆ --- │ │ u32 ┆ u32 │ ╞═════╪═════╡ │ 3 ┆ 2 │ └─────┴─────┘