polars.DataFrame.describe#

DataFrame.describe(
percentiles: Sequence[float] | float | None = (0.25, 0.5, 0.75),
) Self[source]#

Summary statistics for a DataFrame.

Parameters:
percentiles

One or more percentiles to include in the summary statistics. All values must be in the range [0, 1].

See also

glimpse

Notes

The median is included by default as the 50% percentile.

Examples

>>> from datetime import date
>>> df = pl.DataFrame(
...     {
...         "a": [1.0, 2.8, 3.0],
...         "b": [4, 5, None],
...         "c": [True, False, True],
...         "d": [None, "b", "c"],
...         "e": ["usd", "eur", None],
...         "f": [date(2020, 1, 1), date(2021, 1, 1), date(2022, 1, 1)],
...     }
... )
>>> df.describe()
shape: (9, 7)
┌────────────┬──────────┬──────────┬──────────┬──────┬──────┬────────────┐
│ describe   ┆ a        ┆ b        ┆ c        ┆ d    ┆ e    ┆ f          │
│ ---        ┆ ---      ┆ ---      ┆ ---      ┆ ---  ┆ ---  ┆ ---        │
│ str        ┆ f64      ┆ f64      ┆ f64      ┆ str  ┆ str  ┆ str        │
╞════════════╪══════════╪══════════╪══════════╪══════╪══════╪════════════╡
│ count      ┆ 3.0      ┆ 3.0      ┆ 3.0      ┆ 3    ┆ 3    ┆ 3          │
│ null_count ┆ 0.0      ┆ 1.0      ┆ 0.0      ┆ 1    ┆ 1    ┆ 0          │
│ mean       ┆ 2.266667 ┆ 4.5      ┆ 0.666667 ┆ null ┆ null ┆ null       │
│ std        ┆ 1.101514 ┆ 0.707107 ┆ 0.57735  ┆ null ┆ null ┆ null       │
│ min        ┆ 1.0      ┆ 4.0      ┆ 0.0      ┆ b    ┆ eur  ┆ 2020-01-01 │
│ 25%        ┆ 1.0      ┆ 4.0      ┆ null     ┆ null ┆ null ┆ null       │
│ 50%        ┆ 2.8      ┆ 5.0      ┆ null     ┆ null ┆ null ┆ null       │
│ 75%        ┆ 3.0      ┆ 5.0      ┆ null     ┆ null ┆ null ┆ null       │
│ max        ┆ 3.0      ┆ 5.0      ┆ 1.0      ┆ c    ┆ usd  ┆ 2022-01-01 │
└────────────┴──────────┴──────────┴──────────┴──────┴──────┴────────────┘