polars.DataFrame.pivot#
- DataFrame.pivot(
- values: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None,
- index: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None,
- columns: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None,
- aggregate_function: PivotAgg | Expr | None | NoDefault = _NoDefault.no_default,
- *,
- maintain_order: bool = True,
- sort_columns: bool = False,
- separator: str = '_',
Create a spreadsheet-style pivot table as a DataFrame.
- Parameters:
- values
Column values to aggregate. Can be multiple columns if the columns arguments contains multiple columns as well.
- index
One or multiple keys to group by.
- columns
Name of the column(s) whose values will be used as the header of the output DataFrame.
- aggregate_function
Choose from:
None: no aggregation takes place, will raise error if multiple values are in group.
A predefined aggregate function string, one of {‘first’, ‘sum’, ‘max’, ‘min’, ‘mean’, ‘median’, ‘last’, ‘count’}
An expression to do the aggregation.
- maintain_order
Sort the grouped keys so that the output order is predictable.
- sort_columns
Sort the transposed columns by name. Default is by order of discovery.
- separator
Used as separator/delimiter in generated column names.
- Returns:
- DataFrame
Examples
>>> df = pl.DataFrame( ... { ... "foo": ["one", "one", "two", "two", "one", "two"], ... "bar": ["y", "y", "y", "x", "x", "x"], ... "baz": [1, 2, 3, 4, 5, 6], ... } ... ) >>> df.pivot(values="baz", index="foo", columns="bar", aggregate_function="sum") shape: (2, 3) ┌─────┬─────┬─────┐ │ foo ┆ y ┆ x │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ one ┆ 3 ┆ 5 │ │ two ┆ 3 ┆ 10 │ └─────┴─────┴─────┘
Pivot using selectors to determine the index/values/columns:
>>> import polars.selectors as cs >>> df.pivot( ... values=cs.numeric(), ... index=cs.string(), ... columns=cs.string(), ... aggregate_function="sum", ... sort_columns=True, ... ).sort( ... by=cs.string(), ... ) shape: (4, 6) ┌─────┬─────┬──────┬──────┬──────┬──────┐ │ foo ┆ bar ┆ one ┆ two ┆ x ┆ y │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ str ┆ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪══════╪══════╪══════╪══════╡ │ one ┆ x ┆ 5 ┆ null ┆ 5 ┆ null │ │ one ┆ y ┆ 3 ┆ null ┆ null ┆ 3 │ │ two ┆ x ┆ null ┆ 10 ┆ 10 ┆ null │ │ two ┆ y ┆ null ┆ 3 ┆ null ┆ 3 │ └─────┴─────┴──────┴──────┴──────┴──────┘
Run an expression as aggregation function
>>> df = pl.DataFrame( ... { ... "col1": ["a", "a", "a", "b", "b", "b"], ... "col2": ["x", "x", "x", "x", "y", "y"], ... "col3": [6, 7, 3, 2, 5, 7], ... } ... ) >>> df.pivot( ... index="col1", ... columns="col2", ... values="col3", ... aggregate_function=pl.element().tanh().mean(), ... ) shape: (2, 3) ┌──────┬──────────┬──────────┐ │ col1 ┆ x ┆ y │ │ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ f64 │ ╞══════╪══════════╪══════════╡ │ a ┆ 0.998347 ┆ null │ │ b ┆ 0.964028 ┆ 0.999954 │ └──────┴──────────┴──────────┘