polars.DataFrame.unpivot#

DataFrame.unpivot(
on: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None = None,
*,
index: ColumnNameOrSelector | Sequence[ColumnNameOrSelector] | None = None,
variable_name: str | None = None,
value_name: str | None = None,
) DataFrame[source]#

Unpivot a DataFrame from wide to long format.

Optionally leaves identifiers set.

This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (index) while all other columns, considered measured variables (on), are “unpivoted” to the row axis leaving just two non-identifier columns, ‘variable’ and ‘value’.

Parameters:
on

Column(s) or selector(s) to use as values variables; if on is empty no columns will be used. If set to None (default) all columns that are not in index will be used.

index

Column(s) or selector(s) to use as identifier variables.

variable_name

Name to give to the variable column. Defaults to “variable”

value_name

Name to give to the value column. Defaults to “value”

Notes

If you’re coming from pandas, this is similar to pandas.DataFrame.melt, but with index replacing id_vars and on replacing value_vars. In other frameworks, you might know this operation as pivot_longer.

The resulting row order is unspecified.

Examples

>>> df = pl.DataFrame(
...     {
...         "a": ["x", "y", "z"],
...         "b": [1, 3, 5],
...         "c": [2, 4, 6],
...     }
... )
>>> import polars.selectors as cs
>>> df.unpivot(cs.numeric(), index="a")
shape: (6, 3)
┌─────┬──────────┬───────┐
│ a   ┆ variable ┆ value │
│ --- ┆ ---      ┆ ---   │
│ str ┆ str      ┆ i64   │
╞═════╪══════════╪═══════╡
│ x   ┆ b        ┆ 1     │
│ y   ┆ b        ┆ 3     │
│ z   ┆ b        ┆ 5     │
│ x   ┆ c        ┆ 2     │
│ y   ┆ c        ┆ 4     │
│ z   ┆ c        ┆ 6     │
└─────┴──────────┴───────┘