polars.DataFrame.top_k#
- DataFrame.top_k( ) DataFrame [source]#
Return the
k
largest rows.Non-null elements are always preferred over null elements, regardless of the value of
reverse
. The output is not guaranteed to be in any particular order, callsort()
after this function if you wish the output to be sorted.- Parameters:
- k
Number of rows to return.
- by
Column(s) used to determine the top rows. Accepts expression input. Strings are parsed as column names.
- reverse
Consider the
k
smallest elements of theby
column(s) (instead of thek
largest). This can be specified per column by passing a sequence of booleans.
See also
Examples
>>> df = pl.DataFrame( ... { ... "a": ["a", "b", "a", "b", "b", "c"], ... "b": [2, 1, 1, 3, 2, 1], ... } ... )
Get the rows which contain the 4 largest values in column b.
>>> df.top_k(4, by="b") shape: (4, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ str ┆ i64 │ ╞═════╪═════╡ │ b ┆ 3 │ │ a ┆ 2 │ │ b ┆ 2 │ │ b ┆ 1 │ └─────┴─────┘
Get the rows which contain the 4 largest values when sorting on column b and a.
>>> df.top_k(4, by=["b", "a"]) shape: (4, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ str ┆ i64 │ ╞═════╪═════╡ │ b ┆ 3 │ │ b ┆ 2 │ │ a ┆ 2 │ │ c ┆ 1 │ └─────┴─────┘