polars.LazyFrame.collect#
- LazyFrame.collect(
- *,
- type_coercion: bool = True,
- predicate_pushdown: bool = True,
- projection_pushdown: bool = True,
- simplify_expression: bool = True,
- no_optimization: bool = False,
- slice_pushdown: bool = True,
- comm_subplan_elim: bool = True,
- comm_subexpr_elim: bool = True,
- streaming: bool = False,
Collect into a DataFrame.
Note: use
fetch()
if you want to run your query on the first n rows only. This can be a huge time saver in debugging queries.- Parameters:
- type_coercion
Do type coercion optimization.
- predicate_pushdown
Do predicate pushdown optimization.
- projection_pushdown
Do projection pushdown optimization.
- simplify_expression
Run simplify expressions optimization.
- no_optimization
Turn off (certain) optimizations.
- slice_pushdown
Slice pushdown optimization.
- comm_subplan_elim
Will try to cache branching subplans that occur on self-joins or unions.
- comm_subexpr_elim
Common subexpressions will be cached and reused.
- streaming
Run parts of the query in a streaming fashion (this is in an alpha state)
- Returns:
- DataFrame
Examples
>>> lf = pl.LazyFrame( ... { ... "a": ["a", "b", "a", "b", "b", "c"], ... "b": [1, 2, 3, 4, 5, 6], ... "c": [6, 5, 4, 3, 2, 1], ... } ... ) >>> lf.groupby("a", maintain_order=True).agg(pl.all().sum()).collect() shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ a ┆ 4 ┆ 10 │ │ b ┆ 11 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘