polars.LazyFrame.collect#
- LazyFrame.collect(
 - *,
 - type_coercion: bool = True,
 - predicate_pushdown: bool = True,
 - projection_pushdown: bool = True,
 - simplify_expression: bool = True,
 - slice_pushdown: bool = True,
 - comm_subplan_elim: bool = True,
 - comm_subexpr_elim: bool = True,
 - no_optimization: bool = False,
 - streaming: bool = False,
 - _eager: bool = False,
 Materialize this LazyFrame into a DataFrame.
By default, all query optimizations are enabled. Individual optimizations may be disabled by setting the corresponding parameter to
False.- Parameters:
 - type_coercion
 Do type coercion optimization.
- predicate_pushdown
 Do predicate pushdown optimization.
- projection_pushdown
 Do projection pushdown optimization.
- simplify_expression
 Run simplify expressions optimization.
- slice_pushdown
 Slice pushdown optimization.
- comm_subplan_elim
 Will try to cache branching subplans that occur on self-joins or unions.
- comm_subexpr_elim
 Common subexpressions will be cached and reused.
- no_optimization
 Turn off (certain) optimizations.
- streaming
 Process the query in batches to handle larger-than-memory data. If set to
False(default), the entire query is processed in a single batch.Warning
This functionality is currently in an alpha state.
Note
Use
explain()to see if Polars can process the query in streaming mode.
- Returns:
 - DataFrame
 
See also
fetchRun the query on the first
nrows only for debugging purposes.explainPrint the query plan that is evaluated with collect.
profileCollect the LazyFrame and time each node in the computation graph.
polars.collect_allCollect multiple LazyFrames at the same time.
polars.Config.set_streaming_chunk_sizeSet the size of streaming batches.
Examples
>>> lf = pl.LazyFrame( ... { ... "a": ["a", "b", "a", "b", "b", "c"], ... "b": [1, 2, 3, 4, 5, 6], ... "c": [6, 5, 4, 3, 2, 1], ... } ... ) >>> lf.group_by("a").agg(pl.all().sum()).collect() shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ a ┆ 4 ┆ 10 │ │ b ┆ 11 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘
Collect in streaming mode
>>> lf.group_by("a").agg(pl.all().sum()).collect( ... streaming=True ... ) shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ a ┆ 4 ┆ 10 │ │ b ┆ 11 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘