polars.LazyFrame.collect#
- LazyFrame.collect(
- *,
- type_coercion: bool = True,
- predicate_pushdown: bool = True,
- projection_pushdown: bool = True,
- simplify_expression: bool = True,
- slice_pushdown: bool = True,
- comm_subplan_elim: bool = True,
- comm_subexpr_elim: bool = True,
- cluster_with_columns: bool = True,
- no_optimization: bool = False,
- streaming: bool = False,
- engine: EngineType = 'cpu',
- background: bool = False,
- _eager: bool = False,
- **_kwargs: Any,
Materialize this LazyFrame into a DataFrame.
By default, all query optimizations are enabled. Individual optimizations may be disabled by setting the corresponding parameter to
False
.- Parameters:
- type_coercion
Do type coercion optimization.
- predicate_pushdown
Do predicate pushdown optimization.
- projection_pushdown
Do projection pushdown optimization.
- simplify_expression
Run simplify expressions optimization.
- slice_pushdown
Slice pushdown optimization.
- comm_subplan_elim
Will try to cache branching subplans that occur on self-joins or unions.
- comm_subexpr_elim
Common subexpressions will be cached and reused.
- cluster_with_columns
Combine sequential independent calls to with_columns
- no_optimization
Turn off (certain) optimizations.
- streaming
Process the query in batches to handle larger-than-memory data. If set to
False
(default), the entire query is processed in a single batch.Warning
Streaming mode is considered unstable. It may be changed at any point without it being considered a breaking change.
Note
Use
explain()
to see if Polars can process the query in streaming mode.- engine
Select the engine used to process the query, optional. If set to
"cpu"
(default), the query is run using the polars CPU engine. If set to"gpu"
, the GPU engine is used. Fine-grained control over the GPU engine, for example which device to use on a system with multiple devices, is possible by providing aGPUEngine
object with configuration options.Note
GPU mode is considered unstable. Not all queries will run successfully on the GPU, however, they should fall back transparently to the default engine if execution is not supported.
Running with
POLARS_VERBOSE=1
will provide information if a query falls back (and why).Note
The GPU engine does not support streaming, or running in the background. If either are enabled, then GPU execution is switched off.
- background
Run the query in the background and get a handle to the query. This handle can be used to fetch the result or cancel the query.
Warning
Background mode is considered unstable. It may be changed at any point without it being considered a breaking change.
- Returns:
- DataFrame
See also
fetch
Run the query on the first
n
rows only for debugging purposes.explain
Print the query plan that is evaluated with collect.
profile
Collect the LazyFrame and time each node in the computation graph.
polars.collect_all
Collect multiple LazyFrames at the same time.
polars.Config.set_streaming_chunk_size
Set the size of streaming batches.
Examples
>>> lf = pl.LazyFrame( ... { ... "a": ["a", "b", "a", "b", "b", "c"], ... "b": [1, 2, 3, 4, 5, 6], ... "c": [6, 5, 4, 3, 2, 1], ... } ... ) >>> lf.group_by("a").agg(pl.all().sum()).collect() shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ a ┆ 4 ┆ 10 │ │ b ┆ 11 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘
Collect in streaming mode
>>> lf.group_by("a").agg(pl.all().sum()).collect( ... streaming=True ... ) shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ a ┆ 4 ┆ 10 │ │ b ┆ 11 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘
Collect in GPU mode
>>> lf.group_by("a").agg(pl.all().sum()).collect(engine="gpu") shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ b ┆ 11 ┆ 10 │ │ a ┆ 4 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘
With control over the device used
>>> lf.group_by("a").agg(pl.all().sum()).collect( ... engine=pl.GPUEngine(device=1) ... ) shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ b ┆ 11 ┆ 10 │ │ a ┆ 4 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘