polars.LazyFrame.collect_async#
- LazyFrame.collect_async(
- *,
- gevent: bool = False,
- engine: EngineType = 'auto',
- optimizations: QueryOptFlags = (),
Collect DataFrame asynchronously in thread pool.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
Collects into a DataFrame (like
collect()) but, instead of returning a DataFrame directly, it is scheduled to be collected inside a thread pool, while this method returns almost instantly.This can be useful if you use
geventorasyncioand want to release control to other greenlets/tasks while LazyFrames are being collected.- Parameters:
- gevent
Return wrapper to
gevent.event.AsyncResultinstead of Awaitable- engine
Select the engine used to process the query (default
"auto"):"auto": use the engine set byConfig.set_engine_affinityor thePOLARS_ENGINE_AFFINITYenvironment variable, falling back to"in-memory"if unset (this default may change in a future release)."in-memory": use the in-memory engine, this is the default engine."streaming": use the streaming engine, which processes queries in batches, reducing memory pressure and often outperforming the in-memory engine. This will soon become the default engine of Polars."gpu": use the CUDA GPU engine (requires an Nvidia GPU andcudf-polars). Pass aGPUEngineobject for fine-grained control (e.g. device selection on multi-GPU systems).
If the selected engine cannot run the query, Polars falls back to the in-memory engine.
Note
The GPU engine does not support async, or running in the background. If either are enabled, then GPU execution is switched off.
- optimizations
The optimization passes done during query optimization.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
- Returns:
- If
gevent=False(default) then returns an awaitable. - If
gevent=Truethen returns wrapper that has a .get(block=True, timeout=None)method.
- If
See also
polars.collect_allCollect multiple LazyFrames at the same time.
polars.collect_all_asyncCollect multiple LazyFrames at the same time lazily.
Notes
In case of error
set_exceptionis used onasyncio.Future/gevent.event.AsyncResultand will be reraised by them.Examples
>>> import asyncio >>> lf = pl.LazyFrame( ... { ... "a": ["a", "b", "a", "b", "b", "c"], ... "b": [1, 2, 3, 4, 5, 6], ... "c": [6, 5, 4, 3, 2, 1], ... } ... ) >>> async def main(): ... return await ( ... lf.group_by("a", maintain_order=True) ... .agg(pl.all().sum()) ... .collect_async() ... ) >>> asyncio.run(main()) shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ a ┆ 4 ┆ 10 │ │ b ┆ 11 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘