polars.LazyFrame.collect_async#
- LazyFrame.collect_async(
- *,
- gevent: bool = False,
- engine: EngineType = 'auto',
- optimizations: QueryOptFlags = (),
Collect DataFrame asynchronously in thread pool.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
Collects into a DataFrame (like
collect()
) but, instead of returning a DataFrame directly, it is scheduled to be collected inside a thread pool, while this method returns almost instantly.This can be useful if you use
gevent
orasyncio
and want to release control to other greenlets/tasks while LazyFrames are being collected.- Parameters:
- gevent
Return wrapper to
gevent.event.AsyncResult
instead of Awaitable- engine
Select the engine used to process the query, optional. At the moment, if set to
"auto"
(default), the query is run using the polars in-memory engine. Polars will also attempt to use the engine set by thePOLARS_ENGINE_AFFINITY
environment variable. If it cannot run the query using the selected engine, the query is run using the polars in-memory engine.Note
The GPU engine does not support async, or running in the background. If either are enabled, then GPU execution is switched off.
- optimizations
The optimization passes done during query optimization.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
- Returns:
- If
gevent=False
(default) then returns an awaitable. - If
gevent=True
then returns wrapper that has a .get(block=True, timeout=None)
method.
- If
See also
polars.collect_all
Collect multiple LazyFrames at the same time.
polars.collect_all_async
Collect multiple LazyFrames at the same time lazily.
Notes
In case of error
set_exception
is used onasyncio.Future
/gevent.event.AsyncResult
and will be reraised by them.Examples
>>> import asyncio >>> lf = pl.LazyFrame( ... { ... "a": ["a", "b", "a", "b", "b", "c"], ... "b": [1, 2, 3, 4, 5, 6], ... "c": [6, 5, 4, 3, 2, 1], ... } ... ) >>> async def main(): ... return await ( ... lf.group_by("a", maintain_order=True) ... .agg(pl.all().sum()) ... .collect_async() ... ) >>> asyncio.run(main()) shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╡ │ a ┆ 4 ┆ 10 │ │ b ┆ 11 ┆ 10 │ │ c ┆ 6 ┆ 1 │ └─────┴─────┴─────┘