polars.defer#

polars.defer(
function: Callable[[], DataFrame],
*,
schema: SchemaDict | Callable[[], SchemaDict],
validate_schema: bool = True,
) LazyFrame[source]#

Deferred execution.

Takes a function that produces a DataFrame but defers execution until the LazyFrame is collected.

Parameters:
function

Function that takes no arguments and produces a DataFrame.

schema

Schema of the DataFrame the deferred function will return. The caller must ensure this schema is correct.

validate_schema

Whether the engine should validate if the batches generated match the given schema. It’s an implementation error if this isn’t the case and can lead to bugs that are hard to solve.

Examples

Delay DataFrame execution until query is executed.

>>> import numpy as np
>>> np.random.seed(0)
>>> lf = pl.defer(
...     lambda: pl.DataFrame({"a": np.random.randn(3)}), schema={"a": pl.Float64}
... )
>>> lf.collect()
shape: (3, 1)
┌──────────┐
│ a        │
│ ---      │
│ f64      │
╞══════════╡
│ 1.764052 │
│ 0.400157 │
│ 0.978738 │
└──────────┘

Run an eager source in Polars Cloud

>>> (
...     pl.defer(
...         lambda: pl.read_database("select * from tbl"),
...         schema={"a": pl.Float64, "b": pl.Boolean},
...     )
...     .filter("b")
...     .sum("a")
...     .remote()
...     .collect()
... )