polars.LazyFrame.sql#

LazyFrame.sql(query: str, *, table_name: str = 'self') Self[source]#

Execute a SQL query against the LazyFrame.

New in version 0.20.23.

Warning

This functionality is considered unstable, although it is close to being considered stable. It may be changed at any point without it being considered a breaking change.

Parameters:
query

SQL query to execute.

table_name

Optionally provide an explicit name for the table that represents the calling frame (defaults to “self”).

See also

SQLContext

Notes

  • The calling frame is automatically registered as a table in the SQL context under the name “self”. If you want access to the DataFrames and LazyFrames found in the current globals, use the top-level pl.sql.

  • More control over registration and execution behaviour is available by using the SQLContext object.

Examples

>>> lf1 = pl.LazyFrame({"a": [1, 2, 3], "b": [6, 7, 8], "c": ["z", "y", "x"]})
>>> lf2 = pl.LazyFrame({"a": [3, 2, 1], "d": [125, -654, 888]})

Query the LazyFrame using SQL:

>>> lf1.sql("SELECT c, b FROM self WHERE a > 1").collect()
shape: (2, 2)
┌─────┬─────┐
│ c   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ y   ┆ 7   │
│ x   ┆ 8   │
└─────┴─────┘

Apply SQL transforms (aliasing “self” to “frame”) then filter natively (you can freely mix SQL and native operations):

>>> lf1.sql(
...     query='''
...         SELECT
...             a,
...             (a % 2 == 0) AS a_is_even,
...             (b::float4 / 2) AS "b/2",
...             CONCAT_WS(':', c, c, c) AS c_c_c
...         FROM frame
...         ORDER BY a
...     ''',
...     table_name="frame",
... ).filter(~pl.col("c_c_c").str.starts_with("x")).collect()
shape: (2, 4)
┌─────┬───────────┬─────┬───────┐
│ a   ┆ a_is_even ┆ b/2 ┆ c_c_c │
│ --- ┆ ---       ┆ --- ┆ ---   │
│ i64 ┆ bool      ┆ f32 ┆ str   │
╞═════╪═══════════╪═════╪═══════╡
│ 1   ┆ false     ┆ 3.0 ┆ z:z:z │
│ 2   ┆ true      ┆ 3.5 ┆ y:y:y │
└─────┴───────────┴─────┴───────┘