# Module polars_lazy::dsl

source · ## Expand description

Domain specific language for the Lazy API.

This DSL revolves around the `Expr`

type, which represents an abstract
operation on a DataFrame, such as mapping over a column, filtering, group_by, or aggregation.
In general, functions on `LazyFrame`

s consume the `LazyFrame`

and produce a new `LazyFrame`

representing
the result of applying the function and passed expressions to the consumed LazyFrame.
At runtime, when `LazyFrame::collect`

is called, the expressions that comprise
the `LazyFrame`

’s logical plan are materialized on the actual underlying Series.
For instance, `let expr = col("x").pow(lit(2)).alias("x2");`

would produce an expression representing the abstract
operation of squaring the column `"x"`

and naming the resulting column `"x2"`

, and to apply this operation to a
`LazyFrame`

, you’d use `let lazy_df = lazy_df.with_column(expr);`

.
(Of course, a column named `"x"`

must either exist in the original DataFrame or be produced by one of the preceding
operations on the `LazyFrame`

.)

There are many, many free functions that this module exports that produce an `Expr`

from scratch; `col`

and
`lit`

are two examples.
Expressions also have several methods, such as `pow`

and `alias`

, that consume them
and produce a new expression.

Several expressions are only available when the necessary feature is enabled.
Examples of features that unlock specialized expression include `string`

, `temporal`

, and `dtype-categorical`

.
These specialized expressions provide implementations of functions that you’d otherwise have to implement by hand.

Because of how abstract and flexible the `Expr`

type is, care must be take to ensure you only attempt to perform
sensible operations with them.
For instance, as mentioned above, you have to make sure any columns you reference already exist in the LazyFrame.
Furthermore, there is nothing stopping you from calling, for example, `any`

with an expression
that will yield an `f64`

column (instead of `bool`

), or `col("string") - col("f64")`

, which would attempt
to subtract an `f64`

Series from a `string`

Series.
These kinds of invalid operations will only yield an error at runtime, when
`collect`

is called on the `LazyFrame`

.

## Re-exports§

`pub use functions::*;`

## Modules§

- cat
`dtype-categorical`

- dt
`temporal`

- Functions
- python_udf
`python`

- string
`strings`

## Structs§

- Specialized expressions for
`Series`

of`DataType::Array`

. - Specialized expressions for Categorical dtypes.
- Utility struct for the
`when-then-otherwise`

expression. - Utility struct for the
`when-then-otherwise`

expression. - Arguments used by
`datetime`

in order to produce an`Expr`

of Datetime - Specialized expressions for modifying the name of existing expressions.
- Specialized expressions for
`Series`

of`DataType::List`

. - Specialized expressions for Categorical dtypes.
- Wrapper type that has special equality properties depending on the inner type specialization
- Specialized expressions for Struct dtypes.
- Utility struct for the
`when-then-otherwise`

expression. - Represents a user-defined function
- Utility struct for the
`when-then-otherwise`

expression.

## Enums§

- Expressions that can be used in various contexts. Queries consist of multiple expressions. When using the polars lazy API, don’t construct an
`Expr`

directly; instead, create one using the functions in the`polars_lazy::dsl`

module. See that module’s docs for more info.

## Traits§

- ExprEvalExtension
`cumulative_eval`

or`list_eval`

- IntoListNameSpace
`list_eval`

- ListNameSpaceExtension
`list_eval`

- A wrapper trait for any binary closure
`Fn(Series, Series) -> PolarsResult<Series>`

- A wrapper trait for any closure
`Fn(Vec<Series>) -> PolarsResult<Series>`

## Functions§

- Selects all columns. Shorthand for
`col("*")`

. - Create a new column with the bitwise-and of the elements in each row.
- Create a new column with the bitwise-or of the elements in each row.
- Like
`map_binary`

, but used in a group_by-aggregation context. - Apply a function/closure over the groups of multiple columns. This should only be used in a group_by aggregation.
- Generate a range of integers.
- arg_sort_by
`range`

Find the indexes that would sort these series in order of appearance. That means that the first`Series`

will be used to determine the ordering until duplicates are found. Once duplicates are found, the next`Series`

will be used and so on. - arg_where
`arg_where`

Get the indices where`condition`

evaluates`true`

. - Take several expressions and collect them into a
`StructChunked`

. - Find the mean of all the values in the column named
`name`

. Alias for`mean`

. - Compute
`op(l, r)`

(or equivalently`l op r`

).`l`

and`r`

must have types compatible with the Operator. - business_day_count
`dtype-date`

- Casts the column given by
`Expr`

to a different type. - Folds the expressions from left to right keeping the first non-null values.
- Create a Column Expression based on a column name.
- Select multiple columns by name.
- Concat lists entries.
- concat_str
`concat_str`

and`strings`

Horizontally concat string columns in linear time - Compute the covariance between two columns.
- cum_fold_exprs
`dtype-struct`

Accumulate over multiple columns horizontally / row wise. - cum_reduce_exprs
`dtype-struct`

Accumulate over multiple columns horizontally / row wise. - date_ranges
`temporal`

Create a column of date ranges from a`start`

and`stop`

expression. - datetime
`temporal`

Construct a column of`Datetime`

from the provided`DatetimeArgs`

. - datetime_range
`dtype-datetime`

Create a datetime range from a`start`

and`stop`

expression. - datetime_ranges
`dtype-datetime`

Create a column of datetime ranges from a`start`

and`stop`

expression. - Select multiple columns by dtype.
- Select multiple columns by dtype.
- duration
`temporal`

Construct a column of`Duration`

from the provided`DurationArgs`

- First column in DataFrame.
- Accumulate over multiple columns horizontally / row wise.
- format_str
`concat_str`

and`strings`

Format the results of an array of expressions using a format string - Generate a range of integers.
- Generate a range of integers for each row of the input columns.
- A column which is
`false`

wherever`expr`

is null,`true`

elsewhere. - A column which is
`true`

wherever`expr`

is null,`false`

elsewhere. - Last column in DataFrame.
- Return the number of rows in the context.
- Create a Literal Expression from
`L`

. A literal expression behaves like a column that contains a single distinct value. - Apply a function/closure over multiple columns once the logical plan get executed.
- Apply a function/closure over multiple columns once the logical plan get executed.
- Find the maximum of all the values in the column named
`name`

. Shorthand for`col(name).max()`

. - Create a new column with the maximum value per row.
- Find the mean of all the values in the column named
`name`

. Shorthand for`col(name).mean()`

. - Compute the mean of all values horizontally across columns.
- Find the median of all the values in the column named
`name`

. Shorthand for`col(name).median()`

. - Find the minimum of all the values in the column named
`name`

. Shorthand for`col(name).min()`

. - Create a new column with the minimum value per row.
- Negates a boolean column.
- Compute the pearson correlation between two columns.
- Find a specific quantile of all the values in the column named
`name`

. - Analogous to
`Iterator::reduce`

. - Create a column of length
`n`

containing`n`

copies of the literal`value`

. Generally you won’t need this function, as`lit(value)`

already represents a column containing only`value`

whose length is automatically set to the correct number of rows. - rolling_corr
`rolling_window`

- rolling_cov
`rolling_window`

- spearman_rank_corr
`rank`

and`propagate_nans`

Compute the spearman rank correlation between two columns. Missing data will be excluded from the computation. - Sum all the values in the column named
`name`

. Shorthand for`col(name).sum()`

. - Sum all values horizontally across columns.
- time_ranges
`dtype-time`

Create a column of time ranges from a`start`

and`stop`

expression. - Start a
`when-then-otherwise`

expression.

## Type Aliases§

- FieldsNameMapper
`dtype-struct`