Module polars_lazy::dsl 
source · Expand description
Domain specific language for the Lazy API.
This DSL revolves around the Expr type, which represents an abstract
operation on a DataFrame, such as mapping over a column, filtering, group_by, or aggregation.
In general, functions on LazyFrames consume the LazyFrame and produce a new LazyFrame representing
the result of applying the function and passed expressions to the consumed LazyFrame.
At runtime, when LazyFrame::collect is called, the expressions that comprise
the LazyFrame’s logical plan are materialized on the actual underlying Series.
For instance, let expr = col("x").pow(lit(2)).alias("x2"); would produce an expression representing the abstract
operation of squaring the column "x" and naming the resulting column "x2", and to apply this operation to a
LazyFrame, you’d use let lazy_df = lazy_df.with_column(expr);.
(Of course, a column named "x" must either exist in the original DataFrame or be produced by one of the preceding
operations on the LazyFrame.)
There are many, many free functions that this module exports that produce an Expr from scratch; col and
lit are two examples.
Expressions also have several methods, such as pow and alias, that consume them
and produce a new expression.
Several expressions are only available when the necessary feature is enabled.
Examples of features that unlock specialized expression include string, temporal, and dtype-categorical.
These specialized expressions provide implementations of functions that you’d otherwise have to implement by hand.
Because of how abstract and flexible the Expr type is, care must be take to ensure you only attempt to perform
sensible operations with them.
For instance, as mentioned above, you have to make sure any columns you reference already exist in the LazyFrame.
Furthermore, there is nothing stopping you from calling, for example, any with an expression
that will yield an f64 column (instead of bool), or col("string") - col("f64"), which would attempt
to subtract an f64 Series from a string Series.
These kinds of invalid operations will only yield an error at runtime, when
collect is called on the LazyFrame.
Re-exports§
- pub use functions::*;
Modules§
- catdtype-categorical
- dttemporal
- Functions
- python_udfpython
- stringstrings
Structs§
- Specialized expressions forSeriesofDataType::Array.
- Specialized expressions for Categorical dtypes.
- Utility struct for thewhen-then-otherwiseexpression.
- Utility struct for thewhen-then-otherwiseexpression.
- Arguments used bydatetimein order to produce anExprof Datetime
- Specialized expressions for modifying the name of existing expressions.
- Specialized expressions forSeriesofDataType::List.
- Specialized expressions for Categorical dtypes.
- Wrapper type that has special equality properties depending on the inner type specialization
- Specialized expressions for Struct dtypes.
- Utility struct for thewhen-then-otherwiseexpression.
- Represents a user-defined function
- Utility struct for thewhen-then-otherwiseexpression.
Enums§
- Expressions that can be used in various contexts. Queries consist of multiple expressions. When using the polars lazy API, don’t construct anExprdirectly; instead, create one using the functions in thepolars_lazy::dslmodule. See that module’s docs for more info.
Traits§
- ExprEvalExtensioncumulative_evalorlist_eval
- IntoListNameSpacelist_eval
- ListNameSpaceExtensionlist_eval
- A wrapper trait for any binary closureFn(Series, Series) -> PolarsResult<Series>
- A wrapper trait for any closureFn(Vec<Series>) -> PolarsResult<Series>
Functions§
- Selects all columns. Shorthand forcol("*").
- Create a new column with the bitwise-and of the elements in each row.
- Create a new column with the bitwise-or of the elements in each row.
- Likemap_binary, but used in a group_by-aggregation context.
- Apply a function/closure over the groups of multiple columns. This should only be used in a group_by aggregation.
- Generate a range of integers.
- arg_sort_byrangeFind the indexes that would sort these series in order of appearance. That means that the firstSerieswill be used to determine the ordering until duplicates are found. Once duplicates are found, the nextSerieswill be used and so on.
- arg_wherearg_whereGet the indices whereconditionevaluatestrue.
- Take several expressions and collect them into aStructChunked.
- Find the mean of all the values in the column namedname. Alias formean.
- Computeop(l, r)(or equivalentlyl op r).landrmust have types compatible with the Operator.
- business_day_countdtype-date
- Casts the column given byExprto a different type.
- Folds the expressions from left to right keeping the first non-null values.
- Create a Column Expression based on a column name.
- Select multiple columns by name.
- Concat lists entries.
- concat_strconcat_strandstringsHorizontally concat string columns in linear time
- Compute the covariance between two columns.
- cum_fold_exprsdtype-structAccumulate over multiple columns horizontally / row wise.
- cum_reduce_exprsdtype-structAccumulate over multiple columns horizontally / row wise.
- date_rangestemporalCreate a column of date ranges from astartandstopexpression.
- datetimetemporalConstruct a column ofDatetimefrom the providedDatetimeArgs.
- datetime_rangedtype-datetimeCreate a datetime range from astartandstopexpression.
- datetime_rangesdtype-datetimeCreate a column of datetime ranges from astartandstopexpression.
- Select multiple columns by dtype.
- Select multiple columns by dtype.
- durationtemporalConstruct a column ofDurationfrom the providedDurationArgs
- First column in a DataFrame.
- Accumulate over multiple columns horizontally / row wise.
- format_strconcat_strandstringsFormat the results of an array of expressions using a format string
- Select multiple columns by index.
- Generate a range of integers.
- Generate a range of integers for each row of the input columns.
- A column which isfalsewhereverexpris null,trueelsewhere.
- A column which istruewhereverexpris null,falseelsewhere.
- Last column in a DataFrame.
- Return the number of rows in the context.
- Create a Literal Expression fromL. A literal expression behaves like a column that contains a single distinct value.
- Apply a function/closure over multiple columns once the logical plan get executed.
- Apply a function/closure over multiple columns once the logical plan get executed.
- Find the maximum of all the values in the column namedname. Shorthand forcol(name).max().
- Create a new column with the maximum value per row.
- Find the mean of all the values in the column namedname. Shorthand forcol(name).mean().
- Compute the mean of all values horizontally across columns.
- Find the median of all the values in the column namedname. Shorthand forcol(name).median().
- Find the minimum of all the values in the column namedname. Shorthand forcol(name).min().
- Create a new column with the minimum value per row.
- Negates a boolean column.
- Nth column in a DataFrame.
- Compute the pearson correlation between two columns.
- Find a specific quantile of all the values in the column namedname.
- Analogous toIterator::reduce.
- Create a column of lengthncontainingncopies of the literalvalue. Generally you won’t need this function, aslit(value)already represents a column containing onlyvaluewhose length is automatically set to the correct number of rows.
- rolling_corrrolling_window
- rolling_covrolling_window
- spearman_rank_corrrankandpropagate_nansCompute the spearman rank correlation between two columns. Missing data will be excluded from the computation.
- Sum all the values in the column namedname. Shorthand forcol(name).sum().
- Sum all values horizontally across columns.
- time_rangesdtype-timeCreate a column of time ranges from astartandstopexpression.
- Start awhen-then-otherwiseexpression.
Type Aliases§
- FieldsNameMapperdtype-struct