Expand description
Domain specific language for the Lazy API.
This DSL revolves around the Expr type, which represents an abstract
operation on a DataFrame, such as mapping over a column, filtering, group_by, or aggregation.
In general, functions on LazyFrames consume the LazyFrame and produce a new LazyFrame representing
the result of applying the function and passed expressions to the consumed LazyFrame.
At runtime, when LazyFrame::collect is called, the expressions that comprise
the LazyFrame’s logical plan are materialized on the actual underlying Series.
For instance, let expr = col("x").pow(lit(2)).alias("x2"); would produce an expression representing the abstract
operation of squaring the column "x" and naming the resulting column "x2", and to apply this operation to a
LazyFrame, you’d use let lazy_df = lazy_df.with_column(expr);.
(Of course, a column named "x" must either exist in the original DataFrame or be produced by one of the preceding
operations on the LazyFrame.)
There are many, many free functions that this module exports that produce an Expr from scratch; col and
lit are two examples.
Expressions also have several methods, such as pow and alias, that consume them
and produce a new expression.
Several expressions are only available when the necessary feature is enabled.
Examples of features that unlock specialized expression include string, temporal, and dtype-categorical.
These specialized expressions provide implementations of functions that you’d otherwise have to implement by hand.
Because of how abstract and flexible the Expr type is, care must be take to ensure you only attempt to perform
sensible operations with them.
For instance, as mentioned above, you have to make sure any columns you reference already exist in the LazyFrame.
Furthermore, there is nothing stopping you from calling, for example, any with an expression
that will yield an f64 column (instead of bool), or col("string") - col("f64"), which would attempt
to subtract an f64 Series from a string Series.
These kinds of invalid operations will only yield an error at runtime, when
collect is called on the LazyFrame.
Re-exports§
pub use functions::*;
Modules§
- anonymous
- binary
- cat
dtype-categorical - default_
values - deletion
- dt
temporal - file_
provider - function_
expr - functions
- Functions
- python_
dataset python - python_
dsl python - sink
- string
strings - udf
Structs§
- Anonymous
Scan Options - Array
Name Space - Specialized expressions for
SeriesofDataType::Array. - Base
Column Udf - Callback
Sink Type - Cast
Columns Policy - Used by scans.
- Categorical
Name Space - Specialized expressions for Categorical dtypes.
- Chained
Then - Utility struct for the
when-then-otherwiseexpression. - Chained
When - Utility struct for the
when-then-otherwiseexpression. - Distinct
OptionsDSL - DslBuilder
- Expr
Name Name Space - Specialized expressions for modifying the name of existing expressions.
- Extension
Name Space - Specialized expressions for Categorical dtypes.
- File
Sink Options - Groupby
Options - HConcat
Options - Join
Options - Join
OptionsIR - List
Name Space - Specialized expressions for
SeriesofDataType::List. - Logical
Plan UdfOptions - Match
ToSchema PerColumn - Meta
Name Space - Specialized expressions for Categorical dtypes.
- NDJson
Read Options json - Partitioned
Sink Options - Partitioned
Sink OptionsIR - Plan
Serialization Context - Predicate
File Skip - Python
Dataset ProviderV Table - Rolling
CovOptions - Scan
Flags - Scan
Source Iter - An iterator for
ScanSources - Special
Eq - Wrapper type that has special equality properties depending on the inner type specialization
- Strptime
Options - Struct
Name Space - Specialized expressions for Struct dtypes.
- Table
Statistics - Then
- Utility struct for the
when-then-otherwiseexpression. - Time
Unit Set - Unified
Scan Args - Scan arguments shared across different scan types.
- Unified
Sink Args - Union
Args - Union
Options - Unpivot
ArgsDSL - User
Defined Function - Represents a user-defined function
- When
- Utility struct for the
when-then-otherwiseexpression.
Enums§
- AggExpr
- Array
Data Type Function - Array
Function - Binary
Function - Bitwise
Function - Boolean
Function - Business
Function - Categorical
Function - Column
Mapping - Correlation
Method - Data
Type Expr - Data
Type Function - Data
Type Selector - Date
Range Args dtype-dateordtype-datetime - DslPlan
- Engine
- Eval
Variant - Excluded
- Expr
- Expressions that can be used in various contexts.
- Extension
Function - Extra
Columns Policy - File
Scan Dsl - Note: This is cheaply cloneable.
- File
ScanIR - Note: This is cheaply cloneable.
- File
Write Format - Function
Expr - Join
Type OptionsIR - Lazy
Serde - List
Function - Missing
Columns Policy - Missing
Columns Policy OrExpr - Operator
- Partition
Strategy - Partition
StrategyIR - PowFunction
- Random
Method - Range
Function - Rename
Alias Fn - Reshape
Dimension - A dimension in a reshape.
- Rolling
Function - Rolling
Function By - Scan
Source - A single source to scan from
- Scan
Source Ref - A reference to a single item in
ScanSources - Scan
Sources - Set of sources to scan from
- Selector
- Sink
Destination - Sink
Target - Sink
Type - Sink
TypeIR - String
Function - Struct
Data Type Function - Struct
Function - Temporal
Function - Time
Zone Set - Trigonometric
Function - Upcast
OrForbid - Window
Mapping
Constants§
Statics§
- DATASET_
PROVIDER_ VTABLE - This is for
polars-pythonto inject so that the implementation can be done there:
Traits§
- Anonymous
Columns Udf - Anonymous
Streaming Agg - Columns
Udf - A wrapper trait for any closure
Fn(Vec<Series>) -> PolarsResult<Series> - UdfSchema
Functions§
- apply_
multiple - Apply a function/closure over the groups of multiple columns. This should only be used in a group_by aggregation.
- binary_
expr - Compute
op(l, r)(or equivalentlyl op r).landrmust have types compatible with the Operator. - map_
multiple - Apply a function/closure over multiple columns once the logical plan get executed.
- new_
column_ udf - ternary_
expr - when
- Start a
when-then-otherwiseexpression.
Type Aliases§
- DslName
Generator array_to_structorlist_to_struct - Fields
Name Mapper dtype-struct - Opaque
Column Udf - Opaque
Streaming Agg - Rename
Alias Rust Fn