Expressions#
This page gives an overview of all public polars expressions.
- class polars.Expr[source]
- Expressions that can be used in various contexts. - Methods: - Compute absolute values. - Method equivalent of addition operator - expr + other.- Get the group indexes of the group by operation. - Rename the expression. - Return whether all values in the column are - True.- Method equivalent of bitwise "and" operator - expr & other & ....- Return whether any of the values in the column are - True.- Append expressions. - Apply a custom/user-defined function (UDF) in a GroupBy or Projection context. - Approximate count of unique values. - Compute the element-wise value for the inverse cosine. - Compute the element-wise value for the inverse hyperbolic cosine. - Compute the element-wise value for the inverse sine. - Compute the element-wise value for the inverse hyperbolic sine. - Compute the element-wise value for the inverse tangent. - Compute the element-wise value for the inverse hyperbolic tangent. - Get the index of the maximal value. - Get the index of the minimal value. - Get the index values that would sort this column. - Return indices where expression evaluates - True.- Get index of first unique value. - Fill missing values with the next to be seen values. - Return the - ksmallest elements.- Cache this expression so that it only is executed once per context. - Cast between data types. - Compute the cube root of the elements. - Rounds up to the nearest integer value. - Set values outside the given boundaries to the boundary value. - Clip (limit) the values in an array to a - maxboundary.- Clip (limit) the values in an array to a - minboundary.- Compute the element-wise value for the cosine. - Compute the element-wise value for the hyperbolic cosine. - Compute the element-wise value for the cotangent. - Return the number of elements in the column. - Get an array with the cumulative count computed at every element. - Get an array with the cumulative max computed at every element. - Get an array with the cumulative min computed at every element. - Get an array with the cumulative product computed at every element. - Get an array with the cumulative sum computed at every element. - Get an array with the cumulative count computed at every element. - Get an array with the cumulative max computed at every element. - Get an array with the cumulative min computed at every element. - Get an array with the cumulative product computed at every element. - Get an array with the cumulative sum computed at every element. - Run an expression over a sliding window that increases - 1slot every iteration.- Bin continuous values into discrete categories. - Convert from radians to degrees. - Calculate the first discrete difference between shifted items. - Compute the dot/inner product between two Expressions. - Drop all floating point NaN values. - Drop all null values. - Computes the entropy. - Method equivalent of equality operator - expr == other.- Method equivalent of equality operator - expr == otherwhere- None == None.- Exponentially-weighted moving average. - Exponentially-weighted moving standard deviation. - Exponentially-weighted moving variance. - Exclude columns from a multi-column expression. - Compute the exponential, element-wise. - Explode a list expression. - Extremely fast method for extending the Series with 'n' copies of a value. - Fill floating point NaN value with a fill value. - Fill null values using the specified value or strategy. - Filter a single column. - Get the first value. - Flatten a list or string column. - Rounds down to the nearest integer value. - Method equivalent of integer division operator - expr // other.- Fill missing values with the latest seen values. - Read an expression from a JSON encoded string to construct an Expression. - Take values by index. - Take every nth value in the Series and return as a new Series. - Method equivalent of "greater than or equal" operator - expr >= other.- Return a single value by index. - Method equivalent of "greater than" operator - expr > other.- Hash the elements in the selection. - Get the first - nrows.- Aggregate values into a list. - Print the value that this expression evaluates to and pass on the value. - Fill null values using interpolation. - Check if this expression is between the given start and end values. - Return a boolean mask indicating duplicated values. - Returns a boolean Series indicating which values are finite. - Return a boolean mask indicating the first occurrence of each distinct value. - Return a boolean mask indicating the first occurrence of each distinct value. - Check if elements of this expression are present in the other Series. - Returns a boolean Series indicating which values are infinite. - Return a boolean mask indicating the last occurrence of each distinct value. - Return a boolean mask indicating the last occurrence of each distinct value. - Returns a boolean Series indicating which values are NaN. - Negate a boolean expression. - Returns a boolean Series indicating which values are not NaN. - Returns a boolean Series indicating which values are not null. - Returns a boolean Series indicating which values are null. - Get mask of unique values. - Keep the original root name of the expression. - Compute the kurtosis (Fisher or Pearson) of a dataset. - Get the last value. - Method equivalent of "less than or equal" operator - expr <= other.- Return the number of elements in the column. - Get the first - nrows (alias for- Expr.head()).- Compute the logarithm to a given base. - Compute the base 10 logarithm of the input array, element-wise. - Compute the natural logarithm of each element plus one. - Calculate the lower bound. - Method equivalent of "less than" operator - expr < other.- Apply a custom python function to a Series or sequence of Series. - Rename the output of an expression by mapping a function over the root name. - Apply a custom python function to a whole Series or sequence of Series. - Replace values in column according to remapping dictionary. - Map a custom/user-defined function (UDF) to each element of a column. - Get maximum value. - Get mean value. - Get median value using linear interpolation. - Get minimum value. - Method equivalent of modulus operator - expr % other.- Compute the most occurring value(s). - Method equivalent of multiplication operator - expr * other.- Count unique values. - Get maximum value, but propagate/poison encountered NaN values. - Get minimum value, but propagate/poison encountered NaN values. - Method equivalent of inequality operator - expr != other.- Method equivalent of equality operator - expr != otherwhere- None == None.- Negate a boolean expression. - Count null values. - Method equivalent of bitwise "or" operator - expr | other | ....- Compute expressions over the given groups. - Computes percentage change between values. - Get a boolean mask of the local maximum peaks. - Get a boolean mask of the local minimum peaks. - Offers a structured way to apply a sequence of user-defined functions (UDFs). - Method equivalent of exponentiation operator - expr ** exponent.- Add a prefix to the root column name of the expression. - Compute the product of an expression. - Bin continuous values into discrete categories based on their quantiles. - Get quantile value. - Convert from degrees to radians. - Assign ranks to data, dealing with ties appropriately. - Create a single chunk of memory for this Series. - register_plugin- Register a shared library as a plugin. - Reinterpret the underlying bits as a signed/unsigned integer. - Repeat the elements in this Series as specified in the given expression. - Replace values according to the given mapping. - Reshape this Expr to a flat Series or a Series of Lists. - Reverse the selection. - Get the lengths of runs of identical values. - Map values to run IDs. - Create rolling groups based on a time, Int32, or Int64 column. - Apply a custom rolling window function. - Compute a custom rolling window function. - Apply a rolling max (moving max) over the values in this array. - Apply a rolling mean (moving mean) over the values in this array. - Compute a rolling median. - Apply a rolling min (moving min) over the values in this array. - Compute a rolling quantile. - Compute a rolling skew. - Compute a rolling standard deviation. - Apply a rolling sum (moving sum) over the values in this array. - Compute a rolling variance. - Round underlying floating point data by - decimalsdigits.- Round to a number of significant figures. - Sample from this expression. - Find indices where elements should be inserted to maintain order. - Flags the expression as 'sorted'. - Shift values by the given number of indices. - Shift values by the given number of places and fill the resulting null values. - Shrink numeric columns to the minimal required datatype. - Shuffle the contents of this expression. - Compute the element-wise indication of the sign. - Compute the element-wise value for the sine. - Compute the element-wise value for the hyperbolic sine. - Compute the sample skewness of a data set. - Get a slice of this expression. - Sort this column. - Sort this column by the ordering of other columns. - Compute the square root of the elements. - Get standard deviation. - Method equivalent of subtraction operator - expr - other.- Add a suffix to the root column name of the expression. - Get sum value. - Get the last - nrows.- Take values by index. - Take every nth value in the Series and return as a new Series. - Compute the element-wise value for the tangent. - Compute the element-wise value for the hyperbolic tangent. - Cast to physical representation of the logical dtype. - Return the - klargest elements.- Method equivalent of float division operator - expr / other.- Get unique values of this expression. - Return a count of the unique values in the order of appearance. - Calculate the upper bound. - Count the occurrences of unique values. - Get variance. - Filter a single column. - Method equivalent of bitwise exclusive-or operator - expr ^ other.- abs() Self[source]
- Compute absolute values. - Same as - abs(expr).- Examples - >>> df = pl.DataFrame( ... { ... "A": [-1.0, 0.0, 1.0, 2.0], ... } ... ) >>> df.select(pl.col("A").abs()) shape: (4, 1) ┌─────┐ │ A │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ │ 0.0 │ │ 1.0 │ │ 2.0 │ └─────┘ 
 - add(other: Any) Self[source]
- Method equivalent of addition operator - expr + other.- Parameters:
- other
- numeric or string value; accepts expression input. 
 
 - Examples - >>> df = pl.DataFrame({"x": [1, 2, 3, 4, 5]}) >>> df.with_columns( ... pl.col("x").add(2).alias("x+int"), ... pl.col("x").add(pl.col("x").cum_prod()).alias("x+expr"), ... ) shape: (5, 3) ┌─────┬───────┬────────┐ │ x ┆ x+int ┆ x+expr │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪═══════╪════════╡ │ 1 ┆ 3 ┆ 2 │ │ 2 ┆ 4 ┆ 4 │ │ 3 ┆ 5 ┆ 9 │ │ 4 ┆ 6 ┆ 28 │ │ 5 ┆ 7 ┆ 125 │ └─────┴───────┴────────┘ - >>> df = pl.DataFrame( ... {"x": ["a", "d", "g"], "y": ["b", "e", "h"], "z": ["c", "f", "i"]} ... ) >>> df.with_columns(pl.col("x").add(pl.col("y")).add(pl.col("z")).alias("xyz")) shape: (3, 4) ┌─────┬─────┬─────┬─────┐ │ x ┆ y ┆ z ┆ xyz │ │ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ str ┆ str ┆ str │ ╞═════╪═════╪═════╪═════╡ │ a ┆ b ┆ c ┆ abc │ │ d ┆ e ┆ f ┆ def │ │ g ┆ h ┆ i ┆ ghi │ └─────┴─────┴─────┴─────┘ 
 - agg_groups() Self[source]
- Get the group indexes of the group by operation. - Should be used in aggregation context only. - Examples - >>> df = pl.DataFrame( ... { ... "group": [ ... "one", ... "one", ... "one", ... "two", ... "two", ... "two", ... ], ... "value": [94, 95, 96, 97, 97, 99], ... } ... ) >>> df.group_by("group", maintain_order=True).agg(pl.col("value").agg_groups()) shape: (2, 2) ┌───────┬───────────┐ │ group ┆ value │ │ --- ┆ --- │ │ str ┆ list[u32] │ ╞═══════╪═══════════╡ │ one ┆ [0, 1, 2] │ │ two ┆ [3, 4, 5] │ └───────┴───────────┘ 
 - alias(name: str) Self[source]
- Rename the expression. - Parameters:
- name
- The new name. 
 
 - Examples - Rename an expression to avoid overwriting an existing column. - >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3], ... "b": ["x", "y", "z"], ... } ... ) >>> df.with_columns( ... pl.col("a") + 10, ... pl.col("b").str.to_uppercase().alias("c"), ... ) shape: (3, 3) ┌─────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ str │ ╞═════╪═════╪═════╡ │ 11 ┆ x ┆ X │ │ 12 ┆ y ┆ Y │ │ 13 ┆ z ┆ Z │ └─────┴─────┴─────┘ - Overwrite the default name of literal columns to prevent errors due to duplicate column names. - >>> df.with_columns( ... pl.lit(True).alias("c"), ... pl.lit(4.0).alias("d"), ... ) shape: (3, 4) ┌─────┬─────┬──────┬─────┐ │ a ┆ b ┆ c ┆ d │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ bool ┆ f64 │ ╞═════╪═════╪══════╪═════╡ │ 1 ┆ x ┆ true ┆ 4.0 │ │ 2 ┆ y ┆ true ┆ 4.0 │ │ 3 ┆ z ┆ true ┆ 4.0 │ └─────┴─────┴──────┴─────┘ 
 - all(*, ignore_nulls: bool = True) Self[source]
- Return whether all values in the column are - True.- Only works on columns of data type - Boolean.- Note - This method is not to be confused with the function - polars.all(), which can be used to select all columns.- Parameters:
- ignore_nulls
- Ignore null values (default). - If set to - False, Kleene logic is used to deal with nulls: if the column contains any null values and no- Truevalues, the output is null.
 
- Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [True, True], ... "b": [False, True], ... "c": [None, True], ... } ... ) >>> df.select(pl.col("*").all()) shape: (1, 3) ┌──────┬───────┬──────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ bool ┆ bool ┆ bool │ ╞══════╪═══════╪══════╡ │ true ┆ false ┆ true │ └──────┴───────┴──────┘ - Enable Kleene logic by setting - ignore_nulls=False.- >>> df.select(pl.col("*").all(ignore_nulls=False)) shape: (1, 3) ┌──────┬───────┬──────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ bool ┆ bool ┆ bool │ ╞══════╪═══════╪══════╡ │ true ┆ false ┆ null │ └──────┴───────┴──────┘ 
 - and_(*others: Any) Self[source]
- Method equivalent of bitwise “and” operator - expr & other & ....- Parameters:
- *others
- One or more integer or boolean expressions to evaluate/combine. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [5, 6, 7, 4, 8], ... "y": [1.5, 2.5, 1.0, 4.0, -5.75], ... "z": [-9, 2, -1, 4, 8], ... } ... ) >>> df.select( ... (pl.col("x") >= pl.col("z")) ... .and_( ... pl.col("y") >= pl.col("z"), ... pl.col("y") == pl.col("y"), ... pl.col("z") <= pl.col("x"), ... pl.col("y") != pl.col("x"), ... ) ... .alias("all") ... ) shape: (5, 1) ┌───────┐ │ all │ │ --- │ │ bool │ ╞═══════╡ │ true │ │ true │ │ true │ │ false │ │ false │ └───────┘ 
 - any(*, ignore_nulls: bool = True) Self[source]
- Return whether any of the values in the column are - True.- Only works on columns of data type - Boolean.- Parameters:
- ignore_nulls
- Ignore null values (default). - If set to - False, Kleene logic is used to deal with nulls: if the column contains any null values and no- Truevalues, the output is null.
 
- Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [True, False], ... "b": [False, False], ... "c": [None, False], ... } ... ) >>> df.select(pl.col("*").any()) shape: (1, 3) ┌──────┬───────┬───────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ bool ┆ bool ┆ bool │ ╞══════╪═══════╪═══════╡ │ true ┆ false ┆ false │ └──────┴───────┴───────┘ - Enable Kleene logic by setting - ignore_nulls=False.- >>> df.select(pl.col("*").any(ignore_nulls=False)) shape: (1, 3) ┌──────┬───────┬──────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ bool ┆ bool ┆ bool │ ╞══════╪═══════╪══════╡ │ true ┆ false ┆ null │ └──────┴───────┴──────┘ 
 - append(other: IntoExpr, *, upcast: bool = True) Self[source]
- Append expressions. - This is done by adding the chunks of - otherto this- Series.- Parameters:
- other
- Expression to append. 
- upcast
- Cast both - Seriesto the same supertype.
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [8, 9, 10], ... "b": [None, 4, 4], ... } ... ) >>> df.select(pl.all().head(1).append(pl.all().tail(1))) shape: (2, 2) ┌─────┬──────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪══════╡ │ 8 ┆ null │ │ 10 ┆ 4 │ └─────┴──────┘ 
 - apply(
- function: Callable[[Series], Series] | Callable[[Any], Any],
- return_dtype: PolarsDataType | None = None,
- *,
- skip_nulls: bool = True,
- pass_name: bool = False,
- strategy: MapElementsStrategy = 'thread_local',
- Apply a custom/user-defined function (UDF) in a GroupBy or Projection context. - Deprecated since version 0.19.0: This method has been renamed to - Expr.map_elements().- Parameters:
- function
- Lambda/ function to apply. 
- return_dtype
- Dtype of the output Series. If not set, the dtype will be - polars.Unknown.
- skip_nulls
- Don’t apply the function over values that contain nulls. This is faster. 
- pass_name
- Pass the Series name to the custom function This is more expensive. 
- strategy{‘thread_local’, ‘threading’}
- This functionality is in - alphastage. This may be removed /changed without it being considered a breaking change.- ‘thread_local’: run the python function on a single thread. 
- ‘threading’: run the python function on separate threads. Use with
- care as this can slow performance. This might only speed up your code if the amount of work per element is significant and the python function releases the GIL (e.g. via calling a c function) 
 
 
 
 
 - approx_n_unique() Self[source]
- Approximate count of unique values. - This is done using the HyperLogLog++ algorithm for cardinality estimation. - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2]}) >>> df.select(pl.col("a").approx_n_unique()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 2 │ └─────┘ 
 - arccos() Self[source]
- Compute the element-wise value for the inverse cosine. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [0.0]}) >>> df.select(pl.col("a").arccos()) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 1.570796 │ └──────────┘ 
 - arccosh() Self[source]
- Compute the element-wise value for the inverse hyperbolic cosine. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").arccosh()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 0.0 │ └─────┘ 
 - arcsin() Self[source]
- Compute the element-wise value for the inverse sine. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").arcsin()) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 1.570796 │ └──────────┘ 
 - arcsinh() Self[source]
- Compute the element-wise value for the inverse hyperbolic sine. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").arcsinh()) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 0.881374 │ └──────────┘ 
 - arctan() Self[source]
- Compute the element-wise value for the inverse tangent. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").arctan()) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 0.785398 │ └──────────┘ 
 - arctanh() Self[source]
- Compute the element-wise value for the inverse hyperbolic tangent. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").arctanh()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ inf │ └─────┘ 
 - arg_max() Self[source]
- Get the index of the maximal value. - Examples - >>> df = pl.DataFrame( ... { ... "a": [20, 10, 30], ... } ... ) >>> df.select(pl.col("a").arg_max()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 2 │ └─────┘ 
 - arg_min() Self[source]
- Get the index of the minimal value. - Examples - >>> df = pl.DataFrame( ... { ... "a": [20, 10, 30], ... } ... ) >>> df.select(pl.col("a").arg_min()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 1 │ └─────┘ 
 - arg_sort(*, descending: bool = False, nulls_last: bool = False) Self[source]
- Get the index values that would sort this column. - Parameters:
- descending
- Sort in descending (descending) order. 
- nulls_last
- Place null values last instead of first. 
 
- Returns:
- Expr
- Expression of data type - UInt32.
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [20, 10, 30], ... } ... ) >>> df.select(pl.col("a").arg_sort()) shape: (3, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 1 │ │ 0 │ │ 2 │ └─────┘ 
 - arg_true() Self[source]
- Return indices where expression evaluates - True.- Warning - Modifies number of rows returned, so will fail in combination with other expressions. Use as only expression in - select/- with_columns.- See also - Series.arg_true
- Return indices where Series is True 
- polars.arg_where
 - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2, 1]}) >>> df.select((pl.col("a") == 1).arg_true()) shape: (3, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 0 │ │ 1 │ │ 3 │ └─────┘ 
 - arg_unique() Self[source]
- Get index of first unique value. - Examples - >>> df = pl.DataFrame( ... { ... "a": [8, 9, 10], ... "b": [None, 4, 4], ... } ... ) >>> df.select(pl.col("a").arg_unique()) shape: (3, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 0 │ │ 1 │ │ 2 │ └─────┘ >>> df.select(pl.col("b").arg_unique()) shape: (2, 1) ┌─────┐ │ b │ │ --- │ │ u32 │ ╞═════╡ │ 0 │ │ 1 │ └─────┘ 
 - backward_fill(limit: int | None = None) Self[source]
- Fill missing values with the next to be seen values. - Parameters:
- limit
- The number of consecutive null values to backward fill. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, None], ... "b": [4, None, 6], ... "c": [None, None, 2], ... } ... ) >>> df.select(pl.all().backward_fill()) shape: (3, 3) ┌──────┬─────┬─────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞══════╪═════╪═════╡ │ 1 ┆ 4 ┆ 2 │ │ 2 ┆ 6 ┆ 2 │ │ null ┆ 6 ┆ 2 │ └──────┴─────┴─────┘ >>> df.select(pl.all().backward_fill(limit=1)) shape: (3, 3) ┌──────┬─────┬──────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞══════╪═════╪══════╡ │ 1 ┆ 4 ┆ null │ │ 2 ┆ 6 ┆ 2 │ │ null ┆ 6 ┆ 2 │ └──────┴─────┴──────┘ 
 - bottom_k(k: int | IntoExprColumn = 5) Self[source]
- Return the - ksmallest elements.- This has time complexity: \[\begin{split}O(n + k \\log{}n - \frac{k}{2})\end{split}\]- Parameters:
- k
- Number of elements to return. 
 
 - See also - Examples - >>> df = pl.DataFrame( ... { ... "value": [1, 98, 2, 3, 99, 4], ... } ... ) >>> df.select( ... [ ... pl.col("value").top_k().alias("top_k"), ... pl.col("value").bottom_k().alias("bottom_k"), ... ] ... ) shape: (5, 2) ┌───────┬──────────┐ │ top_k ┆ bottom_k │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═══════╪══════════╡ │ 99 ┆ 1 │ │ 98 ┆ 2 │ │ 4 ┆ 3 │ │ 3 ┆ 4 │ │ 2 ┆ 98 │ └───────┴──────────┘ 
 - cache() Self[source]
- Cache this expression so that it only is executed once per context. - Deprecated since version 0.18.9: This method now does nothing. It has been superseded by the - comm_subexpr_elimsetting on- LazyFrame.collect, which automatically caches expressions that are equal.
 - cast(dtype: PolarsDataType | type[Any], *, strict: bool = True) Self[source]
- Cast between data types. - Parameters:
- dtype
- DataType to cast to. 
- strict
- Throw an error if a cast could not be done (for instance, due to an overflow). 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3], ... "b": ["4", "5", "6"], ... } ... ) >>> df.with_columns( ... [ ... pl.col("a").cast(pl.Float64), ... pl.col("b").cast(pl.Int32), ... ] ... ) shape: (3, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ f64 ┆ i32 │ ╞═════╪═════╡ │ 1.0 ┆ 4 │ │ 2.0 ┆ 5 │ │ 3.0 ┆ 6 │ └─────┴─────┘ 
 - cbrt() Self[source]
- Compute the cube root of the elements. - Examples - >>> df = pl.DataFrame({"values": [1.0, 2.0, 4.0]}) >>> df.select(pl.col("values").cbrt()) shape: (3, 1) ┌──────────┐ │ values │ │ --- │ │ f64 │ ╞══════════╡ │ 1.0 │ │ 1.259921 │ │ 1.587401 │ └──────────┘ 
 - ceil() Self[source]
- Rounds up to the nearest integer value. - Only works on floating point Series. - Examples - >>> df = pl.DataFrame({"a": [0.3, 0.5, 1.0, 1.1]}) >>> df.select(pl.col("a").ceil()) shape: (4, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ │ 1.0 │ │ 1.0 │ │ 2.0 │ └─────┘ 
 - clip(
- lower_bound: NumericLiteral | TemporalLiteral | IntoExprColumn | None = None,
- upper_bound: NumericLiteral | TemporalLiteral | IntoExprColumn | None = None,
- Set values outside the given boundaries to the boundary value. - Parameters:
- lower_bound
- Lower bound. Accepts expression input. Non-expression inputs are parsed as literals. 
- upper_bound
- Upper bound. Accepts expression input. Non-expression inputs are parsed as literals. 
 
 - See also - Notes - This method only works for numeric and temporal columns. To clip other data types, consider writing a - when-then-otherwiseexpression. See- when().- Examples - Specifying both a lower and upper bound: - >>> df = pl.DataFrame({"a": [-50, 5, 50, None]}) >>> df.with_columns(clip=pl.col("a").clip(1, 10)) shape: (4, 2) ┌──────┬──────┐ │ a ┆ clip │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞══════╪══════╡ │ -50 ┆ 1 │ │ 5 ┆ 5 │ │ 50 ┆ 10 │ │ null ┆ null │ └──────┴──────┘ - Specifying only a single bound: - >>> df.with_columns(clip=pl.col("a").clip(upper_bound=10)) shape: (4, 2) ┌──────┬──────┐ │ a ┆ clip │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞══════╪══════╡ │ -50 ┆ -50 │ │ 5 ┆ 5 │ │ 50 ┆ 10 │ │ null ┆ null │ └──────┴──────┘ 
 - clip_max(upper_bound: NumericLiteral | TemporalLiteral | IntoExprColumn) Self[source]
- Clip (limit) the values in an array to a - maxboundary.- Deprecated since version 0.19.12: Use - clip()instead.- Parameters:
- upper_bound
- Upper bound. 
 
 
 - clip_min(lower_bound: NumericLiteral | TemporalLiteral | IntoExprColumn) Self[source]
- Clip (limit) the values in an array to a - minboundary.- Deprecated since version 0.19.12: Use - clip()instead.- Parameters:
- lower_bound
- Lower bound. 
 
 
 - cos() Self[source]
- Compute the element-wise value for the cosine. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [0.0]}) >>> df.select(pl.col("a").cos()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ └─────┘ 
 - cosh() Self[source]
- Compute the element-wise value for the hyperbolic cosine. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").cosh()) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 1.543081 │ └──────────┘ 
 - cot() Self[source]
- Compute the element-wise value for the cotangent. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").cot().round(2)) shape: (1, 1) ┌──────┐ │ a │ │ --- │ │ f64 │ ╞══════╡ │ 0.64 │ └──────┘ 
 - count() Self[source]
- Return the number of elements in the column. - Warning - Null values are treated like regular elements in this context. - Examples - >>> df = pl.DataFrame({"a": [8, 9, 10], "b": [None, 4, 4]}) >>> df.select(pl.all().count()) shape: (1, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ u32 ┆ u32 │ ╞═════╪═════╡ │ 3 ┆ 3 │ └─────┴─────┘ 
 - cum_count(*, reverse: bool = False) Self[source]
- Get an array with the cumulative count computed at every element. - Counting from 0 to len - Parameters:
- reverse
- Reverse the operation. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 4]}) >>> df.with_columns( ... pl.col("a").cum_count().alias("cum_count"), ... pl.col("a").cum_count(reverse=True).alias("cum_count_reverse"), ... ) shape: (4, 3) ┌─────┬───────────┬───────────────────┐ │ a ┆ cum_count ┆ cum_count_reverse │ │ --- ┆ --- ┆ --- │ │ i64 ┆ u32 ┆ u32 │ ╞═════╪═══════════╪═══════════════════╡ │ 1 ┆ 0 ┆ 3 │ │ 2 ┆ 1 ┆ 2 │ │ 3 ┆ 2 ┆ 1 │ │ 4 ┆ 3 ┆ 0 │ └─────┴───────────┴───────────────────┘ 
 - cum_max(*, reverse: bool = False) Self[source]
- Get an array with the cumulative max computed at every element. - Parameters:
- reverse
- Reverse the operation. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 4]}) >>> df.with_columns( ... pl.col("a").cum_max().alias("cum_max"), ... pl.col("a").cum_max(reverse=True).alias("cum_max_reverse"), ... ) shape: (4, 3) ┌─────┬─────────┬─────────────────┐ │ a ┆ cum_max ┆ cum_max_reverse │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪═════════╪═════════════════╡ │ 1 ┆ 1 ┆ 4 │ │ 2 ┆ 2 ┆ 4 │ │ 3 ┆ 3 ┆ 4 │ │ 4 ┆ 4 ┆ 4 │ └─────┴─────────┴─────────────────┘ - Null values are excluded, but can also be filled by calling - forward_fill.- >>> df = pl.DataFrame({"values": [None, 10, None, 8, 9, None, 16, None]}) >>> df.with_columns( ... pl.col("values").cum_max().alias("cum_max"), ... pl.col("values").cum_max().forward_fill().alias("cum_max_all_filled"), ... ) shape: (8, 3) ┌────────┬─────────┬────────────────────┐ │ values ┆ cum_max ┆ cum_max_all_filled │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞════════╪═════════╪════════════════════╡ │ null ┆ null ┆ null │ │ 10 ┆ 10 ┆ 10 │ │ null ┆ null ┆ 10 │ │ 8 ┆ 10 ┆ 10 │ │ 9 ┆ 10 ┆ 10 │ │ null ┆ null ┆ 10 │ │ 16 ┆ 16 ┆ 16 │ │ null ┆ null ┆ 16 │ └────────┴─────────┴────────────────────┘ 
 - cum_min(*, reverse: bool = False) Self[source]
- Get an array with the cumulative min computed at every element. - Parameters:
- reverse
- Reverse the operation. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 4]}) >>> df.with_columns( ... pl.col("a").cum_min().alias("cum_min"), ... pl.col("a").cum_min(reverse=True).alias("cum_min_reverse"), ... ) shape: (4, 3) ┌─────┬─────────┬─────────────────┐ │ a ┆ cum_min ┆ cum_min_reverse │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪═════════╪═════════════════╡ │ 1 ┆ 1 ┆ 1 │ │ 2 ┆ 1 ┆ 2 │ │ 3 ┆ 1 ┆ 3 │ │ 4 ┆ 1 ┆ 4 │ └─────┴─────────┴─────────────────┘ 
 - cum_prod(*, reverse: bool = False) Self[source]
- Get an array with the cumulative product computed at every element. - Parameters:
- reverse
- Reverse the operation. 
 
 - Notes - Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues. - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 4]}) >>> df.with_columns( ... pl.col("a").cum_prod().alias("cum_prod"), ... pl.col("a").cum_prod(reverse=True).alias("cum_prod_reverse"), ... ) shape: (4, 3) ┌─────┬──────────┬──────────────────┐ │ a ┆ cum_prod ┆ cum_prod_reverse │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪══════════╪══════════════════╡ │ 1 ┆ 1 ┆ 24 │ │ 2 ┆ 2 ┆ 24 │ │ 3 ┆ 6 ┆ 12 │ │ 4 ┆ 24 ┆ 4 │ └─────┴──────────┴──────────────────┘ 
 - cum_sum(*, reverse: bool = False) Self[source]
- Get an array with the cumulative sum computed at every element. - Parameters:
- reverse
- Reverse the operation. 
 
 - Notes - Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues. - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 4]}) >>> df.with_columns( ... pl.col("a").cum_sum().alias("cum_sum"), ... pl.col("a").cum_sum(reverse=True).alias("cum_sum_reverse"), ... ) shape: (4, 3) ┌─────┬─────────┬─────────────────┐ │ a ┆ cum_sum ┆ cum_sum_reverse │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪═════════╪═════════════════╡ │ 1 ┆ 1 ┆ 10 │ │ 2 ┆ 3 ┆ 9 │ │ 3 ┆ 6 ┆ 7 │ │ 4 ┆ 10 ┆ 4 │ └─────┴─────────┴─────────────────┘ - Null values are excluded, but can also be filled by calling - forward_fill.- >>> df = pl.DataFrame({"values": [None, 10, None, 8, 9, None, 16, None]}) >>> df.with_columns( ... pl.col("values").cum_sum().alias("value_cum_sum"), ... pl.col("values") ... .cum_sum() ... .forward_fill() ... .alias("value_cum_sum_all_filled"), ... ) shape: (8, 3) ┌────────┬───────────────┬──────────────────────────┐ │ values ┆ value_cum_sum ┆ value_cum_sum_all_filled │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞════════╪═══════════════╪══════════════════════════╡ │ null ┆ null ┆ null │ │ 10 ┆ 10 ┆ 10 │ │ null ┆ null ┆ 10 │ │ 8 ┆ 18 ┆ 18 │ │ 9 ┆ 27 ┆ 27 │ │ null ┆ null ┆ 27 │ │ 16 ┆ 43 ┆ 43 │ │ null ┆ null ┆ 43 │ └────────┴───────────────┴──────────────────────────┘ 
 - cumcount(*, reverse: bool = False) Self[source]
- Get an array with the cumulative count computed at every element. - Deprecated since version 0.19.14: This method has been renamed to - cum_count().- Parameters:
- reverse
- Reverse the operation. 
 
 
 - cummax(*, reverse: bool = False) Self[source]
- Get an array with the cumulative max computed at every element. - Deprecated since version 0.19.14: This method has been renamed to - cum_max().- Parameters:
- reverse
- Reverse the operation. 
 
 
 - cummin(*, reverse: bool = False) Self[source]
- Get an array with the cumulative min computed at every element. - Deprecated since version 0.19.14: This method has been renamed to - cum_min().- Parameters:
- reverse
- Reverse the operation. 
 
 
 - cumprod(*, reverse: bool = False) Self[source]
- Get an array with the cumulative product computed at every element. - Deprecated since version 0.19.14: This method has been renamed to - cum_prod().- Parameters:
- reverse
- Reverse the operation. 
 
 
 - cumsum(*, reverse: bool = False) Self[source]
- Get an array with the cumulative sum computed at every element. - Deprecated since version 0.19.14: This method has been renamed to - cum_sum().- Parameters:
- reverse
- Reverse the operation. 
 
 
 - cumulative_eval( ) Self[source]
- Run an expression over a sliding window that increases - 1slot every iteration.- Parameters:
- expr
- Expression to evaluate 
- min_periods
- Number of valid values there should be in the window before the expression is evaluated. valid values = - length - null_count
- parallel
- Run in parallel. Don’t do this in a group by or another operation that already has much parallelization. 
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - This can be really slow as it can have - O(n^2)complexity. Don’t use this for operations that visit all elements.- Examples - >>> df = pl.DataFrame({"values": [1, 2, 3, 4, 5]}) >>> df.select( ... [ ... pl.col("values").cumulative_eval( ... pl.element().first() - pl.element().last() ** 2 ... ) ... ] ... ) shape: (5, 1) ┌────────┐ │ values │ │ --- │ │ f64 │ ╞════════╡ │ 0.0 │ │ -3.0 │ │ -8.0 │ │ -15.0 │ │ -24.0 │ └────────┘ 
 - cut(
- breaks: Sequence[float],
- *,
- labels: Sequence[str] | None = None,
- left_closed: bool = False,
- include_breaks: bool = False,
- Bin continuous values into discrete categories. - Parameters:
- breaks
- List of unique cut points. 
- labels
- Names of the categories. The number of labels must be equal to the number of cut points plus one. 
- left_closed
- Set the intervals to be left-closed instead of right-closed. 
- include_breaks
- Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a - Categoricalto a- Struct.
 
- Returns:
- Expr
- Expression of data type - Categoricalif- include_breaksis set to- False(default), otherwise an expression of data type- Struct.
 
 - See also - Examples - Divide a column into three categories. - >>> df = pl.DataFrame({"foo": [-2, -1, 0, 1, 2]}) >>> df.with_columns( ... pl.col("foo").cut([-1, 1], labels=["a", "b", "c"]).alias("cut") ... ) shape: (5, 2) ┌─────┬─────┐ │ foo ┆ cut │ │ --- ┆ --- │ │ i64 ┆ cat │ ╞═════╪═════╡ │ -2 ┆ a │ │ -1 ┆ a │ │ 0 ┆ b │ │ 1 ┆ b │ │ 2 ┆ c │ └─────┴─────┘ - Add both the category and the breakpoint. - >>> df.with_columns( ... pl.col("foo").cut([-1, 1], include_breaks=True).alias("cut") ... ).unnest("cut") shape: (5, 3) ┌─────┬──────┬────────────┐ │ foo ┆ brk ┆ foo_bin │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ cat │ ╞═════╪══════╪════════════╡ │ -2 ┆ -1.0 ┆ (-inf, -1] │ │ -1 ┆ -1.0 ┆ (-inf, -1] │ │ 0 ┆ 1.0 ┆ (-1, 1] │ │ 1 ┆ 1.0 ┆ (-1, 1] │ │ 2 ┆ inf ┆ (1, inf] │ └─────┴──────┴────────────┘ 
 - degrees() Self[source]
- Convert from radians to degrees. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> import math >>> df = pl.DataFrame({"a": [x * math.pi for x in range(-4, 5)]}) >>> df.select(pl.col("a").degrees()) shape: (9, 1) ┌────────┐ │ a │ │ --- │ │ f64 │ ╞════════╡ │ -720.0 │ │ -540.0 │ │ -360.0 │ │ -180.0 │ │ 0.0 │ │ 180.0 │ │ 360.0 │ │ 540.0 │ │ 720.0 │ └────────┘ 
 - diff(n: int = 1, null_behavior: NullBehavior = 'ignore') Self[source]
- Calculate the first discrete difference between shifted items. - Parameters:
- n
- Number of slots to shift. 
- null_behavior{‘ignore’, ‘drop’}
- How to handle null values. 
 
 - Examples - >>> df = pl.DataFrame({"int": [20, 10, 30, 25, 35]}) >>> df.with_columns(change=pl.col("int").diff()) shape: (5, 2) ┌─────┬────────┐ │ int ┆ change │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪════════╡ │ 20 ┆ null │ │ 10 ┆ -10 │ │ 30 ┆ 20 │ │ 25 ┆ -5 │ │ 35 ┆ 10 │ └─────┴────────┘ - >>> df.with_columns(change=pl.col("int").diff(n=2)) shape: (5, 2) ┌─────┬────────┐ │ int ┆ change │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪════════╡ │ 20 ┆ null │ │ 10 ┆ null │ │ 30 ┆ 10 │ │ 25 ┆ 15 │ │ 35 ┆ 5 │ └─────┴────────┘ - >>> df.select(pl.col("int").diff(n=2, null_behavior="drop").alias("diff")) shape: (3, 1) ┌──────┐ │ diff │ │ --- │ │ i64 │ ╞══════╡ │ 10 │ │ 15 │ │ 5 │ └──────┘ 
 - dot(other: Expr | str) Self[source]
- Compute the dot/inner product between two Expressions. - Parameters:
- other
- Expression to compute dot product with. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 3, 5], ... "b": [2, 4, 6], ... } ... ) >>> df.select(pl.col("a").dot(pl.col("b"))) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 44 │ └─────┘ 
 - drop_nans() Self[source]
- Drop all floating point NaN values. - The original order of the remaining elements is preserved. - See also - Notes - A NaN value is not the same as a null value. To drop null values, use - drop_nulls().- Examples - >>> df = pl.DataFrame({"a": [1.0, None, 3.0, float("nan")]}) >>> df.select(pl.col("a").drop_nans()) shape: (3, 1) ┌──────┐ │ a │ │ --- │ │ f64 │ ╞══════╡ │ 1.0 │ │ null │ │ 3.0 │ └──────┘ 
 - drop_nulls() Self[source]
- Drop all null values. - The original order of the remaining elements is preserved. - See also - Notes - A null value is not the same as a NaN value. To drop NaN values, use - drop_nans().- Examples - >>> df = pl.DataFrame({"a": [1.0, None, 3.0, float("nan")]}) >>> df.select(pl.col("a").drop_nulls()) shape: (3, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ │ 3.0 │ │ NaN │ └─────┘ 
 - entropy(base: float = 2.718281828459045, *, normalize: bool = True) Self[source]
- Computes the entropy. - Uses the formula - -sum(pk * log(pk)where- pkare discrete probabilities.- Parameters:
- base
- Given base, defaults to - e
- normalize
- Normalize pk if it doesn’t sum to 1. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").entropy(base=2)) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 1.459148 │ └──────────┘ >>> df.select(pl.col("a").entropy(base=2, normalize=False)) shape: (1, 1) ┌───────────┐ │ a │ │ --- │ │ f64 │ ╞═══════════╡ │ -6.754888 │ └───────────┘ 
 - eq(other: Any) Self[source]
- Method equivalent of equality operator - expr == other.- Parameters:
- other
- A literal or expression value to compare with. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [1.0, 2.0, float("nan"), 4.0], ... "y": [2.0, 2.0, float("nan"), 4.0], ... } ... ) >>> df.with_columns( ... pl.col("x").eq(pl.col("y")).alias("x == y"), ... ) shape: (4, 3) ┌─────┬─────┬────────┐ │ x ┆ y ┆ x == y │ │ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ bool │ ╞═════╪═════╪════════╡ │ 1.0 ┆ 2.0 ┆ false │ │ 2.0 ┆ 2.0 ┆ true │ │ NaN ┆ NaN ┆ false │ │ 4.0 ┆ 4.0 ┆ true │ └─────┴─────┴────────┘ 
 - eq_missing(other: Any) Self[source]
- Method equivalent of equality operator - expr == otherwhere- None == None.- This differs from default - eqwhere null values are propagated.- Parameters:
- other
- A literal or expression value to compare with. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [1.0, 2.0, float("nan"), 4.0, None, None], ... "y": [2.0, 2.0, float("nan"), 4.0, 5.0, None], ... } ... ) >>> df.with_columns( ... pl.col("x").eq(pl.col("y")).alias("x eq y"), ... pl.col("x").eq_missing(pl.col("y")).alias("x eq_missing y"), ... ) shape: (6, 4) ┌──────┬──────┬────────┬────────────────┐ │ x ┆ y ┆ x eq y ┆ x eq_missing y │ │ --- ┆ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ bool ┆ bool │ ╞══════╪══════╪════════╪════════════════╡ │ 1.0 ┆ 2.0 ┆ false ┆ false │ │ 2.0 ┆ 2.0 ┆ true ┆ true │ │ NaN ┆ NaN ┆ false ┆ false │ │ 4.0 ┆ 4.0 ┆ true ┆ true │ │ null ┆ 5.0 ┆ null ┆ false │ │ null ┆ null ┆ null ┆ true │ └──────┴──────┴────────┴────────────────┘ 
 - ewm_mean(
- *,
- com: float | None = None,
- span: float | None = None,
- half_life: float | None = None,
- alpha: float | None = None,
- adjust: bool = True,
- min_periods: int = 1,
- ignore_nulls: bool = True,
- Exponentially-weighted moving average. - Parameters:
- com
- Specify decay in terms of center of mass, \(\gamma\), with \[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]
- span
- Specify decay in terms of span, \(\theta\), with \[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]
- half_life
- Specify decay in terms of half-life, \(\lambda\), with \[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0\]
- alpha
- Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\). 
- adjust
- Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings - When - adjust=Truethe EW function is calculated using weights \(w_i = (1 - \alpha)^i\)
- When - adjust=Falsethe EW function is calculated recursively by\[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]
 
- min_periods
- Minimum number of observations in window required to have a value (otherwise result is null). 
- ignore_nulls
- Ignore missing values when calculating weights. - When - ignore_nulls=False(default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) if- adjust=True, and \((1-\alpha)^2\) and \(\alpha\) if- adjust=False.
- When - ignore_nulls=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) if- adjust=True, and \(1-\alpha\) and \(\alpha\) if- adjust=False.
 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").ewm_mean(com=1)) shape: (3, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 1.0 │ │ 1.666667 │ │ 2.428571 │ └──────────┘ 
 - ewm_std(
- *,
- com: float | None = None,
- span: float | None = None,
- half_life: float | None = None,
- alpha: float | None = None,
- adjust: bool = True,
- bias: bool = False,
- min_periods: int = 1,
- ignore_nulls: bool = True,
- Exponentially-weighted moving standard deviation. - Parameters:
- com
- Specify decay in terms of center of mass, \(\gamma\), with \[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]
- span
- Specify decay in terms of span, \(\theta\), with \[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]
- half_life
- Specify decay in terms of half-life, \(\lambda\), with \[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0\]
- alpha
- Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\). 
- adjust
- Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings - When - adjust=Truethe EW function is calculated using weights \(w_i = (1 - \alpha)^i\)
- When - adjust=Falsethe EW function is calculated recursively by\[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]
 
- bias
- When - bias=False, apply a correction to make the estimate statistically unbiased.
- min_periods
- Minimum number of observations in window required to have a value (otherwise result is null). 
- ignore_nulls
- Ignore missing values when calculating weights. - When - ignore_nulls=False(default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) if- adjust=True, and \((1-\alpha)^2\) and \(\alpha\) if- adjust=False.
- When - ignore_nulls=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) if- adjust=True, and \(1-\alpha\) and \(\alpha\) if- adjust=False.
 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").ewm_std(com=1)) shape: (3, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 0.0 │ │ 0.707107 │ │ 0.963624 │ └──────────┘ 
 - ewm_var(
- *,
- com: float | None = None,
- span: float | None = None,
- half_life: float | None = None,
- alpha: float | None = None,
- adjust: bool = True,
- bias: bool = False,
- min_periods: int = 1,
- ignore_nulls: bool = True,
- Exponentially-weighted moving variance. - Parameters:
- com
- Specify decay in terms of center of mass, \(\gamma\), with \[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]
- span
- Specify decay in terms of span, \(\theta\), with \[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]
- half_life
- Specify decay in terms of half-life, \(\lambda\), with \[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0\]
- alpha
- Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\). 
- adjust
- Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings - When - adjust=Truethe EW function is calculated using weights \(w_i = (1 - \alpha)^i\)
- When - adjust=Falsethe EW function is calculated recursively by\[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]
 
- bias
- When - bias=False, apply a correction to make the estimate statistically unbiased.
- min_periods
- Minimum number of observations in window required to have a value (otherwise result is null). 
- ignore_nulls
- Ignore missing values when calculating weights. - When - ignore_nulls=False(default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) if- adjust=True, and \((1-\alpha)^2\) and \(\alpha\) if- adjust=False.
- When - ignore_nulls=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) if- adjust=True, and \(1-\alpha\) and \(\alpha\) if- adjust=False.
 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").ewm_var(com=1)) shape: (3, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 0.0 │ │ 0.5 │ │ 0.928571 │ └──────────┘ 
 - exclude(
- columns: str | PolarsDataType | Collection[str] | Collection[PolarsDataType],
- *more_columns: str | PolarsDataType,
- Exclude columns from a multi-column expression. - Only works after a wildcard or regex column selection, and you cannot provide both string column names and dtypes (you may prefer to use selectors instead). - Parameters:
- columns
- The name or datatype of the column(s) to exclude. Accepts regular expression input. Regular expressions should start with - ^and end with- $.
- *more_columns
- Additional names or datatypes of columns to exclude, specified as positional arguments. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "aa": [1, 2, 3], ... "ba": ["a", "b", None], ... "cc": [None, 2.5, 1.5], ... } ... ) >>> df shape: (3, 3) ┌─────┬──────┬──────┐ │ aa ┆ ba ┆ cc │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ f64 │ ╞═════╪══════╪══════╡ │ 1 ┆ a ┆ null │ │ 2 ┆ b ┆ 2.5 │ │ 3 ┆ null ┆ 1.5 │ └─────┴──────┴──────┘ - Exclude by column name(s): - >>> df.select(pl.all().exclude("ba")) shape: (3, 2) ┌─────┬──────┐ │ aa ┆ cc │ │ --- ┆ --- │ │ i64 ┆ f64 │ ╞═════╪══════╡ │ 1 ┆ null │ │ 2 ┆ 2.5 │ │ 3 ┆ 1.5 │ └─────┴──────┘ - Exclude by regex, e.g. removing all columns whose names end with the letter “a”: - >>> df.select(pl.all().exclude("^.*a$")) shape: (3, 1) ┌──────┐ │ cc │ │ --- │ │ f64 │ ╞══════╡ │ null │ │ 2.5 │ │ 1.5 │ └──────┘ - Exclude by dtype(s), e.g. removing all columns of type Int64 or Float64: - >>> df.select(pl.all().exclude([pl.Int64, pl.Float64])) shape: (3, 1) ┌──────┐ │ ba │ │ --- │ │ str │ ╞══════╡ │ a │ │ b │ │ null │ └──────┘ 
 - exp() Self[source]
- Compute the exponential, element-wise. - Examples - >>> df = pl.DataFrame({"values": [1.0, 2.0, 4.0]}) >>> df.select(pl.col("values").exp()) shape: (3, 1) ┌──────────┐ │ values │ │ --- │ │ f64 │ ╞══════════╡ │ 2.718282 │ │ 7.389056 │ │ 54.59815 │ └──────────┘ 
 - explode() Self[source]
- Explode a list expression. - This means that every item is expanded to a new row. - Returns:
- Expr
- Expression with the data type of the list elements. 
 
 - See also - Expr.list.explode
- Explode a list column. 
- Expr.str.explode
- Explode a string column. 
 - Examples - >>> df = pl.DataFrame( ... { ... "group": ["a", "b"], ... "values": [ ... [1, 2], ... [3, 4], ... ], ... } ... ) >>> df.select(pl.col("values").explode()) shape: (4, 1) ┌────────┐ │ values │ │ --- │ │ i64 │ ╞════════╡ │ 1 │ │ 2 │ │ 3 │ │ 4 │ └────────┘ 
 - extend_constant(value: PythonLiteral | None, n: int) Self[source]
- Extremely fast method for extending the Series with ‘n’ copies of a value. - Parameters:
- value
- A constant literal value (not an expression) with which to extend the expression result Series; can pass None to extend with nulls. 
- n
- The number of additional values that will be added. 
 
 - Examples - >>> df = pl.DataFrame({"values": [1, 2, 3]}) >>> df.select((pl.col("values") - 1).extend_constant(99, n=2)) shape: (5, 1) ┌────────┐ │ values │ │ --- │ │ i64 │ ╞════════╡ │ 0 │ │ 1 │ │ 2 │ │ 99 │ │ 99 │ └────────┘ 
 - fill_nan(value: int | float | Expr | None) Self[source]
- Fill floating point NaN value with a fill value. - Examples - >>> df = pl.DataFrame( ... { ... "a": [1.0, None, float("nan")], ... "b": [4.0, float("nan"), 6], ... } ... ) >>> df.with_columns(pl.col("b").fill_nan(0)) shape: (3, 2) ┌──────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞══════╪═════╡ │ 1.0 ┆ 4.0 │ │ null ┆ 0.0 │ │ NaN ┆ 6.0 │ └──────┴─────┘ 
 - fill_null( ) Self[source]
- Fill null values using the specified value or strategy. - To interpolate over null values see interpolate. See the examples below to fill nulls with an expression. - Parameters:
- value
- Value used to fill null values. 
- strategy{None, ‘forward’, ‘backward’, ‘min’, ‘max’, ‘mean’, ‘zero’, ‘one’}
- Strategy used to fill null values. 
- limit
- Number of consecutive null values to fill when using the ‘forward’ or ‘backward’ strategy. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, None], ... "b": [4, None, 6], ... } ... ) >>> df.with_columns(pl.col("b").fill_null(strategy="zero")) shape: (3, 2) ┌──────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞══════╪═════╡ │ 1 ┆ 4 │ │ 2 ┆ 0 │ │ null ┆ 6 │ └──────┴─────┘ >>> df.with_columns(pl.col("b").fill_null(99)) shape: (3, 2) ┌──────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞══════╪═════╡ │ 1 ┆ 4 │ │ 2 ┆ 99 │ │ null ┆ 6 │ └──────┴─────┘ >>> df.with_columns(pl.col("b").fill_null(strategy="forward")) shape: (3, 2) ┌──────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞══════╪═════╡ │ 1 ┆ 4 │ │ 2 ┆ 4 │ │ null ┆ 6 │ └──────┴─────┘ >>> df.with_columns(pl.col("b").fill_null(pl.col("b").median())) shape: (3, 2) ┌──────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ f64 │ ╞══════╪═════╡ │ 1 ┆ 4.0 │ │ 2 ┆ 5.0 │ │ null ┆ 6.0 │ └──────┴─────┘ >>> df.with_columns(pl.all().fill_null(pl.all().median())) shape: (3, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════╡ │ 1.0 ┆ 4.0 │ │ 2.0 ┆ 5.0 │ │ 1.5 ┆ 6.0 │ └─────┴─────┘ 
 - filter(predicate: Expr) Self[source]
- Filter a single column. - The original order of the remaining elements is preserved. - Mostly useful in an aggregation context. If you want to filter on a DataFrame level, use - LazyFrame.filter.- Parameters:
- predicate
- Boolean expression. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "group_col": ["g1", "g1", "g2"], ... "b": [1, 2, 3], ... } ... ) >>> df.group_by("group_col").agg( ... lt=pl.col("b").filter(pl.col("b") < 2).sum(), ... gte=pl.col("b").filter(pl.col("b") >= 2).sum(), ... ).sort("group_col") shape: (2, 3) ┌───────────┬─────┬─────┐ │ group_col ┆ lt ┆ gte │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═══════════╪═════╪═════╡ │ g1 ┆ 1 ┆ 2 │ │ g2 ┆ 0 ┆ 3 │ └───────────┴─────┴─────┘ 
 - first() Self[source]
- Get the first value. - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2]}) >>> df.select(pl.col("a").first()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 1 │ └─────┘ 
 - flatten() Self[source]
- Flatten a list or string column. - Alias for - polars.expr.list.ExprListNameSpace.explode().- Examples - >>> df = pl.DataFrame( ... { ... "group": ["a", "b", "b"], ... "values": [[1, 2], [2, 3], [4]], ... } ... ) >>> df.group_by("group").agg(pl.col("values").flatten()) shape: (2, 2) ┌───────┬───────────┐ │ group ┆ values │ │ --- ┆ --- │ │ str ┆ list[i64] │ ╞═══════╪═══════════╡ │ a ┆ [1, 2] │ │ b ┆ [2, 3, 4] │ └───────┴───────────┘ 
 - floor() Self[source]
- Rounds down to the nearest integer value. - Only works on floating point Series. - Examples - >>> df = pl.DataFrame({"a": [0.3, 0.5, 1.0, 1.1]}) >>> df.select(pl.col("a").floor()) shape: (4, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 0.0 │ │ 0.0 │ │ 1.0 │ │ 1.0 │ └─────┘ 
 - floordiv(other: Any) Self[source]
- Method equivalent of integer division operator - expr // other.- Parameters:
- other
- Numeric literal or expression value. 
 
 - See also - Examples - >>> df = pl.DataFrame({"x": [1, 2, 3, 4, 5]}) >>> df.with_columns( ... pl.col("x").truediv(2).alias("x/2"), ... pl.col("x").floordiv(2).alias("x//2"), ... ) shape: (5, 3) ┌─────┬─────┬──────┐ │ x ┆ x/2 ┆ x//2 │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ i64 │ ╞═════╪═════╪══════╡ │ 1 ┆ 0.5 ┆ 0 │ │ 2 ┆ 1.0 ┆ 1 │ │ 3 ┆ 1.5 ┆ 1 │ │ 4 ┆ 2.0 ┆ 2 │ │ 5 ┆ 2.5 ┆ 2 │ └─────┴─────┴──────┘ 
 - forward_fill(limit: int | None = None) Self[source]
- Fill missing values with the latest seen values. - Parameters:
- limit
- The number of consecutive null values to forward fill. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, None], ... "b": [4, None, 6], ... } ... ) >>> df.select(pl.all().forward_fill()) shape: (3, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 1 ┆ 4 │ │ 2 ┆ 4 │ │ 2 ┆ 6 │ └─────┴─────┘ 
 - classmethod from_json(value: str) Self[source]
- Read an expression from a JSON encoded string to construct an Expression. - Parameters:
- value
- JSON encoded string value 
 
 
 - gather(indices: int | list[int] | Expr | Series | np.ndarray[Any, Any]) Self[source]
- Take values by index. - Parameters:
- indices
- An expression that leads to a UInt32 dtyped Series. 
 
- Returns:
- Expr
- Expression of the same data type. 
 
 - See also - Expr.get
- Take a single value 
 - Examples - >>> df = pl.DataFrame( ... { ... "group": [ ... "one", ... "one", ... "one", ... "two", ... "two", ... "two", ... ], ... "value": [1, 98, 2, 3, 99, 4], ... } ... ) >>> df.group_by("group", maintain_order=True).agg( ... pl.col("value").gather([2, 1]) ... ) shape: (2, 2) ┌───────┬───────────┐ │ group ┆ value │ │ --- ┆ --- │ │ str ┆ list[i64] │ ╞═══════╪═══════════╡ │ one ┆ [2, 98] │ │ two ┆ [4, 99] │ └───────┴───────────┘ 
 - gather_every(n: int) Self[source]
- Take every nth value in the Series and return as a new Series. - Parameters:
- n
- Gather every n-th row. 
 
 - Examples - >>> df = pl.DataFrame({"foo": [1, 2, 3, 4, 5, 6, 7, 8, 9]}) >>> df.select(pl.col("foo").gather_every(3)) shape: (3, 1) ┌─────┐ │ foo │ │ --- │ │ i64 │ ╞═════╡ │ 1 │ │ 4 │ │ 7 │ └─────┘ 
 - ge(other: Any) Self[source]
- Method equivalent of “greater than or equal” operator - expr >= other.- Parameters:
- other
- A literal or expression value to compare with. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [5.0, 4.0, float("nan"), 2.0], ... "y": [5.0, 3.0, float("nan"), 1.0], ... } ... ) >>> df.with_columns( ... pl.col("x").ge(pl.col("y")).alias("x >= y"), ... ) shape: (4, 3) ┌─────┬─────┬────────┐ │ x ┆ y ┆ x >= y │ │ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ bool │ ╞═════╪═════╪════════╡ │ 5.0 ┆ 5.0 ┆ true │ │ 4.0 ┆ 3.0 ┆ true │ │ NaN ┆ NaN ┆ false │ │ 2.0 ┆ 1.0 ┆ true │ └─────┴─────┴────────┘ 
 - get(index: int | Expr) Self[source]
- Return a single value by index. - Parameters:
- index
- An expression that leads to a UInt32 index. 
 
- Returns:
- Expr
- Expression of the same data type. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "group": [ ... "one", ... "one", ... "one", ... "two", ... "two", ... "two", ... ], ... "value": [1, 98, 2, 3, 99, 4], ... } ... ) >>> df.group_by("group", maintain_order=True).agg(pl.col("value").get(1)) shape: (2, 2) ┌───────┬───────┐ │ group ┆ value │ │ --- ┆ --- │ │ str ┆ i64 │ ╞═══════╪═══════╡ │ one ┆ 98 │ │ two ┆ 99 │ └───────┴───────┘ 
 - gt(other: Any) Self[source]
- Method equivalent of “greater than” operator - expr > other.- Parameters:
- other
- A literal or expression value to compare with. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [5.0, 4.0, float("nan"), 2.0], ... "y": [5.0, 3.0, float("nan"), 1.0], ... } ... ) >>> df.with_columns( ... pl.col("x").gt(pl.col("y")).alias("x > y"), ... ) shape: (4, 3) ┌─────┬─────┬───────┐ │ x ┆ y ┆ x > y │ │ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ bool │ ╞═════╪═════╪═══════╡ │ 5.0 ┆ 5.0 ┆ false │ │ 4.0 ┆ 3.0 ┆ true │ │ NaN ┆ NaN ┆ false │ │ 2.0 ┆ 1.0 ┆ true │ └─────┴─────┴───────┘ 
 - hash( ) Self[source]
- Hash the elements in the selection. - The hash value is of type - UInt64.- Parameters:
- seed
- Random seed parameter. Defaults to 0. 
- seed_1
- Random seed parameter. Defaults to - seedif not set.
- seed_2
- Random seed parameter. Defaults to - seedif not set.
- seed_3
- Random seed parameter. Defaults to - seedif not set.
 
 - Notes - This implementation of - rows()does not guarantee stable results across different Polars versions. Its stability is only guaranteed within a single version.- Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, None], ... "b": ["x", None, "z"], ... } ... ) >>> df.with_columns(pl.all().hash(10, 20, 30, 40)) shape: (3, 2) ┌──────────────────────┬──────────────────────┐ │ a ┆ b │ │ --- ┆ --- │ │ u64 ┆ u64 │ ╞══════════════════════╪══════════════════════╡ │ 9774092659964970114 ┆ 13614470193936745724 │ │ 1101441246220388612 ┆ 11638928888656214026 │ │ 11638928888656214026 ┆ 13382926553367784577 │ └──────────────────────┴──────────────────────┘ 
 - head(n: int | Expr = 10) Self[source]
- Get the first - nrows.- Parameters:
- n
- Number of rows to return. 
 
 - Examples - >>> df = pl.DataFrame({"foo": [1, 2, 3, 4, 5, 6, 7]}) >>> df.head(3) shape: (3, 1) ┌─────┐ │ foo │ │ --- │ │ i64 │ ╞═════╡ │ 1 │ │ 2 │ │ 3 │ └─────┘ 
 - implode() Self[source]
- Aggregate values into a list. - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3], ... "b": [4, 5, 6], ... } ... ) >>> df.select(pl.all().implode()) shape: (1, 2) ┌───────────┬───────────┐ │ a ┆ b │ │ --- ┆ --- │ │ list[i64] ┆ list[i64] │ ╞═══════════╪═══════════╡ │ [1, 2, 3] ┆ [4, 5, 6] │ └───────────┴───────────┘ 
 - inspect(fmt: str = '{}') Self[source]
- Print the value that this expression evaluates to and pass on the value. - Examples - >>> df = pl.DataFrame({"foo": [1, 1, 2]}) >>> df.select(pl.col("foo").cum_sum().inspect("value is: {}").alias("bar")) value is: shape: (3,) Series: 'foo' [i64] [ 1 2 4 ] shape: (3, 1) ┌─────┐ │ bar │ │ --- │ │ i64 │ ╞═════╡ │ 1 │ │ 2 │ │ 4 │ └─────┘ 
 - interpolate(method: InterpolationMethod = 'linear') Self[source]
- Fill null values using interpolation. - Parameters:
- method{‘linear’, ‘nearest’}
- Interpolation method. 
 
 - Examples - Fill null values using linear interpolation. - >>> df = pl.DataFrame( ... { ... "a": [1, None, 3], ... "b": [1.0, float("nan"), 3.0], ... } ... ) >>> df.select(pl.all().interpolate()) shape: (3, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════╡ │ 1.0 ┆ 1.0 │ │ 2.0 ┆ NaN │ │ 3.0 ┆ 3.0 │ └─────┴─────┘ - Fill null values using nearest interpolation. - >>> df.select(pl.all().interpolate("nearest")) shape: (3, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ f64 │ ╞═════╪═════╡ │ 1 ┆ 1.0 │ │ 3 ┆ NaN │ │ 3 ┆ 3.0 │ └─────┴─────┘ - Regrid data to a new grid. - >>> df_original_grid = pl.DataFrame( ... { ... "grid_points": [1, 3, 10], ... "values": [2.0, 6.0, 20.0], ... } ... ) # Interpolate from this to the new grid >>> df_new_grid = pl.DataFrame({"grid_points": range(1, 11)}) >>> df_new_grid.join( ... df_original_grid, on="grid_points", how="left" ... ).with_columns(pl.col("values").interpolate()) shape: (10, 2) ┌─────────────┬────────┐ │ grid_points ┆ values │ │ --- ┆ --- │ │ i64 ┆ f64 │ ╞═════════════╪════════╡ │ 1 ┆ 2.0 │ │ 2 ┆ 4.0 │ │ 3 ┆ 6.0 │ │ 4 ┆ 8.0 │ │ … ┆ … │ │ 7 ┆ 14.0 │ │ 8 ┆ 16.0 │ │ 9 ┆ 18.0 │ │ 10 ┆ 20.0 │ └─────────────┴────────┘ 
 - is_between(
- lower_bound: IntoExpr,
- upper_bound: IntoExpr,
- closed: ClosedInterval = 'both',
- Check if this expression is between the given start and end values. - Parameters:
- lower_bound
- Lower bound value. Accepts expression input. Strings are parsed as column names, other non-expression inputs are parsed as literals. 
- upper_bound
- Upper bound value. Accepts expression input. Strings are parsed as column names, other non-expression inputs are parsed as literals. 
- closed{‘both’, ‘left’, ‘right’, ‘none’}
- Define which sides of the interval are closed (inclusive). 
 
- Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame({"num": [1, 2, 3, 4, 5]}) >>> df.with_columns(pl.col("num").is_between(2, 4).alias("is_between")) shape: (5, 2) ┌─────┬────────────┐ │ num ┆ is_between │ │ --- ┆ --- │ │ i64 ┆ bool │ ╞═════╪════════════╡ │ 1 ┆ false │ │ 2 ┆ true │ │ 3 ┆ true │ │ 4 ┆ true │ │ 5 ┆ false │ └─────┴────────────┘ - Use the - closedargument to include or exclude the values at the bounds:- >>> df.with_columns( ... pl.col("num").is_between(2, 4, closed="left").alias("is_between") ... ) shape: (5, 2) ┌─────┬────────────┐ │ num ┆ is_between │ │ --- ┆ --- │ │ i64 ┆ bool │ ╞═════╪════════════╡ │ 1 ┆ false │ │ 2 ┆ true │ │ 3 ┆ true │ │ 4 ┆ false │ │ 5 ┆ false │ └─────┴────────────┘ - You can also use strings as well as numeric/temporal values (note: ensure that string literals are wrapped with - litso as not to conflate them with column names):- >>> df = pl.DataFrame({"a": ["a", "b", "c", "d", "e"]}) >>> df.with_columns( ... pl.col("a") ... .is_between(pl.lit("a"), pl.lit("c"), closed="both") ... .alias("is_between") ... ) shape: (5, 2) ┌─────┬────────────┐ │ a ┆ is_between │ │ --- ┆ --- │ │ str ┆ bool │ ╞═════╪════════════╡ │ a ┆ true │ │ b ┆ true │ │ c ┆ true │ │ d ┆ false │ │ e ┆ false │ └─────┴────────────┘ 
 - is_duplicated() Self[source]
- Return a boolean mask indicating duplicated values. - Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2]}) >>> df.select(pl.col("a").is_duplicated()) shape: (3, 1) ┌───────┐ │ a │ │ --- │ │ bool │ ╞═══════╡ │ true │ │ true │ │ false │ └───────┘ 
 - is_finite() Self[source]
- Returns a boolean Series indicating which values are finite. - Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame( ... { ... "A": [1.0, 2], ... "B": [3.0, float("inf")], ... } ... ) >>> df.select(pl.all().is_finite()) shape: (2, 2) ┌──────┬───────┐ │ A ┆ B │ │ --- ┆ --- │ │ bool ┆ bool │ ╞══════╪═══════╡ │ true ┆ true │ │ true ┆ false │ └──────┴───────┘ 
 - is_first() Self[source]
- Return a boolean mask indicating the first occurrence of each distinct value. - Deprecated since version 0.19.3: This method has been renamed to - Expr.is_first_distinct().- Returns:
- Expr
- Expression of data type - Boolean.
 
 
 - is_first_distinct() Self[source]
- Return a boolean mask indicating the first occurrence of each distinct value. - Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2, 3, 2]}) >>> df.with_columns(pl.col("a").is_first_distinct().alias("first")) shape: (5, 2) ┌─────┬───────┐ │ a ┆ first │ │ --- ┆ --- │ │ i64 ┆ bool │ ╞═════╪═══════╡ │ 1 ┆ true │ │ 1 ┆ false │ │ 2 ┆ true │ │ 3 ┆ true │ │ 2 ┆ false │ └─────┴───────┘ 
 - is_in(other: Expr | Collection[Any] | Series) Self[source]
- Check if elements of this expression are present in the other Series. - Parameters:
- other
- Series or sequence of primitive type. 
 
- Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame( ... {"sets": [[1, 2, 3], [1, 2], [9, 10]], "optional_members": [1, 2, 3]} ... ) >>> df.with_columns(contains=pl.col("optional_members").is_in("sets")) shape: (3, 3) ┌───────────┬──────────────────┬──────────┐ │ sets ┆ optional_members ┆ contains │ │ --- ┆ --- ┆ --- │ │ list[i64] ┆ i64 ┆ bool │ ╞═══════════╪══════════════════╪══════════╡ │ [1, 2, 3] ┆ 1 ┆ true │ │ [1, 2] ┆ 2 ┆ true │ │ [9, 10] ┆ 3 ┆ false │ └───────────┴──────────────────┴──────────┘ 
 - is_infinite() Self[source]
- Returns a boolean Series indicating which values are infinite. - Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame( ... { ... "A": [1.0, 2], ... "B": [3.0, float("inf")], ... } ... ) >>> df.select(pl.all().is_infinite()) shape: (2, 2) ┌───────┬───────┐ │ A ┆ B │ │ --- ┆ --- │ │ bool ┆ bool │ ╞═══════╪═══════╡ │ false ┆ false │ │ false ┆ true │ └───────┴───────┘ 
 - is_last() Self[source]
- Return a boolean mask indicating the last occurrence of each distinct value. - Deprecated since version 0.19.3: This method has been renamed to - Expr.is_last_distinct().- Returns:
- Expr
- Expression of data type - Boolean.
 
 
 - is_last_distinct() Self[source]
- Return a boolean mask indicating the last occurrence of each distinct value. - Returns:
- Expr
- Expression of data type - Boolean.
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2, 3, 2]}) >>> df.with_columns(pl.col("a").is_last_distinct().alias("last")) shape: (5, 2) ┌─────┬───────┐ │ a ┆ last │ │ --- ┆ --- │ │ i64 ┆ bool │ ╞═════╪═══════╡ │ 1 ┆ false │ │ 1 ┆ true │ │ 2 ┆ false │ │ 3 ┆ true │ │ 2 ┆ true │ └─────┴───────┘ 
 - is_nan() Self[source]
- Returns a boolean Series indicating which values are NaN. - Notes - Floating point - NaN(Not A Number) should not be confused with missing data represented as- Null/None.- Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, None, 1, 5], ... "b": [1.0, 2.0, float("nan"), 1.0, 5.0], ... } ... ) >>> df.with_columns(pl.col(pl.Float64).is_nan().name.suffix("_isnan")) shape: (5, 3) ┌──────┬─────┬─────────┐ │ a ┆ b ┆ b_isnan │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ bool │ ╞══════╪═════╪═════════╡ │ 1 ┆ 1.0 ┆ false │ │ 2 ┆ 2.0 ┆ false │ │ null ┆ NaN ┆ true │ │ 1 ┆ 1.0 ┆ false │ │ 5 ┆ 5.0 ┆ false │ └──────┴─────┴─────────┘ 
 - is_not() Self[source]
- Negate a boolean expression. - Deprecated since version 0.19.2: This method has been renamed to - Expr.not_().
 - is_not_nan() Self[source]
- Returns a boolean Series indicating which values are not NaN. - Notes - Floating point - NaN(Not A Number) should not be confused with missing data represented as- Null/None.- Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, None, 1, 5], ... "b": [1.0, 2.0, float("nan"), 1.0, 5.0], ... } ... ) >>> df.with_columns(pl.col(pl.Float64).is_not_nan().name.suffix("_is_not_nan")) shape: (5, 3) ┌──────┬─────┬──────────────┐ │ a ┆ b ┆ b_is_not_nan │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ bool │ ╞══════╪═════╪══════════════╡ │ 1 ┆ 1.0 ┆ true │ │ 2 ┆ 2.0 ┆ true │ │ null ┆ NaN ┆ false │ │ 1 ┆ 1.0 ┆ true │ │ 5 ┆ 5.0 ┆ true │ └──────┴─────┴──────────────┘ 
 - is_not_null() Self[source]
- Returns a boolean Series indicating which values are not null. - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, None, 1, 5], ... "b": [1.0, 2.0, float("nan"), 1.0, 5.0], ... } ... ) >>> df.with_columns( ... pl.all().is_not_null().name.suffix("_not_null") # nan != null ... ) shape: (5, 4) ┌──────┬─────┬────────────┬────────────┐ │ a ┆ b ┆ a_not_null ┆ b_not_null │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ bool ┆ bool │ ╞══════╪═════╪════════════╪════════════╡ │ 1 ┆ 1.0 ┆ true ┆ true │ │ 2 ┆ 2.0 ┆ true ┆ true │ │ null ┆ NaN ┆ false ┆ true │ │ 1 ┆ 1.0 ┆ true ┆ true │ │ 5 ┆ 5.0 ┆ true ┆ true │ └──────┴─────┴────────────┴────────────┘ 
 - is_null() Self[source]
- Returns a boolean Series indicating which values are null. - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, None, 1, 5], ... "b": [1.0, 2.0, float("nan"), 1.0, 5.0], ... } ... ) >>> df.with_columns(pl.all().is_null().name.suffix("_isnull")) # nan != null shape: (5, 4) ┌──────┬─────┬──────────┬──────────┐ │ a ┆ b ┆ a_isnull ┆ b_isnull │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ bool ┆ bool │ ╞══════╪═════╪══════════╪══════════╡ │ 1 ┆ 1.0 ┆ false ┆ false │ │ 2 ┆ 2.0 ┆ false ┆ false │ │ null ┆ NaN ┆ true ┆ false │ │ 1 ┆ 1.0 ┆ false ┆ false │ │ 5 ┆ 5.0 ┆ false ┆ false │ └──────┴─────┴──────────┴──────────┘ 
 - is_unique() Self[source]
- Get mask of unique values. - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2]}) >>> df.select(pl.col("a").is_unique()) shape: (3, 1) ┌───────┐ │ a │ │ --- │ │ bool │ ╞═══════╡ │ false │ │ false │ │ true │ └───────┘ 
 - keep_name() Self[source]
- Keep the original root name of the expression. - Deprecated since version 0.19.12: This method has been renamed to - name.keep().- See also - Notes - Due to implementation constraints, this method can only be called as the last expression in a chain. - Examples - Undo an alias operation. - >>> df = pl.DataFrame( ... { ... "a": [1, 2], ... "b": [3, 4], ... } ... ) >>> df.with_columns((pl.col("a") * 9).alias("c").name.keep()) shape: (2, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 9 ┆ 3 │ │ 18 ┆ 4 │ └─────┴─────┘ - Prevent errors due to duplicate column names. - >>> df.select((pl.lit(10) / pl.all()).name.keep()) shape: (2, 2) ┌──────┬──────────┐ │ a ┆ b │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞══════╪══════════╡ │ 10.0 ┆ 3.333333 │ │ 5.0 ┆ 2.5 │ └──────┴──────────┘ 
 - kurtosis(*, fisher: bool = True, bias: bool = True) Self[source]
- Compute the kurtosis (Fisher or Pearson) of a dataset. - Kurtosis is the fourth central moment divided by the square of the variance. If Fisher’s definition is used, then 3.0 is subtracted from the result to give 0.0 for a normal distribution. If bias is False then the kurtosis is calculated using k statistics to eliminate bias coming from biased moment estimators. - See scipy.stats for more information - Parameters:
- fisherbool, optional
- If True, Fisher’s definition is used (normal ==> 0.0). If False, Pearson’s definition is used (normal ==> 3.0). 
- biasbool, optional
- If False, the calculations are corrected for statistical bias. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 2, 1]}) >>> df.select(pl.col("a").kurtosis()) shape: (1, 1) ┌───────────┐ │ a │ │ --- │ │ f64 │ ╞═══════════╡ │ -1.153061 │ └───────────┘ 
 - last() Self[source]
- Get the last value. - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2]}) >>> df.select(pl.col("a").last()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 2 │ └─────┘ 
 - le(other: Any) Self[source]
- Method equivalent of “less than or equal” operator - expr <= other.- Parameters:
- other
- A literal or expression value to compare with. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [5.0, 4.0, float("nan"), 0.5], ... "y": [5.0, 3.5, float("nan"), 2.0], ... } ... ) >>> df.with_columns( ... pl.col("x").le(pl.col("y")).alias("x <= y"), ... ) shape: (4, 3) ┌─────┬─────┬────────┐ │ x ┆ y ┆ x <= y │ │ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ bool │ ╞═════╪═════╪════════╡ │ 5.0 ┆ 5.0 ┆ true │ │ 4.0 ┆ 3.5 ┆ false │ │ NaN ┆ NaN ┆ false │ │ 0.5 ┆ 2.0 ┆ true │ └─────┴─────┴────────┘ 
 - len() Self[source]
- Return the number of elements in the column. - Null values are treated like regular elements in this context. - Alias for - count().- Examples - >>> df = pl.DataFrame({"a": [8, 9, 10], "b": [None, 4, 4]}) >>> df.select(pl.all().len()) shape: (1, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ u32 ┆ u32 │ ╞═════╪═════╡ │ 3 ┆ 3 │ └─────┴─────┘ 
 - limit(n: int | Expr = 10) Self[source]
- Get the first - nrows (alias for- Expr.head()).- Parameters:
- n
- Number of rows to return. 
 
 - Examples - >>> df = pl.DataFrame({"foo": [1, 2, 3, 4, 5, 6, 7]}) >>> df.limit(3) shape: (3, 1) ┌─────┐ │ foo │ │ --- │ │ i64 │ ╞═════╡ │ 1 │ │ 2 │ │ 3 │ └─────┘ 
 - log(base: float = 2.718281828459045) Self[source]
- Compute the logarithm to a given base. - Parameters:
- base
- Given base, defaults to - e
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").log(base=2)) shape: (3, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 0.0 │ │ 1.0 │ │ 1.584963 │ └──────────┘ 
 - log10() Self[source]
- Compute the base 10 logarithm of the input array, element-wise. - Examples - >>> df = pl.DataFrame({"values": [1.0, 2.0, 4.0]}) >>> df.select(pl.col("values").log10()) shape: (3, 1) ┌─────────┐ │ values │ │ --- │ │ f64 │ ╞═════════╡ │ 0.0 │ │ 0.30103 │ │ 0.60206 │ └─────────┘ 
 - log1p() Self[source]
- Compute the natural logarithm of each element plus one. - This computes - log(1 + x)but is more numerically stable for- xclose to zero.- Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").log1p()) shape: (3, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 0.693147 │ │ 1.098612 │ │ 1.386294 │ └──────────┘ 
 - lower_bound() Self[source]
- Calculate the lower bound. - Returns a unit Series with the lowest value possible for the dtype of this expression. - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 2, 1]}) >>> df.select(pl.col("a").lower_bound()) shape: (1, 1) ┌──────────────────────┐ │ a │ │ --- │ │ i64 │ ╞══════════════════════╡ │ -9223372036854775808 │ └──────────────────────┘ 
 - lt(other: Any) Self[source]
- Method equivalent of “less than” operator - expr < other.- Parameters:
- other
- A literal or expression value to compare with. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [1.0, 2.0, float("nan"), 3.0], ... "y": [2.0, 2.0, float("nan"), 4.0], ... } ... ) >>> df.with_columns( ... pl.col("x").lt(pl.col("y")).alias("x < y"), ... ) shape: (4, 3) ┌─────┬─────┬───────┐ │ x ┆ y ┆ x < y │ │ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ bool │ ╞═════╪═════╪═══════╡ │ 1.0 ┆ 2.0 ┆ true │ │ 2.0 ┆ 2.0 ┆ false │ │ NaN ┆ NaN ┆ false │ │ 3.0 ┆ 4.0 ┆ true │ └─────┴─────┴───────┘ 
 - map(
- function: Callable[[Series], Series | Any],
- return_dtype: PolarsDataType | None = None,
- *,
- agg_list: bool = False,
- Apply a custom python function to a Series or sequence of Series. - Deprecated since version 0.19.0: This method has been renamed to - Expr.map_batches().- Parameters:
- function
- Lambda/ function to apply. 
- return_dtype
- Dtype of the output Series. 
- agg_list
- Aggregate list 
 
 
 - map_alias(function: Callable[[str], str]) Self[source]
- Rename the output of an expression by mapping a function over the root name. - Deprecated since version 0.19.12: This method has been renamed to - name.map().- Parameters:
- function
- Function that maps a root name to a new name. 
 
 - Examples - Remove a common suffix and convert to lower case. - >>> df = pl.DataFrame( ... { ... "A_reverse": [3, 2, 1], ... "B_reverse": ["z", "y", "x"], ... } ... ) >>> df.with_columns( ... pl.all().reverse().name.map(lambda c: c.rstrip("_reverse").lower()) ... ) shape: (3, 4) ┌───────────┬───────────┬─────┬─────┐ │ A_reverse ┆ B_reverse ┆ a ┆ b │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 ┆ str │ ╞═══════════╪═══════════╪═════╪═════╡ │ 3 ┆ z ┆ 1 ┆ x │ │ 2 ┆ y ┆ 2 ┆ y │ │ 1 ┆ x ┆ 3 ┆ z │ └───────────┴───────────┴─────┴─────┘ 
 - map_batches(
- function: Callable[[Series], Series | Any],
- return_dtype: PolarsDataType | None = None,
- *,
- agg_list: bool = False,
- Apply a custom python function to a whole Series or sequence of Series. - The output of this custom function must be a Series. If you want to apply a custom function elementwise over single values, see - map_elements(). A reasonable use case for- mapfunctions is transforming the values represented by an expression using a third-party library.- Read more in the book. - Parameters:
- function
- Lambda/function to apply. 
- return_dtype
- Dtype of the output Series. 
- agg_list
- Aggregate list. 
 
 - Warning - If - return_dtypeis not provided, this may lead to unexpected results. We allow this, but it is considered a bug in the user’s query.- See also - Notes - If you are looking to map a function over a window function or group_by context, refer to func: - map_elementsinstead.- Examples - >>> df = pl.DataFrame( ... { ... "sine": [0.0, 1.0, 0.0, -1.0], ... "cosine": [1.0, 0.0, -1.0, 0.0], ... } ... ) >>> df.select(pl.all().map_batches(lambda x: x.to_numpy().argmax())) shape: (1, 2) ┌──────┬────────┐ │ sine ┆ cosine │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞══════╪════════╡ │ 1 ┆ 0 │ └──────┴────────┘ 
 - map_dict( ) Self[source]
- Replace values in column according to remapping dictionary. - Deprecated since version 0.19.16: This method has been renamed to - replace(). The default behavior has changed to keep any values not present in the mapping unchanged. Pass- default=Noneto keep existing behavior.- Parameters:
- mapping
- Dictionary containing the before/after values to map. 
- default
- Value to use when the remapping dict does not contain the lookup value. Accepts expression input. Non-expression inputs are parsed as literals. Use - pl.first(), to keep the original value.
- return_dtype
- Set return dtype to override automatic return dtype determination. 
 
 
 - map_elements(
- function: Callable[[Series], Series] | Callable[[Any], Any],
- return_dtype: PolarsDataType | None = None,
- *,
- skip_nulls: bool = True,
- pass_name: bool = False,
- strategy: MapElementsStrategy = 'thread_local',
- Map a custom/user-defined function (UDF) to each element of a column. - Warning - This method is much slower than the native expressions API. Only use it if you cannot implement your logic otherwise. - The UDF is applied to each element of a column. Note that, in a GroupBy context, the column will have been pre-aggregated and so each element will itself be a Series. Therefore, depending on the context, requirements for - functiondiffer:- Selection
- Expects - functionto be of type- Callable[[Any], Any]. Applies a Python function to each individual value in the column.
 
- GroupBy
- Expects - functionto be of type- Callable[[Series], Any]. For each group, applies a Python function to the slice of the column corresponding to that group.
 
 - Parameters:
- function
- Lambda/function to map. 
- return_dtype
- Dtype of the output Series. If not set, the dtype will be - pl.Unknown.
- skip_nulls
- Don’t map the function over values that contain nulls (this is faster). 
- pass_name
- Pass the Series name to the custom function (this is more expensive). 
- strategy{‘thread_local’, ‘threading’}
- This functionality is considered experimental and may be removed/changed. - ‘thread_local’: run the python function on a single thread. 
- ‘threading’: run the python function on separate threads. Use with care as this can slow performance. This might only speed up your code if the amount of work per element is significant and the python function releases the GIL (e.g. via calling a c function) 
 
 
 - Warning - If - return_dtypeis not provided, this may lead to unexpected results. We allow this, but it is considered a bug in the user’s query.- Notes - Using - map_elementsis strongly discouraged as you will be effectively running python “for” loops, which will be very slow. Wherever possible you should prefer the native expression API to achieve the best performance.
- If your function is expensive and you don’t want it to be called more than once for a given input, consider applying an - @lru_cachedecorator to it. If your data is suitable you may achieve significant speedups.
- Window function application using - overis considered a GroupBy context here, so- map_elementscan be used to map functions over window groups.
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3, 1], ... "b": ["a", "b", "c", "c"], ... } ... ) - The function is applied to each element of column - 'a':- >>> df.with_columns( ... pl.col("a").map_elements(lambda x: x * 2).alias("a_times_2"), ... ) shape: (4, 3) ┌─────┬─────┬───────────┐ │ a ┆ b ┆ a_times_2 │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 │ ╞═════╪═════╪═══════════╡ │ 1 ┆ a ┆ 2 │ │ 2 ┆ b ┆ 4 │ │ 3 ┆ c ┆ 6 │ │ 1 ┆ c ┆ 2 │ └─────┴─────┴───────────┘ - Tip: it is better to implement this with an expression: - >>> df.with_columns( ... (pl.col("a") * 2).alias("a_times_2"), ... ) - In a GroupBy context, each element of the column is itself a Series: - >>> ( ... df.lazy().group_by("b").agg(pl.col("a")).collect() ... ) shape: (3, 2) ┌─────┬───────────┐ │ b ┆ a │ │ --- ┆ --- │ │ str ┆ list[i64] │ ╞═════╪═══════════╡ │ a ┆ [1] │ │ b ┆ [2] │ │ c ┆ [3, 1] │ └─────┴───────────┘ - Therefore, from the user’s point-of-view, the function is applied per-group: - >>> ( ... df.lazy() ... .group_by("b") ... .agg(pl.col("a").map_elements(lambda x: x.sum())) ... .collect() ... ) shape: (3, 2) ┌─────┬─────┐ │ b ┆ a │ │ --- ┆ --- │ │ str ┆ i64 │ ╞═════╪═════╡ │ a ┆ 1 │ │ b ┆ 2 │ │ c ┆ 4 │ └─────┴─────┘ - Tip: again, it is better to implement this with an expression: - >>> ( ... df.lazy() ... .group_by("b", maintain_order=True) ... .agg(pl.col("a").sum()) ... .collect() ... ) - Window function application using - overwill behave as a GroupBy context, with your function receiving individual window groups:- >>> df = pl.DataFrame( ... { ... "key": ["x", "x", "y", "x", "y", "z"], ... "val": [1, 1, 1, 1, 1, 1], ... } ... ) >>> df.with_columns( ... scaled=pl.col("val").map_elements(lambda s: s * len(s)).over("key"), ... ).sort("key") shape: (6, 3) ┌─────┬─────┬────────┐ │ key ┆ val ┆ scaled │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═════╪═════╪════════╡ │ x ┆ 1 ┆ 3 │ │ x ┆ 1 ┆ 3 │ │ x ┆ 1 ┆ 3 │ │ y ┆ 1 ┆ 2 │ │ y ┆ 1 ┆ 2 │ │ z ┆ 1 ┆ 1 │ └─────┴─────┴────────┘ - Note that this function would also be better-implemented natively: - >>> df.with_columns( ... scaled=(pl.col("val") * pl.col("val").count()).over("key"), ... ).sort( ... "key" ... ) 
 - max() Self[source]
- Get maximum value. - Examples - >>> df = pl.DataFrame({"a": [-1, float("nan"), 1]}) >>> df.select(pl.col("a").max()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ └─────┘ 
 - mean() Self[source]
- Get mean value. - Examples - >>> df = pl.DataFrame({"a": [-1, 0, 1]}) >>> df.select(pl.col("a").mean()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 0.0 │ └─────┘ 
 - median() Self[source]
- Get median value using linear interpolation. - Examples - >>> df = pl.DataFrame({"a": [-1, 0, 1]}) >>> df.select(pl.col("a").median()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 0.0 │ └─────┘ 
 - min() Self[source]
- Get minimum value. - Examples - >>> df = pl.DataFrame({"a": [-1, float("nan"), 1]}) >>> df.select(pl.col("a").min()) shape: (1, 1) ┌──────┐ │ a │ │ --- │ │ f64 │ ╞══════╡ │ -1.0 │ └──────┘ 
 - mod(other: Any) Self[source]
- Method equivalent of modulus operator - expr % other.- Parameters:
- other
- Numeric literal or expression value. 
 
 - Examples - >>> df = pl.DataFrame({"x": [0, 1, 2, 3, 4]}) >>> df.with_columns(pl.col("x").mod(2).alias("x%2")) shape: (5, 2) ┌─────┬─────┐ │ x ┆ x%2 │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 0 ┆ 0 │ │ 1 ┆ 1 │ │ 2 ┆ 0 │ │ 3 ┆ 1 │ │ 4 ┆ 0 │ └─────┴─────┘ 
 - mode() Self[source]
- Compute the most occurring value(s). - Can return multiple Values. - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 1, 2, 3], ... "b": [1, 1, 2, 2], ... } ... ) >>> df.select(pl.all().mode()) shape: (2, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 1 ┆ 1 │ │ 1 ┆ 2 │ └─────┴─────┘ 
 - mul(other: Any) Self[source]
- Method equivalent of multiplication operator - expr * other.- Parameters:
- other
- Numeric literal or expression value. 
 
 - Examples - >>> df = pl.DataFrame({"x": [1, 2, 4, 8, 16]}) >>> df.with_columns( ... pl.col("x").mul(2).alias("x*2"), ... pl.col("x").mul(pl.col("x").log(2)).alias("x * xlog2"), ... ) shape: (5, 3) ┌─────┬─────┬───────────┐ │ x ┆ x*2 ┆ x * xlog2 │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ f64 │ ╞═════╪═════╪═══════════╡ │ 1 ┆ 2 ┆ 0.0 │ │ 2 ┆ 4 ┆ 2.0 │ │ 4 ┆ 8 ┆ 8.0 │ │ 8 ┆ 16 ┆ 24.0 │ │ 16 ┆ 32 ┆ 64.0 │ └─────┴─────┴───────────┘ 
 - n_unique() Self[source]
- Count unique values. - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2]}) >>> df.select(pl.col("a").n_unique()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 2 │ └─────┘ 
 - nan_max() Self[source]
- Get maximum value, but propagate/poison encountered NaN values. - This differs from numpy’s - nanmaxas numpy defaults to propagating NaN values, whereas polars defaults to ignoring them.- Examples - >>> df = pl.DataFrame({"a": [0, float("nan")]}) >>> df.select(pl.col("a").nan_max()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ NaN │ └─────┘ 
 - nan_min() Self[source]
- Get minimum value, but propagate/poison encountered NaN values. - This differs from numpy’s - nanmaxas numpy defaults to propagating NaN values, whereas polars defaults to ignoring them.- Examples - >>> df = pl.DataFrame({"a": [0, float("nan")]}) >>> df.select(pl.col("a").nan_min()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ NaN │ └─────┘ 
 - ne(other: Any) Self[source]
- Method equivalent of inequality operator - expr != other.- Parameters:
- other
- A literal or expression value to compare with. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [1.0, 2.0, float("nan"), 4.0], ... "y": [2.0, 2.0, float("nan"), 4.0], ... } ... ) >>> df.with_columns( ... pl.col("x").ne(pl.col("y")).alias("x != y"), ... ) shape: (4, 3) ┌─────┬─────┬────────┐ │ x ┆ y ┆ x != y │ │ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ bool │ ╞═════╪═════╪════════╡ │ 1.0 ┆ 2.0 ┆ true │ │ 2.0 ┆ 2.0 ┆ false │ │ NaN ┆ NaN ┆ true │ │ 4.0 ┆ 4.0 ┆ false │ └─────┴─────┴────────┘ 
 - ne_missing(other: Any) Self[source]
- Method equivalent of equality operator - expr != otherwhere- None == None.- This differs from default - newhere null values are propagated.- Parameters:
- other
- A literal or expression value to compare with. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [1.0, 2.0, float("nan"), 4.0, None, None], ... "y": [2.0, 2.0, float("nan"), 4.0, 5.0, None], ... } ... ) >>> df.with_columns( ... pl.col("x").ne(pl.col("y")).alias("x ne y"), ... pl.col("x").ne_missing(pl.col("y")).alias("x ne_missing y"), ... ) shape: (6, 4) ┌──────┬──────┬────────┬────────────────┐ │ x ┆ y ┆ x ne y ┆ x ne_missing y │ │ --- ┆ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ bool ┆ bool │ ╞══════╪══════╪════════╪════════════════╡ │ 1.0 ┆ 2.0 ┆ true ┆ true │ │ 2.0 ┆ 2.0 ┆ false ┆ false │ │ NaN ┆ NaN ┆ true ┆ true │ │ 4.0 ┆ 4.0 ┆ false ┆ false │ │ null ┆ 5.0 ┆ null ┆ true │ │ null ┆ null ┆ null ┆ false │ └──────┴──────┴────────┴────────────────┘ 
 - not_() Self[source]
- Negate a boolean expression. - Examples - >>> df = pl.DataFrame( ... { ... "a": [True, False, False], ... "b": ["a", "b", None], ... } ... ) >>> df shape: (3, 2) ┌───────┬──────┐ │ a ┆ b │ │ --- ┆ --- │ │ bool ┆ str │ ╞═══════╪══════╡ │ true ┆ a │ │ false ┆ b │ │ false ┆ null │ └───────┴──────┘ >>> df.select(pl.col("a").not_()) shape: (3, 1) ┌───────┐ │ a │ │ --- │ │ bool │ ╞═══════╡ │ false │ │ true │ │ true │ └───────┘ 
 - null_count() Self[source]
- Count null values. - Examples - >>> df = pl.DataFrame( ... { ... "a": [None, 1, None], ... "b": [1, 2, 3], ... } ... ) >>> df.select(pl.all().null_count()) shape: (1, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ u32 ┆ u32 │ ╞═════╪═════╡ │ 2 ┆ 0 │ └─────┴─────┘ 
 - or_(*others: Any) Self[source]
- Method equivalent of bitwise “or” operator - expr | other | ....- Parameters:
- *others
- One or more integer or boolean expressions to evaluate/combine. 
 
 - Examples - >>> df = pl.DataFrame( ... data={ ... "x": [5, 6, 7, 4, 8], ... "y": [1.5, 2.5, 1.0, 4.0, -5.75], ... "z": [-9, 2, -1, 4, 8], ... } ... ) >>> df.select( ... (pl.col("x") == pl.col("y")) ... .or_( ... pl.col("x") == pl.col("y"), ... pl.col("y") == pl.col("z"), ... pl.col("y").cast(int) == pl.col("z"), ... ) ... .alias("any") ... ) shape: (5, 1) ┌───────┐ │ any │ │ --- │ │ bool │ ╞═══════╡ │ false │ │ true │ │ false │ │ true │ │ false │ └───────┘ 
 - over(
- expr: IntoExpr | Iterable[IntoExpr],
- *more_exprs: IntoExpr,
- mapping_strategy: WindowMappingStrategy = 'group_to_rows',
- Compute expressions over the given groups. - This expression is similar to performing a group by aggregation and joining the result back into the original DataFrame. - The outcome is similar to how window functions work in PostgreSQL. - Parameters:
- expr
- Column(s) to group by. Accepts expression input. Strings are parsed as column names. 
- *more_exprs
- Additional columns to group by, specified as positional arguments. 
- mapping_strategy: {‘group_to_rows’, ‘join’, ‘explode’}
- group_to_rows
- If the aggregation results in multiple values, assign them back to their position in the DataFrame. This can only be done if the group yields the same elements before aggregation as after. 
 
- join
- Join the groups as ‘List<group_dtype>’ to the row positions. warning: this can be memory intensive. 
 
- explode
- Don’t do any mapping, but simply flatten the group. This only makes sense if the input data is sorted. 
 
 
 
 - Examples - Pass the name of a column to compute the expression over that column. - >>> df = pl.DataFrame( ... { ... "a": ["a", "a", "b", "b", "b"], ... "b": [1, 2, 3, 5, 3], ... "c": [5, 4, 3, 2, 1], ... } ... ) >>> df.with_columns( ... pl.col("c").max().over("a").name.suffix("_max"), ... ) shape: (5, 4) ┌─────┬─────┬─────┬───────┐ │ a ┆ b ┆ c ┆ c_max │ │ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╪═══════╡ │ a ┆ 1 ┆ 5 ┆ 5 │ │ a ┆ 2 ┆ 4 ┆ 5 │ │ b ┆ 3 ┆ 3 ┆ 3 │ │ b ┆ 5 ┆ 2 ┆ 3 │ │ b ┆ 3 ┆ 1 ┆ 3 │ └─────┴─────┴─────┴───────┘ - Expression input is supported. - >>> df.with_columns( ... pl.col("c").max().over(pl.col("b") // 2).name.suffix("_max"), ... ) shape: (5, 4) ┌─────┬─────┬─────┬───────┐ │ a ┆ b ┆ c ┆ c_max │ │ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╪═══════╡ │ a ┆ 1 ┆ 5 ┆ 5 │ │ a ┆ 2 ┆ 4 ┆ 4 │ │ b ┆ 3 ┆ 3 ┆ 4 │ │ b ┆ 5 ┆ 2 ┆ 2 │ │ b ┆ 3 ┆ 1 ┆ 4 │ └─────┴─────┴─────┴───────┘ - Group by multiple columns by passing a list of column names or expressions. - >>> df.with_columns( ... pl.col("c").min().over(["a", "b"]).name.suffix("_min"), ... ) shape: (5, 4) ┌─────┬─────┬─────┬───────┐ │ a ┆ b ┆ c ┆ c_min │ │ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╪═══════╡ │ a ┆ 1 ┆ 5 ┆ 5 │ │ a ┆ 2 ┆ 4 ┆ 4 │ │ b ┆ 3 ┆ 3 ┆ 1 │ │ b ┆ 5 ┆ 2 ┆ 2 │ │ b ┆ 3 ┆ 1 ┆ 1 │ └─────┴─────┴─────┴───────┘ - Or use positional arguments to group by multiple columns in the same way. - >>> df.with_columns( ... pl.col("c").min().over("a", pl.col("b") % 2).name.suffix("_min"), ... ) shape: (5, 4) ┌─────┬─────┬─────┬───────┐ │ a ┆ b ┆ c ┆ c_min │ │ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╪═══════╡ │ a ┆ 1 ┆ 5 ┆ 5 │ │ a ┆ 2 ┆ 4 ┆ 4 │ │ b ┆ 3 ┆ 3 ┆ 1 │ │ b ┆ 5 ┆ 2 ┆ 1 │ │ b ┆ 3 ┆ 1 ┆ 1 │ └─────┴─────┴─────┴───────┘ 
 - pct_change(n: int | IntoExprColumn = 1) Self[source]
- Computes percentage change between values. - Percentage change (as fraction) between current element and most-recent non-null element at least - nperiod(s) before the current element.- Computes the change from the previous row by default. - Parameters:
- n
- periods to shift for forming percent change. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [10, 11, 12, None, 12], ... } ... ) >>> df.with_columns(pl.col("a").pct_change().alias("pct_change")) shape: (5, 2) ┌──────┬────────────┐ │ a ┆ pct_change │ │ --- ┆ --- │ │ i64 ┆ f64 │ ╞══════╪════════════╡ │ 10 ┆ null │ │ 11 ┆ 0.1 │ │ 12 ┆ 0.090909 │ │ null ┆ 0.0 │ │ 12 ┆ 0.0 │ └──────┴────────────┘ 
 - peak_max() Self[source]
- Get a boolean mask of the local maximum peaks. - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 4, 5]}) >>> df.select(pl.col("a").peak_max()) shape: (5, 1) ┌───────┐ │ a │ │ --- │ │ bool │ ╞═══════╡ │ false │ │ false │ │ false │ │ false │ │ true │ └───────┘ 
 - peak_min() Self[source]
- Get a boolean mask of the local minimum peaks. - Examples - >>> df = pl.DataFrame({"a": [4, 1, 3, 2, 5]}) >>> df.select(pl.col("a").peak_min()) shape: (5, 1) ┌───────┐ │ a │ │ --- │ │ bool │ ╞═══════╡ │ false │ │ true │ │ false │ │ true │ │ false │ └───────┘ 
 - pipe(
- function: Callable[Concatenate[Expr, P], T],
- *args: P.args,
- **kwargs: P.kwargs,
- Offers a structured way to apply a sequence of user-defined functions (UDFs). - Parameters:
- function
- Callable; will receive the expression as the first parameter, followed by any given args/kwargs. 
- *args
- Arguments to pass to the UDF. 
- **kwargs
- Keyword arguments to pass to the UDF. 
 
 - Examples - >>> def extract_number(expr: pl.Expr) -> pl.Expr: ... """Extract the digits from a string.""" ... return expr.str.extract(r"\d+", 0).cast(pl.Int64) >>> >>> def scale_negative_even(expr: pl.Expr, *, n: int = 1) -> pl.Expr: ... """Set even numbers negative, and scale by a user-supplied value.""" ... expr = pl.when(expr % 2 == 0).then(-expr).otherwise(expr) ... return expr * n >>> >>> df = pl.DataFrame({"val": ["a: 1", "b: 2", "c: 3", "d: 4"]}) >>> df.with_columns( ... udfs=( ... pl.col("val").pipe(extract_number).pipe(scale_negative_even, n=5) ... ), ... ) shape: (4, 2) ┌──────┬──────┐ │ val ┆ udfs │ │ --- ┆ --- │ │ str ┆ i64 │ ╞══════╪══════╡ │ a: 1 ┆ 5 │ │ b: 2 ┆ -10 │ │ c: 3 ┆ 15 │ │ d: 4 ┆ -20 │ └──────┴──────┘ 
 - pow(exponent: int | float | None | Series | Expr) Self[source]
- Method equivalent of exponentiation operator - expr ** exponent.- Parameters:
- exponent
- Numeric literal or expression exponent value. 
 
 - Examples - >>> df = pl.DataFrame({"x": [1, 2, 4, 8]}) >>> df.with_columns( ... pl.col("x").pow(3).alias("cube"), ... pl.col("x").pow(pl.col("x").log(2)).alias("x ** xlog2"), ... ) shape: (4, 3) ┌─────┬───────┬────────────┐ │ x ┆ cube ┆ x ** xlog2 │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ f64 │ ╞═════╪═══════╪════════════╡ │ 1 ┆ 1.0 ┆ 1.0 │ │ 2 ┆ 8.0 ┆ 2.0 │ │ 4 ┆ 64.0 ┆ 16.0 │ │ 8 ┆ 512.0 ┆ 512.0 │ └─────┴───────┴────────────┘ 
 - prefix(prefix: str) Self[source]
- Add a prefix to the root column name of the expression. - Deprecated since version 0.19.12: This method has been renamed to - name.prefix().- Parameters:
- prefix
- Prefix to add to the root column name. 
 
 - See also - Notes - This will undo any previous renaming operations on the expression. - Due to implementation constraints, this method can only be called as the last expression in a chain. - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3], ... "b": ["x", "y", "z"], ... } ... ) >>> df.with_columns(pl.all().reverse().name.prefix("reverse_")) shape: (3, 4) ┌─────┬─────┬───────────┬───────────┐ │ a ┆ b ┆ reverse_a ┆ reverse_b │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 ┆ str │ ╞═════╪═════╪═══════════╪═══════════╡ │ 1 ┆ x ┆ 3 ┆ z │ │ 2 ┆ y ┆ 2 ┆ y │ │ 3 ┆ z ┆ 1 ┆ x │ └─────┴─────┴───────────┴───────────┘ 
 - product() Self[source]
- Compute the product of an expression. - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").product()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 6 │ └─────┘ 
 - qcut(
- quantiles: Sequence[float] | int,
- *,
- labels: Sequence[str] | None = None,
- left_closed: bool = False,
- allow_duplicates: bool = False,
- include_breaks: bool = False,
- Bin continuous values into discrete categories based on their quantiles. - Parameters:
- quantiles
- Either a list of quantile probabilities between 0 and 1 or a positive integer determining the number of bins with uniform probability. 
- labels
- Names of the categories. The number of labels must be equal to the number of categories. 
- left_closed
- Set the intervals to be left-closed instead of right-closed. 
- allow_duplicates
- If set to - True, duplicates in the resulting quantiles are dropped, rather than raising a- DuplicateError. This can happen even with unique probabilities, depending on the data.
- include_breaks
- Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a - Categoricalto a- Struct.
 
- Returns:
- Expr
- Expression of data type - Categoricalif- include_breaksis set to- False(default), otherwise an expression of data type- Struct.
 
 - See also - Examples - Divide a column into three categories according to pre-defined quantile probabilities. - >>> df = pl.DataFrame({"foo": [-2, -1, 0, 1, 2]}) >>> df.with_columns( ... pl.col("foo").qcut([0.25, 0.75], labels=["a", "b", "c"]).alias("qcut") ... ) shape: (5, 2) ┌─────┬──────┐ │ foo ┆ qcut │ │ --- ┆ --- │ │ i64 ┆ cat │ ╞═════╪══════╡ │ -2 ┆ a │ │ -1 ┆ a │ │ 0 ┆ b │ │ 1 ┆ b │ │ 2 ┆ c │ └─────┴──────┘ - Divide a column into two categories using uniform quantile probabilities. - >>> df.with_columns( ... pl.col("foo") ... .qcut(2, labels=["low", "high"], left_closed=True) ... .alias("qcut") ... ) shape: (5, 2) ┌─────┬──────┐ │ foo ┆ qcut │ │ --- ┆ --- │ │ i64 ┆ cat │ ╞═════╪══════╡ │ -2 ┆ low │ │ -1 ┆ low │ │ 0 ┆ high │ │ 1 ┆ high │ │ 2 ┆ high │ └─────┴──────┘ - Add both the category and the breakpoint. - >>> df.with_columns( ... pl.col("foo").qcut([0.25, 0.75], include_breaks=True).alias("qcut") ... ).unnest("qcut") shape: (5, 3) ┌─────┬──────┬────────────┐ │ foo ┆ brk ┆ foo_bin │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ cat │ ╞═════╪══════╪════════════╡ │ -2 ┆ -1.0 ┆ (-inf, -1] │ │ -1 ┆ -1.0 ┆ (-inf, -1] │ │ 0 ┆ 1.0 ┆ (-1, 1] │ │ 1 ┆ 1.0 ┆ (-1, 1] │ │ 2 ┆ inf ┆ (1, inf] │ └─────┴──────┴────────────┘ 
 - quantile(
- quantile: float | Expr,
- interpolation: RollingInterpolationMethod = 'nearest',
- Get quantile value. - Parameters:
- quantile
- Quantile between 0.0 and 1.0. 
- interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’}
- Interpolation method. 
 
 - Examples - >>> df = pl.DataFrame({"a": [0, 1, 2, 3, 4, 5]}) >>> df.select(pl.col("a").quantile(0.3)) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ └─────┘ >>> df.select(pl.col("a").quantile(0.3, interpolation="higher")) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 2.0 │ └─────┘ >>> df.select(pl.col("a").quantile(0.3, interpolation="lower")) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ └─────┘ >>> df.select(pl.col("a").quantile(0.3, interpolation="midpoint")) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.5 │ └─────┘ >>> df.select(pl.col("a").quantile(0.3, interpolation="linear")) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.5 │ └─────┘ 
 - radians() Self[source]
- Convert from degrees to radians. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [-720, -540, -360, -180, 0, 180, 360, 540, 720]}) >>> df.select(pl.col("a").radians()) shape: (9, 1) ┌────────────┐ │ a │ │ --- │ │ f64 │ ╞════════════╡ │ -12.566371 │ │ -9.424778 │ │ -6.283185 │ │ -3.141593 │ │ 0.0 │ │ 3.141593 │ │ 6.283185 │ │ 9.424778 │ │ 12.566371 │ └────────────┘ 
 - rank( ) Self[source]
- Assign ranks to data, dealing with ties appropriately. - Parameters:
- method{‘average’, ‘min’, ‘max’, ‘dense’, ‘ordinal’, ‘random’}
- The method used to assign ranks to tied elements. The following methods are available (default is ‘average’): - ‘average’ : The average of the ranks that would have been assigned to all the tied values is assigned to each value. 
- ‘min’ : The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as “competition” ranking.) 
- ‘max’ : The maximum of the ranks that would have been assigned to all the tied values is assigned to each value. 
- ‘dense’ : Like ‘min’, but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements. 
- ‘ordinal’ : All values are given a distinct rank, corresponding to the order that the values occur in the Series. 
- ‘random’ : Like ‘ordinal’, but the rank for ties is not dependent on the order that the values occur in the Series. 
 
- descending
- Rank in descending order. 
- seed
- If - method="random", use this as seed.
 
 - Examples - The ‘average’ method: - >>> df = pl.DataFrame({"a": [3, 6, 1, 1, 6]}) >>> df.select(pl.col("a").rank()) shape: (5, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 3.0 │ │ 4.5 │ │ 1.5 │ │ 1.5 │ │ 4.5 │ └─────┘ - The ‘ordinal’ method: - >>> df = pl.DataFrame({"a": [3, 6, 1, 1, 6]}) >>> df.select(pl.col("a").rank("ordinal")) shape: (5, 1) ┌─────┐ │ a │ │ --- │ │ u32 │ ╞═════╡ │ 3 │ │ 4 │ │ 1 │ │ 2 │ │ 5 │ └─────┘ - Use ‘rank’ with ‘over’ to rank within groups: - >>> df = pl.DataFrame({"a": [1, 1, 2, 2, 2], "b": [6, 7, 5, 14, 11]}) >>> df.with_columns(pl.col("b").rank().over("a").alias("rank")) shape: (5, 3) ┌─────┬─────┬──────┐ │ a ┆ b ┆ rank │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ f64 │ ╞═════╪═════╪══════╡ │ 1 ┆ 6 ┆ 1.0 │ │ 1 ┆ 7 ┆ 2.0 │ │ 2 ┆ 5 ┆ 1.0 │ │ 2 ┆ 14 ┆ 3.0 │ │ 2 ┆ 11 ┆ 2.0 │ └─────┴─────┴──────┘ 
 - rechunk() Self[source]
- Create a single chunk of memory for this Series. - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2]}) - Create a Series with 3 nulls, append column a then rechunk - >>> df.select(pl.repeat(None, 3).append(pl.col("a")).rechunk()) shape: (6, 1) ┌────────┐ │ repeat │ │ --- │ │ i64 │ ╞════════╡ │ null │ │ null │ │ null │ │ 1 │ │ 1 │ │ 2 │ └────────┘ 
 - register_plugin(
- *,
- lib: str,
- symbol: str,
- args: list[IntoExpr] | None = None,
- kwargs: dict[Any, Any] | None = None,
- is_elementwise: bool = False,
- input_wildcard_expansion: bool = False,
- returns_scalar: bool = False,
- cast_to_supertypes: bool = False,
- pass_name_to_apply: bool = False,
- changes_length: bool = False,
- Register a shared library as a plugin. - Warning - This is highly unsafe as this will call the C function loaded by - lib::symbol.- The parameters you give dictate how polars will deal with the function. Make sure they are correct! - Note - This functionality is unstable and may change without it being considered breaking. - Parameters:
- lib
- Library to load. 
- symbol
- Function to load. 
- args
- Arguments (other than self) passed to this function. These arguments have to be of type Expression. 
- kwargs
- Non-expression arguments. They must be JSON serializable. 
- is_elementwise
- If the function only operates on scalars this will trigger fast paths. 
- input_wildcard_expansion
- Expand expressions as input of this function. 
- returns_scalar
- Automatically explode on unit length if it ran as final aggregation. this is the case for aggregations like - sum,- min,- covarianceetc.
- cast_to_supertypes
- Cast the input datatypes to their supertype. 
- pass_name_to_apply
- if set, then the - Seriespassed to the function in the group_by operation will ensure the name is set. This is an extra heap allocation per group.
- changes_length
- For example a - uniqueor a- slice
 
 
 - reinterpret(*, signed: bool = True) Self[source]
- Reinterpret the underlying bits as a signed/unsigned integer. - This operation is only allowed for 64bit integers. For lower bits integers, you can safely use that cast operation. - Parameters:
- signed
- If True, reinterpret as - pl.Int64. Otherwise, reinterpret as- pl.UInt64.
 
 - Examples - >>> s = pl.Series("a", [1, 1, 2], dtype=pl.UInt64) >>> df = pl.DataFrame([s]) >>> df.select( ... [ ... pl.col("a").reinterpret(signed=True).alias("reinterpreted"), ... pl.col("a").alias("original"), ... ] ... ) shape: (3, 2) ┌───────────────┬──────────┐ │ reinterpreted ┆ original │ │ --- ┆ --- │ │ i64 ┆ u64 │ ╞═══════════════╪══════════╡ │ 1 ┆ 1 │ │ 1 ┆ 1 │ │ 2 ┆ 2 │ └───────────────┴──────────┘ 
 - repeat_by(by: Series | Expr | str | int) Self[source]
- Repeat the elements in this Series as specified in the given expression. - The repeated elements are expanded into a - List.- Parameters:
- by
- Numeric column that determines how often the values will be repeated. The column will be coerced to UInt32. Give this dtype to make the coercion a no-op. 
 
- Returns:
- Expr
- Expression of data type - List, where the inner data type is equal to the original data type.
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": ["x", "y", "z"], ... "n": [1, 2, 3], ... } ... ) >>> df.select(pl.col("a").repeat_by("n")) shape: (3, 1) ┌─────────────────┐ │ a │ │ --- │ │ list[str] │ ╞═════════════════╡ │ ["x"] │ │ ["y", "y"] │ │ ["z", "z", "z"] │ └─────────────────┘ 
 - replace(
- mapping: dict[Any, Any],
- *,
- default: Any = _NoDefault.no_default,
- return_dtype: PolarsDataType | None = None,
- Replace values according to the given mapping. - Needs a global string cache for lazily evaluated queries on columns of type - Categorical.- Parameters:
- mapping
- Mapping of values to their replacement. 
- default
- Value to use when the mapping does not contain the lookup value. Defaults to keeping the original value. Accepts expression input. Non-expression inputs are parsed as literals. 
- return_dtype
- Set return dtype to override automatic return dtype determination. 
 
 - See also - Examples - Replace a single value by another value. Values not in the mapping remain unchanged. - >>> df = pl.DataFrame({"a": [1, 2, 2, 3]}) >>> df.with_columns(pl.col("a").replace({2: 100}).alias("replaced")) shape: (4, 2) ┌─────┬──────────┐ │ a ┆ replaced │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪══════════╡ │ 1 ┆ 1 │ │ 2 ┆ 100 │ │ 2 ┆ 100 │ │ 3 ┆ 3 │ └─────┴──────────┘ - Replace multiple values. Specify a default to set values not in the given map to the default value. - >>> df = pl.DataFrame({"country_code": ["FR", "ES", "DE", None]}) >>> country_code_map = { ... "CA": "Canada", ... "DE": "Germany", ... "FR": "France", ... None: "unspecified", ... } >>> df.with_columns( ... pl.col("country_code") ... .replace(country_code_map, default=None) ... .alias("replaced") ... ) shape: (4, 2) ┌──────────────┬─────────────┐ │ country_code ┆ replaced │ │ --- ┆ --- │ │ str ┆ str │ ╞══════════════╪═════════════╡ │ FR ┆ France │ │ ES ┆ null │ │ DE ┆ Germany │ │ null ┆ unspecified │ └──────────────┴─────────────┘ - The return type can be overridden with the - return_dtypeargument.- >>> df = df.with_row_count() >>> df.select( ... "row_nr", ... pl.col("row_nr") ... .replace({1: 10, 2: 20}, default=0, return_dtype=pl.UInt8) ... .alias("replaced"), ... ) shape: (4, 2) ┌────────┬──────────┐ │ row_nr ┆ replaced │ │ --- ┆ --- │ │ u32 ┆ u8 │ ╞════════╪══════════╡ │ 0 ┆ 0 │ │ 1 ┆ 10 │ │ 2 ┆ 20 │ │ 3 ┆ 0 │ └────────┴──────────┘ - To reference other columns as a - defaultvalue, a struct column must be constructed first. The first field must be the column in which values are replaced. The other columns can be used in the default expression.- >>> df.with_columns( ... pl.struct("country_code", "row_nr") ... .replace( ... mapping=country_code_map, ... default=pl.col("row_nr").cast(pl.Utf8), ... ) ... .alias("replaced") ... ) shape: (4, 3) ┌────────┬──────────────┬─────────────┐ │ row_nr ┆ country_code ┆ replaced │ │ --- ┆ --- ┆ --- │ │ u32 ┆ str ┆ str │ ╞════════╪══════════════╪═════════════╡ │ 0 ┆ FR ┆ France │ │ 1 ┆ ES ┆ 1 │ │ 2 ┆ DE ┆ Germany │ │ 3 ┆ null ┆ unspecified │ └────────┴──────────────┴─────────────┘ 
 - reshape(dimensions: tuple[int, ...]) Self[source]
- Reshape this Expr to a flat Series or a Series of Lists. - Parameters:
- dimensions
- Tuple of the dimension sizes. If a -1 is used in any of the dimensions, that dimension is inferred. 
 
- Returns:
- Expr
- If a single dimension is given, results in an expression of the original data type. If a multiple dimensions are given, results in an expression of data type - Listwith shape (rows, cols).
 
 - See also - Expr.list.explode
- Explode a list column. 
 - Examples - >>> df = pl.DataFrame({"foo": [1, 2, 3, 4, 5, 6, 7, 8, 9]}) >>> df.select(pl.col("foo").reshape((3, 3))) shape: (3, 1) ┌───────────┐ │ foo │ │ --- │ │ list[i64] │ ╞═══════════╡ │ [1, 2, 3] │ │ [4, 5, 6] │ │ [7, 8, 9] │ └───────────┘ 
 - reverse() Self[source]
- Reverse the selection. - Examples - >>> df = pl.DataFrame( ... { ... "A": [1, 2, 3, 4, 5], ... "fruits": ["banana", "banana", "apple", "apple", "banana"], ... "B": [5, 4, 3, 2, 1], ... "cars": ["beetle", "audi", "beetle", "beetle", "beetle"], ... } ... ) >>> df.select( ... [ ... pl.all(), ... pl.all().reverse().name.suffix("_reverse"), ... ] ... ) shape: (5, 8) ┌─────┬────────┬─────┬────────┬───────────┬────────────────┬───────────┬──────────────┐ │ A ┆ fruits ┆ B ┆ cars ┆ A_reverse ┆ fruits_reverse ┆ B_reverse ┆ cars_reverse │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 ┆ str ┆ i64 ┆ str ┆ i64 ┆ str │ ╞═════╪════════╪═════╪════════╪═══════════╪════════════════╪═══════════╪══════════════╡ │ 1 ┆ banana ┆ 5 ┆ beetle ┆ 5 ┆ banana ┆ 1 ┆ beetle │ │ 2 ┆ banana ┆ 4 ┆ audi ┆ 4 ┆ apple ┆ 2 ┆ beetle │ │ 3 ┆ apple ┆ 3 ┆ beetle ┆ 3 ┆ apple ┆ 3 ┆ beetle │ │ 4 ┆ apple ┆ 2 ┆ beetle ┆ 2 ┆ banana ┆ 4 ┆ audi │ │ 5 ┆ banana ┆ 1 ┆ beetle ┆ 1 ┆ banana ┆ 5 ┆ beetle │ └─────┴────────┴─────┴────────┴───────────┴────────────────┴───────────┴──────────────┘ 
 - rle() Self[source]
- Get the lengths of runs of identical values. - Returns:
- Expr
- Expression of data type - Structwith Fields “lengths” and “values”.
 
 - Examples - >>> df = pl.DataFrame(pl.Series("s", [1, 1, 2, 1, None, 1, 3, 3])) >>> df.select(pl.col("s").rle()).unnest("s") shape: (6, 2) ┌─────────┬────────┐ │ lengths ┆ values │ │ --- ┆ --- │ │ i32 ┆ i64 │ ╞═════════╪════════╡ │ 2 ┆ 1 │ │ 1 ┆ 2 │ │ 1 ┆ 1 │ │ 1 ┆ null │ │ 1 ┆ 1 │ │ 2 ┆ 3 │ └─────────┴────────┘ 
 - rle_id() Self[source]
- Map values to run IDs. - Similar to RLE, but it maps each value to an ID corresponding to the run into which it falls. This is especially useful when you want to define groups by runs of identical values rather than the values themselves. - Examples - >>> df = pl.DataFrame(dict(a=[1, 2, 1, 1, 1], b=["x", "x", None, "y", "y"])) >>> # It works on structs of multiple values too! >>> df.with_columns(a_r=pl.col("a").rle_id(), ab_r=pl.struct("a", "b").rle_id()) shape: (5, 4) ┌─────┬──────┬─────┬──────┐ │ a ┆ b ┆ a_r ┆ ab_r │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ u32 ┆ u32 │ ╞═════╪══════╪═════╪══════╡ │ 1 ┆ x ┆ 0 ┆ 0 │ │ 2 ┆ x ┆ 1 ┆ 1 │ │ 1 ┆ null ┆ 2 ┆ 2 │ │ 1 ┆ y ┆ 2 ┆ 3 │ │ 1 ┆ y ┆ 2 ┆ 3 │ └─────┴──────┴─────┴──────┘ 
 - rolling(
- index_column: str,
- *,
- period: str | timedelta,
- offset: str | timedelta | None = None,
- closed: ClosedInterval = 'right',
- check_sorted: bool = True,
- Create rolling groups based on a time, Int32, or Int64 column. - If you have a time series - <t_0, t_1, ..., t_n>, then by default the windows created will be- (t_0 - period, t_0] 
- (t_1 - period, t_1] 
- … 
- (t_n - period, t_n] 
 - whereas if you pass a non-default - offset, then the windows will be- (t_0 + offset, t_0 + offset + period] 
- (t_1 + offset, t_1 + offset + period] 
- … 
- (t_n + offset, t_n + offset + period] 
 - The - periodand- offsetarguments are created either from a timedelta, or by using the following string language:- 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - Or combine them: “3d12h4m25s” # 3 days, 12 hours, 4 minutes, and 25 seconds - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - In case of a rolling operation on an integer column, the windows are defined by: - “1i” # length 1 
- “10i” # length 10 
 - Parameters:
- index_column
- Column used to group based on the time window. Often of type Date/Datetime. This column must be sorted in ascending order. In case of a rolling group by on indices, dtype needs to be one of {Int32, Int64}. Note that Int32 gets temporarily cast to Int64, so if performance matters use an Int64 column. 
- period
- length of the window - must be non-negative 
- offset
- offset of the window. Default is -period 
- closed{‘right’, ‘left’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive). 
- check_sorted
- When the - byargument is given, polars can not check sortedness by the metadata and has to do a full scan on the index column to verify data is sorted. This is expensive. If you are sure the data within the by groups is sorted, you can set this to- False. Doing so incorrectly will lead to incorrect output
 
 - Examples - >>> dates = [ ... "2020-01-01 13:45:48", ... "2020-01-01 16:42:13", ... "2020-01-01 16:45:09", ... "2020-01-02 18:12:48", ... "2020-01-03 19:45:32", ... "2020-01-08 23:16:43", ... ] >>> df = pl.DataFrame({"dt": dates, "a": [3, 7, 5, 9, 2, 1]}).with_columns( ... pl.col("dt").str.strptime(pl.Datetime).set_sorted() ... ) >>> df.with_columns( ... sum_a=pl.sum("a").rolling(index_column="dt", period="2d"), ... min_a=pl.min("a").rolling(index_column="dt", period="2d"), ... max_a=pl.max("a").rolling(index_column="dt", period="2d"), ... ) shape: (6, 5) ┌─────────────────────┬─────┬───────┬───────┬───────┐ │ dt ┆ a ┆ sum_a ┆ min_a ┆ max_a │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ datetime[μs] ┆ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞═════════════════════╪═════╪═══════╪═══════╪═══════╡ │ 2020-01-01 13:45:48 ┆ 3 ┆ 3 ┆ 3 ┆ 3 │ │ 2020-01-01 16:42:13 ┆ 7 ┆ 10 ┆ 3 ┆ 7 │ │ 2020-01-01 16:45:09 ┆ 5 ┆ 15 ┆ 3 ┆ 7 │ │ 2020-01-02 18:12:48 ┆ 9 ┆ 24 ┆ 3 ┆ 9 │ │ 2020-01-03 19:45:32 ┆ 2 ┆ 11 ┆ 2 ┆ 9 │ │ 2020-01-08 23:16:43 ┆ 1 ┆ 1 ┆ 1 ┆ 1 │ └─────────────────────┴─────┴───────┴───────┴───────┘ 
 - rolling_apply(
- function: Callable[[Series], Any],
- window_size: int,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- Apply a custom rolling window function. - Deprecated since version 0.19.0: This method has been renamed to - Expr.rolling_map().- Parameters:
- function
- Aggregation function 
- window_size
- The length of the window. 
- weights
- An optional slice with the same length as the window that will be multiplied elementwise with the values in the window. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
 
 
 - rolling_map(
- function: Callable[[Series], Any],
- window_size: int,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- Compute a custom rolling window function. - Warning - Computing custom functions is extremely slow. Use specialized rolling functions such as - Expr.rolling_sum()if at all possible.- Parameters:
- function
- Custom aggregation function. 
- window_size
- Size of the window. The window at a given row will include the row itself and the - window_size - 1elements before it.
- weights
- A list of weights with the same length as the window that will be multiplied elementwise with the values in the window. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window. 
 
 - Examples - >>> from numpy import nansum >>> df = pl.DataFrame({"a": [11.0, 2.0, 9.0, float("nan"), 8.0]}) >>> df.select(pl.col("a").rolling_map(nansum, window_size=3)) shape: (5, 1) ┌──────┐ │ a │ │ --- │ │ f64 │ ╞══════╡ │ null │ │ null │ │ 22.0 │ │ 11.0 │ │ 17.0 │ └──────┘ 
 - rolling_max(
- window_size: int | timedelta | str,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- by: str | None = None,
- closed: ClosedInterval = 'left',
- warn_if_unsorted: bool = True,
- Apply a rolling max (moving max) over the values in this array. - A window of length - window_sizewill traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the- weightvector. The resulting values will be aggregated to their max.- If - byhas not been specified (the default), the window at a given row will include the row itself, and the- window_size - 1elements before it.- If you pass a - bycolumn- <t_0, t_1, ..., t_n>, then- closed="left"means the windows will be:- [t_0 - window_size, t_0) 
- [t_1 - window_size, t_1) 
- … 
- [t_n - window_size, t_n) 
 - With - closed="right", the left endpoint is not included and the right endpoint is included.- Parameters:
- window_size
- The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language: - 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - If a timedelta or the dynamic string language is used, the - byand- closedarguments must also be set.
- weights
- An optional slice with the same length as the window that will be multiplied elementwise with the values in the window. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
- by
- If the - window_sizeis temporal, for instance- "5h"or- "3s", you must set the column that will be used to determine the windows. This column must be of dtype Datetime or Date.
- closed{‘left’, ‘right’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive); only applicable if - byhas been set.
- warn_if_unsorted
- Warn if data is not known to be sorted by - bycolumn (if passed). Experimental.
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - Notes - If you want to compute multiple aggregation statistics over the same dynamic window, consider using - rolling- this method can cache the window size computation.- Examples - >>> df = pl.DataFrame({"A": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}) >>> df.with_columns( ... rolling_max=pl.col("A").rolling_max(window_size=2), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_max │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 2.0 │ │ 3.0 ┆ 3.0 │ │ 4.0 ┆ 4.0 │ │ 5.0 ┆ 5.0 │ │ 6.0 ┆ 6.0 │ └─────┴─────────────┘ - Specify weights to multiply the values in the window with: - >>> df.with_columns( ... rolling_max=pl.col("A").rolling_max( ... window_size=2, weights=[0.25, 0.75] ... ), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_max │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.5 │ │ 3.0 ┆ 2.25 │ │ 4.0 ┆ 3.0 │ │ 5.0 ┆ 3.75 │ │ 6.0 ┆ 4.5 │ └─────┴─────────────┘ - Center the values in the window - >>> df.with_columns( ... rolling_max=pl.col("A").rolling_max(window_size=3, center=True), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_max │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 3.0 │ │ 3.0 ┆ 4.0 │ │ 4.0 ┆ 5.0 │ │ 5.0 ┆ 6.0 │ │ 6.0 ┆ null │ └─────┴─────────────┘ - Create a DataFrame with a datetime column and a row number column - >>> from datetime import timedelta, datetime >>> start = datetime(2001, 1, 1) >>> stop = datetime(2001, 1, 2) >>> df_temporal = pl.DataFrame( ... {"date": pl.datetime_range(start, stop, "1h", eager=True)} ... ).with_row_count() >>> df_temporal shape: (25, 2) ┌────────┬─────────────────────┐ │ row_nr ┆ date │ │ --- ┆ --- │ │ u32 ┆ datetime[μs] │ ╞════════╪═════════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 │ │ 1 ┆ 2001-01-01 01:00:00 │ │ 2 ┆ 2001-01-01 02:00:00 │ │ 3 ┆ 2001-01-01 03:00:00 │ │ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 │ │ 22 ┆ 2001-01-01 22:00:00 │ │ 23 ┆ 2001-01-01 23:00:00 │ │ 24 ┆ 2001-01-02 00:00:00 │ └────────┴─────────────────────┘ - Compute the rolling max with the default left closure of temporal windows - >>> df_temporal.with_columns( ... rolling_row_max=pl.col("row_nr").rolling_max( ... window_size="2h", by="date", closed="left" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_max │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ u32 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ null │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 1 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 2 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 20 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 21 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 22 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 23 │ └────────┴─────────────────────┴─────────────────┘ - Compute the rolling max with the closure of windows on both sides - >>> df_temporal.with_columns( ... rolling_row_max=pl.col("row_nr").rolling_max( ... window_size="2h", by="date", closed="both" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_max │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ u32 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ 0 │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 1 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 2 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 3 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 21 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 22 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 23 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 24 │ └────────┴─────────────────────┴─────────────────┘ 
 - rolling_mean(
- window_size: int | timedelta | str,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- by: str | None = None,
- closed: ClosedInterval = 'left',
- warn_if_unsorted: bool = True,
- Apply a rolling mean (moving mean) over the values in this array. - A window of length - window_sizewill traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the- weightvector. The resulting values will be aggregated to their mean.- If - byhas not been specified (the default), the window at a given row will include the row itself, and the- window_size - 1elements before it.- If you pass a - bycolumn- <t_0, t_1, ..., t_n>, then- closed="left"means the windows will be:- [t_0 - window_size, t_0) 
- [t_1 - window_size, t_1) 
- … 
- [t_n - window_size, t_n) 
 - With - closed="right", the left endpoint is not included and the right endpoint is included.- Parameters:
- window_size
- The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language: - 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - If a timedelta or the dynamic string language is used, the - byand- closedarguments must also be set.
- weights
- An optional slice with the same length as the window that will be multiplied elementwise with the values in the window. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
- by
- If the - window_sizeis temporal for instance- "5h"or- "3s", you must set the column that will be used to determine the windows. This column must be of dtype Datetime or Date.- Warning - If passed, the column must be sorted in ascending order. Otherwise, results will not be correct. 
- closed{‘left’, ‘right’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive); only applicable if - byhas been set.
- warn_if_unsorted
- Warn if data is not known to be sorted by - bycolumn (if passed). Experimental.
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - Notes - If you want to compute multiple aggregation statistics over the same dynamic window, consider using - rolling- this method can cache the window size computation.- Examples - >>> df = pl.DataFrame({"A": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}) >>> df.with_columns( ... rolling_mean=pl.col("A").rolling_mean(window_size=2), ... ) shape: (6, 2) ┌─────┬──────────────┐ │ A ┆ rolling_mean │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪══════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.5 │ │ 3.0 ┆ 2.5 │ │ 4.0 ┆ 3.5 │ │ 5.0 ┆ 4.5 │ │ 6.0 ┆ 5.5 │ └─────┴──────────────┘ - Specify weights to multiply the values in the window with: - >>> df.with_columns( ... rolling_mean=pl.col("A").rolling_mean( ... window_size=2, weights=[0.25, 0.75] ... ), ... ) shape: (6, 2) ┌─────┬──────────────┐ │ A ┆ rolling_mean │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪══════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.75 │ │ 3.0 ┆ 2.75 │ │ 4.0 ┆ 3.75 │ │ 5.0 ┆ 4.75 │ │ 6.0 ┆ 5.75 │ └─────┴──────────────┘ - Center the values in the window - >>> df.with_columns( ... rolling_mean=pl.col("A").rolling_mean(window_size=3, center=True), ... ) shape: (6, 2) ┌─────┬──────────────┐ │ A ┆ rolling_mean │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪══════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 2.0 │ │ 3.0 ┆ 3.0 │ │ 4.0 ┆ 4.0 │ │ 5.0 ┆ 5.0 │ │ 6.0 ┆ null │ └─────┴──────────────┘ - Create a DataFrame with a datetime column and a row number column - >>> from datetime import timedelta, datetime >>> start = datetime(2001, 1, 1) >>> stop = datetime(2001, 1, 2) >>> df_temporal = pl.DataFrame( ... {"date": pl.datetime_range(start, stop, "1h", eager=True)} ... ).with_row_count() >>> df_temporal shape: (25, 2) ┌────────┬─────────────────────┐ │ row_nr ┆ date │ │ --- ┆ --- │ │ u32 ┆ datetime[μs] │ ╞════════╪═════════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 │ │ 1 ┆ 2001-01-01 01:00:00 │ │ 2 ┆ 2001-01-01 02:00:00 │ │ 3 ┆ 2001-01-01 03:00:00 │ │ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 │ │ 22 ┆ 2001-01-01 22:00:00 │ │ 23 ┆ 2001-01-01 23:00:00 │ │ 24 ┆ 2001-01-02 00:00:00 │ └────────┴─────────────────────┘ - Compute the rolling mean with the default left closure of temporal windows - >>> df_temporal.with_columns( ... rolling_row_mean=pl.col("row_nr").rolling_mean( ... window_size="2h", by="date", closed="left" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬──────────────────┐ │ row_nr ┆ date ┆ rolling_row_mean │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ f64 │ ╞════════╪═════════════════════╪══════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ null │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0.0 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 0.5 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 1.5 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 19.5 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 20.5 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 21.5 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 22.5 │ └────────┴─────────────────────┴──────────────────┘ - Compute the rolling mean with the closure of windows on both sides - >>> df_temporal.with_columns( ... rolling_row_mean=pl.col("row_nr").rolling_mean( ... window_size="2h", by="date", closed="both" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬──────────────────┐ │ row_nr ┆ date ┆ rolling_row_mean │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ f64 │ ╞════════╪═════════════════════╪══════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ 0.0 │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0.5 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 1.0 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 2.0 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 20.0 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 21.0 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 22.0 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 23.0 │ └────────┴─────────────────────┴──────────────────┘ 
 - rolling_median(
- window_size: int | timedelta | str,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- by: str | None = None,
- closed: ClosedInterval = 'left',
- warn_if_unsorted: bool = True,
- Compute a rolling median. - If - byhas not been specified (the default), the window at a given row will include the row itself, and the- window_size - 1elements before it.- If you pass a - bycolumn- <t_0, t_1, ..., t_n>, then- closed="left"means the windows will be:- [t_0 - window_size, t_0) 
- [t_1 - window_size, t_1) 
- … 
- [t_n - window_size, t_n) 
 - With - closed="right", the left endpoint is not included and the right endpoint is included.- Parameters:
- window_size
- The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language: - 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - If a timedelta or the dynamic string language is used, the - byand- closedarguments must also be set.
- weights
- An optional slice with the same length as the window that determines the relative contribution of each value in a window to the output. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
- by
- If the - window_sizeis temporal for instance- "5h"or- "3s", you must set the column that will be used to determine the windows. This column must be of dtype Datetime or Date.- Warning - If passed, the column must be sorted in ascending order. Otherwise, results will not be correct. 
- closed{‘left’, ‘right’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive); only applicable if - byhas been set.
- warn_if_unsorted
- Warn if data is not known to be sorted by - bycolumn (if passed). Experimental.
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - Notes - If you want to compute multiple aggregation statistics over the same dynamic window, consider using - rolling- this method can cache the window size computation.- Examples - >>> df = pl.DataFrame({"A": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}) >>> df.with_columns( ... rolling_median=pl.col("A").rolling_median(window_size=2), ... ) shape: (6, 2) ┌─────┬────────────────┐ │ A ┆ rolling_median │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪════════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.5 │ │ 3.0 ┆ 2.5 │ │ 4.0 ┆ 3.5 │ │ 5.0 ┆ 4.5 │ │ 6.0 ┆ 5.5 │ └─────┴────────────────┘ - Specify weights for the values in each window: - >>> df.with_columns( ... rolling_median=pl.col("A").rolling_median( ... window_size=2, weights=[0.25, 0.75] ... ), ... ) shape: (6, 2) ┌─────┬────────────────┐ │ A ┆ rolling_median │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪════════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.5 │ │ 3.0 ┆ 2.5 │ │ 4.0 ┆ 3.5 │ │ 5.0 ┆ 4.5 │ │ 6.0 ┆ 5.5 │ └─────┴────────────────┘ - Center the values in the window - >>> df.with_columns( ... rolling_median=pl.col("A").rolling_median(window_size=3, center=True), ... ) shape: (6, 2) ┌─────┬────────────────┐ │ A ┆ rolling_median │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪════════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 2.0 │ │ 3.0 ┆ 3.0 │ │ 4.0 ┆ 4.0 │ │ 5.0 ┆ 5.0 │ │ 6.0 ┆ null │ └─────┴────────────────┘ 
 - rolling_min(
- window_size: int | timedelta | str,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- by: str | None = None,
- closed: ClosedInterval = 'left',
- warn_if_unsorted: bool = True,
- Apply a rolling min (moving min) over the values in this array. - A window of length - window_sizewill traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the- weightvector. The resulting values will be aggregated to their min.- If - byhas not been specified (the default), the window at a given row will include the row itself, and the- window_size - 1elements before it.- If you pass a - bycolumn- <t_0, t_1, ..., t_n>, then- closed="left"means the windows will be:- [t_0 - window_size, t_0) 
- [t_1 - window_size, t_1) 
- … 
- [t_n - window_size, t_n) 
 - With - closed="right", the left endpoint is not included and the right endpoint is included.- Parameters:
- window_size
- The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language: - 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - If a timedelta or the dynamic string language is used, the - byand- closedarguments must also be set.
- weights
- An optional slice with the same length as the window that will be multiplied elementwise with the values in the window. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
- by
- If the - window_sizeis temporal for instance- "5h"or- "3s", you must set the column that will be used to determine the windows. This column must be of dtype Datetime or Date.- Warning - If passed, the column must be sorted in ascending order. Otherwise, results will not be correct. 
- closed{‘left’, ‘right’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive); only applicable if - byhas been set.
- warn_if_unsorted
- Warn if data is not known to be sorted by - bycolumn (if passed). Experimental.
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - Notes - If you want to compute multiple aggregation statistics over the same dynamic window, consider using - rolling- this method can cache the window size computation.- Examples - >>> df = pl.DataFrame({"A": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}) >>> df.with_columns( ... rolling_min=pl.col("A").rolling_min(window_size=2), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_min │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.0 │ │ 3.0 ┆ 2.0 │ │ 4.0 ┆ 3.0 │ │ 5.0 ┆ 4.0 │ │ 6.0 ┆ 5.0 │ └─────┴─────────────┘ - Specify weights to multiply the values in the window with: - >>> df.with_columns( ... rolling_min=pl.col("A").rolling_min( ... window_size=2, weights=[0.25, 0.75] ... ), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_min │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 0.25 │ │ 3.0 ┆ 0.5 │ │ 4.0 ┆ 0.75 │ │ 5.0 ┆ 1.0 │ │ 6.0 ┆ 1.25 │ └─────┴─────────────┘ - Center the values in the window - >>> df.with_columns( ... rolling_min=pl.col("A").rolling_min(window_size=3, center=True), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_min │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.0 │ │ 3.0 ┆ 2.0 │ │ 4.0 ┆ 3.0 │ │ 5.0 ┆ 4.0 │ │ 6.0 ┆ null │ └─────┴─────────────┘ - Create a DataFrame with a datetime column and a row number column - >>> from datetime import timedelta, datetime >>> start = datetime(2001, 1, 1) >>> stop = datetime(2001, 1, 2) >>> df_temporal = pl.DataFrame( ... {"date": pl.datetime_range(start, stop, "1h", eager=True)} ... ).with_row_count() >>> df_temporal shape: (25, 2) ┌────────┬─────────────────────┐ │ row_nr ┆ date │ │ --- ┆ --- │ │ u32 ┆ datetime[μs] │ ╞════════╪═════════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 │ │ 1 ┆ 2001-01-01 01:00:00 │ │ 2 ┆ 2001-01-01 02:00:00 │ │ 3 ┆ 2001-01-01 03:00:00 │ │ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 │ │ 22 ┆ 2001-01-01 22:00:00 │ │ 23 ┆ 2001-01-01 23:00:00 │ │ 24 ┆ 2001-01-02 00:00:00 │ └────────┴─────────────────────┘ >>> df_temporal.with_columns( ... rolling_row_min=pl.col("row_nr").rolling_min( ... window_size="2h", by="date", closed="left" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_min │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ u32 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ null │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 0 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 1 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 19 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 20 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 21 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 22 │ └────────┴─────────────────────┴─────────────────┘ 
 - rolling_quantile(
- quantile: float,
- interpolation: RollingInterpolationMethod = 'nearest',
- window_size: int | timedelta | str = 2,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- by: str | None = None,
- closed: ClosedInterval = 'left',
- warn_if_unsorted: bool = True,
- Compute a rolling quantile. - If - byhas not been specified (the default), the window at a given row will include the row itself, and the- window_size - 1elements before it.- If you pass a - bycolumn- <t_0, t_1, ..., t_n>, then- closed="left"means the windows will be:- [t_0 - window_size, t_0) 
- [t_1 - window_size, t_1) 
- … 
- [t_n - window_size, t_n) 
 - With - closed="right", the left endpoint is not included and the right endpoint is included.- Parameters:
- quantile
- Quantile between 0.0 and 1.0. 
- interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’}
- Interpolation method. 
- window_size
- The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language: - 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - If a timedelta or the dynamic string language is used, the - byand- closedarguments must also be set.
- weights
- An optional slice with the same length as the window that determines the relative contribution of each value in a window to the output. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
- by
- If the - window_sizeis temporal for instance- "5h"or- "3s", you must set the column that will be used to determine the windows. This column must be of dtype Datetime or Date.- Warning - If passed, the column must be sorted in ascending order. Otherwise, results will not be correct. 
- closed{‘left’, ‘right’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive); only applicable if - byhas been set.
- warn_if_unsorted
- Warn if data is not known to be sorted by - bycolumn (if passed). Experimental.
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - Notes - If you want to compute multiple aggregation statistics over the same dynamic window, consider using - rolling- this method can cache the window size computation.- Examples - >>> df = pl.DataFrame({"A": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}) >>> df.with_columns( ... rolling_quantile=pl.col("A").rolling_quantile( ... quantile=0.25, window_size=4 ... ), ... ) shape: (6, 2) ┌─────┬──────────────────┐ │ A ┆ rolling_quantile │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪══════════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ null │ │ 3.0 ┆ null │ │ 4.0 ┆ 2.0 │ │ 5.0 ┆ 3.0 │ │ 6.0 ┆ 4.0 │ └─────┴──────────────────┘ - Specify weights for the values in each window: - >>> df.with_columns( ... rolling_quantile=pl.col("A").rolling_quantile( ... quantile=0.25, window_size=4, weights=[0.2, 0.4, 0.4, 0.2] ... ), ... ) shape: (6, 2) ┌─────┬──────────────────┐ │ A ┆ rolling_quantile │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪══════════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ null │ │ 3.0 ┆ null │ │ 4.0 ┆ 2.0 │ │ 5.0 ┆ 3.0 │ │ 6.0 ┆ 4.0 │ └─────┴──────────────────┘ - Specify weights and interpolation method - >>> df.with_columns( ... rolling_quantile=pl.col("A").rolling_quantile( ... quantile=0.25, ... window_size=4, ... weights=[0.2, 0.4, 0.4, 0.2], ... interpolation="linear", ... ), ... ) shape: (6, 2) ┌─────┬──────────────────┐ │ A ┆ rolling_quantile │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪══════════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ null │ │ 3.0 ┆ null │ │ 4.0 ┆ 1.625 │ │ 5.0 ┆ 2.625 │ │ 6.0 ┆ 3.625 │ └─────┴──────────────────┘ - Center the values in the window - >>> df.with_columns( ... rolling_quantile=pl.col("A").rolling_quantile( ... quantile=0.2, window_size=5, center=True ... ), ... ) shape: (6, 2) ┌─────┬──────────────────┐ │ A ┆ rolling_quantile │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪══════════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ null │ │ 3.0 ┆ 2.0 │ │ 4.0 ┆ 3.0 │ │ 5.0 ┆ null │ │ 6.0 ┆ null │ └─────┴──────────────────┘ 
 - rolling_skew(window_size: int, *, bias: bool = True) Self[source]
- Compute a rolling skew. - The window at a given row includes the row itself and the - window_size - 1elements before it.- Parameters:
- window_size
- Integer size of the rolling window. 
- bias
- If False, the calculations are corrected for statistical bias. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 4, 2, 9]}) >>> df.select(pl.col("a").rolling_skew(3)) shape: (4, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ null │ │ null │ │ 0.381802 │ │ 0.47033 │ └──────────┘ - Note how the values match the following: - >>> pl.Series([1, 4, 2]).skew(), pl.Series([4, 2, 9]).skew() (0.38180177416060584, 0.47033046033698594) 
 - rolling_std(
- window_size: int | timedelta | str,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- by: str | None = None,
- closed: ClosedInterval = 'left',
- ddof: int = 1,
- warn_if_unsorted: bool = True,
- Compute a rolling standard deviation. - If - byhas not been specified (the default), the window at a given row will include the row itself, and the- window_size - 1elements before it.- If you pass a - bycolumn- <t_0, t_1, ..., t_n>, then- closed="left"means the windows will be:- [t_0 - window_size, t_0) 
- [t_1 - window_size, t_1) 
- … 
- [t_n - window_size, t_n) 
 - With - closed="right", the left endpoint is not included and the right endpoint is included.- Parameters:
- window_size
- The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language: - 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - If a timedelta or the dynamic string language is used, the - byand- closedarguments must also be set.
- weights
- An optional slice with the same length as the window that determines the relative contribution of each value in a window to the output. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
- by
- If the - window_sizeis temporal for instance- "5h"or- "3s", you must set the column that will be used to determine the windows. This column must be of dtype Datetime or Date.- Warning - If passed, the column must be sorted in ascending order. Otherwise, results will not be correct. 
- closed{‘left’, ‘right’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive); only applicable if - byhas been set.
- ddof
- “Delta Degrees of Freedom”: The divisor for a length N window is N - ddof 
- warn_if_unsorted
- Warn if data is not known to be sorted by - bycolumn (if passed). Experimental.
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - Notes - If you want to compute multiple aggregation statistics over the same dynamic window, consider using - rolling- this method can cache the window size computation.- Examples - >>> df = pl.DataFrame({"A": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}) >>> df.with_columns( ... rolling_std=pl.col("A").rolling_std(window_size=2), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_std │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 0.707107 │ │ 3.0 ┆ 0.707107 │ │ 4.0 ┆ 0.707107 │ │ 5.0 ┆ 0.707107 │ │ 6.0 ┆ 0.707107 │ └─────┴─────────────┘ - Specify weights to multiply the values in the window with: - >>> df.with_columns( ... rolling_std=pl.col("A").rolling_std( ... window_size=2, weights=[0.25, 0.75] ... ), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_std │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 0.433013 │ │ 3.0 ┆ 0.433013 │ │ 4.0 ┆ 0.433013 │ │ 5.0 ┆ 0.433013 │ │ 6.0 ┆ 0.433013 │ └─────┴─────────────┘ - Center the values in the window - >>> df.with_columns( ... rolling_std=pl.col("A").rolling_std(window_size=3, center=True), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_std │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.0 │ │ 3.0 ┆ 1.0 │ │ 4.0 ┆ 1.0 │ │ 5.0 ┆ 1.0 │ │ 6.0 ┆ null │ └─────┴─────────────┘ - Create a DataFrame with a datetime column and a row number column - >>> from datetime import timedelta, datetime >>> start = datetime(2001, 1, 1) >>> stop = datetime(2001, 1, 2) >>> df_temporal = pl.DataFrame( ... {"date": pl.datetime_range(start, stop, "1h", eager=True)} ... ).with_row_count() >>> df_temporal shape: (25, 2) ┌────────┬─────────────────────┐ │ row_nr ┆ date │ │ --- ┆ --- │ │ u32 ┆ datetime[μs] │ ╞════════╪═════════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 │ │ 1 ┆ 2001-01-01 01:00:00 │ │ 2 ┆ 2001-01-01 02:00:00 │ │ 3 ┆ 2001-01-01 03:00:00 │ │ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 │ │ 22 ┆ 2001-01-01 22:00:00 │ │ 23 ┆ 2001-01-01 23:00:00 │ │ 24 ┆ 2001-01-02 00:00:00 │ └────────┴─────────────────────┘ - Compute the rolling std with the default left closure of temporal windows - >>> df_temporal.with_columns( ... rolling_row_std=pl.col("row_nr").rolling_std( ... window_size="2h", by="date", closed="left" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_std │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ f64 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ null │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0.0 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 0.707107 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 0.707107 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 0.707107 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 0.707107 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 0.707107 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 0.707107 │ └────────┴─────────────────────┴─────────────────┘ - Compute the rolling std with the closure of windows on both sides - >>> df_temporal.with_columns( ... rolling_row_std=pl.col("row_nr").rolling_std( ... window_size="2h", by="date", closed="both" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_std │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ f64 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ 0.0 │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0.707107 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 1.0 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 1.0 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 1.0 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 1.0 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 1.0 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 1.0 │ └────────┴─────────────────────┴─────────────────┘ 
 - rolling_sum(
- window_size: int | timedelta | str,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- by: str | None = None,
- closed: ClosedInterval = 'left',
- warn_if_unsorted: bool = True,
- Apply a rolling sum (moving sum) over the values in this array. - A window of length - window_sizewill traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the- weightvector. The resulting values will be aggregated to their sum.- If - byhas not been specified (the default), the window at a given row will include the row itself, and the- window_size - 1elements before it.- If you pass a - bycolumn- <t_0, t_1, ..., t_n>, then- closed="left"means the windows will be:- [t_0 - window_size, t_0) 
- [t_1 - window_size, t_1) 
- … 
- [t_n - window_size, t_n) 
 - With - closed="right", the left endpoint is not included and the right endpoint is included.- Parameters:
- window_size
- The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language: - 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - If a timedelta or the dynamic string language is used, the - byand- closedarguments must also be set.
- weights
- An optional slice with the same length as the window that will be multiplied elementwise with the values in the window. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
- by
- If the - window_sizeis temporal for instance- "5h"or- "3s", you must set the column that will be used to determine the windows. This column must of dtype- {Date, Datetime}
- closed{‘left’, ‘right’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive); only applicable if - byhas been set.
- warn_if_unsorted
- Warn if data is not known to be sorted by - bycolumn (if passed). Experimental.
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - Notes - If you want to compute multiple aggregation statistics over the same dynamic window, consider using - rolling- this method can cache the window size computation.- Examples - >>> df = pl.DataFrame({"A": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}) >>> df.with_columns( ... rolling_sum=pl.col("A").rolling_sum(window_size=2), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_sum │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 3.0 │ │ 3.0 ┆ 5.0 │ │ 4.0 ┆ 7.0 │ │ 5.0 ┆ 9.0 │ │ 6.0 ┆ 11.0 │ └─────┴─────────────┘ - Specify weights to multiply the values in the window with: - >>> df.with_columns( ... rolling_sum=pl.col("A").rolling_sum( ... window_size=2, weights=[0.25, 0.75] ... ), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_sum │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.75 │ │ 3.0 ┆ 2.75 │ │ 4.0 ┆ 3.75 │ │ 5.0 ┆ 4.75 │ │ 6.0 ┆ 5.75 │ └─────┴─────────────┘ - Center the values in the window - >>> df.with_columns( ... rolling_sum=pl.col("A").rolling_sum(window_size=3, center=True), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_sum │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 6.0 │ │ 3.0 ┆ 9.0 │ │ 4.0 ┆ 12.0 │ │ 5.0 ┆ 15.0 │ │ 6.0 ┆ null │ └─────┴─────────────┘ - Create a DataFrame with a datetime column and a row number column - >>> from datetime import timedelta, datetime >>> start = datetime(2001, 1, 1) >>> stop = datetime(2001, 1, 2) >>> df_temporal = pl.DataFrame( ... {"date": pl.datetime_range(start, stop, "1h", eager=True)} ... ).with_row_count() >>> df_temporal shape: (25, 2) ┌────────┬─────────────────────┐ │ row_nr ┆ date │ │ --- ┆ --- │ │ u32 ┆ datetime[μs] │ ╞════════╪═════════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 │ │ 1 ┆ 2001-01-01 01:00:00 │ │ 2 ┆ 2001-01-01 02:00:00 │ │ 3 ┆ 2001-01-01 03:00:00 │ │ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 │ │ 22 ┆ 2001-01-01 22:00:00 │ │ 23 ┆ 2001-01-01 23:00:00 │ │ 24 ┆ 2001-01-02 00:00:00 │ └────────┴─────────────────────┘ - Compute the rolling sum with the default left closure of temporal windows - >>> df_temporal.with_columns( ... rolling_row_sum=pl.col("row_nr").rolling_sum( ... window_size="2h", by="date", closed="left" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_sum │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ u32 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ null │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 1 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 3 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 39 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 41 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 43 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 45 │ └────────┴─────────────────────┴─────────────────┘ - Compute the rolling sum with the closure of windows on both sides - >>> df_temporal.with_columns( ... rolling_row_sum=pl.col("row_nr").rolling_sum( ... window_size="2h", by="date", closed="both" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_sum │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ u32 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ 0 │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 1 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 3 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 6 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 60 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 63 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 66 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 69 │ └────────┴─────────────────────┴─────────────────┘ 
 - rolling_var(
- window_size: int | timedelta | str,
- weights: list[float] | None = None,
- min_periods: int | None = None,
- *,
- center: bool = False,
- by: str | None = None,
- closed: ClosedInterval = 'left',
- ddof: int = 1,
- warn_if_unsorted: bool = True,
- Compute a rolling variance. - If - byhas not been specified (the default), the window at a given row will include the row itself, and the- window_size - 1elements before it.- If you pass a - bycolumn- <t_0, t_1, ..., t_n>, then- closed="left"means the windows will be:- [t_0 - window_size, t_0) 
- [t_1 - window_size, t_1) 
- … 
- [t_n - window_size, t_n) 
 - With - closed="right", the left endpoint is not included and the right endpoint is included.- Parameters:
- window_size
- The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language: - 1ns (1 nanosecond) 
- 1us (1 microsecond) 
- 1ms (1 millisecond) 
- 1s (1 second) 
- 1m (1 minute) 
- 1h (1 hour) 
- 1d (1 calendar day) 
- 1w (1 calendar week) 
- 1mo (1 calendar month) 
- 1q (1 calendar quarter) 
- 1y (1 calendar year) 
- 1i (1 index count) 
 - By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”. - If a timedelta or the dynamic string language is used, the - byand- closedarguments must also be set.
- weights
- An optional slice with the same length as the window that determines the relative contribution of each value in a window to the output. 
- min_periods
- The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size. 
- center
- Set the labels at the center of the window 
- by
- If the - window_sizeis temporal for instance- "5h"or- "3s", you must set the column that will be used to determine the windows. This column must be of dtype Datetime or Date.- Warning - If passed, the column must be sorted in ascending order. Otherwise, results will not be correct. 
- closed{‘left’, ‘right’, ‘both’, ‘none’}
- Define which sides of the temporal interval are closed (inclusive); only applicable if - byhas been set.
- ddof
- “Delta Degrees of Freedom”: The divisor for a length N window is N - ddof 
- warn_if_unsorted
- Warn if data is not known to be sorted by - bycolumn (if passed). Experimental.
 
 - Warning - This functionality is experimental and may change without it being considered a breaking change. - Notes - If you want to compute multiple aggregation statistics over the same dynamic window, consider using - rolling- this method can cache the window size computation.- Examples - >>> df = pl.DataFrame({"A": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]}) >>> df.with_columns( ... rolling_var=pl.col("A").rolling_var(window_size=2), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_var │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 0.5 │ │ 3.0 ┆ 0.5 │ │ 4.0 ┆ 0.5 │ │ 5.0 ┆ 0.5 │ │ 6.0 ┆ 0.5 │ └─────┴─────────────┘ - Specify weights to multiply the values in the window with: - >>> df.with_columns( ... rolling_var=pl.col("A").rolling_var( ... window_size=2, weights=[0.25, 0.75] ... ), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_var │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 0.1875 │ │ 3.0 ┆ 0.1875 │ │ 4.0 ┆ 0.1875 │ │ 5.0 ┆ 0.1875 │ │ 6.0 ┆ 0.1875 │ └─────┴─────────────┘ - Center the values in the window - >>> df.with_columns( ... rolling_var=pl.col("A").rolling_var(window_size=3, center=True), ... ) shape: (6, 2) ┌─────┬─────────────┐ │ A ┆ rolling_var │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════╪═════════════╡ │ 1.0 ┆ null │ │ 2.0 ┆ 1.0 │ │ 3.0 ┆ 1.0 │ │ 4.0 ┆ 1.0 │ │ 5.0 ┆ 1.0 │ │ 6.0 ┆ null │ └─────┴─────────────┘ - Create a DataFrame with a datetime column and a row number column - >>> from datetime import timedelta, datetime >>> start = datetime(2001, 1, 1) >>> stop = datetime(2001, 1, 2) >>> df_temporal = pl.DataFrame( ... {"date": pl.datetime_range(start, stop, "1h", eager=True)} ... ).with_row_count() >>> df_temporal shape: (25, 2) ┌────────┬─────────────────────┐ │ row_nr ┆ date │ │ --- ┆ --- │ │ u32 ┆ datetime[μs] │ ╞════════╪═════════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 │ │ 1 ┆ 2001-01-01 01:00:00 │ │ 2 ┆ 2001-01-01 02:00:00 │ │ 3 ┆ 2001-01-01 03:00:00 │ │ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 │ │ 22 ┆ 2001-01-01 22:00:00 │ │ 23 ┆ 2001-01-01 23:00:00 │ │ 24 ┆ 2001-01-02 00:00:00 │ └────────┴─────────────────────┘ - Compute the rolling var with the default left closure of temporal windows - >>> df_temporal.with_columns( ... rolling_row_var=pl.col("row_nr").rolling_var( ... window_size="2h", by="date", closed="left" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_var │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ f64 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ null │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0.0 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 0.5 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 0.5 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 0.5 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 0.5 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 0.5 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 0.5 │ └────────┴─────────────────────┴─────────────────┘ - Compute the rolling var with the closure of windows on both sides - >>> df_temporal.with_columns( ... rolling_row_var=pl.col("row_nr").rolling_var( ... window_size="2h", by="date", closed="both" ... ) ... ) shape: (25, 3) ┌────────┬─────────────────────┬─────────────────┐ │ row_nr ┆ date ┆ rolling_row_var │ │ --- ┆ --- ┆ --- │ │ u32 ┆ datetime[μs] ┆ f64 │ ╞════════╪═════════════════════╪═════════════════╡ │ 0 ┆ 2001-01-01 00:00:00 ┆ 0.0 │ │ 1 ┆ 2001-01-01 01:00:00 ┆ 0.5 │ │ 2 ┆ 2001-01-01 02:00:00 ┆ 1.0 │ │ 3 ┆ 2001-01-01 03:00:00 ┆ 1.0 │ │ … ┆ … ┆ … │ │ 21 ┆ 2001-01-01 21:00:00 ┆ 1.0 │ │ 22 ┆ 2001-01-01 22:00:00 ┆ 1.0 │ │ 23 ┆ 2001-01-01 23:00:00 ┆ 1.0 │ │ 24 ┆ 2001-01-02 00:00:00 ┆ 1.0 │ └────────┴─────────────────────┴─────────────────┘ 
 - round(decimals: int = 0) Self[source]
- Round underlying floating point data by - decimalsdigits.- Parameters:
- decimals
- Number of decimals to round by. 
 
 - Examples - >>> df = pl.DataFrame({"a": [0.33, 0.52, 1.02, 1.17]}) >>> df.select(pl.col("a").round(1)) shape: (4, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 0.3 │ │ 0.5 │ │ 1.0 │ │ 1.2 │ └─────┘ 
 - round_sig_figs(digits: int) Self[source]
- Round to a number of significant figures. - Parameters:
- digits
- Number of significant figures to round to. 
 
 - Examples - >>> df = pl.DataFrame({"a": [0.01234, 3.333, 1234.0]}) >>> df.with_columns(pl.col("a").round_sig_figs(2).alias("round_sig_figs")) shape: (3, 2) ┌─────────┬────────────────┐ │ a ┆ round_sig_figs │ │ --- ┆ --- │ │ f64 ┆ f64 │ ╞═════════╪════════════════╡ │ 0.01234 ┆ 0.012 │ │ 3.333 ┆ 3.3 │ │ 1234.0 ┆ 1200.0 │ └─────────┴────────────────┘ 
 - sample(
- n: int | IntoExprColumn | None = None,
- *,
- fraction: float | IntoExprColumn | None = None,
- with_replacement: bool = False,
- shuffle: bool = False,
- seed: int | None = None,
- Sample from this expression. - Parameters:
- n
- Number of items to return. Cannot be used with - fraction. Defaults to 1 if- fractionis None.
- fraction
- Fraction of items to return. Cannot be used with - n.
- with_replacement
- Allow values to be sampled more than once. 
- shuffle
- Shuffle the order of sampled data points. 
- seed
- Seed for the random number generator. If set to None (default), a random seed is generated for each sample operation. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").sample(fraction=1.0, with_replacement=True, seed=1)) shape: (3, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 3 │ │ 1 │ │ 1 │ └─────┘ 
 - search_sorted(element: IntoExpr, side: SearchSortedSide = 'any') Self[source]
- Find indices where elements should be inserted to maintain order. \[a[i-1] < v <= a[i]\]- Parameters:
- element
- Expression or scalar value. 
- side{‘any’, ‘left’, ‘right’}
- If ‘any’, the index of the first suitable location found is given. If ‘left’, the index of the leftmost suitable location found is given. If ‘right’, return the rightmost suitable location found is given. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "values": [1, 2, 3, 5], ... } ... ) >>> df.select( ... [ ... pl.col("values").search_sorted(0).alias("zero"), ... pl.col("values").search_sorted(3).alias("three"), ... pl.col("values").search_sorted(6).alias("six"), ... ] ... ) shape: (1, 3) ┌──────┬───────┬─────┐ │ zero ┆ three ┆ six │ │ --- ┆ --- ┆ --- │ │ u32 ┆ u32 ┆ u32 │ ╞══════╪═══════╪═════╡ │ 0 ┆ 2 ┆ 4 │ └──────┴───────┴─────┘ 
 - set_sorted(*, descending: bool = False) Self[source]
- Flags the expression as ‘sorted’. - Enables downstream code to user fast paths for sorted arrays. - Parameters:
- descending
- Whether the - Seriesorder is descending.
 
 - Warning - This can lead to incorrect results if this - Seriesis not sorted!! Use with care!- Examples - >>> df = pl.DataFrame({"values": [1, 2, 3]}) >>> df.select(pl.col("values").set_sorted().max()) shape: (1, 1) ┌────────┐ │ values │ │ --- │ │ i64 │ ╞════════╡ │ 3 │ └────────┘ 
 - shift(n: int | IntoExprColumn = 1, *, fill_value: IntoExpr | None = None) Self[source]
- Shift values by the given number of indices. - Parameters:
- n
- Number of indices to shift forward. If a negative value is passed, values are shifted in the opposite direction instead. 
- fill_value
- Fill the resulting null values with this value. 
 
 - Notes - This method is similar to the - LAGoperation in SQL when the value for- nis positive. With a negative value for- n, it is similar to- LEAD.- Examples - By default, values are shifted forward by one index. - >>> df = pl.DataFrame({"a": [1, 2, 3, 4]}) >>> df.with_columns(shift=pl.col("a").shift()) shape: (4, 2) ┌─────┬───────┐ │ a ┆ shift │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═══════╡ │ 1 ┆ null │ │ 2 ┆ 1 │ │ 3 ┆ 2 │ │ 4 ┆ 3 │ └─────┴───────┘ - Pass a negative value to shift in the opposite direction instead. - >>> df.with_columns(shift=pl.col("a").shift(-2)) shape: (4, 2) ┌─────┬───────┐ │ a ┆ shift │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═══════╡ │ 1 ┆ 3 │ │ 2 ┆ 4 │ │ 3 ┆ null │ │ 4 ┆ null │ └─────┴───────┘ - Specify - fill_valueto fill the resulting null values.- >>> df.with_columns(shift=pl.col("a").shift(-2, fill_value=100)) shape: (4, 2) ┌─────┬───────┐ │ a ┆ shift │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═══════╡ │ 1 ┆ 3 │ │ 2 ┆ 4 │ │ 3 ┆ 100 │ │ 4 ┆ 100 │ └─────┴───────┘ 
 - shift_and_fill(fill_value: IntoExpr, *, n: int = 1) Self[source]
- Shift values by the given number of places and fill the resulting null values. - Deprecated since version 0.19.12: Use - shift()instead.- Parameters:
- fill_value
- Fill None values with the result of this expression. 
- n
- Number of places to shift (may be negative). 
 
 
 - shrink_dtype() Self[source]
- Shrink numeric columns to the minimal required datatype. - Shrink to the dtype needed to fit the extrema of this [ - Series]. This can be used to reduce memory pressure.- Examples - >>> pl.DataFrame( ... { ... "a": [1, 2, 3], ... "b": [1, 2, 2 << 32], ... "c": [-1, 2, 1 << 30], ... "d": [-112, 2, 112], ... "e": [-112, 2, 129], ... "f": ["a", "b", "c"], ... "g": [0.1, 1.32, 0.12], ... "h": [True, None, False], ... } ... ).select(pl.all().shrink_dtype()) shape: (3, 8) ┌─────┬────────────┬────────────┬──────┬──────┬─────┬──────┬───────┐ │ a ┆ b ┆ c ┆ d ┆ e ┆ f ┆ g ┆ h │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i8 ┆ i64 ┆ i32 ┆ i8 ┆ i16 ┆ str ┆ f32 ┆ bool │ ╞═════╪════════════╪════════════╪══════╪══════╪═════╪══════╪═══════╡ │ 1 ┆ 1 ┆ -1 ┆ -112 ┆ -112 ┆ a ┆ 0.1 ┆ true │ │ 2 ┆ 2 ┆ 2 ┆ 2 ┆ 2 ┆ b ┆ 1.32 ┆ null │ │ 3 ┆ 8589934592 ┆ 1073741824 ┆ 112 ┆ 129 ┆ c ┆ 0.12 ┆ false │ └─────┴────────────┴────────────┴──────┴──────┴─────┴──────┴───────┘ 
 - shuffle(seed: int | None = None) Self[source]
- Shuffle the contents of this expression. - Parameters:
- seed
- Seed for the random number generator. If set to None (default), a random seed is generated each time the shuffle is called. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3]}) >>> df.select(pl.col("a").shuffle(seed=1)) shape: (3, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 2 │ │ 1 │ │ 3 │ └─────┘ 
 - sign() Self[source]
- Compute the element-wise indication of the sign. - The returned values can be -1, 0, or 1: - -1 if x < 0. 
- 0 if x == 0. 
- 1 if x > 0. 
 - (null values are preserved as-is). - Examples - >>> df = pl.DataFrame({"a": [-9.0, -0.0, 0.0, 4.0, None]}) >>> df.select(pl.col("a").sign()) shape: (5, 1) ┌──────┐ │ a │ │ --- │ │ i64 │ ╞══════╡ │ -1 │ │ 0 │ │ 0 │ │ 1 │ │ null │ └──────┘ 
 - sin() Self[source]
- Compute the element-wise value for the sine. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [0.0]}) >>> df.select(pl.col("a").sin()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 0.0 │ └─────┘ 
 - sinh() Self[source]
- Compute the element-wise value for the hyperbolic sine. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").sinh()) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 1.175201 │ └──────────┘ 
 - skew(*, bias: bool = True) Self[source]
- Compute the sample skewness of a data set. - For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function - skewtestcan be used to determine if the skewness value is close enough to zero, statistically speaking.- See scipy.stats for more information. - Parameters:
- biasbool, optional
- If False, the calculations are corrected for statistical bias. 
 
 - Notes - The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e. \[g_1=\frac{m_3}{m_2^{3/2}}\]- where \[m_i=\frac{1}{N}\sum_{n=1}^N(x[n]-\bar{x})^i\]- is the biased sample \(i\texttt{th}\) central moment, and \(\bar{x}\) is the sample mean. If - biasis False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.\[G_1 = \frac{k_3}{k_2^{3/2}} = \frac{\sqrt{N(N-1)}}{N-2}\frac{m_3}{m_2^{3/2}}\]- Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 2, 1]}) >>> df.select(pl.col("a").skew()) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 0.343622 │ └──────────┘ 
 - slice(offset: int | Expr, length: int | Expr | None = None) Self[source]
- Get a slice of this expression. - Parameters:
- offset
- Start index. Negative indexing is supported. 
- length
- Length of the slice. If set to - None, all rows starting at the offset will be selected.
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [8, 9, 10, 11], ... "b": [None, 4, 4, 4], ... } ... ) >>> df.select(pl.all().slice(1, 2)) shape: (2, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 9 ┆ 4 │ │ 10 ┆ 4 │ └─────┴─────┘ 
 - sort(*, descending: bool = False, nulls_last: bool = False) Self[source]
- Sort this column. - When used in a projection/selection context, the whole column is sorted. When used in a group by context, the groups are sorted. - Parameters:
- descending
- Sort in descending order. 
- nulls_last
- Place null values last. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, None, 3, 2], ... } ... ) >>> df.select(pl.col("a").sort()) shape: (4, 1) ┌──────┐ │ a │ │ --- │ │ i64 │ ╞══════╡ │ null │ │ 1 │ │ 2 │ │ 3 │ └──────┘ >>> df.select(pl.col("a").sort(descending=True)) shape: (4, 1) ┌──────┐ │ a │ │ --- │ │ i64 │ ╞══════╡ │ null │ │ 3 │ │ 2 │ │ 1 │ └──────┘ >>> df.select(pl.col("a").sort(nulls_last=True)) shape: (4, 1) ┌──────┐ │ a │ │ --- │ │ i64 │ ╞══════╡ │ 1 │ │ 2 │ │ 3 │ │ null │ └──────┘ - When sorting in a group by context, the groups are sorted. - >>> df = pl.DataFrame( ... { ... "group": ["one", "one", "one", "two", "two", "two"], ... "value": [1, 98, 2, 3, 99, 4], ... } ... ) >>> df.group_by("group").agg(pl.col("value").sort()) shape: (2, 2) ┌───────┬────────────┐ │ group ┆ value │ │ --- ┆ --- │ │ str ┆ list[i64] │ ╞═══════╪════════════╡ │ two ┆ [3, 4, 99] │ │ one ┆ [1, 2, 98] │ └───────┴────────────┘ 
 - sort_by( ) Self[source]
- Sort this column by the ordering of other columns. - When used in a projection/selection context, the whole column is sorted. When used in a group by context, the groups are sorted. - Parameters:
- by
- Column(s) to sort by. Accepts expression input. Strings are parsed as column names. 
- *more_by
- Additional columns to sort by, specified as positional arguments. 
- descending
- Sort in descending order. When sorting by multiple columns, can be specified per column by passing a sequence of booleans. 
 
 - Examples - Pass a single column name to sort by that column. - >>> df = pl.DataFrame( ... { ... "group": ["a", "a", "b", "b"], ... "value1": [1, 3, 4, 2], ... "value2": [8, 7, 6, 5], ... } ... ) >>> df.select(pl.col("group").sort_by("value1")) shape: (4, 1) ┌───────┐ │ group │ │ --- │ │ str │ ╞═══════╡ │ a │ │ b │ │ a │ │ b │ └───────┘ - Sorting by expressions is also supported. - >>> df.select(pl.col("group").sort_by(pl.col("value1") + pl.col("value2"))) shape: (4, 1) ┌───────┐ │ group │ │ --- │ │ str │ ╞═══════╡ │ b │ │ a │ │ a │ │ b │ └───────┘ - Sort by multiple columns by passing a list of columns. - >>> df.select(pl.col("group").sort_by(["value1", "value2"], descending=True)) shape: (4, 1) ┌───────┐ │ group │ │ --- │ │ str │ ╞═══════╡ │ b │ │ a │ │ b │ │ a │ └───────┘ - Or use positional arguments to sort by multiple columns in the same way. - >>> df.select(pl.col("group").sort_by("value1", "value2")) shape: (4, 1) ┌───────┐ │ group │ │ --- │ │ str │ ╞═══════╡ │ a │ │ b │ │ a │ │ b │ └───────┘ - When sorting in a group by context, the groups are sorted. - >>> df.group_by("group").agg( ... pl.col("value1").sort_by("value2") ... ) shape: (2, 2) ┌───────┬───────────┐ │ group ┆ value1 │ │ --- ┆ --- │ │ str ┆ list[i64] │ ╞═══════╪═══════════╡ │ a ┆ [3, 1] │ │ b ┆ [2, 4] │ └───────┴───────────┘ - Take a single row from each group where a column attains its minimal value within that group. - >>> df.group_by("group").agg( ... pl.all().sort_by("value2").first() ... ) shape: (2, 3) ┌───────┬────────┬────────┐ │ group ┆ value1 ┆ value2 | │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 | ╞═══════╪════════╪════════╡ │ a ┆ 3 ┆ 7 | │ b ┆ 2 ┆ 5 | └───────┴────────┴────────┘ 
 - sqrt() Self[source]
- Compute the square root of the elements. - Examples - >>> df = pl.DataFrame({"values": [1.0, 2.0, 4.0]}) >>> df.select(pl.col("values").sqrt()) shape: (3, 1) ┌──────────┐ │ values │ │ --- │ │ f64 │ ╞══════════╡ │ 1.0 │ │ 1.414214 │ │ 2.0 │ └──────────┘ 
 - std(ddof: int = 1) Self[source]
- Get standard deviation. - Parameters:
- ddof
- “Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1. 
 
 - Examples - >>> df = pl.DataFrame({"a": [-1, 0, 1]}) >>> df.select(pl.col("a").std()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ └─────┘ 
 - sub(other: Any) Self[source]
- Method equivalent of subtraction operator - expr - other.- Parameters:
- other
- Numeric literal or expression value. 
 
 - Examples - >>> df = pl.DataFrame({"x": [0, 1, 2, 3, 4]}) >>> df.with_columns( ... pl.col("x").sub(2).alias("x-2"), ... pl.col("x").sub(pl.col("x").cum_sum()).alias("x-expr"), ... ) shape: (5, 3) ┌─────┬─────┬────────┐ │ x ┆ x-2 ┆ x-expr │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪════════╡ │ 0 ┆ -2 ┆ 0 │ │ 1 ┆ -1 ┆ 0 │ │ 2 ┆ 0 ┆ -1 │ │ 3 ┆ 1 ┆ -3 │ │ 4 ┆ 2 ┆ -6 │ └─────┴─────┴────────┘ 
 - suffix(suffix: str) Self[source]
- Add a suffix to the root column name of the expression. - Deprecated since version 0.19.12: This method has been renamed to - name.suffix().- Parameters:
- suffix
- Suffix to add to the root column name. 
 
 - See also - Notes - This will undo any previous renaming operations on the expression. - Due to implementation constraints, this method can only be called as the last expression in a chain. - Examples - >>> df = pl.DataFrame( ... { ... "a": [1, 2, 3], ... "b": ["x", "y", "z"], ... } ... ) >>> df.with_columns(pl.all().reverse().name.suffix("_reverse")) shape: (3, 4) ┌─────┬─────┬───────────┬───────────┐ │ a ┆ b ┆ a_reverse ┆ b_reverse │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 ┆ str │ ╞═════╪═════╪═══════════╪═══════════╡ │ 1 ┆ x ┆ 3 ┆ z │ │ 2 ┆ y ┆ 2 ┆ y │ │ 3 ┆ z ┆ 1 ┆ x │ └─────┴─────┴───────────┴───────────┘ 
 - sum() Self[source]
- Get sum value. - Notes - Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues. - Examples - >>> df = pl.DataFrame({"a": [-1, 0, 1]}) >>> df.select(pl.col("a").sum()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 0 │ └─────┘ 
 - tail(n: int | Expr = 10) Self[source]
- Get the last - nrows.- Parameters:
- n
- Number of rows to return. 
 
 - Examples - >>> df = pl.DataFrame({"foo": [1, 2, 3, 4, 5, 6, 7]}) >>> df.tail(3) shape: (3, 1) ┌─────┐ │ foo │ │ --- │ │ i64 │ ╞═════╡ │ 5 │ │ 6 │ │ 7 │ └─────┘ 
 - take(indices: int | list[int] | Expr | Series | np.ndarray[Any, Any]) Self[source]
- Take values by index. - Deprecated since version 0.19.14: This method has been renamed to - gather().- Parameters:
- indices
- An expression that leads to a UInt32 dtyped Series. 
 
 
 - take_every(n: int) Self[source]
- Take every nth value in the Series and return as a new Series. - Deprecated since version 0.19.14: This method has been renamed to - gather_every().- Parameters:
- n
- Gather every n-th row. 
 
 
 - tan() Self[source]
- Compute the element-wise value for the tangent. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").tan().round(2)) shape: (1, 1) ┌──────┐ │ a │ │ --- │ │ f64 │ ╞══════╡ │ 1.56 │ └──────┘ 
 - tanh() Self[source]
- Compute the element-wise value for the hyperbolic tangent. - Returns:
- Expr
- Expression of data type - Float64.
 
 - Examples - >>> df = pl.DataFrame({"a": [1.0]}) >>> df.select(pl.col("a").tanh()) shape: (1, 1) ┌──────────┐ │ a │ │ --- │ │ f64 │ ╞══════════╡ │ 0.761594 │ └──────────┘ 
 - to_physical() Self[source]
- Cast to physical representation of the logical dtype. - polars.datatypes.Date()->- polars.datatypes.Int32()
- polars.datatypes.Datetime()->- polars.datatypes.Int64()
- polars.datatypes.Time()->- polars.datatypes.Int64()
- polars.datatypes.Duration()->- polars.datatypes.Int64()
- polars.datatypes.Categorical()->- polars.datatypes.UInt32()
- List(inner)->- List(physical of inner)
 - Other data types will be left unchanged. - Examples - Replicating the pandas pd.factorize function. - >>> pl.DataFrame({"vals": ["a", "x", None, "a"]}).with_columns( ... [ ... pl.col("vals").cast(pl.Categorical), ... pl.col("vals") ... .cast(pl.Categorical) ... .to_physical() ... .alias("vals_physical"), ... ] ... ) shape: (4, 2) ┌──────┬───────────────┐ │ vals ┆ vals_physical │ │ --- ┆ --- │ │ cat ┆ u32 │ ╞══════╪═══════════════╡ │ a ┆ 0 │ │ x ┆ 1 │ │ null ┆ null │ │ a ┆ 0 │ └──────┴───────────────┘ 
 - top_k(k: int | IntoExprColumn = 5) Self[source]
- Return the - klargest elements.- This has time complexity: \[\begin{split}O(n + k \\log{}n - \frac{k}{2})\end{split}\]- Parameters:
- k
- Number of elements to return. 
 
 - See also - Examples - >>> df = pl.DataFrame( ... { ... "value": [1, 98, 2, 3, 99, 4], ... } ... ) >>> df.select( ... [ ... pl.col("value").top_k().alias("top_k"), ... pl.col("value").bottom_k().alias("bottom_k"), ... ] ... ) shape: (5, 2) ┌───────┬──────────┐ │ top_k ┆ bottom_k │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═══════╪══════════╡ │ 99 ┆ 1 │ │ 98 ┆ 2 │ │ 4 ┆ 3 │ │ 3 ┆ 4 │ │ 2 ┆ 98 │ └───────┴──────────┘ 
 - truediv(other: Any) Self[source]
- Method equivalent of float division operator - expr / other.- Parameters:
- other
- Numeric literal or expression value. 
 
 - See also - Notes - Zero-division behaviour follows IEEE-754: - 0/0: Invalid operation - mathematically undefined, returns NaN. n/0: On finite operands gives an exact infinite result, eg: ±infinity. - Examples - >>> df = pl.DataFrame( ... data={"x": [-2, -1, 0, 1, 2], "y": [0.5, 0.0, 0.0, -4.0, -0.5]} ... ) >>> df.with_columns( ... pl.col("x").truediv(2).alias("x/2"), ... pl.col("x").truediv(pl.col("y")).alias("x/y"), ... ) shape: (5, 4) ┌─────┬──────┬──────┬───────┐ │ x ┆ y ┆ x/2 ┆ x/y │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ f64 ┆ f64 │ ╞═════╪══════╪══════╪═══════╡ │ -2 ┆ 0.5 ┆ -1.0 ┆ -4.0 │ │ -1 ┆ 0.0 ┆ -0.5 ┆ -inf │ │ 0 ┆ 0.0 ┆ 0.0 ┆ NaN │ │ 1 ┆ -4.0 ┆ 0.5 ┆ -0.25 │ │ 2 ┆ -0.5 ┆ 1.0 ┆ -4.0 │ └─────┴──────┴──────┴───────┘ 
 - unique(*, maintain_order: bool = False) Self[source]
- Get unique values of this expression. - Parameters:
- maintain_order
- Maintain order of data. This requires more work. 
 
 - Examples - >>> df = pl.DataFrame({"a": [1, 1, 2]}) >>> df.select(pl.col("a").unique()) shape: (2, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 2 │ │ 1 │ └─────┘ >>> df.select(pl.col("a").unique(maintain_order=True)) shape: (2, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 1 │ │ 2 │ └─────┘ 
 - unique_counts() Self[source]
- Return a count of the unique values in the order of appearance. - This method differs from - value_countsin that it does not return the values, only the counts and might be faster- Examples - >>> df = pl.DataFrame( ... { ... "id": ["a", "b", "b", "c", "c", "c"], ... } ... ) >>> df.select( ... [ ... pl.col("id").unique_counts(), ... ] ... ) shape: (3, 1) ┌─────┐ │ id │ │ --- │ │ u32 │ ╞═════╡ │ 1 │ │ 2 │ │ 3 │ └─────┘ 
 - upper_bound() Self[source]
- Calculate the upper bound. - Returns a unit Series with the highest value possible for the dtype of this expression. - Examples - >>> df = pl.DataFrame({"a": [1, 2, 3, 2, 1]}) >>> df.select(pl.col("a").upper_bound()) shape: (1, 1) ┌─────────────────────┐ │ a │ │ --- │ │ i64 │ ╞═════════════════════╡ │ 9223372036854775807 │ └─────────────────────┘ 
 - value_counts(*, sort: bool = False, parallel: bool = False) Self[source]
- Count the occurrences of unique values. - Parameters:
- sort
- Sort the output by count in descending order. If set to - False(default), the order of the output is random.
- parallel
- Execute the computation in parallel. - Note - This option should likely not be enabled in a group by context, as the computation is already parallelized per group. 
 
- Returns:
- Expr
- Expression of data type - Structwith mapping of unique values to their count.
 
 - Examples - >>> df = pl.DataFrame( ... {"color": ["red", "blue", "red", "green", "blue", "blue"]} ... ) >>> df.select(pl.col("color").value_counts()) shape: (3, 1) ┌─────────────┐ │ color │ │ --- │ │ struct[2] │ ╞═════════════╡ │ {"red",2} │ │ {"green",1} │ │ {"blue",3} │ └─────────────┘ - Sort the output by count. - >>> df.select(pl.col("color").value_counts(sort=True)) shape: (3, 1) ┌─────────────┐ │ color │ │ --- │ │ struct[2] │ ╞═════════════╡ │ {"blue",3} │ │ {"red",2} │ │ {"green",1} │ └─────────────┘ 
 - var(ddof: int = 1) Self[source]
- Get variance. - Parameters:
- ddof
- “Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1. 
 
 - Examples - >>> df = pl.DataFrame({"a": [-1, 0, 1]}) >>> df.select(pl.col("a").var()) shape: (1, 1) ┌─────┐ │ a │ │ --- │ │ f64 │ ╞═════╡ │ 1.0 │ └─────┘ 
 - where(predicate: Expr) Self[source]
- Filter a single column. - Alias for - filter().- Parameters:
- predicate
- Boolean expression. 
 
 - Examples - >>> df = pl.DataFrame( ... { ... "group_col": ["g1", "g1", "g2"], ... "b": [1, 2, 3], ... } ... ) >>> df.group_by("group_col").agg( ... [ ... pl.col("b").where(pl.col("b") < 2).sum().alias("lt"), ... pl.col("b").where(pl.col("b") >= 2).sum().alias("gte"), ... ] ... ).sort("group_col") shape: (2, 3) ┌───────────┬─────┬─────┐ │ group_col ┆ lt ┆ gte │ │ --- ┆ --- ┆ --- │ │ str ┆ i64 ┆ i64 │ ╞═══════════╪═════╪═════╡ │ g1 ┆ 1 ┆ 2 │ │ g2 ┆ 0 ┆ 3 │ └───────────┴─────┴─────┘ 
 - xor(other: Any) Self[source]
- Method equivalent of bitwise exclusive-or operator - expr ^ other.- Parameters:
- other
- Integer or boolean value; accepts expression input. 
 
 - Examples - >>> df = pl.DataFrame( ... {"x": [True, False, True, False], "y": [True, True, False, False]} ... ) >>> df.with_columns(pl.col("x").xor(pl.col("y")).alias("x ^ y")) shape: (4, 3) ┌───────┬───────┬───────┐ │ x ┆ y ┆ x ^ y │ │ --- ┆ --- ┆ --- │ │ bool ┆ bool ┆ bool │ ╞═══════╪═══════╪═══════╡ │ true ┆ true ┆ false │ │ false ┆ true ┆ true │ │ true ┆ false ┆ true │ │ false ┆ false ┆ false │ └───────┴───────┴───────┘ - >>> def binary_string(n: int) -> str: ... return bin(n)[2:].zfill(8) >>> >>> df = pl.DataFrame( ... data={"x": [10, 8, 250, 66], "y": [1, 2, 3, 4]}, ... schema={"x": pl.UInt8, "y": pl.UInt8}, ... ) >>> df.with_columns( ... pl.col("x").map_elements(binary_string).alias("bin_x"), ... pl.col("y").map_elements(binary_string).alias("bin_y"), ... pl.col("x").xor(pl.col("y")).alias("xor_xy"), ... pl.col("x") ... .xor(pl.col("y")) ... .map_elements(binary_string) ... .alias("bin_xor_xy"), ... ) shape: (4, 6) ┌─────┬─────┬──────────┬──────────┬────────┬────────────┐ │ x ┆ y ┆ bin_x ┆ bin_y ┆ xor_xy ┆ bin_xor_xy │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ u8 ┆ u8 ┆ str ┆ str ┆ u8 ┆ str │ ╞═════╪═════╪══════════╪══════════╪════════╪════════════╡ │ 10 ┆ 1 ┆ 00001010 ┆ 00000001 ┆ 11 ┆ 00001011 │ │ 8 ┆ 2 ┆ 00001000 ┆ 00000010 ┆ 10 ┆ 00001010 │ │ 250 ┆ 3 ┆ 11111010 ┆ 00000011 ┆ 249 ┆ 11111001 │ │ 66 ┆ 4 ┆ 01000010 ┆ 00000100 ┆ 70 ┆ 01000110 │ └─────┴─────┴──────────┴──────────┴────────┴────────────┘