Series#

This page gives an overview of all public Series methods.

class polars.Series(
name: str | ArrayLike | None = None,
values: ArrayLike | None = None,
dtype: PolarsDataType | None = None,
*,
strict: bool = True,
nan_to_null: bool = False,
dtype_if_empty: PolarsDataType | None = None,
)[source]

A Series represents a single column in a polars DataFrame.

Parameters:
namestr, default None

Name of the Series. Will be used as a column name when used in a DataFrame. When not specified, name is set to an empty string.

valuesArrayLike, default None

One-dimensional data in various forms. Supported are: Sequence, Series, pyarrow Array, and numpy ndarray.

dtypeDataType, default None

Polars dtype of the Series data. If not specified, the dtype is inferred.

strict

Throw error on numeric overflow.

nan_to_null

In case a numpy array is used to create this Series, indicate how to deal with np.nan values. (This parameter is a no-op on non-numpy data).

dtype_if_empty=dtype_if_emptyDataType, default None

If no dtype is specified and values contains None, an empty list, or a list with only None values, set the Polars dtype of the Series data. If not specified, Float32 is used in those cases.

Examples

Constructing a Series by specifying name and values positionally:

>>> s = pl.Series("a", [1, 2, 3])
>>> s
shape: (3,)
Series: 'a' [i64]
[
        1
        2
        3
]

Notice that the dtype is automatically inferred as a polars Int64:

>>> s.dtype
Int64

Constructing a Series with a specific dtype:

>>> s2 = pl.Series("a", [1, 2, 3], dtype=pl.Float32)
>>> s2
shape: (3,)
Series: 'a' [f32]
[
    1.0
    2.0
    3.0
]

It is possible to construct a Series with values as the first positional argument. This syntax considered an anti-pattern, but it can be useful in certain scenarios. You must specify any other arguments through keywords.

>>> s3 = pl.Series([1, 2, 3])
>>> s3
shape: (3,)
Series: '' [i64]
[
        1
        2
        3
]

Methods:

abs

Compute absolute values.

alias

Rename the series.

all

Return whether all values in the column are True.

any

Return whether any of the values in the column are True.

append

Append a Series to this one.

apply

Apply a custom/user-defined function (UDF) over elements in this Series.

arccos

Compute the element-wise value for the inverse cosine.

arccosh

Compute the element-wise value for the inverse hyperbolic cosine.

arcsin

Compute the element-wise value for the inverse sine.

arcsinh

Compute the element-wise value for the inverse hyperbolic sine.

arctan

Compute the element-wise value for the inverse tangent.

arctanh

Compute the element-wise value for the inverse hyperbolic tangent.

arg_max

Get the index of the maximal value.

arg_min

Get the index of the minimal value.

arg_sort

Get the index values that would sort this Series.

arg_true

Get index values where Boolean Series evaluate True.

arg_unique

Get unique index as Series.

bottom_k

Return the k smallest elements.

cast

Cast between data types.

cbrt

Compute the cube root of the elements.

ceil

Rounds up to the nearest integer value.

chunk_lengths

Get the length of each individual chunk.

clear

Create an empty copy of the current Series, with zero to 'n' elements.

clip

Set values outside the given boundaries to the boundary value.

clip_max

Clip (limit) the values in an array to a max boundary.

clip_min

Clip (limit) the values in an array to a min boundary.

clone

Create a copy of this Series.

cos

Compute the element-wise value for the cosine.

cosh

Compute the element-wise value for the hyperbolic cosine.

cot

Compute the element-wise value for the cotangent.

cum_max

Get an array with the cumulative max computed at every element.

cum_min

Get an array with the cumulative min computed at every element.

cum_prod

Get an array with the cumulative product computed at every element.

cum_sum

Get an array with the cumulative sum computed at every element.

cummax

Get an array with the cumulative max computed at every element.

cummin

Get an array with the cumulative min computed at every element.

cumprod

Get an array with the cumulative product computed at every element.

cumsum

Get an array with the cumulative sum computed at every element.

cumulative_eval

Run an expression over a sliding window that increases 1 slot every iteration.

cut

Bin continuous values into discrete categories.

describe

Quick summary statistics of a Series.

diff

Calculate the first discrete difference between shifted items.

dot

Compute the dot/inner product between two Series.

drop_nans

Drop all floating point NaN values.

drop_nulls

Drop all null values.

entropy

Computes the entropy.

eq

Method equivalent of operator expression series == other.

eq_missing

Method equivalent of equality operator series == other where None == None.

equals

Check whether the Series is equal to another Series.

estimated_size

Return an estimation of the total (heap) allocated size of the Series.

ewm_mean

Exponentially-weighted moving average.

ewm_std

Exponentially-weighted moving standard deviation.

ewm_var

Exponentially-weighted moving variance.

exp

Compute the exponential, element-wise.

explode

Explode a list Series.

extend

Extend the memory backed by this Series with the values from another.

extend_constant

Extremely fast method for extending the Series with 'n' copies of a value.

fill_nan

Fill floating point NaN value with a fill value.

fill_null

Fill null values using the specified value or strategy.

filter

Filter elements by a boolean mask.

floor

Rounds down to the nearest integer value.

gather

Take values by index.

gather_every

Take every nth value in the Series and return as new Series.

ge

Method equivalent of operator expression series >= other.

get_chunks

Get the chunks of this Series as a list of Series.

gt

Method equivalent of operator expression series > other.

has_validity

Return True if the Series has a validity bitmask.

hash

Hash the Series.

head

Get the first n elements.

hist

Bin values into buckets and count their occurrences.

implode

Aggregate values into a list.

interpolate

Fill null values using interpolation.

is_between

Get a boolean mask of the values that fall between the given start/end values.

is_boolean

Check if this Series is a Boolean.

is_duplicated

Get mask of all duplicated values.

is_empty

Check if the Series is empty.

is_finite

Returns a boolean Series indicating which values are finite.

is_first

Return a boolean mask indicating the first occurrence of each distinct value.

is_first_distinct

Return a boolean mask indicating the first occurrence of each distinct value.

is_float

Check if this Series has floating point numbers.

is_in

Check if elements of this Series are in the other Series.

is_infinite

Returns a boolean Series indicating which values are infinite.

is_integer

Check if this Series datatype is an integer (signed or unsigned).

is_last

Return a boolean mask indicating the last occurrence of each distinct value.

is_last_distinct

Return a boolean mask indicating the last occurrence of each distinct value.

is_nan

Returns a boolean Series indicating which values are not NaN.

is_not_nan

Returns a boolean Series indicating which values are not NaN.

is_not_null

Returns a boolean Series indicating which values are not null.

is_null

Returns a boolean Series indicating which values are null.

is_numeric

Check if this Series datatype is numeric.

is_sorted

Check if the Series is sorted.

is_temporal

Check if this Series datatype is temporal.

is_unique

Get mask of all unique values.

is_utf8

Check if this Series datatype is a Utf8.

item

Return the Series as a scalar, or return the element at the given index.

kurtosis

Compute the kurtosis (Fisher or Pearson) of a dataset.

le

Method equivalent of operator expression series <= other.

len

Return the number of elements in this Series.

limit

Get the first n elements.

log

Compute the logarithm to a given base.

log10

Compute the base 10 logarithm of the input array, element-wise.

log1p

Compute the natural logarithm of the input array plus one, element-wise.

lower_bound

Return the lower bound of this Series' dtype as a unit Series.

lt

Method equivalent of operator expression series < other.

map_dict

Replace values in the Series using a remapping dictionary.

map_elements

Map a custom/user-defined function (UDF) over elements in this Series.

max

Get the maximum value in this Series.

mean

Reduce this Series to the mean value.

median

Get the median of this Series.

min

Get the minimal value in this Series.

mode

Compute the most occurring value(s).

n_chunks

Get the number of chunks that this Series contains.

n_unique

Count the number of unique values in this Series.

nan_max

Get maximum value, but propagate/poison encountered NaN values.

nan_min

Get minimum value, but propagate/poison encountered NaN values.

ne

Method equivalent of operator expression series != other.

ne_missing

Method equivalent of equality operator series != other where None == None.

new_from_index

Create a new Series filled with values from the given index.

not_

Negate a boolean Series.

null_count

Count the null values in this Series.

pct_change

Computes percentage change between values.

peak_max

Get a boolean mask of the local maximum peaks.

peak_min

Get a boolean mask of the local minimum peaks.

pow

Raise to the power of the given exponent.

product

Reduce this Series to the product value.

qcut

Bin continuous values into discrete categories based on their quantiles.

quantile

Get the quantile value of this Series.

rank

Assign ranks to data, dealing with ties appropriately.

rechunk

Create a single chunk of memory for this Series.

reinterpret

Reinterpret the underlying bits as a signed/unsigned integer.

rename

Rename this Series.

replace

Replace values according to the given mapping.

reshape

Reshape this Series to a flat Series or a Series of Lists.

reverse

Return Series in reverse order.

rle

Get the lengths of runs of identical values.

rle_id

Map values to run IDs.

rolling_apply

Apply a custom rolling window function.

rolling_map

Compute a custom rolling window function.

rolling_max

Apply a rolling max (moving max) over the values in this array.

rolling_mean

Apply a rolling mean (moving mean) over the values in this array.

rolling_median

Compute a rolling median.

rolling_min

Apply a rolling min (moving min) over the values in this array.

rolling_quantile

Compute a rolling quantile.

rolling_skew

Compute a rolling skew.

rolling_std

Compute a rolling std dev.

rolling_sum

Apply a rolling sum (moving sum) over the values in this array.

rolling_var

Compute a rolling variance.

round

Round underlying floating point data by decimals digits.

round_sig_figs

Round to a number of significant figures.

sample

Sample from this Series.

scatter

Set values at the index locations.

search_sorted

Find indices where elements should be inserted to maintain order.

series_equal

Check whether the Series is equal to another Series.

set

Set masked values.

set_at_idx

Set values at the index locations.

set_sorted

Flags the Series as 'sorted'.

shift

Shift values by the given number of indices.

shift_and_fill

Shift values by the given number of places and fill the resulting null values.

shrink_dtype

Shrink numeric columns to the minimal required datatype.

shrink_to_fit

Shrink Series memory usage.

shuffle

Shuffle the contents of this Series.

sign

Compute the element-wise indication of the sign.

sin

Compute the element-wise value for the sine.

sinh

Compute the element-wise value for the hyperbolic sine.

skew

Compute the sample skewness of a data set.

slice

Get a slice of this Series.

sort

Sort this Series.

sqrt

Compute the square root of the elements.

std

Get the standard deviation of this Series.

sum

Reduce this Series to the sum value.

tail

Get the last n elements.

take

Take values by index.

take_every

Take every nth value in the Series and return as new Series.

tan

Compute the element-wise value for the tangent.

tanh

Compute the element-wise value for the hyperbolic tangent.

to_arrow

Get the underlying Arrow Array.

to_dummies

Get dummy/indicator variables.

to_frame

Cast this Series to a DataFrame.

to_init_repr

Convert Series to instantiatable string representation.

to_list

Convert this Series to a Python List.

to_numpy

Convert this Series to numpy.

to_pandas

Convert this Series to a pandas Series.

to_physical

Cast to physical representation of the logical dtype.

top_k

Return the k largest elements.

unique

Get unique elements in series.

unique_counts

Return a count of the unique values in the order of appearance.

upper_bound

Return the upper bound of this Series' dtype as a unit Series.

value_counts

Count the occurrences of unique values.

var

Get variance of this Series.

view

Get a view into this Series data with a numpy array.

zip_with

Take values from self or other based on the given mask.

abs() Series[source]

Compute absolute values.

Same as abs(series).

alias(name: str) Series[source]

Rename the series.

Parameters:
name

The new name.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.alias("b")
shape: (3,)
Series: 'b' [i64]
[
        1
        2
        3
]
all(*, ignore_nulls: Literal[True] = True) bool[source]
all(*, ignore_nulls: bool) bool | None

Return whether all values in the column are True.

Only works on columns of data type Boolean.

Parameters:
ignore_nulls

Ignore null values (default).

If set to False, Kleene logic is used to deal with nulls: if the column contains any null values and no True values, the output is None.

Returns:
bool or None

Examples

>>> pl.Series([True, True]).all()
True
>>> pl.Series([False, True]).all()
False
>>> pl.Series([None, True]).all()
True

Enable Kleene logic by setting ignore_nulls=False.

>>> pl.Series([None, True]).all(ignore_nulls=False)  # Returns None
any(*, ignore_nulls: Literal[True] = True) bool[source]
any(*, ignore_nulls: bool) bool | None

Return whether any of the values in the column are True.

Only works on columns of data type Boolean.

Parameters:
ignore_nulls

Ignore null values (default).

If set to False, Kleene logic is used to deal with nulls: if the column contains any null values and no True values, the output is None.

Returns:
bool or None

Examples

>>> pl.Series([True, False]).any()
True
>>> pl.Series([False, False]).any()
False
>>> pl.Series([None, False]).any()
False

Enable Kleene logic by setting ignore_nulls=False.

>>> pl.Series([None, False]).any(ignore_nulls=False)  # Returns None
append(other: Series, *, append_chunks: bool | None = None) Self[source]

Append a Series to this one.

Parameters:
other

Series to append.

append_chunks

Deprecated since version 0.18.8: This argument will be removed and append will change to always behave like append_chunks=True (the previous default). For the behavior of append_chunks=False, use Series.extend.

If set to True the append operation will add the chunks from other to self. This is super cheap.

If set to False the append operation will do the same as DataFrame.extend which extends the memory backed by this Series with the values from other.

Different from append chunks, extend appends the data from other to the underlying memory locations and thus may cause a reallocation (which are expensive).

If this does not cause a reallocation, the resulting data structure will not have any extra chunks and thus will yield faster queries.

Prefer extend over append_chunks when you want to do a query after a single append. For instance during online operations where you add n rows and rerun a query.

Prefer append_chunks over extend when you want to append many times before doing a query. For instance when you read in multiple files and when to store them in a single Series. In the latter case, finish the sequence of append_chunks operations with a rechunk.

Warning

This method modifies the series in-place. The series is returned for convenience only.

See also

extend

Examples

>>> a = pl.Series("a", [1, 2, 3])
>>> b = pl.Series("b", [4, 5])
>>> a.append(b)
shape: (5,)
Series: 'a' [i64]
[
    1
    2
    3
    4
    5
]

The resulting series will consist of multiple chunks.

>>> a.n_chunks()
2
apply(
function: Callable[[Any], Any],
return_dtype: PolarsDataType | None = None,
*,
skip_nulls: bool = True,
) Self[source]

Apply a custom/user-defined function (UDF) over elements in this Series.

Deprecated since version 0.19.0: This method has been renamed to Series.map_elements().

Parameters:
function

Custom function or lambda.

return_dtype

Output datatype. If none is given, the same datatype as this Series will be used.

skip_nulls

Nulls will be skipped and not passed to the python function. This is faster because python can be skipped and because we call more specialized functions.

arccos() Series[source]

Compute the element-wise value for the inverse cosine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.arccos()
shape: (3,)
Series: 'a' [f64]
[
    0.0
    1.570796
    3.141593
]
arccosh() Series[source]

Compute the element-wise value for the inverse hyperbolic cosine.

Examples

>>> s = pl.Series("a", [5.0, 1.0, 0.0, -1.0])
>>> s.arccosh()
shape: (4,)
Series: 'a' [f64]
[
    2.292432
    0.0
    NaN
    NaN
]
arcsin() Series[source]

Compute the element-wise value for the inverse sine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.arcsin()
shape: (3,)
Series: 'a' [f64]
[
    1.570796
    0.0
    -1.570796
]
arcsinh() Series[source]

Compute the element-wise value for the inverse hyperbolic sine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.arcsinh()
shape: (3,)
Series: 'a' [f64]
[
    0.881374
    0.0
    -0.881374
]
arctan() Series[source]

Compute the element-wise value for the inverse tangent.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.arctan()
shape: (3,)
Series: 'a' [f64]
[
    0.785398
    0.0
    -0.785398
]
arctanh() Series[source]

Compute the element-wise value for the inverse hyperbolic tangent.

Examples

>>> s = pl.Series("a", [2.0, 1.0, 0.5, 0.0, -0.5, -1.0, -1.1])
>>> s.arctanh()
shape: (7,)
Series: 'a' [f64]
[
    NaN
    inf
    0.549306
    0.0
    -0.549306
    -inf
    NaN
]
arg_max() int | None[source]

Get the index of the maximal value.

Returns:
int

Examples

>>> s = pl.Series("a", [3, 2, 1])
>>> s.arg_max()
0
arg_min() int | None[source]

Get the index of the minimal value.

Returns:
int

Examples

>>> s = pl.Series("a", [3, 2, 1])
>>> s.arg_min()
2
arg_sort(
*,
descending: bool = False,
nulls_last: bool = False,
) Series[source]

Get the index values that would sort this Series.

Parameters:
descending

Sort in descending order.

nulls_last

Place null values last instead of first.

Examples

>>> s = pl.Series("a", [5, 3, 4, 1, 2])
>>> s.arg_sort()
shape: (5,)
Series: 'a' [u32]
[
    3
    4
    1
    2
    0
]
arg_true() Series[source]

Get index values where Boolean Series evaluate True.

Returns:
Series

Series of data type UInt32.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> (s == 2).arg_true()
shape: (1,)
Series: 'a' [u32]
[
        1
]
arg_unique() Series[source]

Get unique index as Series.

Returns:
Series

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.arg_unique()
shape: (3,)
Series: 'a' [u32]
[
        0
        1
        3
]
bottom_k(k: int | IntoExprColumn = 5) Series[source]

Return the k smallest elements.

This has time complexity:

\[\begin{split}O(n + k \\log{}n - \frac{k}{2})\end{split}\]
Parameters:
k

Number of elements to return.

See also

top_k

Examples

>>> s = pl.Series("a", [2, 5, 1, 4, 3])
>>> s.bottom_k(3)
shape: (3,)
Series: 'a' [i64]
[
    1
    2
    3
]
cast(
dtype: PolarsDataType | type[int] | type[float] | type[str] | type[bool],
*,
strict: bool = True,
) Self[source]

Cast between data types.

Parameters:
dtype

DataType to cast to.

strict

Throw an error if a cast could not be done (for instance, due to an overflow).

Examples

>>> s = pl.Series("a", [True, False, True])
>>> s
shape: (3,)
Series: 'a' [bool]
[
    true
    false
    true
]
>>> s.cast(pl.UInt32)
shape: (3,)
Series: 'a' [u32]
[
    1
    0
    1
]
cbrt() Series[source]

Compute the cube root of the elements.

Optimization for

>>> pl.Series([1, 2]) ** (1.0 / 3)
shape: (2,)
Series: '' [f64]
[
    1.0
    1.259921
]
ceil() Series[source]

Rounds up to the nearest integer value.

Only works on floating point Series.

Examples

>>> s = pl.Series("a", [1.12345, 2.56789, 3.901234])
>>> s.ceil()
shape: (3,)
Series: 'a' [f64]
[
        2.0
        3.0
        4.0
]
chunk_lengths() list[int][source]

Get the length of each individual chunk.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("a", [4, 5, 6])

Concatenate Series with rechunk = True

>>> pl.concat([s, s2]).chunk_lengths()
[6]

Concatenate Series with rechunk = False

>>> pl.concat([s, s2], rechunk=False).chunk_lengths()
[3, 3]
clear(n: int = 0) Series[source]

Create an empty copy of the current Series, with zero to ‘n’ elements.

The copy has an identical name/dtype, but no data.

Parameters:
n

Number of (empty) elements to return in the cleared frame.

See also

clone

Cheap deepcopy/clone.

Examples

>>> s = pl.Series("a", [None, True, False])
>>> s.clear()
shape: (0,)
Series: 'a' [bool]
[
]
>>> s.clear(n=2)
shape: (2,)
Series: 'a' [bool]
[
    null
    null
]
clip(
lower_bound: NumericLiteral | TemporalLiteral | IntoExprColumn | None = None,
upper_bound: NumericLiteral | TemporalLiteral | IntoExprColumn | None = None,
) Series[source]

Set values outside the given boundaries to the boundary value.

Parameters:
lower_bound

Lower bound. Accepts expression input. Non-expression inputs are parsed as literals. If set to None (default), no lower bound is applied.

upper_bound

Upper bound. Accepts expression input. Non-expression inputs are parsed as literals. If set to None (default), no upper bound is applied.

See also

when

Notes

This method only works for numeric and temporal columns. To clip other data types, consider writing a when-then-otherwise expression. See when().

Examples

Specifying both a lower and upper bound:

>>> s = pl.Series([-50, 5, 50, None])
>>> s.clip(1, 10)
shape: (4,)
Series: '' [i64]
[
        1
        5
        10
        null
]

Specifying only a single bound:

>>> s.clip(upper_bound=10)
shape: (4,)
Series: '' [i64]
[
        -50
        5
        10
        null
]
clip_max(
upper_bound: NumericLiteral | TemporalLiteral | IntoExprColumn,
) Series[source]

Clip (limit) the values in an array to a max boundary.

Deprecated since version 0.19.12: Use clip() instead.

Parameters:
upper_bound

Upper bound.

clip_min(
lower_bound: NumericLiteral | TemporalLiteral | IntoExprColumn,
) Series[source]

Clip (limit) the values in an array to a min boundary.

Deprecated since version 0.19.12: Use clip() instead.

Parameters:
lower_bound

Lower bound.

clone() Self[source]

Create a copy of this Series.

This is a cheap operation that does not copy data.

See also

clear

Create an empty copy of the current Series, with identical schema but no data.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.clone()
shape: (3,)
Series: 'a' [i64]
[
        1
        2
        3
]
cos() Series[source]

Compute the element-wise value for the cosine.

Examples

>>> import math
>>> s = pl.Series("a", [0.0, math.pi / 2.0, math.pi])
>>> s.cos()
shape: (3,)
Series: 'a' [f64]
[
    1.0
    6.1232e-17
    -1.0
]
cosh() Series[source]

Compute the element-wise value for the hyperbolic cosine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.cosh()
shape: (3,)
Series: 'a' [f64]
[
    1.543081
    1.0
    1.543081
]
cot() Series[source]

Compute the element-wise value for the cotangent.

Examples

>>> import math
>>> s = pl.Series("a", [0.0, math.pi / 2.0, math.pi])
>>> s.cot()
shape: (3,)
Series: 'a' [f64]
[
    inf
    6.1232e-17
    -8.1656e15
]
cum_max(*, reverse: bool = False) Series[source]

Get an array with the cumulative max computed at every element.

Parameters:
reverse

reverse the operation.

Examples

>>> s = pl.Series("s", [3, 5, 1])
>>> s.cum_max()
shape: (3,)
Series: 's' [i64]
[
    3
    5
    5
]
cum_min(*, reverse: bool = False) Series[source]

Get an array with the cumulative min computed at every element.

Parameters:
reverse

reverse the operation.

Examples

>>> s = pl.Series("s", [1, 2, 3])
>>> s.cum_min()
shape: (3,)
Series: 's' [i64]
[
    1
    1
    1
]
cum_prod(*, reverse: bool = False) Series[source]

Get an array with the cumulative product computed at every element.

Parameters:
reverse

reverse the operation.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.cum_prod()
shape: (3,)
Series: 'a' [i64]
[
    1
    2
    6
]
cum_sum(*, reverse: bool = False) Series[source]

Get an array with the cumulative sum computed at every element.

Parameters:
reverse

reverse the operation.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.cum_sum()
shape: (3,)
Series: 'a' [i64]
[
    1
    3
    6
]
cummax(*, reverse: bool = False) Series[source]

Get an array with the cumulative max computed at every element.

Deprecated since version 0.19.14: This method has been renamed to cum_max().

Parameters:
reverse

reverse the operation.

cummin(*, reverse: bool = False) Series[source]

Get an array with the cumulative min computed at every element.

Deprecated since version 0.19.14: This method has been renamed to cum_min().

Parameters:
reverse

reverse the operation.

cumprod(*, reverse: bool = False) Series[source]

Get an array with the cumulative product computed at every element.

Deprecated since version 0.19.14: This method has been renamed to cum_prod().

Parameters:
reverse

reverse the operation.

cumsum(*, reverse: bool = False) Series[source]

Get an array with the cumulative sum computed at every element.

Deprecated since version 0.19.14: This method has been renamed to cum_sum().

Parameters:
reverse

reverse the operation.

cumulative_eval(
expr: Expr,
min_periods: int = 1,
*,
parallel: bool = False,
) Series[source]

Run an expression over a sliding window that increases 1 slot every iteration.

Parameters:
expr

Expression to evaluate

min_periods

Number of valid values there should be in the window before the expression is evaluated. valid values = length - null_count

parallel

Run in parallel. Don’t do this in a group by or another operation that already has much parallelization.

Warning

This functionality is experimental and may change without it being considered a breaking change.

This can be really slow as it can have O(n^2) complexity. Don’t use this for operations that visit all elements.

Examples

>>> s = pl.Series("values", [1, 2, 3, 4, 5])
>>> s.cumulative_eval(pl.element().first() - pl.element().last() ** 2)
shape: (5,)
Series: 'values' [f64]
[
    0.0
    -3.0
    -8.0
    -15.0
    -24.0
]
cut(
breaks: Sequence[float],
labels: Sequence[str] | None = None,
break_point_label: str = 'break_point',
category_label: str = 'category',
*,
left_closed: bool = False,
include_breaks: bool = False,
as_series: Literal[True] = True,
) Series[source]
cut(
breaks: Sequence[float],
labels: Sequence[str] | None = None,
break_point_label: str = 'break_point',
category_label: str = 'category',
*,
left_closed: bool = False,
include_breaks: bool = False,
as_series: Literal[False],
) DataFrame
cut(
breaks: Sequence[float],
labels: Sequence[str] | None = None,
break_point_label: str = 'break_point',
category_label: str = 'category',
*,
left_closed: bool = False,
include_breaks: bool = False,
as_series: bool,
) Series | DataFrame

Bin continuous values into discrete categories.

Parameters:
breaks

List of unique cut points.

labels

Names of the categories. The number of labels must be equal to the number of cut points plus one.

break_point_label

Name of the breakpoint column. Only used if include_breaks is set to True.

Deprecated since version 0.19.0: This parameter will be removed. Use Series.struct.rename_fields to rename the field instead.

category_label

Name of the category column. Only used if include_breaks is set to True.

Deprecated since version 0.19.0: This parameter will be removed. Use Series.struct.rename_fields to rename the field instead.

left_closed

Set the intervals to be left-closed instead of right-closed.

include_breaks

Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a Categorical to a Struct.

as_series

If set to False, return a DataFrame containing the original values, the breakpoints, and the categories.

Deprecated since version 0.19.0: This parameter will be removed. The same behavior can be achieved by setting include_breaks=True, unnesting the resulting struct Series, and adding the result to the original Series.

Returns:
Series

Series of data type Categorical if include_breaks is set to False (default), otherwise a Series of data type Struct.

See also

qcut

Examples

Divide the column into three categories.

>>> s = pl.Series("foo", [-2, -1, 0, 1, 2])
>>> s.cut([-1, 1], labels=["a", "b", "c"])
shape: (5,)
Series: 'foo' [cat]
[
        "a"
        "a"
        "b"
        "b"
        "c"
]

Create a DataFrame with the breakpoint and category for each value.

>>> cut = s.cut([-1, 1], include_breaks=True).alias("cut")
>>> s.to_frame().with_columns(cut).unnest("cut")
shape: (5, 3)
┌─────┬─────────────┬────────────┐
│ foo ┆ break_point ┆ category   │
│ --- ┆ ---         ┆ ---        │
│ i64 ┆ f64         ┆ cat        │
╞═════╪═════════════╪════════════╡
│ -2  ┆ -1.0        ┆ (-inf, -1] │
│ -1  ┆ -1.0        ┆ (-inf, -1] │
│ 0   ┆ 1.0         ┆ (-1, 1]    │
│ 1   ┆ 1.0         ┆ (-1, 1]    │
│ 2   ┆ inf         ┆ (1, inf]   │
└─────┴─────────────┴────────────┘
describe(
percentiles: Sequence[float] | float | None = (0.25, 0.5, 0.75),
) DataFrame[source]

Quick summary statistics of a Series.

Series with mixed datatypes will return summary statistics for the datatype of the first value.

Parameters:
percentiles

One or more percentiles to include in the summary statistics (if the Series has a numeric dtype). All values must be in the range [0, 1].

Returns:
DataFrame

Mapping with summary statistics of a Series.

Notes

The median is included by default as the 50% percentile.

Examples

>>> series_num = pl.Series([1, 2, 3, 4, 5])
>>> series_num.describe()
shape: (9, 2)
┌────────────┬──────────┐
│ statistic  ┆ value    │
│ ---        ┆ ---      │
│ str        ┆ f64      │
╞════════════╪══════════╡
│ count      ┆ 5.0      │
│ null_count ┆ 0.0      │
│ mean       ┆ 3.0      │
│ std        ┆ 1.581139 │
│ min        ┆ 1.0      │
│ 25%        ┆ 2.0      │
│ 50%        ┆ 3.0      │
│ 75%        ┆ 4.0      │
│ max        ┆ 5.0      │
└────────────┴──────────┘
>>> series_str = pl.Series(["a", "a", None, "b", "c"])
>>> series_str.describe()
shape: (3, 2)
┌────────────┬───────┐
│ statistic  ┆ value │
│ ---        ┆ ---   │
│ str        ┆ i64   │
╞════════════╪═══════╡
│ count      ┆ 5     │
│ null_count ┆ 1     │
│ unique     ┆ 4     │
└────────────┴───────┘
diff(n: int = 1, null_behavior: NullBehavior = 'ignore') Series[source]

Calculate the first discrete difference between shifted items.

Parameters:
n

Number of slots to shift.

null_behavior{‘ignore’, ‘drop’}

How to handle null values.

Examples

>>> s = pl.Series("s", values=[20, 10, 30, 25, 35], dtype=pl.Int8)
>>> s.diff()
shape: (5,)
Series: 's' [i8]
[
    null
    -10
    20
    -5
    10
]
>>> s.diff(n=2)
shape: (5,)
Series: 's' [i8]
[
    null
    null
    10
    15
    5
]
>>> s.diff(n=2, null_behavior="drop")
shape: (3,)
Series: 's' [i8]
[
    10
    15
    5
]
dot(
other: Series | Sequence[Any] | Array | ChunkedArray | ndarray | Series | DatetimeIndex,
) float | None[source]

Compute the dot/inner product between two Series.

Parameters:
other

Series (or array) to compute dot product with.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("b", [4.0, 5.0, 6.0])
>>> s.dot(s2)
32.0
drop_nans() Series[source]

Drop all floating point NaN values.

The original order of the remaining elements is preserved.

See also

drop_nulls

Notes

A NaN value is not the same as a null value. To drop null values, use drop_nulls().

Examples

>>> s = pl.Series([1.0, None, 3.0, float("nan")])
>>> s.drop_nans()
shape: (3,)
Series: '' [f64]
[
        1.0
        null
        3.0
]
drop_nulls() Series[source]

Drop all null values.

The original order of the remaining elements is preserved.

See also

drop_nans

Notes

A null value is not the same as a NaN value. To drop NaN values, use drop_nans().

Examples

>>> s = pl.Series([1.0, None, 3.0, float("nan")])
>>> s.drop_nulls()
shape: (3,)
Series: '' [f64]
[
        1.0
        3.0
        NaN
]
entropy(
base: float = 2.718281828459045,
*,
normalize: bool = False,
) float | None[source]

Computes the entropy.

Uses the formula -sum(pk * log(pk) where pk are discrete probabilities.

Parameters:
base

Given base, defaults to e

normalize

Normalize pk if it doesn’t sum to 1.

Examples

>>> a = pl.Series([0.99, 0.005, 0.005])
>>> a.entropy(normalize=True)
0.06293300616044681
>>> b = pl.Series([0.65, 0.10, 0.25])
>>> b.entropy(normalize=True)
0.8568409950394724
eq(other: Any) Self | Expr[source]

Method equivalent of operator expression series == other.

eq_missing(other: Any) Self[source]
eq_missing(other: Expr) Expr

Method equivalent of equality operator series == other where None == None.

This differs from the standard ne where null values are propagated.

Parameters:
other

A literal or expression value to compare with.

See also

ne_missing
eq

Examples

>>> s1 = pl.Series("a", [333, 200, None])
>>> s2 = pl.Series("a", [100, 200, None])
>>> s1.eq(s2)
shape: (3,)
Series: 'a' [bool]
[
    false
    true
    null
]
>>> s1.eq_missing(s2)
shape: (3,)
Series: 'a' [bool]
[
    false
    true
    true
]
equals(
other: Series,
*,
null_equal: bool = True,
strict: bool = False,
) bool[source]

Check whether the Series is equal to another Series.

Parameters:
other

Series to compare with.

null_equal

Consider null values as equal.

strict

Don’t allow different numerical dtypes, e.g. comparing pl.UInt32 with a pl.Int64 will return False.

See also

assert_series_equal

Examples

>>> s1 = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("b", [4, 5, 6])
>>> s1.equals(s1)
True
>>> s1.equals(s2)
False
estimated_size(unit: SizeUnit = 'b') int | float[source]

Return an estimation of the total (heap) allocated size of the Series.

Estimated size is given in the specified unit (bytes by default).

This estimation is the sum of the size of its buffers, validity, including nested arrays. Multiple arrays may share buffers and bitmaps. Therefore, the size of 2 arrays is not the sum of the sizes computed from this function. In particular, [StructArray]’s size is an upper bound.

When an array is sliced, its allocated size remains constant because the buffer unchanged. However, this function will yield a smaller number. This is because this function returns the visible size of the buffer, not its total capacity.

FFI buffers are included in this estimation.

Parameters:
unit{‘b’, ‘kb’, ‘mb’, ‘gb’, ‘tb’}

Scale the returned size to the given unit.

Examples

>>> s = pl.Series("values", list(range(1_000_000)), dtype=pl.UInt32)
>>> s.estimated_size()
4000000
>>> s.estimated_size("mb")
3.814697265625
ewm_mean(
com: float | None = None,
span: float | None = None,
half_life: float | None = None,
alpha: float | None = None,
*,
adjust: bool = True,
min_periods: int = 1,
ignore_nulls: bool = True,
) Series[source]

Exponentially-weighted moving average.

Parameters:
com

Specify decay in terms of center of mass, \(\gamma\), with

\[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]
span

Specify decay in terms of span, \(\theta\), with

\[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]
half_life

Specify decay in terms of half-life, \(\lambda\), with

\[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0\]
alpha

Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\).

adjust

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

  • When adjust=True the EW function is calculated using weights \(w_i = (1 - \alpha)^i\)

  • When adjust=False the EW function is calculated recursively by

    \[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]
min_periods

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

  • When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) if adjust=True, and \((1-\alpha)^2\) and \(\alpha\) if adjust=False.

  • When ignore_nulls=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) if adjust=True, and \(1-\alpha\) and \(\alpha\) if adjust=False.

Examples

>>> s = pl.Series([1, 2, 3])
>>> s.ewm_mean(com=1)
shape: (3,)
Series: '' [f64]
[
        1.0
        1.666667
        2.428571
]
ewm_std(
com: float | None = None,
span: float | None = None,
half_life: float | None = None,
alpha: float | None = None,
*,
adjust: bool = True,
bias: bool = False,
min_periods: int = 1,
ignore_nulls: bool = True,
) Series[source]

Exponentially-weighted moving standard deviation.

Parameters:
com

Specify decay in terms of center of mass, \(\gamma\), with

\[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]
span

Specify decay in terms of span, \(\theta\), with

\[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]
half_life

Specify decay in terms of half-life, \(\lambda\), with

\[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0\]
alpha

Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\).

adjust

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

  • When adjust=True the EW function is calculated using weights \(w_i = (1 - \alpha)^i\)

  • When adjust=False the EW function is calculated recursively by

    \[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]
bias

When bias=False, apply a correction to make the estimate statistically unbiased.

min_periods

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

  • When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) if adjust=True, and \((1-\alpha)^2\) and \(\alpha\) if adjust=False.

  • When ignore_nulls=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) if adjust=True, and \(1-\alpha\) and \(\alpha\) if adjust=False.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.ewm_std(com=1)
shape: (3,)
Series: 'a' [f64]
[
    0.0
    0.707107
    0.963624
]
ewm_var(
com: float | None = None,
span: float | None = None,
half_life: float | None = None,
alpha: float | None = None,
*,
adjust: bool = True,
bias: bool = False,
min_periods: int = 1,
ignore_nulls: bool = True,
) Series[source]

Exponentially-weighted moving variance.

Parameters:
com

Specify decay in terms of center of mass, \(\gamma\), with

\[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]
span

Specify decay in terms of span, \(\theta\), with

\[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]
half_life

Specify decay in terms of half-life, \(\lambda\), with

\[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0\]
alpha

Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\).

adjust

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

  • When adjust=True the EW function is calculated using weights \(w_i = (1 - \alpha)^i\)

  • When adjust=False the EW function is calculated recursively by

    \[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]
bias

When bias=False, apply a correction to make the estimate statistically unbiased.

min_periods

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

  • When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) if adjust=True, and \((1-\alpha)^2\) and \(\alpha\) if adjust=False.

  • When ignore_nulls=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) if adjust=True, and \(1-\alpha\) and \(\alpha\) if adjust=False.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.ewm_var(com=1)
shape: (3,)
Series: 'a' [f64]
[
    0.0
    0.5
    0.928571
]
exp() Series[source]

Compute the exponential, element-wise.

explode() Series[source]

Explode a list Series.

This means that every item is expanded to a new row.

Returns:
Series

Series with the data type of the list elements.

See also

Series.list.explode

Explode a list column.

Series.str.explode

Explode a string column.

extend(other: Series) Self[source]

Extend the memory backed by this Series with the values from another.

Different from append, which adds the chunks from other to the chunks of this series, extend appends the data from other to the underlying memory locations and thus may cause a reallocation (which is expensive).

If this does not cause a reallocation, the resulting data structure will not have any extra chunks and thus will yield faster queries.

Prefer extend over append when you want to do a query after a single append. For instance, during online operations where you add n rows and rerun a query.

Prefer append over extend when you want to append many times before doing a query. For instance, when you read in multiple files and want to store them in a single Series. In the latter case, finish the sequence of append operations with a rechunk.

Parameters:
other

Series to extend the series with.

Warning

This method modifies the series in-place. The series is returned for convenience only.

See also

append

Examples

>>> a = pl.Series("a", [1, 2, 3])
>>> b = pl.Series("b", [4, 5])
>>> a.extend(b)
shape: (5,)
Series: 'a' [i64]
[
    1
    2
    3
    4
    5
]

The resulting series will consist of a single chunk.

>>> a.n_chunks()
1
extend_constant(value: PythonLiteral | None, n: int) Series[source]

Extremely fast method for extending the Series with ‘n’ copies of a value.

Parameters:
value

A constant literal value (not an expression) with which to extend the Series; can pass None to extend with nulls.

n

The number of additional values that will be added.

Examples

>>> s = pl.Series([1, 2, 3])
>>> s.extend_constant(99, n=2)
shape: (5,)
Series: '' [i64]
[
        1
        2
        3
        99
        99
]
fill_nan(value: int | float | Expr | None) Series[source]

Fill floating point NaN value with a fill value.

Parameters:
value

Value used to fill NaN values.

Examples

>>> s = pl.Series("a", [1, 2, 3, float("nan")])
>>> s.fill_nan(0)
shape: (4,)
Series: 'a' [f64]
[
        1.0
        2.0
        3.0
        0.0
]
fill_null(
value: Any | None = None,
strategy: FillNullStrategy | None = None,
limit: int | None = None,
) Series[source]

Fill null values using the specified value or strategy.

Parameters:
value

Value used to fill null values.

strategy{None, ‘forward’, ‘backward’, ‘min’, ‘max’, ‘mean’, ‘zero’, ‘one’}

Strategy used to fill null values.

limit

Number of consecutive null values to fill when using the ‘forward’ or ‘backward’ strategy.

Examples

>>> s = pl.Series("a", [1, 2, 3, None])
>>> s.fill_null(strategy="forward")
shape: (4,)
Series: 'a' [i64]
[
    1
    2
    3
    3
]
>>> s.fill_null(strategy="min")
shape: (4,)
Series: 'a' [i64]
[
    1
    2
    3
    1
]
>>> s = pl.Series("b", ["x", None, "z"])
>>> s.fill_null(pl.lit(""))
shape: (3,)
Series: 'b' [str]
[
    "x"
    ""
    "z"
]
filter(predicate: Series | list[bool]) Self[source]

Filter elements by a boolean mask.

The original order of the remaining elements is preserved.

Parameters:
predicate

Boolean mask.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> mask = pl.Series("", [True, False, True])
>>> s.filter(mask)
shape: (2,)
Series: 'a' [i64]
[
        1
        3
]
floor() Series[source]

Rounds down to the nearest integer value.

Only works on floating point Series.

Examples

>>> s = pl.Series("a", [1.12345, 2.56789, 3.901234])
>>> s.floor()
shape: (3,)
Series: 'a' [f64]
[
        1.0
        2.0
        3.0
]
gather(indices: int | list[int] | Expr | Series | np.ndarray[Any, Any]) Series[source]

Take values by index.

Parameters:
indices

Index location used for selection.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4])
>>> s.gather([1, 3])
shape: (2,)
Series: 'a' [i64]
[
        2
        4
]
gather_every(n: int) Series[source]

Take every nth value in the Series and return as new Series.

Parameters:
n

Gather every n-th row.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4])
>>> s.gather_every(2)
shape: (2,)
Series: 'a' [i64]
[
    1
    3
]
ge(other: Any) Self | Expr[source]

Method equivalent of operator expression series >= other.

get_chunks() list[Series][source]

Get the chunks of this Series as a list of Series.

gt(other: Any) Self | Expr[source]

Method equivalent of operator expression series > other.

has_validity() bool[source]

Return True if the Series has a validity bitmask.

If there is no mask, it means that there are no null values.

Notes

While the absence of a validity bitmask guarantees that a Series does not have null values, the converse is not true, eg: the presence of a bitmask does not mean that there are null values, as every value of the bitmask could be false.

To confirm that a column has null values use null_count().

hash(
seed: int = 0,
seed_1: int | None = None,
seed_2: int | None = None,
seed_3: int | None = None,
) Series[source]

Hash the Series.

The hash value is of type UInt64.

Parameters:
seed

Random seed parameter. Defaults to 0.

seed_1

Random seed parameter. Defaults to seed if not set.

seed_2

Random seed parameter. Defaults to seed if not set.

seed_3

Random seed parameter. Defaults to seed if not set.

Notes

This implementation of hash() does not guarantee stable results across different Polars versions. Its stability is only guaranteed within a single version.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.hash(seed=42)  
shape: (3,)
Series: 'a' [u64]
[
    10734580197236529959
    3022416320763508302
    13756996518000038261
]
head(n: int = 10) Series[source]

Get the first n elements.

Parameters:
n

Number of elements to return. If a negative value is passed, return all elements except the last abs(n).

See also

tail, slice

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.head(3)
shape: (3,)
Series: 'a' [i64]
[
        1
        2
        3
]

Pass a negative value to get all rows except the last abs(n).

>>> s.head(-3)
shape: (2,)
Series: 'a' [i64]
[
        1
        2
]
hist(
bins: list[float] | None = None,
*,
bin_count: int | None = None,
) DataFrame[source]

Bin values into buckets and count their occurrences.

Parameters:
bins

Discretizations to make. If None given, we determine the boundaries based on the data.

bin_count

If no bins provided, this will be used to determine the distance of the bins

Returns:
DataFrame

Warning

This functionality is experimental and may change without it being considered a breaking change.

Examples

>>> a = pl.Series("a", [1, 3, 8, 8, 2, 1, 3])
>>> a.hist(bin_count=4)
shape: (5, 3)
┌─────────────┬─────────────┬─────────┐
│ break_point ┆ category    ┆ a_count │
│ ---         ┆ ---         ┆ ---     │
│ f64         ┆ cat         ┆ u32     │
╞═════════════╪═════════════╪═════════╡
│ 0.0         ┆ (-inf, 0.0] ┆ 0       │
│ 2.25        ┆ (0.0, 2.25] ┆ 3       │
│ 4.5         ┆ (2.25, 4.5] ┆ 2       │
│ 6.75        ┆ (4.5, 6.75] ┆ 0       │
│ inf         ┆ (6.75, inf] ┆ 2       │
└─────────────┴─────────────┴─────────┘
implode() Self[source]

Aggregate values into a list.

interpolate(method: InterpolationMethod = 'linear') Series[source]

Fill null values using interpolation.

Parameters:
method{‘linear’, ‘nearest’}

Interpolation method.

Examples

>>> s = pl.Series("a", [1, 2, None, None, 5])
>>> s.interpolate()
shape: (5,)
Series: 'a' [f64]
[
    1.0
    2.0
    3.0
    4.0
    5.0
]
is_between(
lower_bound: IntoExpr,
upper_bound: IntoExpr,
closed: ClosedInterval = 'both',
) Series[source]

Get a boolean mask of the values that fall between the given start/end values.

Parameters:
lower_bound

Lower bound value. Accepts expression input. Non-expression inputs (including strings) are parsed as literals.

upper_bound

Upper bound value. Accepts expression input. Non-expression inputs (including strings) are parsed as literals.

closed{‘both’, ‘left’, ‘right’, ‘none’}

Define which sides of the interval are closed (inclusive).

Examples

>>> s = pl.Series("num", [1, 2, 3, 4, 5])
>>> s.is_between(2, 4)
shape: (5,)
Series: 'num' [bool]
[
    false
    true
    true
    true
    false
]

Use the closed argument to include or exclude the values at the bounds:

>>> s.is_between(2, 4, closed="left")
shape: (5,)
Series: 'num' [bool]
[
    false
    true
    true
    false
    false
]

You can also use strings as well as numeric/temporal values:

>>> s = pl.Series("s", ["a", "b", "c", "d", "e"])
>>> s.is_between("b", "d", closed="both")
shape: (5,)
Series: 's' [bool]
[
    false
    true
    true
    true
    false
]
is_boolean() bool[source]

Check if this Series is a Boolean.

Deprecated since version 0.19.14: Use Series.dtype == pl.Boolean instead.

Examples

>>> s = pl.Series("a", [True, False, True])
>>> s.is_boolean()  
True
is_duplicated() Series[source]

Get mask of all duplicated values.

Returns:
Series

Series of data type Boolean.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.is_duplicated()
shape: (4,)
Series: 'a' [bool]
[
        false
        true
        true
        false
]
is_empty() bool[source]

Check if the Series is empty.

Examples

>>> s = pl.Series("a", [], dtype=pl.Float32)
>>> s.is_empty()
True
is_finite() Series[source]

Returns a boolean Series indicating which values are finite.

Returns:
Series

Series of data type Boolean.

Examples

>>> import numpy as np
>>> s = pl.Series("a", [1.0, 2.0, np.inf])
>>> s.is_finite()
shape: (3,)
Series: 'a' [bool]
[
        true
        true
        false
]
is_first() Series[source]

Return a boolean mask indicating the first occurrence of each distinct value.

Deprecated since version 0.19.3: This method has been renamed to Series.is_first_distinct().

Returns:
Series

Series of data type Boolean.

is_first_distinct() Series[source]

Return a boolean mask indicating the first occurrence of each distinct value.

Returns:
Series

Series of data type Boolean.

Examples

>>> s = pl.Series([1, 1, 2, 3, 2])
>>> s.is_first_distinct()
shape: (5,)
Series: '' [bool]
[
        true
        false
        true
        true
        false
]
is_float() bool[source]

Check if this Series has floating point numbers.

Deprecated since version 0.19.13: Use Series.dtype.is_float() instead.

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0])
>>> s.is_float()  
True
is_in(
other: Series | Collection[Any],
) Series[source]

Check if elements of this Series are in the other Series.

Returns:
Series

Series of data type Boolean.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s2 = pl.Series("b", [2, 4])
>>> s2.is_in(s)
shape: (2,)
Series: 'b' [bool]
[
        true
        false
]
>>> # check if some values are a member of sublists
>>> sets = pl.Series("sets", [[1, 2, 3], [1, 2], [9, 10]])
>>> optional_members = pl.Series("optional_members", [1, 2, 3])
>>> print(sets)
shape: (3,)
Series: 'sets' [list[i64]]
[
    [1, 2, 3]
    [1, 2]
    [9, 10]
]
>>> print(optional_members)
shape: (3,)
Series: 'optional_members' [i64]
[
    1
    2
    3
]
>>> optional_members.is_in(sets)
shape: (3,)
Series: 'optional_members' [bool]
[
    true
    true
    false
]
is_infinite() Series[source]

Returns a boolean Series indicating which values are infinite.

Returns:
Series

Series of data type Boolean.

Examples

>>> import numpy as np
>>> s = pl.Series("a", [1.0, 2.0, np.inf])
>>> s.is_infinite()
shape: (3,)
Series: 'a' [bool]
[
        false
        false
        true
]
is_integer(signed: bool | None = None) bool[source]

Check if this Series datatype is an integer (signed or unsigned).

Deprecated since version 0.19.13: Use Series.dtype.is_integer() instead. For signed/unsigned variants, use Series.dtype.is_signed_integer() or Series.dtype.is_unsigned_integer().

Parameters:
signed
  • if None, both signed and unsigned integer dtypes will match.

  • if True, only signed integer dtypes will be considered a match.

  • if False, only unsigned integer dtypes will be considered a match.

Examples

>>> s = pl.Series("a", [1, 2, 3], dtype=pl.UInt32)
>>> s.is_integer()  
True
>>> s.is_integer(signed=False)  
True
>>> s.is_integer(signed=True)  
False
is_last() Series[source]

Return a boolean mask indicating the last occurrence of each distinct value.

Deprecated since version 0.19.3: This method has been renamed to Series.is_last_distinct().

Returns:
Series

Series of data type Boolean.

is_last_distinct() Series[source]

Return a boolean mask indicating the last occurrence of each distinct value.

Returns:
Series

Series of data type Boolean.

Examples

>>> s = pl.Series([1, 1, 2, 3, 2])
>>> s.is_last_distinct()
shape: (5,)
Series: '' [bool]
[
        false
        true
        false
        true
        true
]
is_nan() Series[source]

Returns a boolean Series indicating which values are not NaN.

Returns:
Series

Series of data type Boolean.

Examples

>>> import numpy as np
>>> s = pl.Series("a", [1.0, 2.0, 3.0, np.nan])
>>> s.is_nan()
shape: (4,)
Series: 'a' [bool]
[
        false
        false
        false
        true
]
is_not_nan() Series[source]

Returns a boolean Series indicating which values are not NaN.

Returns:
Series

Series of data type Boolean.

Examples

>>> import numpy as np
>>> s = pl.Series("a", [1.0, 2.0, 3.0, np.nan])
>>> s.is_not_nan()
shape: (4,)
Series: 'a' [bool]
[
        true
        true
        true
        false
]
is_not_null() Series[source]

Returns a boolean Series indicating which values are not null.

Returns:
Series

Series of data type Boolean.

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, None])
>>> s.is_not_null()
shape: (4,)
Series: 'a' [bool]
[
    true
    true
    true
    false
]
is_null() Series[source]

Returns a boolean Series indicating which values are null.

Returns:
Series

Series of data type Boolean.

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, None])
>>> s.is_null()
shape: (4,)
Series: 'a' [bool]
[
    false
    false
    false
    true
]
is_numeric() bool[source]

Check if this Series datatype is numeric.

Deprecated since version 0.19.13: Use Series.dtype.is_numeric() instead.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.is_numeric()  
True
is_sorted(*, descending: bool = False) bool[source]

Check if the Series is sorted.

Parameters:
descending

Check if the Series is sorted in descending order

is_temporal(excluding: OneOrMoreDataTypes | None = None) bool[source]

Check if this Series datatype is temporal.

Deprecated since version 0.19.13: Use Series.dtype.is_temporal() instead.

Parameters:
excluding

Optionally exclude one or more temporal dtypes from matching.

Examples

>>> from datetime import date
>>> s = pl.Series([date(2021, 1, 1), date(2021, 1, 2), date(2021, 1, 3)])
>>> s.is_temporal()  
True
>>> s.is_temporal(excluding=[pl.Date])  
False
is_unique() Series[source]

Get mask of all unique values.

Returns:
Series

Series of data type Boolean.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.is_unique()
shape: (4,)
Series: 'a' [bool]
[
        true
        false
        false
        true
]
is_utf8() bool[source]

Check if this Series datatype is a Utf8.

Deprecated since version 0.19.14: Use Series.dtype == pl.Utf8 instead.

Examples

>>> s = pl.Series("x", ["a", "b", "c"])
>>> s.is_utf8()  
True
item(index: int | None = None) Any[source]

Return the Series as a scalar, or return the element at the given index.

If no index is provided, this is equivalent to s[0], with a check that the shape is (1,). With an index, this is equivalent to s[index].

Examples

>>> s1 = pl.Series("a", [1])
>>> s1.item()
1
>>> s2 = pl.Series("a", [9, 8, 7])
>>> s2.cum_sum().item(-1)
24
kurtosis(*, fisher: bool = True, bias: bool = True) float | None[source]

Compute the kurtosis (Fisher or Pearson) of a dataset.

Kurtosis is the fourth central moment divided by the square of the variance. If Fisher’s definition is used, then 3.0 is subtracted from the result to give 0.0 for a normal distribution. If bias is False then the kurtosis is calculated using k statistics to eliminate bias coming from biased moment estimators

See scipy.stats for more information

Parameters:
fisherbool, optional

If True, Fisher’s definition is used (normal ==> 0.0). If False, Pearson’s definition is used (normal ==> 3.0).

biasbool, optional

If False, the calculations are corrected for statistical bias.

le(other: Any) Self | Expr[source]

Method equivalent of operator expression series <= other.

len() int[source]

Return the number of elements in this Series.

Null values are treated like regular elements in this context.

Examples

>>> s = pl.Series("a", [1, 2, None])
>>> s.len()
3
limit(n: int = 10) Series[source]

Get the first n elements.

Alias for Series.head().

Parameters:
n

Number of elements to return. If a negative value is passed, return all elements except the last abs(n).

See also

head
log(base: float = 2.718281828459045) Series[source]

Compute the logarithm to a given base.

log10() Series[source]

Compute the base 10 logarithm of the input array, element-wise.

log1p() Series[source]

Compute the natural logarithm of the input array plus one, element-wise.

lower_bound() Self[source]

Return the lower bound of this Series’ dtype as a unit Series.

See also

upper_bound

return the upper bound of the given Series’ dtype.

Examples

>>> s = pl.Series("s", [-1, 0, 1], dtype=pl.Int32)
>>> s.lower_bound()
shape: (1,)
Series: 's' [i32]
[
    -2147483648
]
>>> s = pl.Series("s", [1.0, 2.5, 3.0], dtype=pl.Float32)
>>> s.lower_bound()
shape: (1,)
Series: 's' [f32]
[
    -inf
]
lt(other: Any) Self | Expr[source]

Method equivalent of operator expression series < other.

map_dict(
mapping: dict[Any, Any],
*,
default: Any = None,
return_dtype: PolarsDataType | None = None,
) Self[source]

Replace values in the Series using a remapping dictionary.

Deprecated since version 0.19.16: This method has been renamed to replace(). The default behavior has changed to keep any values not present in the mapping unchanged. Pass default=None to keep existing behavior.

Parameters:
mapping

Dictionary containing the before/after values to map.

default

Value to use when the remapping dict does not contain the lookup value. Use pl.first(), to keep the original value.

return_dtype

Set return dtype to override automatic return dtype determination.

map_elements(
function: Callable[[Any], Any],
return_dtype: PolarsDataType | None = None,
*,
skip_nulls: bool = True,
) Self[source]

Map a custom/user-defined function (UDF) over elements in this Series.

Warning

This method is much slower than the native expressions API. Only use it if you cannot implement your logic otherwise.

If the function returns a different datatype, the return_dtype arg should be set, otherwise the method will fail.

Implementing logic using a Python function is almost always significantly slower and more memory intensive than implementing the same logic using the native expression API because:

  • The native expression engine runs in Rust; UDFs run in Python.

  • Use of Python UDFs forces the DataFrame to be materialized in memory.

  • Polars-native expressions can be parallelised (UDFs typically cannot).

  • Polars-native expressions can be logically optimised (UDFs cannot).

Wherever possible you should strongly prefer the native expression API to achieve the best performance.

Parameters:
function

Custom function or lambda.

return_dtype

Output datatype. If none is given, the same datatype as this Series will be used.

skip_nulls

Nulls will be skipped and not passed to the python function. This is faster because python can be skipped and because we call more specialized functions.

Returns:
Series

Warning

If return_dtype is not provided, this may lead to unexpected results. We allow this, but it is considered a bug in the user’s query.

Notes

If your function is expensive and you don’t want it to be called more than once for a given input, consider applying an @lru_cache decorator to it. If your data is suitable you may achieve significant speedups.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.map_elements(lambda x: x + 10)  
shape: (3,)
Series: 'a' [i64]
[
        11
        12
        13
]
max() PythonLiteral | None[source]

Get the maximum value in this Series.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.max()
3
mean() int | float | None[source]

Reduce this Series to the mean value.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.mean()
2.0
median() float | None[source]

Get the median of this Series.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.median()
2.0
min() PythonLiteral | None[source]

Get the minimal value in this Series.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.min()
1
mode() Series[source]

Compute the most occurring value(s).

Can return multiple Values.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.mode()
shape: (1,)
Series: 'a' [i64]
[
        2
]
n_chunks() int[source]

Get the number of chunks that this Series contains.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.n_chunks()
1
>>> s2 = pl.Series("a", [4, 5, 6])

Concatenate Series with rechunk = True

>>> pl.concat([s, s2]).n_chunks()
1

Concatenate Series with rechunk = False

>>> pl.concat([s, s2], rechunk=False).n_chunks()
2
n_unique() int[source]

Count the number of unique values in this Series.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.n_unique()
3
nan_max() int | float | date | datetime | timedelta | str[source]

Get maximum value, but propagate/poison encountered NaN values.

This differs from numpy’s nanmax as numpy defaults to propagating NaN values, whereas polars defaults to ignoring them.

nan_min() int | float | date | datetime | timedelta | str[source]

Get minimum value, but propagate/poison encountered NaN values.

This differs from numpy’s nanmax as numpy defaults to propagating NaN values, whereas polars defaults to ignoring them.

ne(other: Any) Self | Expr[source]

Method equivalent of operator expression series != other.

ne_missing(other: Expr) Expr[source]
ne_missing(other: Any) Self

Method equivalent of equality operator series != other where None == None.

This differs from the standard ne where null values are propagated.

Parameters:
other

A literal or expression value to compare with.

See also

eq_missing
ne

Examples

>>> s1 = pl.Series("a", [333, 200, None])
>>> s2 = pl.Series("a", [100, 200, None])
>>> s1.ne(s2)
shape: (3,)
Series: 'a' [bool]
[
    true
    false
    null
]
>>> s1.ne_missing(s2)
shape: (3,)
Series: 'a' [bool]
[
    true
    false
    false
]
new_from_index(index: int, length: int) Self[source]

Create a new Series filled with values from the given index.

not_() Series[source]

Negate a boolean Series.

Returns:
Series

Series of data type Boolean.

Examples

>>> s = pl.Series("a", [True, False, False])
>>> s.not_()
shape: (3,)
Series: 'a' [bool]
[
    false
    true
    true
]
null_count() int[source]

Count the null values in this Series.

pct_change(n: int | IntoExprColumn = 1) Series[source]

Computes percentage change between values.

Percentage change (as fraction) between current element and most-recent non-null element at least n period(s) before the current element.

Computes the change from the previous row by default.

Parameters:
n

periods to shift for forming percent change.

Examples

>>> pl.Series(range(10)).pct_change()
shape: (10,)
Series: '' [f64]
[
    null
    inf
    1.0
    0.5
    0.333333
    0.25
    0.2
    0.166667
    0.142857
    0.125
]
>>> pl.Series([1, 2, 4, 8, 16, 32, 64, 128, 256, 512]).pct_change(2)
shape: (10,)
Series: '' [f64]
[
    null
    null
    3.0
    3.0
    3.0
    3.0
    3.0
    3.0
    3.0
    3.0
]
peak_max() Self[source]

Get a boolean mask of the local maximum peaks.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.peak_max()
shape: (5,)
Series: 'a' [bool]
[
        false
        false
        false
        false
        true
]
peak_min() Self[source]

Get a boolean mask of the local minimum peaks.

Examples

>>> s = pl.Series("a", [4, 1, 3, 2, 5])
>>> s.peak_min()
shape: (5,)
Series: 'a' [bool]
[
    false
    true
    false
    true
    false
]
pow(
exponent: int | float | None | Series,
) Series[source]

Raise to the power of the given exponent.

Parameters:
exponent

The exponent. Accepts Series input.

Examples

>>> s = pl.Series("foo", [1, 2, 3, 4])
>>> s.pow(3)
shape: (4,)
Series: 'foo' [f64]
[
        1.0
        8.0
        27.0
        64.0
]
product() int | float[source]

Reduce this Series to the product value.

qcut(
quantiles: Sequence[float] | int,
*,
labels: Sequence[str] | None = None,
left_closed: bool = False,
allow_duplicates: bool = False,
include_breaks: bool = False,
break_point_label: str = 'break_point',
category_label: str = 'category',
as_series: Literal[True] = True,
) Series[source]
qcut(
quantiles: Sequence[float] | int,
*,
labels: Sequence[str] | None = None,
left_closed: bool = False,
allow_duplicates: bool = False,
include_breaks: bool = False,
break_point_label: str = 'break_point',
category_label: str = 'category',
as_series: Literal[False],
) DataFrame
qcut(
quantiles: Sequence[float] | int,
*,
labels: Sequence[str] | None = None,
left_closed: bool = False,
allow_duplicates: bool = False,
include_breaks: bool = False,
break_point_label: str = 'break_point',
category_label: str = 'category',
as_series: bool,
) Series | DataFrame

Bin continuous values into discrete categories based on their quantiles.

Parameters:
quantiles

Either a list of quantile probabilities between 0 and 1 or a positive integer determining the number of bins with uniform probability.

labels

Names of the categories. The number of labels must be equal to the number of cut points plus one.

left_closed

Set the intervals to be left-closed instead of right-closed.

allow_duplicates

If set to True, duplicates in the resulting quantiles are dropped, rather than raising a DuplicateError. This can happen even with unique probabilities, depending on the data.

include_breaks

Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a Categorical to a Struct.

break_point_label

Name of the breakpoint column. Only used if include_breaks is set to True.

Deprecated since version 0.19.0: This parameter will be removed. Use Series.struct.rename_fields to rename the field instead.

category_label

Name of the category column. Only used if include_breaks is set to True.

Deprecated since version 0.19.0: This parameter will be removed. Use Series.struct.rename_fields to rename the field instead.

as_series

If set to False, return a DataFrame containing the original values, the breakpoints, and the categories.

Deprecated since version 0.19.0: This parameter will be removed. The same behavior can be achieved by setting include_breaks=True, unnesting the resulting struct Series, and adding the result to the original Series.

Returns:
Series

Series of data type Categorical if include_breaks is set to False (default), otherwise a Series of data type Struct.

Warning

This functionality is experimental and may change without it being considered a breaking change.

See also

cut

Examples

Divide a column into three categories according to pre-defined quantile probabilities.

>>> s = pl.Series("foo", [-2, -1, 0, 1, 2])
>>> s.qcut([0.25, 0.75], labels=["a", "b", "c"])
shape: (5,)
Series: 'foo' [cat]
[
        "a"
        "a"
        "b"
        "b"
        "c"
]

Divide a column into two categories using uniform quantile probabilities.

>>> s.qcut(2, labels=["low", "high"], left_closed=True)
shape: (5,)
Series: 'foo' [cat]
[
        "low"
        "low"
        "high"
        "high"
        "high"
]

Create a DataFrame with the breakpoint and category for each value.

>>> cut = s.qcut([0.25, 0.75], include_breaks=True).alias("cut")
>>> s.to_frame().with_columns(cut).unnest("cut")
shape: (5, 3)
┌─────┬─────────────┬────────────┐
│ foo ┆ break_point ┆ category   │
│ --- ┆ ---         ┆ ---        │
│ i64 ┆ f64         ┆ cat        │
╞═════╪═════════════╪════════════╡
│ -2  ┆ -1.0        ┆ (-inf, -1] │
│ -1  ┆ -1.0        ┆ (-inf, -1] │
│ 0   ┆ 1.0         ┆ (-1, 1]    │
│ 1   ┆ 1.0         ┆ (-1, 1]    │
│ 2   ┆ inf         ┆ (1, inf]   │
└─────┴─────────────┴────────────┘
quantile(
quantile: float,
interpolation: RollingInterpolationMethod = 'nearest',
) float | None[source]

Get the quantile value of this Series.

Parameters:
quantile

Quantile between 0.0 and 1.0.

interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’}

Interpolation method.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.quantile(0.5)
2.0
rank(
method: RankMethod = 'average',
*,
descending: bool = False,
seed: int | None = None,
) Series[source]

Assign ranks to data, dealing with ties appropriately.

Parameters:
method{‘average’, ‘min’, ‘max’, ‘dense’, ‘ordinal’, ‘random’}

The method used to assign ranks to tied elements. The following methods are available (default is ‘average’):

  • ‘average’ : The average of the ranks that would have been assigned to all the tied values is assigned to each value.

  • ‘min’ : The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as “competition” ranking.)

  • ‘max’ : The maximum of the ranks that would have been assigned to all the tied values is assigned to each value.

  • ‘dense’ : Like ‘min’, but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements.

  • ‘ordinal’ : All values are given a distinct rank, corresponding to the order that the values occur in the Series.

  • ‘random’ : Like ‘ordinal’, but the rank for ties is not dependent on the order that the values occur in the Series.

descending

Rank in descending order.

seed

If method="random", use this as seed.

Examples

The ‘average’ method:

>>> s = pl.Series("a", [3, 6, 1, 1, 6])
>>> s.rank()
shape: (5,)
Series: 'a' [f64]
[
    3.0
    4.5
    1.5
    1.5
    4.5
]

The ‘ordinal’ method:

>>> s = pl.Series("a", [3, 6, 1, 1, 6])
>>> s.rank("ordinal")
shape: (5,)
Series: 'a' [u32]
[
    3
    4
    1
    2
    5
]
rechunk(*, in_place: bool = False) Self[source]

Create a single chunk of memory for this Series.

Parameters:
in_place

In place or not.

reinterpret(*, signed: bool = True) Series[source]

Reinterpret the underlying bits as a signed/unsigned integer.

This operation is only allowed for 64bit integers. For lower bits integers, you can safely use that cast operation.

Parameters:
signed

If True, reinterpret as pl.Int64. Otherwise, reinterpret as pl.UInt64.

rename(name: str) Series[source]

Rename this Series.

Alias for Series.alias().

Parameters:
name

New name.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.rename("b")
shape: (3,)
Series: 'b' [i64]
[
        1
        2
        3
]
replace(
mapping: dict[Any, Any],
*,
default: Any = _NoDefault.no_default,
return_dtype: PolarsDataType | None = None,
) Self[source]

Replace values according to the given mapping.

Needs a global string cache for lazily evaluated queries on columns of type Categorical.

Parameters:
mapping

Mapping of values to their replacement.

default

Value to use when the mapping does not contain the lookup value. Defaults to keeping the original value.

return_dtype

Set return dtype to override automatic return dtype determination.

See also

str.replace

Examples

Replace a single value by another value. Values not in the mapping remain unchanged.

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.replace({2: 100})
shape: (4,)
Series: 'a' [i64]
[
        1
        100
        100
        3
]

Replace multiple values. Specify a default to set values not in the given map to the default value.

>>> s = pl.Series("country_code", ["FR", "ES", "DE", None])
>>> country_code_map = {
...     "CA": "Canada",
...     "DE": "Germany",
...     "FR": "France",
...     None: "unspecified",
... }
>>> s.replace(country_code_map, default=None)
shape: (4,)
Series: 'country_code' [str]
[
        "France"
        null
        "Germany"
        "unspecified"
]

The return type can be overridden with the return_dtype argument.

>>> s = pl.Series("a", [0, 1, 2, 3])
>>> s.replace({1: 10, 2: 20}, default=0, return_dtype=pl.UInt8)
shape: (4,)
Series: 'a' [u8]
[
        0
        10
        20
        0
]
reshape(dimensions: tuple[int, ...]) Series[source]

Reshape this Series to a flat Series or a Series of Lists.

Parameters:
dimensions

Tuple of the dimension sizes. If a -1 is used in any of the dimensions, that dimension is inferred.

Returns:
Series

If a single dimension is given, results in a Series of the original data type. If a multiple dimensions are given, results in a Series of data type List with shape (rows, cols).

See also

Series.list.explode

Explode a list column.

Examples

>>> s = pl.Series("foo", [1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> s.reshape((3, 3))
shape: (3,)
Series: 'foo' [list[i64]]
[
        [1, 2, 3]
        [4, 5, 6]
        [7, 8, 9]
]
reverse() Series[source]

Return Series in reverse order.

Examples

>>> s = pl.Series("a", [1, 2, 3], dtype=pl.Int8)
>>> s.reverse()
shape: (3,)
Series: 'a' [i8]
[
    3
    2
    1
]
rle() Series[source]

Get the lengths of runs of identical values.

Returns:
Series

Series of data type Struct with Fields “lengths” and “values”.

Examples

>>> s = pl.Series("s", [1, 1, 2, 1, None, 1, 3, 3])
>>> s.rle().struct.unnest()
shape: (6, 2)
┌─────────┬────────┐
│ lengths ┆ values │
│ ---     ┆ ---    │
│ i32     ┆ i64    │
╞═════════╪════════╡
│ 2       ┆ 1      │
│ 1       ┆ 2      │
│ 1       ┆ 1      │
│ 1       ┆ null   │
│ 1       ┆ 1      │
│ 2       ┆ 3      │
└─────────┴────────┘
rle_id() Series[source]

Map values to run IDs.

Similar to RLE, but it maps each value to an ID corresponding to the run into which it falls. This is especially useful when you want to define groups by runs of identical values rather than the values themselves.

Returns:
Series

See also

rle

Examples

>>> s = pl.Series("s", [1, 1, 2, 1, None, 1, 3, 3])
>>> s.rle_id()
shape: (8,)
Series: 's' [u32]
[
    0
    0
    1
    2
    3
    4
    5
    5
]
rolling_apply(
function: Callable[[Series], Any],
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
) Series[source]

Apply a custom rolling window function.

Deprecated since version 0.19.0: This method has been renamed to Series.rolling_map().

Parameters:
function

Aggregation function

window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

rolling_map(
function: Callable[[Series], Any],
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
) Series[source]

Compute a custom rolling window function.

Warning

Computing custom functions is extremely slow. Use specialized rolling functions such as Series.rolling_sum() if at all possible.

Parameters:
function

Custom aggregation function.

window_size

Size of the window. The window at a given row will include the row itself and the window_size - 1 elements before it.

weights

A list of weights with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window.

Examples

>>> from numpy import nansum
>>> s = pl.Series([11.0, 2.0, 9.0, float("nan"), 8.0])
>>> s.rolling_map(nansum, window_size=3)
shape: (5,)
Series: '' [f64]
[
        null
        null
        22.0
        11.0
        17.0
]
rolling_max(
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
) Series[source]

Apply a rolling max (moving max) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their max.

The window at a given row will include the row itself and the window_size - 1 elements before it.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [100, 200, 300, 400, 500])
>>> s.rolling_max(window_size=2)
shape: (5,)
Series: 'a' [i64]
[
    null
    200
    300
    400
    500
]
rolling_mean(
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
) Series[source]

Apply a rolling mean (moving mean) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their mean.

The window at a given row will include the row itself and the window_size - 1 elements before it.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [100, 200, 300, 400, 500])
>>> s.rolling_mean(window_size=2)
shape: (5,)
Series: 'a' [f64]
[
    null
    150.0
    250.0
    350.0
    450.0
]
rolling_median(
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
) Series[source]

Compute a rolling median.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

The window at a given row will include the row itself and the `window_size - 1`
elements before it.

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_median(window_size=3)
shape: (6,)
Series: 'a' [f64]
[
        null
        null
        2.0
        3.0
        4.0
        6.0
]
rolling_min(
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
) Series[source]

Apply a rolling min (moving min) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their min.

The window at a given row will include the row itself and the window_size - 1 elements before it.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [100, 200, 300, 400, 500])
>>> s.rolling_min(window_size=3)
shape: (5,)
Series: 'a' [i64]
[
    null
    null
    100
    200
    300
]
rolling_quantile(
quantile: float,
interpolation: RollingInterpolationMethod = 'nearest',
window_size: int = 2,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
) Series[source]

Compute a rolling quantile.

The window at a given row will include the row itself and the window_size - 1 elements before it.

Parameters:
quantile

Quantile between 0.0 and 1.0.

interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’}

Interpolation method.

window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_quantile(quantile=0.33, window_size=3)
shape: (6,)
Series: 'a' [f64]
[
        null
        null
        1.0
        2.0
        3.0
        4.0
]
>>> s.rolling_quantile(quantile=0.33, interpolation="linear", window_size=3)
shape: (6,)
Series: 'a' [f64]
[
        null
        null
        1.66
        2.66
        3.66
        5.32
]
rolling_skew(
window_size: int,
*,
bias: bool = True,
) Series[source]

Compute a rolling skew.

The window at a given row includes the row itself and the window_size - 1 elements before it.

Parameters:
window_size

Integer size of the rolling window.

bias

If False, the calculations are corrected for statistical bias.

Examples

>>> pl.Series([1, 4, 2, 9]).rolling_skew(3)
shape: (4,)
Series: '' [f64]
[
    null
    null
    0.381802
    0.47033
]

Note how the values match

>>> pl.Series([1, 4, 2]).skew(), pl.Series([4, 2, 9]).skew()
(0.38180177416060584, 0.47033046033698594)
rolling_std(
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
ddof: int = 1,
) Series[source]

Compute a rolling std dev.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their std dev.

The window at a given row will include the row itself and the window_size - 1 elements before it.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

ddof

“Delta Degrees of Freedom”: The divisor for a length N window is N - ddof

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_std(window_size=3)
shape: (6,)
Series: 'a' [f64]
[
        null
        null
        1.0
        1.0
        1.527525
        2.0
]
rolling_sum(
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
) Series[source]

Apply a rolling sum (moving sum) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

The window at a given row will include the row itself and the window_size - 1 elements before it.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length of the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.rolling_sum(window_size=2)
shape: (5,)
Series: 'a' [i64]
[
        null
        3
        5
        7
        9
]
rolling_var(
window_size: int,
weights: list[float] | None = None,
min_periods: int | None = None,
*,
center: bool = False,
ddof: int = 1,
) Series[source]

Compute a rolling variance.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their variance.

The window at a given row will include the row itself and the window_size - 1 elements before it.

Parameters:
window_size

The length of the window.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_periods

The number of values in the window that should be non-null before computing a result. If None, it will be set equal to window size.

center

Set the labels at the center of the window

ddof

“Delta Degrees of Freedom”: The divisor for a length N window is N - ddof

Examples

>>> s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0])
>>> s.rolling_var(window_size=3)
shape: (6,)
Series: 'a' [f64]
[
        null
        null
        1.0
        1.0
        2.333333
        4.0
]
round(decimals: int = 0) Series[source]

Round underlying floating point data by decimals digits.

Parameters:
decimals

number of decimals to round by.

Examples

>>> s = pl.Series("a", [1.12345, 2.56789, 3.901234])
>>> s.round(2)
shape: (3,)
Series: 'a' [f64]
[
        1.12
        2.57
        3.9
]
round_sig_figs(digits: int) Series[source]

Round to a number of significant figures.

Parameters:
digits

Number of significant figures to round to.

Examples

>>> s = pl.Series([0.01234, 3.333, 1234.0])
>>> s.round_sig_figs(2)
shape: (3,)
Series: '' [f64]
[
        0.012
        3.3
        1200.0
]
sample(
n: int | None = None,
*,
fraction: float | None = None,
with_replacement: bool = False,
shuffle: bool = False,
seed: int | None = None,
) Series[source]

Sample from this Series.

Parameters:
n

Number of items to return. Cannot be used with fraction. Defaults to 1 if fraction is None.

fraction

Fraction of items to return. Cannot be used with n.

with_replacement

Allow values to be sampled more than once.

shuffle

Shuffle the order of sampled data points.

seed

Seed for the random number generator. If set to None (default), a random seed is generated for each sample operation.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.sample(2, seed=0)  
shape: (2,)
Series: 'a' [i64]
[
    1
    5
]
scatter(
indices: Series | ndarray[Any, Any] | Sequence[int] | int,
values: int | float | str | bool | date | datetime | Sequence[int] | Sequence[float] | Sequence[bool] | Sequence[str] | Sequence[date] | Sequence[datetime] | Series | None,
) Series[source]

Set values at the index locations.

Parameters:
indices

Integers representing the index locations.

values

Replacement values.

Notes

Use of this function is frequently an anti-pattern, as it can block optimization (predicate pushdown, etc). Consider using pl.when(predicate).then(value).otherwise(self) instead.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.scatter(1, 10)
shape: (3,)
Series: 'a' [i64]
[
        1
        10
        3
]

It is better to implement this as follows:

>>> s.to_frame().with_row_count("row_nr").select(
...     pl.when(pl.col("row_nr") == 1).then(10).otherwise(pl.col("a"))
... )
shape: (3, 1)
┌─────────┐
│ literal │
│ ---     │
│ i64     │
╞═════════╡
│ 1       │
│ 10      │
│ 3       │
└─────────┘
search_sorted(element: int | float, side: SearchSortedSide = 'any') int[source]
search_sorted(
element: Series | ndarray[Any, Any] | list[int] | list[float],
side: SearchSortedSide = 'any',
) Series

Find indices where elements should be inserted to maintain order.

\[a[i-1] < v <= a[i]\]
Parameters:
element

Expression or scalar value.

side{‘any’, ‘left’, ‘right’}

If ‘any’, the index of the first suitable location found is given. If ‘left’, the index of the leftmost suitable location found is given. If ‘right’, return the rightmost suitable location found is given.

series_equal(
other: Series,
*,
null_equal: bool = True,
strict: bool = False,
) bool[source]

Check whether the Series is equal to another Series.

Deprecated since version 0.19.16: This method has been renamed to equals().

Parameters:
other

Series to compare with.

null_equal

Consider null values as equal.

strict

Don’t allow different numerical dtypes, e.g. comparing pl.UInt32 with a pl.Int64 will return False.

set(
filter: Series,
value: int | float | str | bool | None,
) Series[source]

Set masked values.

Parameters:
filter

Boolean mask.

value

Value with which to replace the masked values.

Notes

Use of this function is frequently an anti-pattern, as it can block optimisation (predicate pushdown, etc). Consider using pl.when(predicate).then(value).otherwise(self) instead.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.set(s == 2, 10)
shape: (3,)
Series: 'a' [i64]
[
        1
        10
        3
]

It is better to implement this as follows:

>>> s.to_frame().select(
...     pl.when(pl.col("a") == 2).then(10).otherwise(pl.col("a"))
... )
shape: (3, 1)
┌─────────┐
│ literal │
│ ---     │
│ i64     │
╞═════════╡
│ 1       │
│ 10      │
│ 3       │
└─────────┘
set_at_idx(
indices: Series | ndarray[Any, Any] | Sequence[int] | int,
values: int | float | str | bool | date | datetime | Sequence[int] | Sequence[float] | Sequence[bool] | Sequence[str] | Sequence[date] | Sequence[datetime] | Series | None,
) Series[source]

Set values at the index locations.

Deprecated since version 0.19.14: This method has been renamed to scatter().

Parameters:
indices

Integers representing the index locations.

values

Replacement values.

set_sorted(*, descending: bool = False) Self[source]

Flags the Series as ‘sorted’.

Enables downstream code to user fast paths for sorted arrays.

Parameters:
descending

If the Series order is descending.

Warning

This can lead to incorrect results if this Series is not sorted!! Use with care!

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.set_sorted().max()
3
shift(n: int = 1, *, fill_value: IntoExpr | None = None) Series[source]

Shift values by the given number of indices.

Parameters:
n

Number of indices to shift forward. If a negative value is passed, values are shifted in the opposite direction instead.

fill_value

Fill the resulting null values with this value. Accepts expression input. Non-expression inputs are parsed as literals.

Notes

This method is similar to the LAG operation in SQL when the value for n is positive. With a negative value for n, it is similar to LEAD.

Examples

By default, values are shifted forward by one index.

>>> s = pl.Series([1, 2, 3, 4])
>>> s.shift()
shape: (4,)
Series: '' [i64]
[
        null
        1
        2
        3
]

Pass a negative value to shift in the opposite direction instead.

>>> s.shift(-2)
shape: (4,)
Series: '' [i64]
[
        3
        4
        null
        null
]

Specify fill_value to fill the resulting null values.

>>> s.shift(-2, fill_value=100)
shape: (4,)
Series: '' [i64]
[
        3
        4
        100
        100
]
shift_and_fill(fill_value: int | Expr, *, n: int = 1) Series[source]

Shift values by the given number of places and fill the resulting null values.

Deprecated since version 0.19.12: Use shift() instead.

Parameters:
fill_value

Fill None values with the result of this expression.

n

Number of places to shift (may be negative).

shrink_dtype() Series[source]

Shrink numeric columns to the minimal required datatype.

Shrink to the dtype needed to fit the extrema of this [Series]. This can be used to reduce memory pressure.

shrink_to_fit(*, in_place: bool = False) Series[source]

Shrink Series memory usage.

Shrinks the underlying array capacity to exactly fit the actual data. (Note that this function does not change the Series data type).

shuffle(seed: int | None = None) Series[source]

Shuffle the contents of this Series.

Parameters:
seed

Seed for the random number generator. If set to None (default), a random seed is generated each time the shuffle is called.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.shuffle(seed=1)
shape: (3,)
Series: 'a' [i64]
[
        2
        1
        3
]
sign() Series[source]

Compute the element-wise indication of the sign.

The returned values can be -1, 0, or 1:

  • -1 if x < 0.

  • 0 if x == 0.

  • 1 if x > 0.

(null values are preserved as-is).

Examples

>>> s = pl.Series("a", [-9.0, -0.0, 0.0, 4.0, None])
>>> s.sign()
shape: (5,)
Series: 'a' [i64]
[
        -1
        0
        0
        1
        null
]
sin() Series[source]

Compute the element-wise value for the sine.

Examples

>>> import math
>>> s = pl.Series("a", [0.0, math.pi / 2.0, math.pi])
>>> s.sin()
shape: (3,)
Series: 'a' [f64]
[
    0.0
    1.0
    1.2246e-16
]
sinh() Series[source]

Compute the element-wise value for the hyperbolic sine.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.sinh()
shape: (3,)
Series: 'a' [f64]
[
    1.175201
    0.0
    -1.175201
]
skew(*, bias: bool = True) float | None[source]

Compute the sample skewness of a data set.

For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to zero, statistically speaking.

See scipy.stats for more information.

Parameters:
biasbool, optional

If False, the calculations are corrected for statistical bias.

Notes

The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e.

\[g_1=\frac{m_3}{m_2^{3/2}}\]

where

\[m_i=\frac{1}{N}\sum_{n=1}^N(x[n]-\bar{x})^i\]

is the biased sample \(i\texttt{th}\) central moment, and \(\bar{x}\) is the sample mean. If bias is False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.

\[G_1 = \frac{k_3}{k_2^{3/2}} = \frac{\sqrt{N(N-1)}}{N-2}\frac{m_3}{m_2^{3/2}}\]
slice(offset: int, length: int | None = None) Series[source]

Get a slice of this Series.

Parameters:
offset

Start index. Negative indexing is supported.

length

Length of the slice. If set to None, all rows starting at the offset will be selected.

Examples

>>> s = pl.Series("a", [1, 2, 3, 4])
>>> s.slice(1, 2)
shape: (2,)
Series: 'a' [i64]
[
        2
        3
]
sort(*, descending: bool = False, in_place: bool = False) Self[source]

Sort this Series.

Parameters:
descending

Sort in descending order.

in_place

Sort in-place.

Examples

>>> s = pl.Series("a", [1, 3, 4, 2])
>>> s.sort()
shape: (4,)
Series: 'a' [i64]
[
        1
        2
        3
        4
]
>>> s.sort(descending=True)
shape: (4,)
Series: 'a' [i64]
[
        4
        3
        2
        1
]
sqrt() Series[source]

Compute the square root of the elements.

Syntactic sugar for

>>> pl.Series([1, 2]) ** 0.5
shape: (2,)
Series: '' [f64]
[
    1.0
    1.414214
]
std(ddof: int = 1) float | None[source]

Get the standard deviation of this Series.

Parameters:
ddof

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.std()
1.0
sum() int | float[source]

Reduce this Series to the sum value.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.sum()
6
tail(n: int = 10) Series[source]

Get the last n elements.

Parameters:
n

Number of elements to return. If a negative value is passed, return all elements except the first abs(n).

See also

head, slice

Examples

>>> s = pl.Series("a", [1, 2, 3, 4, 5])
>>> s.tail(3)
shape: (3,)
Series: 'a' [i64]
[
        3
        4
        5
]

Pass a negative value to get all rows except the first abs(n).

>>> s.tail(-3)
shape: (2,)
Series: 'a' [i64]
[
        4
        5
]
take(indices: int | list[int] | Expr | Series | np.ndarray[Any, Any]) Series[source]

Take values by index.

Deprecated since version 0.19.14: This method has been renamed to gather().

Parameters:
indices

Index location used for selection.

take_every(n: int) Series[source]

Take every nth value in the Series and return as new Series.

Deprecated since version 0.19.14: This method has been renamed to gather_every().

Parameters:
n

Gather every n-th row.

tan() Series[source]

Compute the element-wise value for the tangent.

Examples

>>> import math
>>> s = pl.Series("a", [0.0, math.pi / 2.0, math.pi])
>>> s.tan()
shape: (3,)
Series: 'a' [f64]
[
    0.0
    1.6331e16
    -1.2246e-16
]
tanh() Series[source]

Compute the element-wise value for the hyperbolic tangent.

Examples

>>> s = pl.Series("a", [1.0, 0.0, -1.0])
>>> s.tanh()
shape: (3,)
Series: 'a' [f64]
[
    0.761594
    0.0
    -0.761594
]
to_arrow() Array[source]

Get the underlying Arrow Array.

If the Series contains only a single chunk this operation is zero copy.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s = s.to_arrow()
>>> s  
<pyarrow.lib.Int64Array object at ...>
[
  1,
  2,
  3
]
to_dummies(separator: str = '_') DataFrame[source]

Get dummy/indicator variables.

Parameters:
separator

Separator/delimiter used when generating column names.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.to_dummies()
shape: (3, 3)
┌─────┬─────┬─────┐
│ a_1 ┆ a_2 ┆ a_3 │
│ --- ┆ --- ┆ --- │
│ u8  ┆ u8  ┆ u8  │
╞═════╪═════╪═════╡
│ 1   ┆ 0   ┆ 0   │
│ 0   ┆ 1   ┆ 0   │
│ 0   ┆ 0   ┆ 1   │
└─────┴─────┴─────┘
to_frame(name: str | None = None) DataFrame[source]

Cast this Series to a DataFrame.

Parameters:
name

optionally name/rename the Series column in the new DataFrame.

Examples

>>> s = pl.Series("a", [123, 456])
>>> df = s.to_frame()
>>> df
shape: (2, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 123 │
│ 456 │
└─────┘
>>> df = s.to_frame("xyz")
>>> df
shape: (2, 1)
┌─────┐
│ xyz │
│ --- │
│ i64 │
╞═════╡
│ 123 │
│ 456 │
└─────┘
to_init_repr(n: int = 1000) str[source]

Convert Series to instantiatable string representation.

Parameters:
n

Only use first n elements.

Examples

>>> s = pl.Series("a", [1, 2, None, 4], dtype=pl.Int16)
>>> print(s.to_init_repr())
pl.Series("a", [1, 2, None, 4], dtype=pl.Int16)
>>> s_from_str_repr = eval(s.to_init_repr())
>>> s_from_str_repr
shape: (4,)
Series: 'a' [i16]
[
    1
    2
    null
    4
]
to_list(*, use_pyarrow: bool | None = None) list[Any][source]

Convert this Series to a Python List. This operation clones data.

Parameters:
use_pyarrow

Use pyarrow for the conversion.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.to_list()
[1, 2, 3]
>>> type(s.to_list())
<class 'list'>
to_numpy(
*args: Any,
zero_copy_only: bool = False,
writable: bool = False,
use_pyarrow: bool = True,
) ndarray[Any, Any][source]

Convert this Series to numpy.

This operation may clone data but is completely safe. Note that:

  • data which is purely numeric AND without null values is not cloned;

  • floating point nan values can be zero-copied;

  • booleans can’t be zero-copied.

To ensure that no data is cloned, set zero_copy_only=True.

Parameters:
*args

args will be sent to pyarrow.Array.to_numpy.

zero_copy_only

If True, an exception will be raised if the conversion to a numpy array would require copying the underlying data (e.g. in presence of nulls, or for non-primitive types).

writable

For numpy arrays created with zero copy (view on the Arrow data), the resulting array is not writable (Arrow data is immutable). By setting this to True, a copy of the array is made to ensure it is writable.

use_pyarrow

Use pyarrow.Array.to_numpy

for the conversion to numpy.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> arr = s.to_numpy()
>>> arr  
array([1, 2, 3], dtype=int64)
>>> type(arr)
<class 'numpy.ndarray'>
to_pandas(
*args: Any,
use_pyarrow_extension_array: bool = False,
**kwargs: Any,
) pd.Series[Any][source]

Convert this Series to a pandas Series.

This requires that pandas and pyarrow are installed. This operation clones data, unless use_pyarrow_extension_array=True.

Parameters:
use_pyarrow_extension_array

Further operations on this Pandas series, might trigger conversion to numpy. Use PyArrow backed-extension array instead of numpy array for pandas Series. This allows zero copy operations and preservation of nulls values. Further operations on this pandas Series, might trigger conversion to NumPy arrays if that operation is not supported by pyarrow compute functions.

kwargs

Arguments will be sent to pyarrow.Table.to_pandas().

Examples

>>> s1 = pl.Series("a", [1, 2, 3])
>>> s1.to_pandas()
0    1
1    2
2    3
Name: a, dtype: int64
>>> s1.to_pandas(use_pyarrow_extension_array=True)  
0    1
1    2
2    3
Name: a, dtype: int64[pyarrow]
>>> s2 = pl.Series("b", [1, 2, None, 4])
>>> s2.to_pandas()
0    1.0
1    2.0
2    NaN
3    4.0
Name: b, dtype: float64
>>> s2.to_pandas(use_pyarrow_extension_array=True)  
0       1
1       2
2    <NA>
3       4
Name: b, dtype: int64[pyarrow]
to_physical() Series[source]

Cast to physical representation of the logical dtype.

  • polars.datatypes.Date() -> polars.datatypes.Int32()

  • polars.datatypes.Datetime() -> polars.datatypes.Int64()

  • polars.datatypes.Time() -> polars.datatypes.Int64()

  • polars.datatypes.Duration() -> polars.datatypes.Int64()

  • polars.datatypes.Categorical() -> polars.datatypes.UInt32()

  • List(inner) -> List(physical of inner)

  • Other data types will be left unchanged.

Examples

Replicating the pandas pd.Series.factorize method.

>>> s = pl.Series("values", ["a", None, "x", "a"])
>>> s.cast(pl.Categorical).to_physical()
shape: (4,)
Series: 'values' [u32]
[
    0
    null
    1
    0
]
top_k(k: int | IntoExprColumn = 5) Series[source]

Return the k largest elements.

This has time complexity:

\[\begin{split}O(n + k \\log{}n - \frac{k}{2})\end{split}\]
Parameters:
k

Number of elements to return.

See also

bottom_k

Examples

>>> s = pl.Series("a", [2, 5, 1, 4, 3])
>>> s.top_k(3)
shape: (3,)
Series: 'a' [i64]
[
    5
    4
    3
]
unique(*, maintain_order: bool = False) Series[source]

Get unique elements in series.

Parameters:
maintain_order

Maintain order of data. This requires more work.

Examples

>>> s = pl.Series("a", [1, 2, 2, 3])
>>> s.unique().sort()
shape: (3,)
Series: 'a' [i64]
[
    1
    2
    3
]
unique_counts() Series[source]

Return a count of the unique values in the order of appearance.

Examples

>>> s = pl.Series("id", ["a", "b", "b", "c", "c", "c"])
>>> s.unique_counts()
shape: (3,)
Series: 'id' [u32]
[
    1
    2
    3
]
upper_bound() Self[source]

Return the upper bound of this Series’ dtype as a unit Series.

See also

lower_bound

return the lower bound of the given Series’ dtype.

Examples

>>> s = pl.Series("s", [-1, 0, 1], dtype=pl.Int8)
>>> s.upper_bound()
shape: (1,)
Series: 's' [i8]
[
    127
]
>>> s = pl.Series("s", [1.0, 2.5, 3.0], dtype=pl.Float64)
>>> s.upper_bound()
shape: (1,)
Series: 's' [f64]
[
    inf
]
value_counts(*, sort: bool = False, parallel: bool = False) DataFrame[source]

Count the occurrences of unique values.

Parameters:
sort

Sort the output by count in descending order. If set to False (default), the order of the output is random.

parallel

Execute the computation in parallel.

Note

This option should likely not be enabled in a group by context, as the computation is already parallelized per group.

Returns:
DataFrame

Mapping of unique values to their count.

Examples

>>> s = pl.Series("color", ["red", "blue", "red", "green", "blue", "blue"])
>>> s.value_counts()  
shape: (3, 2)
┌───────┬────────┐
│ color ┆ counts │
│ ---   ┆ ---    │
│ str   ┆ u32    │
╞═══════╪════════╡
│ red   ┆ 2      │
│ green ┆ 1      │
│ blue  ┆ 3      │
└───────┴────────┘

Sort the output by count.

shape: (3, 2) ┌───────┬────────┐ │ color ┆ counts │ │ — ┆ — │ │ str ┆ u32 │ ╞═══════╪════════╡ │ blue ┆ 3 │ │ red ┆ 2 │ │ green ┆ 1 │ └───────┴────────┘

var(ddof: int = 1) float | None[source]

Get variance of this Series.

Parameters:
ddof

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.

Examples

>>> s = pl.Series("a", [1, 2, 3])
>>> s.var()
1.0
view(*, ignore_nulls: bool = False) SeriesView[source]

Get a view into this Series data with a numpy array.

Deprecated since version 0.19.14: This method will be removed in a future version.

This operation doesn’t clone data, but does not include missing values. Don’t use this unless you know what you are doing.

Parameters:
ignore_nulls

If True then nulls are converted to 0. If False then an Exception is raised if nulls are present.

zip_with(mask: Series, other: Series) Self[source]

Take values from self or other based on the given mask.

Where mask evaluates true, take values from self. Where mask evaluates false, take values from other.

Parameters:
mask

Boolean Series.

other

Series of same type.

Returns:
Series

Examples

>>> s1 = pl.Series([1, 2, 3, 4, 5])
>>> s2 = pl.Series([5, 4, 3, 2, 1])
>>> s1.zip_with(s1 < s2, s2)
shape: (5,)
Series: '' [i64]
[
        1
        2
        3
        2
        1
]
>>> mask = pl.Series([True, False, True, False, True])
>>> s1.zip_with(mask, s2)
shape: (5,)
Series: '' [i64]
[
        1
        4
        3
        2
        5
]