Testing#

The testing module provides a number of functions and helpers for use with unit tests.

Note

The testing module is not imported by default in order to optimise import speed of the primary polars module. Either import polars.testing and then use that namespace, or import the specific functions you need from the full module path, e.g.:

from polars.testing import assert_frame_equal, assert_series_equal

Asserts#

Polars provides some standard asserts for use with unit tests:

testing.assert_frame_equal(left, right, *[, ...])

Assert that the left and right frame are equal.

testing.assert_frame_not_equal(left, right, *)

Assert that the left and right frame are not equal.

testing.assert_series_equal(left, right, *)

Assert that the left and right Series are equal.

testing.assert_series_not_equal(left, right, *)

Assert that the left and right Series are not equal.

Parametric testing#

See the Hypothesis library for more details about property-based testing, strategies, and library integrations:

Polars strategies#

Polars provides the following hypothesis testing strategies:

testing.parametric.dataframes([cols, lazy, ...])

Hypothesis strategy for producing Polars DataFrames or LazyFrames.

testing.parametric.dtypes(*[, ...])

Create a strategy for generating Polars DataType objects.

testing.parametric.lists(inner_dtype, *[, ...])

Create a strategy for generating lists of the given data type.

testing.parametric.series(*[, name, dtype, ...])

Hypothesis strategy for producing Polars Series.

Strategy helpers#

testing.parametric.column([name, dtype, ...])

Define a column for use with the dataframes strategy.

testing.parametric.columns([cols, dtype, ...])

Define multiple columns for use with the @dataframes strategy.

testing.parametric.create_list_strategy([...])

Create a strategy for generating Polars List data.

Profiles#

Several standard/named hypothesis profiles are provided:

  • fast: runs 100 iterations.

  • balanced: runs 1,000 iterations.

  • expensive: runs 10,000 iterations.

The load/set helper functions allow you to access these profiles directly, set your preferred profile (default is fast), or set a custom number of iterations.

testing.parametric.load_profile([profile, ...])

Load a named (or custom) hypothesis profile for use with the parametric tests.

testing.parametric.set_profile(profile)

Set the env var POLARS_HYPOTHESIS_PROFILE to the given profile name/value.

Approximate profile timings:

Running polars’ own parametric unit tests on 0.17.6 against release and debug builds, on a machine with 12 cores, using xdist -n auto results in the following timings (these values are indicative only, and may vary significantly depending on your own hardware setup):

Profile

Iterations

Release

Debug

fast

100

~6 secs

~8 secs

balanced

1,000

~22 secs

~30 secs

expensive

10,000

~3 mins 5 secs

~4 mins 45 secs

Examples#

Basic: Create a parametric unit test that will receive a series of generated DataFrames, each having 5 numeric columns with a 10% chance of any generated value being null (this is distinct from NaN).

import polars as pl
from polars.testing.parametric import dataframes
from polars import NUMERIC_DTYPES

from hypothesis import given

@given(
    dataframes(
        cols=5,
        allow_null=True,
        allowed_dtypes=NUMERIC_DTYPES,
    )
)
def test_numeric(df: pl.DataFrame):
    assert all(df[col].dtype.is_numeric() for col in df.columns)

    # Example frame:
    # ┌──────┬────────┬───────┬────────────┬────────────┐
    # │ col0 ┆ col1   ┆ col2  ┆ col3       ┆ col4       │
    # │ ---  ┆ ---    ┆ ---   ┆ ---        ┆ ---        │
    # │ u8   ┆ i16    ┆ u16   ┆ i32        ┆ f64        │
    # ╞══════╪════════╪═══════╪════════════╪════════════╡
    # │ 54   ┆ -29096 ┆ 485   ┆ 2147483647 ┆ -2.8257e14 │
    # │ null ┆ 7508   ┆ 37338 ┆ 7264       ┆ 1.5        │
    # │ 0    ┆ 321    ┆ null  ┆ 16996      ┆ NaN        │
    # │ 121  ┆ -361   ┆ 63204 ┆ 1          ┆ 1.1443e235 │
    # └──────┴────────┴───────┴────────────┴────────────┘

Intermediate: Integrate hypothesis-native strategies into specifically-named columns, generating a series of LazyFrames, with a minimum size of five rows and values that conform to the given strategies:

import polars as pl
from polars.testing.parametric import column, dataframes

import hypothesis.strategies as st
from hypothesis import given
from string import ascii_letters, digits

id_chars = ascii_letters + digits

@given(
    dataframes(
        cols=[
            column("id", strategy=st.text(min_size=4, max_size=4, alphabet=id_chars)),
            column("ccy", strategy=st.sampled_from(["GBP", "EUR", "JPY", "USD"])),
            column("price", strategy=st.floats(min_value=0.0, max_value=1000.0)),
        ],
        min_size=5,
        lazy=True,
    )
)
def test_price_calculations(lf: pl.LazyFrame):
    ...
    print(lf.collect())

    # Example frame:
    # ┌──────┬─────┬─────────┐
    # │ id   ┆ ccy ┆ price   │
    # │ ---  ┆ --- ┆ ---     │
    # │ str  ┆ str ┆ f64     │
    # ╞══════╪═════╪═════════╡
    # │ A101 ┆ GBP ┆ 1.1     │
    # │ 8nIn ┆ JPY ┆ 1.5     │
    # │ QHoO ┆ EUR ┆ 714.544 │
    # │ i0e0 ┆ GBP ┆ 0.0     │
    # │ 0000 ┆ USD ┆ 999.0   │
    # └──────┴─────┴─────────┘

Advanced: Create and use a List[UInt8] dtype strategy as a hypothesis composite that generates pairs of pairs of small integer values in which the first value in each nested pair is always less than or equal to the second value:

import polars as pl
from polars.testing.parametric import column, dataframes, lists

import hypothesis.strategies as st
from hypothesis import given

@st.composite
def uint8_pairs(draw: st.DrawFn):
    uints = lists(pl.UInt8, size=2)
    pairs = list(zip(draw(uints), draw(uints)))
    return [sorted(ints) for ints in pairs]

@given(
    dataframes(
        cols=[
            column("colx", strategy=uint8_pairs()),
            column("coly", strategy=uint8_pairs()),
            column("colz", strategy=uint8_pairs()),
        ],
        min_size=3,
        max_size=3,
    )
)
def test_miscellaneous(df: pl.DataFrame): ...

    # Example frame:
    # ┌─────────────────────────┬─────────────────────────┬──────────────────────────┐
    # │ colx                    ┆ coly                    ┆ colz                     │
    # │ ---                     ┆ ---                     ┆ ---                      │
    # │ list[list[i64]]         ┆ list[list[i64]]         ┆ list[list[i64]]          │
    # ╞═════════════════════════╪═════════════════════════╪══════════════════════════╡
    # │ [[143, 235], [75, 101]] ┆ [[143, 235], [75, 101]] ┆ [[31, 41], [57, 250]]    │
    # │ [[87, 186], [174, 179]] ┆ [[87, 186], [174, 179]] ┆ [[112, 213], [149, 221]] │
    # │ [[23, 85], [7, 86]]     ┆ [[23, 85], [7, 86]]     ┆ [[22, 255], [27, 28]]    │
    # └─────────────────────────┴─────────────────────────┴──────────────────────────┘