polars.corr#

polars.corr( a: IntoExpr, b: IntoExpr, *, method: CorrelationMethod = 'pearson', ddof: int | None = None, propagate_nans: bool = False, eager: bool = False, ) → Expr | Series[source]#

Compute the Pearson’s or Spearman rank correlation between two columns.

Parameters:

a: Column name or Expression.
b: Column name or Expression.
ddof: Has no effect, do not use.

Deprecated since version 1.17.0.
method{‘pearson’, ‘spearman’}: Correlation method.
propagate_nans: If True any NaN encountered will lead to NaN in the output. Defaults to False where NaN are regarded as larger than any finite number and thus lead to the highest rank.
eager: Evaluate immediately and return a Series; this requires that at least one of the given arguments is a Series. If set to False (default), return an expression instead.

Examples

Pearson’s correlation:

>>> df = pl.DataFrame(
...     {
...         "a": [1, 8, 3],
...         "b": [4, 5, 2],
...         "c": ["foo", "bar", "foo"],
...     }
... )
>>> df.select(pl.corr("a", "b"))
shape: (1, 1)
┌──────────┐
│ a        │
│ ---      │
│ f64      │
╞══════════╡
│ 0.544705 │
└──────────┘

Spearman rank correlation:

>>> df.select(pl.corr("a", "b", method="spearman"))
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ f64 │
╞═════╡
│ 0.5 │
└─────┘

Eager evaluation:

>>> s1 = pl.Series("a", [1, 8, 3])
>>> s2 = pl.Series("b", [4, 5, 2])
>>> pl.corr(s1, s2, eager=True)
shape: (1,)
Series: 'a' [f64]
[
    0.544705
]
>>> pl.corr(s1, s2, method="spearman", eager=True)
shape: (1,)
Series: 'a' [f64]
[
    0.5
]