
other: DataFrame,
on: str | Expr | Sequence[str | Expr] | None = None,
how: JoinStrategy = 'inner',
left_on: str | Expr | Sequence[str | Expr] | None = None,
right_on: str | Expr | Sequence[str | Expr] | None = None,
suffix: str = '_right',
validate: JoinValidation = 'm:m',
join_nulls: bool = False,
coalesce: bool | None = None,
) DataFrame[source]#

Join in SQL-like fashion.


DataFrame to join with.


Name(s) of the join columns in both DataFrames.

how{‘inner’, ‘left’, ‘full’, ‘semi’, ‘anti’, ‘cross’}

Join strategy.

  • inner

    Returns rows that have matching values in both tables

  • left

    Returns all rows from the left table, and the matched rows from the right table

  • full

    Returns all rows when there is a match in either left or right table

  • cross

    Returns the Cartesian product of rows from both tables

  • semi

    Filter rows that have a match in the right table.

  • anti

    Filter rows that do not have a match in the right table.


A left join preserves the row order of the left DataFrame.


Name(s) of the left join column(s).


Name(s) of the right join column(s).


Suffix to append to columns with a duplicate name.

validate: {‘m:m’, ‘m:1’, ‘1:m’, ‘1:1’}

Checks if join is of specified type.

  • many_to_many

    “m:m”: default, does not result in checks

  • one_to_one

    “1:1”: check if join keys are unique in both left and right datasets

  • one_to_many

    “1:m”: check if join keys are unique in left dataset

  • many_to_one

    “m:1”: check if join keys are unique in right dataset


This is currently not supported by the streaming engine.


Join on null values. By default null values will never produce matches.


Coalescing behavior (merging of join columns).

  • None: -> join specific.

  • True: -> Always coalesce join columns.

  • False: -> Never coalesce join columns.


See also



For joining on columns with categorical data, see polars.StringCache.


>>> df = pl.DataFrame(
...     {
...         "foo": [1, 2, 3],
...         "bar": [6.0, 7.0, 8.0],
...         "ham": ["a", "b", "c"],
...     }
... )
>>> other_df = pl.DataFrame(
...     {
...         "apple": ["x", "y", "z"],
...         "ham": ["a", "b", "d"],
...     }
... )
>>> df.join(other_df, on="ham")
shape: (2, 4)
│ foo ┆ bar ┆ ham ┆ apple │
│ --- ┆ --- ┆ --- ┆ ---   │
│ i64 ┆ f64 ┆ str ┆ str   │
│ 1   ┆ 6.0 ┆ a   ┆ x     │
│ 2   ┆ 7.0 ┆ b   ┆ y     │
>>> df.join(other_df, on="ham", how="full")
shape: (4, 5)
│ foo  ┆ bar  ┆ ham  ┆ apple ┆ ham_right │
│ ---  ┆ ---  ┆ ---  ┆ ---   ┆ ---       │
│ i64  ┆ f64  ┆ str  ┆ str   ┆ str       │
│ 1    ┆ 6.0  ┆ a    ┆ x     ┆ a         │
│ 2    ┆ 7.0  ┆ b    ┆ y     ┆ b         │
│ null ┆ null ┆ null ┆ z     ┆ d         │
│ 3    ┆ 8.0  ┆ c    ┆ null  ┆ null      │
>>> df.join(other_df, on="ham", how="left", coalesce=True)
shape: (3, 4)
│ foo ┆ bar ┆ ham ┆ apple │
│ --- ┆ --- ┆ --- ┆ ---   │
│ i64 ┆ f64 ┆ str ┆ str   │
│ 1   ┆ 6.0 ┆ a   ┆ x     │
│ 2   ┆ 7.0 ┆ b   ┆ y     │
│ 3   ┆ 8.0 ┆ c   ┆ null  │
>>> df.join(other_df, on="ham", how="semi")
shape: (2, 3)
│ foo ┆ bar ┆ ham │
│ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ str │
│ 1   ┆ 6.0 ┆ a   │
│ 2   ┆ 7.0 ┆ b   │
>>> df.join(other_df, on="ham", how="anti")
shape: (1, 3)
│ foo ┆ bar ┆ ham │
│ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ str │
│ 3   ┆ 8.0 ┆ c   │