polars.LazyFrame.join#
- LazyFrame.join(
- other: LazyFrame,
- on: str | Expr | Sequence[str | Expr] | None = None,
- how: JoinStrategy = 'inner',
- *,
- left_on: str | Expr | Sequence[str | Expr] | None = None,
- right_on: str | Expr | Sequence[str | Expr] | None = None,
- suffix: str = '_right',
- validate: JoinValidation = 'm:m',
- allow_parallel: bool = True,
- force_parallel: bool = False,
Add a join operation to the Logical Plan.
- Parameters:
- other
Lazy DataFrame to join with.
- on
Join column of both DataFrames. If set,
left_on
andright_on
should be None.- how{‘inner’, ‘left’, ‘outer’, ‘semi’, ‘anti’, ‘cross’}
Join strategy.
Note
A left join preserves the row order of the left DataFrame.
- left_on
Join column of the left DataFrame.
- right_on
Join column of the right DataFrame.
- suffix
Suffix to append to columns with a duplicate name.
- validate: {‘m:m’, ‘m:1’, ‘1:m’, ‘1:1’}
Checks if join is of specified type.
- many_to_many
“m:m”: default, does not result in checks
- one_to_one
“1:1”: check if join keys are unique in both left and right datasets
- one_to_many
“1:m”: check if join keys are unique in left dataset
- many_to_one
“m:1”: check if join keys are unique in right dataset
Note
This is currently not supported the streaming engine.
This is only supported when joined by single columns.
- allow_parallel
Allow the physical plan to optionally evaluate the computation of both DataFrames up to the join in parallel.
- force_parallel
Force the physical plan to evaluate the computation of both DataFrames up to the join in parallel.
See also
Examples
>>> lf = pl.LazyFrame( ... { ... "foo": [1, 2, 3], ... "bar": [6.0, 7.0, 8.0], ... "ham": ["a", "b", "c"], ... } ... ) >>> other_lf = pl.LazyFrame( ... { ... "apple": ["x", "y", "z"], ... "ham": ["a", "b", "d"], ... } ... ) >>> lf.join(other_lf, on="ham").collect() shape: (2, 4) ┌─────┬─────┬─────┬───────┐ │ foo ┆ bar ┆ ham ┆ apple │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ str ┆ str │ ╞═════╪═════╪═════╪═══════╡ │ 1 ┆ 6.0 ┆ a ┆ x │ │ 2 ┆ 7.0 ┆ b ┆ y │ └─────┴─────┴─────┴───────┘ >>> lf.join(other_lf, on="ham", how="outer").collect() shape: (4, 4) ┌──────┬──────┬─────┬───────┐ │ foo ┆ bar ┆ ham ┆ apple │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ str ┆ str │ ╞══════╪══════╪═════╪═══════╡ │ 1 ┆ 6.0 ┆ a ┆ x │ │ 2 ┆ 7.0 ┆ b ┆ y │ │ null ┆ null ┆ d ┆ z │ │ 3 ┆ 8.0 ┆ c ┆ null │ └──────┴──────┴─────┴───────┘ >>> lf.join(other_lf, on="ham", how="left").collect() shape: (3, 4) ┌─────┬─────┬─────┬───────┐ │ foo ┆ bar ┆ ham ┆ apple │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ str ┆ str │ ╞═════╪═════╪═════╪═══════╡ │ 1 ┆ 6.0 ┆ a ┆ x │ │ 2 ┆ 7.0 ┆ b ┆ y │ │ 3 ┆ 8.0 ┆ c ┆ null │ └─────┴─────┴─────┴───────┘ >>> lf.join(other_lf, on="ham", how="semi").collect() shape: (2, 3) ┌─────┬─────┬─────┐ │ foo ┆ bar ┆ ham │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ str │ ╞═════╪═════╪═════╡ │ 1 ┆ 6.0 ┆ a │ │ 2 ┆ 7.0 ┆ b │ └─────┴─────┴─────┘ >>> lf.join(other_lf, on="ham", how="anti").collect() shape: (1, 3) ┌─────┬─────┬─────┐ │ foo ┆ bar ┆ ham │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ str │ ╞═════╪═════╪═════╡ │ 3 ┆ 8.0 ┆ c │ └─────┴─────┴─────┘