polars.LazyFrame.update#

LazyFrame.update(
other: LazyFrame,
on: str | Sequence[str] | None = None,
left_on: str | Sequence[str] | None = None,
right_on: str | Sequence[str] | None = None,
how: Literal['left', 'inner', 'outer'] = 'left',
include_nulls: bool | None = False,
) Self[source]#

Update the values in this LazyFrame with the non-null values in other.

Parameters:
other

LazyFrame that will be used to update the values

on

Column names that will be joined on; if given None the implicit row index is used as a join key instead.

left_on

Join column(s) of the left DataFrame.

right_on

Join column(s) of the right DataFrame.

how{‘left’, ‘inner’, ‘outer’}
  • ‘left’ will keep all rows from the left table; rows may be duplicated if multiple rows in the right frame match the left row’s key.

  • ‘inner’ keeps only those rows where the key exists in both frames.

  • ‘outer’ will update existing rows where the key matches while also adding any new rows contained in the given frame.

include_nulls

If True, null values from the right DataFrame will be used to update the left DataFrame.

Notes

This is syntactic sugar for a left/inner join, with an optional coalesce when include_nulls = False.

Examples

>>> lf = pl.LazyFrame(
...     {
...         "A": [1, 2, 3, 4],
...         "B": [400, 500, 600, 700],
...     }
... )
>>> lf.collect()
shape: (4, 2)
┌─────┬─────┐
│ A   ┆ B   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 400 │
│ 2   ┆ 500 │
│ 3   ┆ 600 │
│ 4   ┆ 700 │
└─────┴─────┘
>>> new_lf = pl.LazyFrame(
...     {
...         "B": [-66, None, -99],
...         "C": [5, 3, 1],
...     }
... )

Update df values with the non-null values in new_df, by row index:

>>> lf.update(new_lf).collect()
shape: (4, 2)
┌─────┬─────┐
│ A   ┆ B   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ -66 │
│ 2   ┆ 500 │
│ 3   ┆ -99 │
│ 4   ┆ 700 │
└─────┴─────┘

Update df values with the non-null values in new_df, by row index, but only keeping those rows that are common to both frames:

>>> lf.update(new_lf, how="inner").collect()
shape: (3, 2)
┌─────┬─────┐
│ A   ┆ B   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ -66 │
│ 2   ┆ 500 │
│ 3   ┆ -99 │
└─────┴─────┘

Update df values with the non-null values in new_df, using an outer join strategy that defines explicit join columns in each frame:

>>> lf.update(new_lf, left_on=["A"], right_on=["C"], how="outer").collect()
shape: (5, 2)
┌─────┬─────┐
│ A   ┆ B   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ -99 │
│ 2   ┆ 500 │
│ 3   ┆ 600 │
│ 4   ┆ 700 │
│ 5   ┆ -66 │
└─────┴─────┘

Update df values including null values in new_df, using an outer join strategy that defines explicit join columns in each frame:

>>> lf.update(
...     new_lf, left_on="A", right_on="C", how="outer", include_nulls=True
... ).collect()
shape: (5, 2)
┌─────┬──────┐
│ A   ┆ B    │
│ --- ┆ ---  │
│ i64 ┆ i64  │
╞═════╪══════╡
│ 1   ┆ -99  │
│ 2   ┆ 500  │
│ 3   ┆ null │
│ 4   ┆ 700  │
│ 5   ┆ -66  │
└─────┴──────┘