polars.union#

polars.union(
items: Iterable[PolarsType],
*,
how: ConcatMethod = 'vertical',
strict: bool = False,
) PolarsType[source]#

Combine multiple DataFrames, LazyFrames, or Series into a single object.

Warning

This function does not guarantee any specific ordering of rows in the result. If you need predictable row ordering, use pl.concat() instead.

Parameters:
items

DataFrames, LazyFrames, or Series to concatenate.

how{‘vertical’, ‘vertical_relaxed’, ‘diagonal’, ‘diagonal_relaxed’, ‘horizontal’, ‘align’, ‘align_full’, ‘align_inner’, ‘align_left’, ‘align_right’}

Note that Series only support the vertical strategy.

  • vertical: Applies multiple vstack operations.

  • vertical_relaxed: Same as vertical, but additionally coerces columns to their common supertype if they are mismatched (eg: Int32 → Int64).

  • diagonal: Finds a union between the column schemas and fills missing column values with null.

  • diagonal_relaxed: Same as diagonal, but additionally coerces columns to their common supertype if they are mismatched (eg: Int32 → Int64).

  • horizontal: Stacks Series from DataFrames horizontally and fills with null if the lengths don’t match.

  • align, align_full, align_left, align_right: Combines frames horizontally, auto-determining the common key columns and aligning rows using the same logic as align_frames (note that “align” is an alias for “align_full”). The “align” strategy determines the type of join used to align the frames, equivalent to the “how” parameter on align_frames. Note that the common join columns are automatically coalesced, but other column collisions will raise an error (if you need more control over this you should use a suitable join method directly).

strict

When how=`horizontal`, require all DataFrames to be the same height, raising an error if not.

Examples

>>> df1 = pl.DataFrame({"a": [1], "b": [3]})
>>> df2 = pl.DataFrame({"a": [2], "b": [4]})
>>> pl.union([df1, df2])  # default is 'vertical' strategy
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 3   │
│ 2   ┆ 4   │
└─────┴─────┘
>>> df1 = pl.DataFrame({"a": [1], "b": [3]})
>>> df2 = pl.DataFrame({"a": [2.5], "b": [4]})
>>> pl.union([df1, df2], how="vertical_relaxed")  # 'a' coerced into f64
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ f64 ┆ i64 │
╞═════╪═════╡
│ 1.0 ┆ 3   │
│ 2.5 ┆ 4   │
└─────┴─────┘
>>> df_h1 = pl.DataFrame({"l1": [1, 2], "l2": [3, 4]})
>>> df_h2 = pl.DataFrame({"r1": [5, 6], "r2": [7, 8], "r3": [9, 10]})
>>> pl.union([df_h1, df_h2], how="horizontal")
shape: (2, 5)
┌─────┬─────┬─────┬─────┬─────┐
│ l1  ┆ l2  ┆ r1  ┆ r2  ┆ r3  │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════╪═════╡
│ 1   ┆ 3   ┆ 5   ┆ 7   ┆ 9   │
│ 2   ┆ 4   ┆ 6   ┆ 8   ┆ 10  │
└─────┴─────┴─────┴─────┴─────┘

The “diagonal” strategy allows for some frames to have missing columns, the values for which are filled with null:

>>> df_d1 = pl.DataFrame({"a": [1], "b": [3]})
>>> df_d2 = pl.DataFrame({"a": [2], "c": [4]})
>>> pl.union([df_d1, df_d2], how="diagonal")
shape: (2, 3)
┌─────┬──────┬──────┐
│ a   ┆ b    ┆ c    │
│ --- ┆ ---  ┆ ---  │
│ i64 ┆ i64  ┆ i64  │
╞═════╪══════╪══════╡
│ 1   ┆ 3    ┆ null │
│ 2   ┆ null ┆ 4    │
└─────┴──────┴──────┘

The “align” strategies require at least one common column to align on:

>>> df_a1 = pl.DataFrame({"id": [1, 2], "x": [3, 4]})
>>> df_a2 = pl.DataFrame({"id": [2, 3], "y": [5, 6]})
>>> df_a3 = pl.DataFrame({"id": [1, 3], "z": [7, 8]})
>>> pl.union([df_a1, df_a2, df_a3], how="align")  # equivalent to "align_full"
shape: (3, 4)
┌─────┬──────┬──────┬──────┐
│ id  ┆ x    ┆ y    ┆ z    │
│ --- ┆ ---  ┆ ---  ┆ ---  │
│ i64 ┆ i64  ┆ i64  ┆ i64  │
╞═════╪══════╪══════╪══════╡
│ 1   ┆ 3    ┆ null ┆ 7    │
│ 2   ┆ 4    ┆ 5    ┆ null │
│ 3   ┆ null ┆ 6    ┆ 8    │
└─────┴──────┴──────┴──────┘
>>> pl.union([df_a1, df_a2, df_a3], how="align_left")
shape: (2, 4)
┌─────┬─────┬──────┬──────┐
│ id  ┆ x   ┆ y    ┆ z    │
│ --- ┆ --- ┆ ---  ┆ ---  │
│ i64 ┆ i64 ┆ i64  ┆ i64  │
╞═════╪═════╪══════╪══════╡
│ 1   ┆ 3   ┆ null ┆ 7    │
│ 2   ┆ 4   ┆ 5    ┆ null │
└─────┴─────┴──────┴──────┘
>>> pl.union([df_a1, df_a2, df_a3], how="align_right")
shape: (2, 4)
┌─────┬──────┬──────┬─────┐
│ id  ┆ x    ┆ y    ┆ z   │
│ --- ┆ ---  ┆ ---  ┆ --- │
│ i64 ┆ i64  ┆ i64  ┆ i64 │
╞═════╪══════╪══════╪═════╡
│ 1   ┆ null ┆ null ┆ 7   │
│ 3   ┆ null ┆ 6    ┆ 8   │
└─────┴──────┴──────┴─────┘
>>> pl.union([df_a1, df_a2, df_a3], how="align_inner")
shape: (0, 4)
┌─────┬─────┬─────┬─────┐
│ id  ┆ x   ┆ y   ┆ z   │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════╡
└─────┴─────┴─────┴─────┘