polars.concat#

polars.concat(
items: Iterable[PolarsType],
*,
how: ConcatMethod = 'vertical',
rechunk: bool = False,
parallel: bool = True,
) PolarsType[source]#

Combine multiple DataFrames, LazyFrames, or Series into a single object.

Parameters:
items

DataFrames, LazyFrames, or Series to concatenate.

how{‘vertical’, ‘vertical_relaxed’, ‘diagonal’, ‘diagonal_relaxed’, ‘horizontal’, ‘align’, ‘align_full’, ‘align_inner’, ‘align_left’, ‘align_right’}

Note that Series only support the vertical strategy.

  • vertical: Applies multiple vstack operations.

  • vertical_relaxed: Same as vertical, but additionally coerces columns to their common supertype if they are mismatched (eg: Int32 → Int64).

  • diagonal: Finds a union between the column schemas and fills missing column values with null.

  • diagonal_relaxed: Same as diagonal, but additionally coerces columns to their common supertype if they are mismatched (eg: Int32 → Int64).

  • horizontal: Stacks Series from DataFrames horizontally and fills with null if the lengths don’t match.

  • align, align_full, align_left, align_right: Combines frames horizontally, auto-determining the common key columns and aligning rows using the same logic as align_frames (note that “align” is an alias for “align_full”). The “align” strategy determines the type of join used to align the frames, equivalent to the “how” parameter on align_frames. Note that the common join columns are automatically coalesced, but other column collisions will raise an error (if you need more control over this you should use a suitable join method directly).

rechunk

Make sure that the result data is in contiguous memory.

parallel

Only relevant for LazyFrames. This determines if the concatenated lazy computations may be executed in parallel.

Examples

>>> df1 = pl.DataFrame({"a": [1], "b": [3]})
>>> df2 = pl.DataFrame({"a": [2], "b": [4]})
>>> pl.concat([df1, df2])  # default is 'vertical' strategy
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 3   │
│ 2   ┆ 4   │
└─────┴─────┘
>>> df1 = pl.DataFrame({"a": [1], "b": [3]})
>>> df2 = pl.DataFrame({"a": [2.5], "b": [4]})
>>> pl.concat([df1, df2], how="vertical_relaxed")  # 'a' coerced into f64
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ f64 ┆ i64 │
╞═════╪═════╡
│ 1.0 ┆ 3   │
│ 2.5 ┆ 4   │
└─────┴─────┘
>>> df_h1 = pl.DataFrame({"l1": [1, 2], "l2": [3, 4]})
>>> df_h2 = pl.DataFrame({"r1": [5, 6], "r2": [7, 8], "r3": [9, 10]})
>>> pl.concat([df_h1, df_h2], how="horizontal")
shape: (2, 5)
┌─────┬─────┬─────┬─────┬─────┐
│ l1  ┆ l2  ┆ r1  ┆ r2  ┆ r3  │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════╪═════╡
│ 1   ┆ 3   ┆ 5   ┆ 7   ┆ 9   │
│ 2   ┆ 4   ┆ 6   ┆ 8   ┆ 10  │
└─────┴─────┴─────┴─────┴─────┘

The “diagonal” strategy allows for some frames to have missing columns, the values for which are filled with null:

>>> df_d1 = pl.DataFrame({"a": [1], "b": [3]})
>>> df_d2 = pl.DataFrame({"a": [2], "c": [4]})
>>> pl.concat([df_d1, df_d2], how="diagonal")
shape: (2, 3)
┌─────┬──────┬──────┐
│ a   ┆ b    ┆ c    │
│ --- ┆ ---  ┆ ---  │
│ i64 ┆ i64  ┆ i64  │
╞═════╪══════╪══════╡
│ 1   ┆ 3    ┆ null │
│ 2   ┆ null ┆ 4    │
└─────┴──────┴──────┘

The “align” strategies require at least one common column to align on:

>>> df_a1 = pl.DataFrame({"id": [1, 2], "x": [3, 4]})
>>> df_a2 = pl.DataFrame({"id": [2, 3], "y": [5, 6]})
>>> df_a3 = pl.DataFrame({"id": [1, 3], "z": [7, 8]})
>>> pl.concat([df_a1, df_a2, df_a3], how="align")  # equivalent to "align_full"
shape: (3, 4)
┌─────┬──────┬──────┬──────┐
│ id  ┆ x    ┆ y    ┆ z    │
│ --- ┆ ---  ┆ ---  ┆ ---  │
│ i64 ┆ i64  ┆ i64  ┆ i64  │
╞═════╪══════╪══════╪══════╡
│ 1   ┆ 3    ┆ null ┆ 7    │
│ 2   ┆ 4    ┆ 5    ┆ null │
│ 3   ┆ null ┆ 6    ┆ 8    │
└─────┴──────┴──────┴──────┘
>>> pl.concat([df_a1, df_a2, df_a3], how="align_left")
shape: (2, 4)
┌─────┬─────┬──────┬──────┐
│ id  ┆ x   ┆ y    ┆ z    │
│ --- ┆ --- ┆ ---  ┆ ---  │
│ i64 ┆ i64 ┆ i64  ┆ i64  │
╞═════╪═════╪══════╪══════╡
│ 1   ┆ 3   ┆ null ┆ 7    │
│ 2   ┆ 4   ┆ 5    ┆ null │
└─────┴─────┴──────┴──────┘
>>> pl.concat([df_a1, df_a2, df_a3], how="align_right")
shape: (2, 4)
┌─────┬──────┬──────┬─────┐
│ id  ┆ x    ┆ y    ┆ z   │
│ --- ┆ ---  ┆ ---  ┆ --- │
│ i64 ┆ i64  ┆ i64  ┆ i64 │
╞═════╪══════╪══════╪═════╡
│ 1   ┆ null ┆ null ┆ 7   │
│ 3   ┆ null ┆ 6    ┆ 8   │
└─────┴──────┴──────┴─────┘
>>> pl.concat([df_a1, df_a2, df_a3], how="align_inner")
shape: (0, 4)
┌─────┬─────┬─────┬─────┐
│ id  ┆ x   ┆ y   ┆ z   │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════╡
└─────┴─────┴─────┴─────┘