polars.concat#
- polars.concat(
- items: Iterable[PolarsType],
- *,
- how: ConcatMethod = 'vertical',
- rechunk: bool = False,
- parallel: bool = True,
Combine multiple DataFrames, LazyFrames, or Series into a single object.
- Parameters:
- items
DataFrames, LazyFrames, or Series to concatenate.
- how{‘vertical’, ‘vertical_relaxed’, ‘diagonal’, ‘diagonal_relaxed’, ‘horizontal’, ‘align’, ‘align_full’, ‘align_inner’, ‘align_left’, ‘align_right’}
Note that
Series
only support thevertical
strategy.vertical: Applies multiple
vstack
operations.vertical_relaxed: Same as
vertical
, but additionally coerces columns to their common supertype if they are mismatched (eg: Int32 → Int64).diagonal: Finds a union between the column schemas and fills missing column values with
null
.diagonal_relaxed: Same as
diagonal
, but additionally coerces columns to their common supertype if they are mismatched (eg: Int32 → Int64).horizontal: Stacks Series from DataFrames horizontally and fills with
null
if the lengths don’t match.align, align_full, align_left, align_right: Combines frames horizontally, auto-determining the common key columns and aligning rows using the same logic as
align_frames
(note that “align” is an alias for “align_full”). The “align” strategy determines the type of join used to align the frames, equivalent to the “how” parameter onalign_frames
. Note that the common join columns are automatically coalesced, but other column collisions will raise an error (if you need more control over this you should use a suitablejoin
method directly).
- rechunk
Make sure that the result data is in contiguous memory.
- parallel
Only relevant for LazyFrames. This determines if the concatenated lazy computations may be executed in parallel.
Examples
>>> df1 = pl.DataFrame({"a": [1], "b": [3]}) >>> df2 = pl.DataFrame({"a": [2], "b": [4]}) >>> pl.concat([df1, df2]) # default is 'vertical' strategy shape: (2, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ i64 ┆ i64 │ ╞═════╪═════╡ │ 1 ┆ 3 │ │ 2 ┆ 4 │ └─────┴─────┘
>>> df1 = pl.DataFrame({"a": [1], "b": [3]}) >>> df2 = pl.DataFrame({"a": [2.5], "b": [4]}) >>> pl.concat([df1, df2], how="vertical_relaxed") # 'a' coerced into f64 shape: (2, 2) ┌─────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ f64 ┆ i64 │ ╞═════╪═════╡ │ 1.0 ┆ 3 │ │ 2.5 ┆ 4 │ └─────┴─────┘
>>> df_h1 = pl.DataFrame({"l1": [1, 2], "l2": [3, 4]}) >>> df_h2 = pl.DataFrame({"r1": [5, 6], "r2": [7, 8], "r3": [9, 10]}) >>> pl.concat([df_h1, df_h2], how="horizontal") shape: (2, 5) ┌─────┬─────┬─────┬─────┬─────┐ │ l1 ┆ l2 ┆ r1 ┆ r2 ┆ r3 │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╪═════╪═════╡ │ 1 ┆ 3 ┆ 5 ┆ 7 ┆ 9 │ │ 2 ┆ 4 ┆ 6 ┆ 8 ┆ 10 │ └─────┴─────┴─────┴─────┴─────┘
The “diagonal” strategy allows for some frames to have missing columns, the values for which are filled with
null
:>>> df_d1 = pl.DataFrame({"a": [1], "b": [3]}) >>> df_d2 = pl.DataFrame({"a": [2], "c": [4]}) >>> pl.concat([df_d1, df_d2], how="diagonal") shape: (2, 3) ┌─────┬──────┬──────┐ │ a ┆ b ┆ c │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 │ ╞═════╪══════╪══════╡ │ 1 ┆ 3 ┆ null │ │ 2 ┆ null ┆ 4 │ └─────┴──────┴──────┘
The “align” strategies require at least one common column to align on:
>>> df_a1 = pl.DataFrame({"id": [1, 2], "x": [3, 4]}) >>> df_a2 = pl.DataFrame({"id": [2, 3], "y": [5, 6]}) >>> df_a3 = pl.DataFrame({"id": [1, 3], "z": [7, 8]}) >>> pl.concat([df_a1, df_a2, df_a3], how="align") # equivalent to "align_full" shape: (3, 4) ┌─────┬──────┬──────┬──────┐ │ id ┆ x ┆ y ┆ z │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪══════╪══════╪══════╡ │ 1 ┆ 3 ┆ null ┆ 7 │ │ 2 ┆ 4 ┆ 5 ┆ null │ │ 3 ┆ null ┆ 6 ┆ 8 │ └─────┴──────┴──────┴──────┘ >>> pl.concat([df_a1, df_a2, df_a3], how="align_left") shape: (2, 4) ┌─────┬─────┬──────┬──────┐ │ id ┆ x ┆ y ┆ z │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪══════╪══════╡ │ 1 ┆ 3 ┆ null ┆ 7 │ │ 2 ┆ 4 ┆ 5 ┆ null │ └─────┴─────┴──────┴──────┘ >>> pl.concat([df_a1, df_a2, df_a3], how="align_right") shape: (2, 4) ┌─────┬──────┬──────┬─────┐ │ id ┆ x ┆ y ┆ z │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪══════╪══════╪═════╡ │ 1 ┆ null ┆ null ┆ 7 │ │ 3 ┆ null ┆ 6 ┆ 8 │ └─────┴──────┴──────┴─────┘ >>> pl.concat([df_a1, df_a2, df_a3], how="align_inner") shape: (0, 4) ┌─────┬─────┬─────┬─────┐ │ id ┆ x ┆ y ┆ z │ │ --- ┆ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ i64 ┆ i64 │ ╞═════╪═════╪═════╪═════╡ └─────┴─────┴─────┴─────┘