polars.read_ipc_stream#

polars.read_ipc_stream(
source: str | Path | IO[bytes] | bytes,
*,
columns: list[int] | list[str] | None = None,
n_rows: int | None = None,
use_pyarrow: bool = False,
storage_options: dict[str, Any] | None = None,
row_index_name: str | None = None,
row_index_offset: int = 0,
rechunk: bool = True,
) DataFrame[source]#

Read into a DataFrame from Arrow IPC record batch stream.

See “Streaming format” on https://arrow.apache.org/docs/python/ipc.html.

Parameters:
source

Path to a file or a file-like object (by “file-like object” we refer to objects that have a read() method, such as a file handler like the builtin open function, or a BytesIO instance). If fsspec is installed, it will be used to open remote files. For file-like objects, stream position may not be updated accordingly after reading.

columns

Columns to select. Accepts a list of column indices (starting at zero) or a list of column names.

n_rows

Stop reading from IPC stream after reading n_rows. Only valid when use_pyarrow=False.

use_pyarrow

Use pyarrow or the native Rust reader.

storage_options

Extra options that make sense for fsspec.open() or a particular storage connection, e.g. host, port, username, password, etc.

row_index_name

Insert a row index column with the given name into the DataFrame as the first column. If set to None (default), no row index column is created.

row_index_offset

Start the row index at this offset. Cannot be negative. Only used if row_index_name is set.

rechunk

Make sure that all data is contiguous.

Returns:
DataFrame