polars.read_ipc#

Read into a DataFrame from Arrow IPC (Feather v2) file.

See “File or Random Access format” on https://arrow.apache.org/docs/python/ipc.html. Arrow IPC files are also known as Feather (v2) files.

Changed in version 0.20.4: * The row_count_name parameter was renamed row_index_name. * The row_count_offset parameter was renamed row_index_offset.

Parameters:

source: Path to a file or a file-like object (by “file-like object” we refer to objects that have a read() method, such as a file handler like the builtin open function, or a BytesIO instance). If fsspec is installed, it might be used to open remote files. For file-like objects, the stream position may not be updated accordingly after reading.
columns: Columns to select. Accepts a list of column indices (starting at zero) or a list of column names.
n_rows: Stop reading from IPC file after reading n_rows. Only valid when use_pyarrow=False.
use_pyarrow: Use pyarrow or the native Rust reader.
memory_map: Try to memory map the file. This can greatly improve performance on repeated queries as the OS may cache pages. Only uncompressed IPC files can be memory mapped.
storage_options: Extra options that make sense for fsspec.open() or a particular storage connection, e.g. host, port, username, password, etc.
row_index_name: Insert a row index column with the given name into the DataFrame as the first column. If set to None (default), no row index column is created.
row_index_offset: Start the row index at this offset. Cannot be negative. Only used if row_index_name is set.
rechunk: Make sure that all data is contiguous.

Returns:

DataFrame

Warning

Calling read_ipc().lazy() is an antipattern as this forces Polars to materialize a full csv file and therefore cannot push any optimizations into the reader. Therefore always prefer scan_ipc if you want to work with LazyFrame s.

If memory_map is set, the bytes on disk are mapped 1:1 to memory.

That means that:

Arrow data in the file is not validated to be correct and invalid arrow data is UB! Ensure this file is correct or set memory_map=False.
You cannot write to the same filename. E.g. pl.read_ipc("my_file.arrow").write_ipc("my_file.arrow") will fail.