polars.read_excel#

polars.read_excel(
source: str | BytesIO | Path | BinaryIO | bytes,
*,
sheet_id: None = None,
sheet_name: str,
xlsx2csv_options: dict[str, Any] | None = None,
read_csv_options: dict[str, Any] | None = None,
raise_if_empty: bool = True,
) DataFrame[source]#
polars.read_excel(
source: str | BytesIO | Path | BinaryIO | bytes,
*,
sheet_id: None = None,
sheet_name: None = None,
xlsx2csv_options: dict[str, Any] | None = None,
read_csv_options: dict[str, Any] | None = None,
raise_if_empty: bool = True,
) DataFrame
polars.read_excel(
source: str | BytesIO | Path | BinaryIO | bytes,
*,
sheet_id: int,
sheet_name: str,
xlsx2csv_options: dict[str, Any] | None = None,
read_csv_options: dict[str, Any] | None = None,
raise_if_empty: bool = True,
) NoReturn
polars.read_excel(
source: str | BytesIO | Path | BinaryIO | bytes,
*,
sheet_id: Literal[0],
sheet_name: None = None,
xlsx2csv_options: dict[str, Any] | None = None,
read_csv_options: dict[str, Any] | None = None,
raise_if_empty: bool = True,
) dict[str, polars.dataframe.frame.DataFrame]
polars.read_excel(
source: str | BytesIO | Path | BinaryIO | bytes,
*,
sheet_id: int,
sheet_name: None = None,
xlsx2csv_options: dict[str, Any] | None = None,
read_csv_options: dict[str, Any] | None = None,
raise_if_empty: bool = True,
) DataFrame

Read Excel (XLSX) sheet into a DataFrame.

Converts an Excel sheet with xlsx2csv.Xlsx2csv().convert() to CSV and parses the CSV output with read_csv().

Parameters:
source

Path to a file or a file-like object (by file-like object, we refer to objects that have a read() method, such as a file handler (e.g. via builtin open function) or BytesIO).

sheet_id

Sheet number to convert (set 0 to load all sheets as DataFrames) and return a {sheetname:frame,} dict. (Defaults to 1 if neither this nor sheet_name are specified).

sheet_name

Sheet name to convert; cannot be used in conjunction with sheet_id.

xlsx2csv_options

Extra options passed to xlsx2csv.Xlsx2csv(), e.g. {"skip_empty_lines": True}

read_csv_options

Extra options passed to read_csv() for parsing the CSV file returned by xlsx2csv.Xlsx2csv().convert() e.g.: {"has_header": False, "new_columns": ["a", "b", "c"], "infer_schema_length": None}

raise_if_empty

When there is no data in the sheet,``NoDataError`` is raised. If this parameter is set to False, an empty DataFrame (with no columns) is returned instead.

Returns:
DataFrame, or a sheetname to DataFrame dict when sheet_id == 0.

Examples

Read the “data” worksheet from an Excel file into a DataFrame.

>>> pl.read_excel(
...     source="test.xlsx",
...     sheet_name="data",
... )  

Read sheet 3 from Excel sheet file to a DataFrame while skipping empty lines in the sheet. As sheet 3 does not have header row, pass the needed settings to read_csv().

>>> pl.read_excel(
...     source="test.xlsx",
...     sheet_id=3,
...     xlsx2csv_options={"skip_empty_lines": True},
...     read_csv_options={"has_header": False, "new_columns": ["a", "b", "c"]},
... )  

If the correct datatypes can’t be determined by polars, look at the read_csv() documentation to see which options you can pass to fix this issue. For example "infer_schema_length": None can be used to read the data twice, once to infer the correct output types and once to convert the input to the correct types. When “infer_schema_length”: 1000`, only the first 1000 lines are read twice.

>>> pl.read_excel(
...     source="test.xlsx",
...     read_csv_options={"infer_schema_length": None},
... )  

If read_excel() does not work or you need to read other types of spreadsheet files, you can try pandas pd.read_excel() (supports xls, xlsx, xlsm, xlsb, odf, ods and odt).

>>> pl.from_pandas(pd.read_excel("test.xlsx"))