polars.read_excel#
- polars.read_excel(
- source: str | BytesIO | Path | BinaryIO | bytes,
- *,
- sheet_id: None = None,
- sheet_name: str,
- xlsx2csv_options: dict[str, Any] | None = None,
- read_csv_options: dict[str, Any] | None = None,
- raise_if_empty: bool = True,
- polars.read_excel(
- source: str | BytesIO | Path | BinaryIO | bytes,
- *,
- sheet_id: None = None,
- sheet_name: None = None,
- xlsx2csv_options: dict[str, Any] | None = None,
- read_csv_options: dict[str, Any] | None = None,
- raise_if_empty: bool = True,
- polars.read_excel(
- source: str | BytesIO | Path | BinaryIO | bytes,
- *,
- sheet_id: int,
- sheet_name: str,
- xlsx2csv_options: dict[str, Any] | None = None,
- read_csv_options: dict[str, Any] | None = None,
- raise_if_empty: bool = True,
- polars.read_excel(
- source: str | BytesIO | Path | BinaryIO | bytes,
- *,
- sheet_id: Literal[0],
- sheet_name: None = None,
- xlsx2csv_options: dict[str, Any] | None = None,
- read_csv_options: dict[str, Any] | None = None,
- raise_if_empty: bool = True,
- polars.read_excel(
- source: str | BytesIO | Path | BinaryIO | bytes,
- *,
- sheet_id: int,
- sheet_name: None = None,
- xlsx2csv_options: dict[str, Any] | None = None,
- read_csv_options: dict[str, Any] | None = None,
- raise_if_empty: bool = True,
Read Excel (XLSX) sheet into a DataFrame.
Converts an Excel sheet with
xlsx2csv.Xlsx2csv().convert()
to CSV and parses the CSV output withread_csv()
.- Parameters:
- source
Path to a file or a file-like object (by file-like object, we refer to objects that have a
read()
method, such as a file handler (e.g. via builtinopen
function) orBytesIO
).- sheet_id
Sheet number to convert (set
0
to load all sheets as DataFrames) and return a{sheetname:frame,}
dict. (Defaults to 1 if neither this nor sheet_name are specified).- sheet_name
Sheet name to convert; cannot be used in conjunction with sheet_id.
- xlsx2csv_options
Extra options passed to
xlsx2csv.Xlsx2csv()
, e.g.{"skip_empty_lines": True}
- read_csv_options
Extra options passed to
read_csv()
for parsing the CSV file returned byxlsx2csv.Xlsx2csv().convert()
e.g.:{"has_header": False, "new_columns": ["a", "b", "c"], "infer_schema_length": None}
- raise_if_empty
When there is no data in the sheet,``NoDataError`` is raised. If this parameter is set to False, an empty DataFrame (with no columns) is returned instead.
- Returns:
- DataFrame, or a sheetname to DataFrame dict when
sheet_id == 0
.
- DataFrame, or a sheetname to DataFrame dict when
Examples
Read the “data” worksheet from an Excel file into a DataFrame.
>>> pl.read_excel( ... source="test.xlsx", ... sheet_name="data", ... )
Read sheet 3 from Excel sheet file to a DataFrame while skipping empty lines in the sheet. As sheet 3 does not have header row, pass the needed settings to
read_csv()
.>>> pl.read_excel( ... source="test.xlsx", ... sheet_id=3, ... xlsx2csv_options={"skip_empty_lines": True}, ... read_csv_options={"has_header": False, "new_columns": ["a", "b", "c"]}, ... )
If the correct datatypes can’t be determined by polars, look at the
read_csv()
documentation to see which options you can pass to fix this issue. For example"infer_schema_length": None
can be used to read the data twice, once to infer the correct output types and once to convert the input to the correct types. When “infer_schema_length”: 1000`, only the first 1000 lines are read twice.>>> pl.read_excel( ... source="test.xlsx", ... read_csv_options={"infer_schema_length": None}, ... )
If
read_excel()
does not work or you need to read other types of spreadsheet files, you can try pandaspd.read_excel()
(supports xls, xlsx, xlsm, xlsb, odf, ods and odt).>>> pl.from_pandas(pd.read_excel("test.xlsx"))