Skip to content

Excel

Polars can read and write to Excel files from Python. From a performance perspective, we recommend using other formats if possible, such as Parquet or CSV files.

Read

Polars does not have a native Excel reader. Instead, it uses an external library called an "engine" to parse Excel files into a form that Polars can parse. The available engines are:

  • fastexcel: This engine is based on the Rust calamine crate and is (by far) the fastest reader.
  • xlsx2csv: This reader parses the .xlsx file to an in-memory CSV that Polars then reads with its own CSV reader.
  • openpyxl: Typically slower than xls2csv, but can provide more flexibility for files that are difficult to parse.

We recommend working with the default fastexcel engine. The xlsx2csv and openpyxl engines are slower but may have more features for parsing tricky data. These engines may be helpful if the fastexcel reader does not work for a specific Excel file.

To use one of these engines, the appropriate Python package must be installed as an additional dependency.

$ pip install fastexcel xlsx2csv openpyxl 

The default engine for reading .xslx files is fastexcel. This engine uses the Rust calamine crate to read .xslx files into an Apache Arrow in-memory representation that Polars can read without needing to copy the data.

read_excel

df = pl.read_excel("docs/assets/data/path.xlsx")

We can specify the sheet name to read with the sheet_name argument. If we do not specify a sheet name, the first sheet will be read.

read_excel

df = pl.read_excel("docs/assets/data/path.xlsx", sheet_name="Sales")

Write

We need the xlswriter library installed as an additional dependency to write to Excel files.

$ pip install xlsxwriter

Writing to Excel files is not currently available in Rust Polars, though it is possible to use this crate to write to Excel files from Rust.

Writing a DataFrame to an Excel file is done with the write_excel method:

write_excel

df = pl.DataFrame({"foo": [1, 2, 3], "bar": [None, "bak", "baz"]})
df.write_excel("docs/assets/data/path.xlsx")

The name of the worksheet can be specified with the worksheet argument.

write_excel

df = pl.DataFrame({"foo": [1, 2, 3], "bar": [None, "bak", "baz"]})
df.write_excel("docs/assets/data/path.xlsx", worksheet="Sales")

Polars can create rich Excel files with multiple sheets and formatting. For more details, see the API docs for write_excel.