polars.DataFrame.cast#

DataFrame.cast(
dtypes: Mapping[ColumnNameOrSelector | PolarsDataType, PolarsDataType | PythonDataType] | PolarsDataType,
*,
strict: bool = True,
) DataFrame[source]#

Cast DataFrame column(s) to the specified dtype(s).

Parameters:
dtypes

Mapping of column names (or selector) to dtypes, or a single dtype to which all columns will be cast.

strict

Raise if cast is invalid on rows after predicates are pusded down. If False, invalid casts will produce null values.

Examples

>>> from datetime import date
>>> df = pl.DataFrame(
...     {
...         "foo": [1, 2, 3],
...         "bar": [6.0, 7.0, 8.0],
...         "ham": [date(2020, 1, 2), date(2021, 3, 4), date(2022, 5, 6)],
...     }
... )

Cast specific frame columns to the specified dtypes:

>>> df.cast({"foo": pl.Float32, "bar": pl.UInt8})
shape: (3, 3)
┌─────┬─────┬────────────┐
│ foo ┆ bar ┆ ham        │
│ --- ┆ --- ┆ ---        │
│ f32 ┆ u8  ┆ date       │
╞═════╪═════╪════════════╡
│ 1.0 ┆ 6   ┆ 2020-01-02 │
│ 2.0 ┆ 7   ┆ 2021-03-04 │
│ 3.0 ┆ 8   ┆ 2022-05-06 │
└─────┴─────┴────────────┘

Cast all frame columns matching one dtype (or dtype group) to another dtype:

>>> df.cast({pl.Date: pl.Datetime})
shape: (3, 3)
┌─────┬─────┬─────────────────────┐
│ foo ┆ bar ┆ ham                 │
│ --- ┆ --- ┆ ---                 │
│ i64 ┆ f64 ┆ datetime[μs]        │
╞═════╪═════╪═════════════════════╡
│ 1   ┆ 6.0 ┆ 2020-01-02 00:00:00 │
│ 2   ┆ 7.0 ┆ 2021-03-04 00:00:00 │
│ 3   ┆ 8.0 ┆ 2022-05-06 00:00:00 │
└─────┴─────┴─────────────────────┘

Use selectors to define the columns being cast:

>>> import polars.selectors as cs
>>> df.cast({cs.numeric(): pl.UInt32, cs.temporal(): pl.String})
shape: (3, 3)
┌─────┬─────┬────────────┐
│ foo ┆ bar ┆ ham        │
│ --- ┆ --- ┆ ---        │
│ u32 ┆ u32 ┆ str        │
╞═════╪═════╪════════════╡
│ 1   ┆ 6   ┆ 2020-01-02 │
│ 2   ┆ 7   ┆ 2021-03-04 │
│ 3   ┆ 8   ┆ 2022-05-06 │
└─────┴─────┴────────────┘

Cast all frame columns to the specified dtype:

>>> df.cast(pl.String).to_dict(as_series=False)
{'foo': ['1', '2', '3'],
 'bar': ['6.0', '7.0', '8.0'],
 'ham': ['2020-01-02', '2021-03-04', '2022-05-06']}