polars.Expr.str.strptime#
- Expr.str.strptime(
- dtype: PolarsTemporalType,
- format: str | None = None,
- *,
- strict: bool = True,
- exact: bool = True,
- cache: bool = True,
- utc: bool | None = None,
- use_earliest: bool | None = None,
Convert a Utf8 column into a Date/Datetime/Time column.
- Parameters:
- dtype
The data type to convert into. Can be either Date, Datetime, or Time.
- format
Format to use for conversion. Refer to the chrono crate documentation for the full specification. Example:
"%Y-%m-%d %H:%M:%S"
. If set to None (default), the format is inferred from the data.- strict
Raise an error if any conversion fails.
- exact
Require an exact format match. If False, allow the format to match anywhere in the target string. Conversion to the Time type is always exact.
Note
Using
exact=False
introduces a performance penalty - cleaning your data beforehand will almost certainly be more performant.- cache
Use a cache of unique, converted dates to apply the datetime conversion.
- utc
Parse time zone aware datetimes as UTC. This may be useful if you have data with mixed offsets.
Deprecated since version 0.18.0: This is now a no-op, you can safely remove it. Offset-naive strings are parsed as
pl.Datetime(time_unit)
, and offset-aware strings are converted topl.Datetime(time_unit, "UTC")
.- use_earliest
Determine how to deal with ambiguous datetimes:
None
(default): raiseTrue
: use the earliest datetimeFalse
: use the latest datetime
Notes
When converting to a Datetime type, the time unit is inferred from the format string if given, eg:
"%F %T%.3f"
=>Datetime("ms")
. If no fractional second component is found, the default is"us"
.Examples
Dealing with a consistent format:
>>> s = pl.Series(["2020-01-01 01:00Z", "2020-01-01 02:00Z"]) >>> s.str.strptime(pl.Datetime, "%Y-%m-%d %H:%M%#z") shape: (2,) Series: '' [datetime[μs, UTC]] [ 2020-01-01 01:00:00 UTC 2020-01-01 02:00:00 UTC ]
Dealing with different formats.
>>> s = pl.Series( ... "date", ... [ ... "2021-04-22", ... "2022-01-04 00:00:00", ... "01/31/22", ... "Sun Jul 8 00:34:60 2001", ... ], ... ) >>> s.to_frame().select( ... pl.coalesce( ... pl.col("date").str.strptime(pl.Date, "%F", strict=False), ... pl.col("date").str.strptime(pl.Date, "%F %T", strict=False), ... pl.col("date").str.strptime(pl.Date, "%D", strict=False), ... pl.col("date").str.strptime(pl.Date, "%c", strict=False), ... ) ... ).to_series() shape: (4,) Series: 'date' [date] [ 2021-04-22 2022-01-04 2022-01-31 2001-07-08 ]