polars.Series.dt.truncate#
- Series.dt.truncate(
- every: str | dt.timedelta,
- offset: str | dt.timedelta | None = None,
- *,
- use_earliest: bool | None = None,
Divide the date/ datetime range into buckets.
Each date/datetime is mapped to the start of its bucket using the corresponding local datetime. Note that weekly buckets start on Monday.
- Parameters:
- every
Every interval start and period length
- offset
Offset the window
- use_earliest
Determine how to deal with ambiguous datetimes:
None
(default): raiseTrue
: use the earliest datetimeFalse
: use the latest datetime
- Returns:
Notes
The
every
andoffset
argument are created with the the following string language:1ns # 1 nanosecond
1us # 1 microsecond
1ms # 1 millisecond
1s # 1 second
1m # 1 minute
1h # 1 hour
1d # 1 calendar day
1w # 1 calendar week
1mo # 1 calendar month
1q # 1 calendar quarter
1y # 1 calendar year
These strings can be combined:
3d12h4m25s # 3 days, 12 hours, 4 minutes, and 25 seconds
Suffix with “_saturating” to indicate that dates too large for their month should saturate at the largest date (e.g. 2022-02-29 -> 2022-02-28) instead of erroring.
By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”.
Examples
>>> from datetime import timedelta, datetime >>> start = datetime(2001, 1, 1) >>> stop = datetime(2001, 1, 2) >>> s = pl.date_range(start, stop, timedelta(minutes=165), eager=True) >>> s shape: (9,) Series: 'date' [datetime[μs]] [ 2001-01-01 00:00:00 2001-01-01 02:45:00 2001-01-01 05:30:00 2001-01-01 08:15:00 2001-01-01 11:00:00 2001-01-01 13:45:00 2001-01-01 16:30:00 2001-01-01 19:15:00 2001-01-01 22:00:00 ] >>> s.dt.truncate("1h") shape: (9,) Series: 'date' [datetime[μs]] [ 2001-01-01 00:00:00 2001-01-01 02:00:00 2001-01-01 05:00:00 2001-01-01 08:00:00 2001-01-01 11:00:00 2001-01-01 13:00:00 2001-01-01 16:00:00 2001-01-01 19:00:00 2001-01-01 22:00:00 ] >>> s.dt.truncate("1h").series_equal(s.dt.truncate(timedelta(hours=1))) True
>>> start = datetime(2001, 1, 1) >>> stop = datetime(2001, 1, 1, 1) >>> s = pl.date_range(start, stop, "10m", eager=True) >>> s shape: (7,) Series: 'date' [datetime[μs]] [ 2001-01-01 00:00:00 2001-01-01 00:10:00 2001-01-01 00:20:00 2001-01-01 00:30:00 2001-01-01 00:40:00 2001-01-01 00:50:00 2001-01-01 01:00:00 ] >>> s.dt.truncate("30m") shape: (7,) Series: 'date' [datetime[μs]] [ 2001-01-01 00:00:00 2001-01-01 00:00:00 2001-01-01 00:00:00 2001-01-01 00:30:00 2001-01-01 00:30:00 2001-01-01 00:30:00 2001-01-01 01:00:00 ]
If crossing daylight savings time boundaries, you may want to use use_earliest and combine with
dst_offset()
andwhen()
:>>> ser = pl.date_range( ... datetime(2020, 10, 25, 0), ... datetime(2020, 10, 25, 2), ... "30m", ... eager=True, ... time_zone="Europe/London", ... ).dt.offset_by("15m") >>> ser shape: (7,) Series: 'date' [datetime[μs, Europe/London]] [ 2020-10-25 00:15:00 BST 2020-10-25 00:45:00 BST 2020-10-25 01:15:00 BST 2020-10-25 01:45:00 BST 2020-10-25 01:15:00 GMT 2020-10-25 01:45:00 GMT 2020-10-25 02:15:00 GMT ]
>>> pl.select( ... pl.when(ser.dt.dst_offset() == pl.duration(hours=1)) ... .then(ser.dt.truncate("30m", use_earliest=True)) ... .otherwise(ser.dt.truncate("30m", use_earliest=False)) ... )["date"] shape: (7,) Series: 'date' [datetime[μs, Europe/London]] [ 2020-10-25 00:00:00 BST 2020-10-25 00:30:00 BST 2020-10-25 01:00:00 BST 2020-10-25 01:30:00 BST 2020-10-25 01:00:00 GMT 2020-10-25 01:30:00 GMT 2020-10-25 02:00:00 GMT ]