polars.DataFrame.upsample#
- DataFrame.upsample(
- time_column: str,
- *,
- every: str | timedelta,
- offset: str | timedelta | None = None,
- by: str | Sequence[str] | None = None,
- maintain_order: bool = False,
Upsample a DataFrame at a regular frequency.
The
every
andoffset
arguments are created with the following string language:1ns (1 nanosecond)
1us (1 microsecond)
1ms (1 millisecond)
1s (1 second)
1m (1 minute)
1h (1 hour)
1d (1 calendar day)
1w (1 calendar week)
1mo (1 calendar month)
1q (1 calendar quarter)
1y (1 calendar year)
1i (1 index count)
Or combine them:
“3d12h4m25s” # 3 days, 12 hours, 4 minutes, and 25 seconds
By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”.
- Parameters:
- time_column
time column will be used to determine a date_range. Note that this column has to be sorted for the output to make sense.
- every
interval will start ‘every’ duration
- offset
change the start of the date_range by this offset.
- by
First group by these columns and then upsample for every group
- maintain_order
Keep the ordering predictable. This is slower.
- Returns:
- DataFrame
Result will be sorted by
time_column
(but note that ifby
columns are passed, it will only be sorted within eachby
group).
Examples
Upsample a DataFrame by a certain interval.
>>> from datetime import datetime >>> df = pl.DataFrame( ... { ... "time": [ ... datetime(2021, 2, 1), ... datetime(2021, 4, 1), ... datetime(2021, 5, 1), ... datetime(2021, 6, 1), ... ], ... "groups": ["A", "B", "A", "B"], ... "values": [0, 1, 2, 3], ... } ... ).set_sorted("time") >>> df.upsample( ... time_column="time", every="1mo", by="groups", maintain_order=True ... ).select(pl.all().forward_fill()) shape: (7, 3) ┌─────────────────────┬────────┬────────┐ │ time ┆ groups ┆ values │ │ --- ┆ --- ┆ --- │ │ datetime[μs] ┆ str ┆ i64 │ ╞═════════════════════╪════════╪════════╡ │ 2021-02-01 00:00:00 ┆ A ┆ 0 │ │ 2021-03-01 00:00:00 ┆ A ┆ 0 │ │ 2021-04-01 00:00:00 ┆ A ┆ 0 │ │ 2021-05-01 00:00:00 ┆ A ┆ 2 │ │ 2021-04-01 00:00:00 ┆ B ┆ 1 │ │ 2021-05-01 00:00:00 ┆ B ┆ 1 │ │ 2021-06-01 00:00:00 ┆ B ┆ 3 │ └─────────────────────┴────────┴────────┘