Resource limits
Polars will always attempt to use all available CPU resources, and consumes memory resources as needed. If possible, assign physical cores to Polars to avoid contention with other processes. Polars on-premises workers consist of a main process and an executor process, with the latter doing most computation. If the executor dies, it will lose all progress of the stage it was working on. All progress of previous stages (i.e. shuffle data) is managed by the main process.
In other words, if the system is low on memory, the first process that should be killed, is the
executor process. Polars on-premises will already automatically configure oom-score-adj on its
executor process.
If there are other system critical processes, we recommend either delegating a cgroup to Polars on-premises, or manually setting up cgroup limits for the entire Polars on-premises service.
Delegating cgroup to Polars on-premises
Cgroups can contain subgroups, each with independent limits. Polars on-premises can create these
cgroups, and choose proper memory limits for each of its components. To use this feature, ensure you
delegate cgroups to the Polars On-Premise process and configure memory_limit in the configuration
file.
cluster_id = "polars-cluster"
instance_id = "node-0"
license = "/etc/polars/license.json"
memory_limit = 10737418240 # 10 GiB
# ...
Manually configuring cgroup limits
You can also manually configure a memory limit on a cgroup containing all the processes. For example
using Systemd's resource-control. The disadvantage of this approach is that the individual
components will contend for the same memory capacity, which may prevent Polars on-premises from
gracefully handling OOM errors on the executor.