Setting up a Compute context
The compute context is the abstraction of the hardware to execute the query on. This can be either a single node, or in case of distributed execution, on multiple nodes . In this section we will cover how to setup your compute context.
ctx = pc.ComputeContext(
workspace="your-workspace",
instance_type="t2.micro",
cluster_size=2,
labels=["docs"],
)
Setting the context
There are three ways to define the compute context:
- Use your workspace default
- Define CPUs and RAM
- Set instance type
Workspace default
In the Polars Cloud dashboard you can set a default requirements from your cloud service provider to be used for all queries. Next to that you can also manually define storage and the default cluster size to run your queries on.
Polars Cloud will use these defaults if no other parameters are passed to the ComputeContext
.
ctx = pc.ComputeContext(workspace="your-workspace")
Find out more about how to set workspace defaults in the workspace settings section.
Define CPU and RAM
You can directly specify the cpus
and memory
in your ComputeContext. When set, Polars Cloud will
match your requirements and pick the most suitable and efficient instance_type
from your cloud
service provider. The requirements are lower bounds, meaning the machine will have at least that
number of CPUs and memory.
ctx = pc.ComputeContext(
workspace="your-workspace",
memory=8,
cpus=2,
)
Set instance type
Another option is to define the specific instance type for Polars to use. This could be helpful if you want to use a specific instance type in your production environment.
ctx = pc.ComputeContext(
workspace="your-workspace",
instance_type="t2.micro",
cluster_size=2
)
Setting the Compute Context
Once the compute context is defined, you'll need to provide it to the query. This can be done in two ways:
remote(ctx)
: By directly passing the context to the remote query.pc.set_compute_context(ctx)
: By globally setting the compute context. This way you set it once and don't need to provide it to everyremote
call.