Skip to content

Azure Kubernetes Service (EKS)

Initial configuration

This page expects that you've already set up a Polars cluster once through the Polars Cloud onboarding or the getting started guide.

Data access using Pod Identity

Through Pod Identity, you can securely access private S3 buckets without needing to manage service account keys or credentials. See the guide in the official EKS documentation.

helm upgrade --install polars polars-inc/polars \
  --set scheduler.serviceAccount.name=<YOUR_SERVICE_ACCOUNT_NAME> \
  --set worker.serviceAccount.name=<YOUR_SERVICE_ACCOUNT_NAME> \
# ...

Assuming you have an S3 bucket already set up (see quick-start here), you can then scan or sink directly from the bucket.

path = f"s3://YOUR_S3_BUCKET_NAME/PATH/TO/DATA/"

q = (
    pl.scan_parquet(path)
# ..
)

You may also use S3 as an anonymous results location by configuring the values as such:

anonymousResults:
  s3:
    enabled: true
    endpoint: "s3://YOUR_S3_BUCKET_NAME/PATH/TO/DATA/"

To use S3 as a shuffle location, configure the values as such:

shuffleData:
  s3:
    enabled: true
    endpoint: "s3://YOUR_S3_BUCKET_NAME/PATH/TO/DATA/"