Supported configurations

Overview

Poolside supports model inference deployments in GPU-backed Kubernetes environments. Use this page to review supported environments and size the infrastructure that serves Poolside models.

Poolside support covers the environments and minimum requirements documented on this page. For other Kubernetes environments, GPU configurations, storage backends, or security requirements, contact your Poolside account team to discuss support options.

Supported environments

Deployment	Description
Amazon EKS 1.29+	Amazon Elastic Kubernetes Service, with IAM Roles for Service Accounts (IRSA) and an Application Load Balancer
OpenShift 4.16+	Red Hat OpenShift environments
Upstream Kubernetes 1.29+	Self-managed Kubernetes environments such as RKE2 or Charmed Kubernetes
On-premises	Single-node Kubernetes (RKE2) on customer-provided or Poolside-provided hardware

Inference requirements

Poolside models have different minimum requirements. Use this table to size your inference nodes. For concurrent-agent capacity and developer-seat estimates, contact your Poolside account team.

Laguna is the recommended primary model family for new deployments.

Model	Quantization	Minimum GPU memory	Minimum CPU	Minimum host memory
Laguna M.1	FP8	384 GB	128 cores	1 TB
Laguna XS.2	FP8	96 GB	44 cores	512 GB
Malibu 2.2	FP8	192 GB	128 cores	1 TB
Malibu 2.2	INT4	96 GB	44 cores	512 GB
Point	FP8	96 GB	44 cores	512 GB

For cloud deployments, storage requirements depend on whether S3-compatible storage is colocated on the inference node.

For on-premises deployments, see Storage requirements for guidance on sizing and configuring storage.

GPU memory reference

Poolside model inference targets the following NVIDIA GPU families: RTX 6000 Blackwell, H100, H200, B200, and B300. The memory per GPU is listed here for reference against the per-model minimum GPU memory table above:

GPU	Memory per GPU
H100	80 GB
RTX 6000 Blackwell	96 GB
H200	141 GB
B200	192 GB
B300	288 GB

Which GPUs suit a given model is model-dependent. The per-model minimum GPU memory table is a floor, not a GPU selector: context length, batch size, and the number of GPUs all affect what actually serves a model. Confirm the right combination of model, GPU type, and GPU count for your workload with your Poolside account team.

​Overview

​Supported environments

​Inference requirements

​GPU memory reference

​Support and compatibility

Overview

Supported environments

Inference requirements

GPU memory reference

Support and compatibility