Introduction
Use this page when you need to inspect or troubleshoot an on-premises Poolside model inference deployment. The commands on this page assume you have shell access to the deployment host andkubectl access to the RKE2 cluster.
Helpful aliases
Set these aliases for the current shell session, or add them to your shell profile.Check namespaces
On-premises model inference deployments commonly use these namespaces:poolside-modelsfor model inference workloadspoolside-servicesfor supporting infrastructure services, such as S3 object storage and the model checkpoint uploads to S3poolside-registryfor the embedded OCI registry that serves containerskube-systemfor RKE2 system componentspoolside-cert-managerfor certificate management
Running or Completed. Pods in Pending, ContainerCreating, Init, CrashLoopBackOff, or ImagePullBackOff require additional investigation.
Check model workloads
Thepoolside-models namespace contains deployed model inference workloads and model upload jobs.
Check supporting services
Thepoolside-services namespace contains infrastructure services required by model inference, such as S3-compatible object storage.
Check GPU availability
Confirm that the host detects the expected NVIDIA GPU devices:nvidia-smi is available only when NVIDIA drivers are installed on the host. If nvidia-smi is not available on the host, use a GPU-enabled Kubernetes pod or a GPU Operator validation container to confirm runtime access to the GPUs.
If model pods are stuck in Pending or ContainerCreating, inspect the pod details and GPU Operator events:
Check certificates
Poolside on-premises deployments usecert-manager to issue and renew certificates for internal and ingress endpoints.
Resolve SSL and x509 errors
The cluster is the source of truth for certificates. Use Kubernetes certificate and secret resources when you need to inspect the current certificate state. Exported certificate files underpoolside-install/certs are local copies generated by the deployment process.
The deployment installs CA certificates into the deployment host’s trust chain. When the deployment uses self-signed certificates, clients that connect to Poolside services also need to trust the self-signed CA certificate. If a client returns an x509 or certificate authority error, import the self-signed CA certificate into that client’s trusted root store.
After you import the certificate, restart the application, browser, shell session, or client process so it reloads the trust store.
Import on Windows
Import on macOS
Import on Ubuntu and Debian
Import on Red Hat Enterprise Linux and Fedora
Distribute the self-signed CA certificate
Use your organization’s normal endpoint management process to distribute the self-signed CA certificate to clients that access Poolside services. Common options include:- Group Policy for domain-joined Windows hosts
- Configuration management tools, such as Ansible or Puppet
- Mobile device management tools, such as Microsoft Intune
- Browser enterprise policies
- Internal file shares or package repositories