EXETON Logo

Case Studies

AI Clusters

Multi-node GPU clusters with high-speed interconnects enabling distributed AI training at scale.

Government Research Laboratory

National Research Lab GPU Cluster

512-GPU NVIDIA DGX cluster for climate modeling and scientific research.

Challenge

Research lab needed a world-class supercomputing facility to run complex climate simulations and AI models.

Solution

Designed and deployed a 64-node DGX H100 cluster with HDR InfiniBand fabric and Lustre parallel file system.

DGX H100InfiniBand HDRLustreSlurm

Results

512
NVIDIA H100 GPUs
2.3
Exaflops peak
99.99%
Uptime achieved
Top 100
Global ranking
Major Research University

AI Research University Cluster

Multi-tenant GPU cluster supporting thousands of researchers across multiple departments.

Challenge

University needed shared GPU infrastructure that could support diverse workloads from multiple research groups.

Solution

Built a 128-GPU shared cluster with fair-share scheduling, resource quotas, and JupyterHub integration.

NVIDIA A100KubernetesJupyterHubNVIDIA NGC

Results

128
NVIDIA A100 GPUs
500+
Active researchers
95%
GPU utilization
3,000+
Jobs per week
Fortune 100 Technology Company

Enterprise LLM Training Cluster

Large-scale GPU cluster for training proprietary large language models.

Challenge

Client required massive compute capacity with high reliability for training multi-billion parameter models.

Solution

Deployed a 1,024-GPU H100 cluster with liquid cooling and custom networking for optimal training performance.

DGX H100InfiniBand NDRLiquid CoolingSLURM

Results

1,024
NVIDIA H100 GPUs
400Gbps
InfiniBand per node
20MW
Power capacity
100B+
Parameter models

Build Your AI Cluster

Let's design a cluster architecture optimized for your AI workloads.