At just $1.30/hr (billed monthly), you get an entire dedicated AI server — not a shared slice of one. Your model, your resources, your firewall. Always on.
Private dedicated AI infrastructure
RackNation now offers Private AI Infrastructure powered by NVIDIA DGX Blackwell servers, giving businesses of all sizes access to enterprise-grade artificial intelligence capabilities hosted entirely within our secure Costa Rica data centers.
Unlike renting GPU time from large cloud providers such as AWS, Google Cloud, or Azure — where costs scale unpredictably with usage and your data crosses international borders — RackNation's dedicated AI infrastructure gives your organization a private, always-on AI environment at a predictable monthly cost that is significantly more economical for sustained workloads. Your data never leaves your dedicated environment, your models run exclusively for your organization, and you benefit from the same cutting-edge NVIDIA Blackwell GPU technology used by the world's leading AI companies — without the cloud markup. Whether you are deploying large language models for internal knowledge management, customer service automation, document intelligence, or custom AI workflows, RackNation provides the infrastructure, connectivity, and local expertise to run serious AI at a fraction of the cost of hyperscale cloud alternatives. AI that works for you — private, powerful, and priced for the real world.
Specifications on the Nvidia DGX GB10 dedicated servers
1 PFLOP of AI performance — datacenter class performance
128 GB unified nvram memory — runs models up to 200B parameters
2x QSFP networking — two units cluster together natively for 405B parameter models
| Category | Specification |
|---|---|
| Chip | NVIDIA GB10 Grace Blackwell Superchip |
| GPU Architecture | NVIDIA Blackwell — 5th Gen Tensor Cores, 4th Gen RT Cores |
| CUDA Cores | 6,144 |
| AI Performance | 1 PFLOP (FP4 with sparsity) / 1,000 TOPS |
| CPU | 20-core ARM (10x Cortex-X925 + 10x Cortex-A725) |
| Memory | 128 GB LPDDR5x Unified (CPU + GPU shared) |
| Memory Bandwidth | 273 GB/s |
| Memory Interface | 256-bit, 16 channels |
| Storage | 4 TB NVMe M.2 (self-encrypting) |
| Networking | 1x 10 GbE RJ-45 + 2x QSFP (ConnectX-7 Smart NIC) |
| Cluster Networking | 2x QSFP for dual-Spark interconnect |
| Wireless | Wi-Fi 7 + Bluetooth 5.4 |
| Video Output | HDMI 2.1a, 1x NVENC, 1x NVDEC |
| USB | 4x USB Type-C |
| Max Model Size | 200B parameters (single) / 405B parameters (dual cluster) |
| Fine-tuning | Up to 70B parameters |
| Power Consumption | 240W (GB10 TDP: 140W) |
| Dimensions | 150 x 150 x 50.5 mm |
| Weight | 1.2 kg (2.6 lbs) |
| Operating Temperature | 5°C – 30°C |
| OS | DGX OS (Ubuntu-based) |
| AI Frameworks | PyTorch, TensorRT-LLM, CUDA, Isaac, Metropolis, Holoscan |
| Model | Parameters | Quantization | Memory Used | Speed (tok/sec) | Best For |
|---|---|---|---|---|---|
| Llama 3.1 | 8B | 4-bit | ~5 GB | 38 tok/s | Fast chat, Q&A, lightweight tasks |
| Llama 3.1 | 8B | 8-bit | ~9 GB | 25 tok/s | Higher quality chat |
| Llama 3.1 | 70B | 4-bit | ~40 GB | 4.4 tok/s | Complex reasoning, large context |
| DeepSeek R1 | 14B | 4-bit | ~9 GB | 20 tok/s | Advanced reasoning, math, code |
| DeepSeek R1 | 14B | 8-bit | ~15 GB | 13 tok/s | Higher quality reasoning |
| DeepSeek R1 | 70B | 4-bit | ~40 GB | ~15 tok/s | Enterprise reasoning & analysis |
| DeepSeek R1 | 70B | 8-bit | ~75 GB | ~8 tok/s | Maximum quality reasoning |
| Qwen3 | 32B | 4-bit | ~20 GB | 9.4 tok/s | Multilingual, coding, analysis |
| Qwen3 | 32B | 8-bit | ~35 GB | 6.2 tok/s | High quality multilingual tasks |
| Gemma 3 | 12B | 4-bit | ~7 GB | 24 tok/s | Google model, fast general use |
| Gemma 3 | 27B | 4-bit | ~16 GB | 10.8 tok/s | Higher quality Google model |
| GPT-OSS | 20B | MXFP4 | ~12 GB | 58 tok/s | OpenAI open model, fastest option |
| GPT-OSS | 120B | MXFP4 | ~70 GB | 41 tok/s | Flagship open model, GPT-4 class |
| Qwen3 235B (dual Spark) | 235B | 4-bit MoE | ~256 GB | ~23 tok/s | Flagship MoE — requires 2x DGX Spark |
Run State-of-the-Art AI Models on Dedicated Private Infrastructure — No Limits, No Queues, Full Performance
Network Architecture of your dedicated AI Infrastructure
Dedicated AI Server · Network Architecture
DGX Spark · OPNsense · HyperFlex · PostgreSQL + pgvector
The NVIDIA DGX Spark was built for one thing: running large AI models at full speed. Paired with Racknation's HyperFlex 2.0 cloud infrastructure, you get the best of both worlds — dedicated GPU horsepower for your models, and flexible cloud compute for your PostgreSQL or ChromaDB databases, web interfaces, and application logic. One cohesive private stack, fully under your control, hosted in Costa Rica
The AI train is leaving. Don't let Big Cloud sell you a ticket you can't afford.