At just $1.15/hr (billed monthly), you get an entire dedicated AI server — not a shared slice of one. Your model, your resources, your firewall. Always on.

Private dedicated AI infrastructure

RackNation now offers Private AI Infrastructure powered by NVIDIA DGX Blackwell servers, giving businesses of all sizes access to enterprise-grade artificial intelligence capabilities hosted entirely within our secure Costa Rica data centers.

Unlike renting GPU time from large cloud providers such as AWS, Google Cloud, or Azure — where costs scale unpredictably with usage and your data crosses international borders — RackNation's dedicated AI infrastructure gives your organization a private, always-on AI environment at a predictable monthly cost that is significantly more economical for sustained workloads. Your data never leaves your dedicated environment, your models run exclusively for your organization, and you benefit from the same cutting-edge NVIDIA Blackwell GPU technology used by the world's leading AI companies — without the cloud markup. Whether you are deploying large language models for internal knowledge management, customer service automation, document intelligence, or custom AI workflows, RackNation provides the infrastructure, connectivity, and local expertise to run serious AI at a fraction of the cost of hyperscale cloud alternatives. AI that works for you — private, powerful, and priced for the real world.

Get Started with AI

Specifications on the Nvidia DGX GB10 dedicated servers

1 PFLOP of AI performance — datacenter class performance

128 GB unified nvram memory — runs models up to 200B parameters

2x QSFP networking — two units cluster together natively for 405B parameter models

  
    
  
      Category
      Specification
    

  ChipNVIDIA GB10 Grace Blackwell Superchip
GPU ArchitectureNVIDIA Blackwell — 5th Gen Tensor Cores, 4th Gen RT Cores
CUDA Cores6,144
AI Performance1 PFLOP (FP4 with sparsity) / 1,000 TOPS
CPU20-core ARM (10x Cortex-X925 + 10x Cortex-A725)
Memory128 GB LPDDR5x Unified (CPU + GPU shared)
Memory Bandwidth273 GB/s
Memory Interface256-bit, 16 channels
Storage4 TB NVMe M.2 (self-encrypting)
Networking1x 10 GbE RJ-45 + 2x QSFP (ConnectX-7 Smart NIC)
Cluster Networking2x QSFP for dual-Spark interconnect
WirelessWi-Fi 7 + Bluetooth 5.4
Video OutputHDMI 2.1a, 1x NVENC, 1x NVDEC
USB4x USB Type-C
Max Model Size200B parameters (single) / 405B parameters (dual cluster)
Fine-tuningUp to 70B parameters
Power Consumption240W (GB10 TDP: 140W)
Dimensions150 x 150 x 50.5 mm
Weight1.2 kg (2.6 lbs)
Operating Temperature5°C – 30°C
OSDGX OS (Ubuntu-based)
AI FrameworksPyTorch, TensorRT-LLM, CUDA, Isaac, Metropolis, Holoscan


  

  
    
  
      Model
      Parameters
      Quantization
      Memory Used
      Speed (tok/sec)
      Best For
    

  Llama 3.18B4-bit~5 GB38 tok/sFast chat, Q&A, lightweight tasks
Llama 3.18B8-bit~9 GB25 tok/sHigher quality chat
Llama 3.170B4-bit~40 GB4.4 tok/sComplex reasoning, large context
DeepSeek R114B4-bit~9 GB20 tok/sAdvanced reasoning, math, code
DeepSeek R114B8-bit~15 GB13 tok/sHigher quality reasoning
DeepSeek R170B4-bit~40 GB~15 tok/sEnterprise reasoning & analysis
DeepSeek R170B8-bit~75 GB~8 tok/sMaximum quality reasoning
Qwen332B4-bit~20 GB9.4 tok/sMultilingual, coding, analysis
Qwen332B8-bit~35 GB6.2 tok/sHigh quality multilingual tasks
Gemma 312B4-bit~7 GB24 tok/sGoogle model, fast general use
Gemma 327B4-bit~16 GB10.8 tok/sHigher quality Google model
GPT-OSS20BMXFP4~12 GB58 tok/sOpenAI open model, fastest option
GPT-OSS120BMXFP4~70 GB41 tok/sFlagship open model, GPT-4 class
Qwen3 235B (dual Spark)235B4-bit MoE~256 GB~23 tok/sFlagship MoE — requires 2x DGX Spark


  

Run State-of-the-Art AI Models on Dedicated Private Infrastructure — No Limits, No Queues, Full Performance

Get Started Now

Network Architecture of your dedicated AI Infrastructure

Private AI Infrastructure

Dedicated AI Server · Network Architecture

DGX Spark · OPNsense · HyperFlex · PostgreSQL + pgvector

The NVIDIA DGX Spark was built for one thing: running large AI models at full speed. Paired with Racknation's HyperFlex 2.0 cloud infrastructure, you get the best of both worlds — dedicated GPU horsepower for your models, and flexible cloud compute for your PostgreSQL or ChromaDB databases, web interfaces, and application logic. One cohesive private stack, fully under your control, hosted in Costa Rica

Setup AI Now

The AI train is leaving. Don't let Big Cloud sell you a ticket you can't afford.

Model	Parameters	Quantization	Memory Used	Speed (tok/sec)	Best For
Llama 3.1	8B	4-bit	~5 GB	38 tok/s	Fast chat, Q&A, lightweight tasks
Llama 3.1	8B	8-bit	~9 GB	25 tok/s	Higher quality chat
Llama 3.1	70B	4-bit	~40 GB	4.4 tok/s	Complex reasoning, large context
DeepSeek R1	14B	4-bit	~9 GB	20 tok/s	Advanced reasoning, math, code
DeepSeek R1	14B	8-bit	~15 GB	13 tok/s	Higher quality reasoning
DeepSeek R1	70B	4-bit	~40 GB	~15 tok/s	Enterprise reasoning & analysis
DeepSeek R1	70B	8-bit	~75 GB	~8 tok/s	Maximum quality reasoning
Qwen3	32B	4-bit	~20 GB	9.4 tok/s	Multilingual, coding, analysis
Qwen3	32B	8-bit	~35 GB	6.2 tok/s	High quality multilingual tasks
Gemma 3	12B	4-bit	~7 GB	24 tok/s	Google model, fast general use
Gemma 3	27B	4-bit	~16 GB	10.8 tok/s	Higher quality Google model
GPT-OSS	20B	MXFP4	~12 GB	58 tok/s	OpenAI open model, fastest option
GPT-OSS	120B	MXFP4	~70 GB	41 tok/s	Flagship open model, GPT-4 class
Qwen3 235B (dual Spark)	235B	4-bit MoE	~256 GB	~23 tok/s	Flagship MoE — requires 2x DGX Spark