HelixML

Sovereign Server

A 4U rack server with 8× NVIDIA RTX 6000 Pro GPUs and 768 GB VRAM, Helix preloaded. Ship it to your data centre, power it on, run hundreds of concurrent AI agents with zero cloud dependency.

Overview

We ship a fully configured 4U rack server to your data centre with Helix pre-installed and ready to run. Plug it in, power it on, and your team has a private AI agent fleet — hundreds of concurrent agents, each with their own GPU-accelerated desktop — with zero cloud dependency. No API keys, no token metering, no data leaving your building.

It's the fastest path to owning your AI stack. No Kubernetes expertise required. No cloud accounts. No configuration. Just hardware, software, and onboarding — in one package.

Learn why digital sovereignty matters →


The hardware

The Sovereign Server is built on the Gigabyte G494-SB4 — a 4U GPU-optimised server platform designed for AI workloads.

Base configuration:

ComponentSpecification
GPUs8× NVIDIA RTX 6000 Pro (Blackwell generation, 96 GB GDDR7 each — 768 GB total VRAM)
CPUDual Intel Xeon 6 Series processors
Memory256 GB+ DDR5 ECC Registered
StorageNVMe SSD (configured to workload)
NetworkDual 10GbE onboard
PowerQuad 3000W redundant (80+ Titanium)
Form factor4U rackmount, standard 19″

Configurations are customisable. We work with you during onboarding to spec the server to your workload — more memory, more storage, different networking — whatever your environment needs.


What's included

Hardware — The server itself, fully assembled, tested, and burned in before shipping.

Software — Helix pre-installed and configured. The entire stack — inference, RAG, agents, agent desktops, fleet orchestration, observability — ready to go on first boot.

Onboarding — A discounted 8-week structured onboarding programme with the Helix team. We help you integrate with your existing git workflows, CI/CD pipelines, SSO provider, and communication tools (Slack/Teams). Weekly check-ins and a final readout.

First-year license — Your first year of the Helix enterprise license is included in the price. After year one, you renew annually.

Warranty — 3-year return-to-base hardware warranty included. On-site support upgrades available.


The economics

Cloud AI is expensive and getting more expensive. We're hearing from teams spending $3,000 per developer per month on tools like Claude Code and Cursor — and that number keeps climbing. Per-seat and per-token pricing means your costs scale linearly with adoption — exactly when you want your team leaning into AI the hardest.

The Sovereign Server flips that.

One-time hardware cost, predictable annual licence. No token metering. No per-API-call billing. No surprise invoices. No vendor deciding to double their prices overnight.

A single Sovereign Server comfortably supports 20–30+ developers and can run hundreds of concurrent AI agents, each with their own GPU-accelerated desktop. 768 GB of VRAM and hardware video encoding — this box has real density.

At typical enterprise AI spend:

ScenarioCloud AI (per year)Sovereign Server (per year)
10 developers, $3,000/month each$360,000Electricity + licence
20 developers, $3,000/month each$720,000Electricity + licence
30 developers, $3,000/month each$1,080,000Electricity + licence

At $3,000/developer/month, a team of 20 is spending $720,000 per year on cloud AI. The Sovereign Server pays for itself in under three months — and then keeps running for a decade. Even at more modest usage of $1,000/developer/month, a 20-person team recoups the hardware cost within the first year.

After the hardware cost is recovered, your ongoing costs are predictable: an annual Helix licence with per-seat pricing you negotiate up front — not per-token billing that spikes every time your team leans harder into AI. Want to go all-in? Talk to us about an org-wide unlimited licence. Either way, the compute is already paid for, and you know what you're paying before the year starts.

That's without factoring in the cost of a data breach, a compliance failure, or a vendor pulling the rug on your API access.


Sovereignty by default

The Sovereign Server isn't just cheaper. It's a different architecture.

Your infrastructure, your jurisdiction. The server sits in your data centre, in your country, under your legal framework. Not "your region" of someone else's cloud.

No external API calls. Every prompt, every response, every document stays on your network. Nothing leaves.

Open-weight models that rival the best. Llama, Qwen, Mistral, DeepSeek, Kimi — the latest open-weight models now match or exceed Claude and OpenAI on most benchmarks. You choose which to run. Swap models without asking permission, and verify you're running exactly what you think you're running. No proprietary black boxes — and no reason to settle for less capable models just because you're running locally.

Works offline, air-gap ready. Helix is designed to run fully disconnected. There's an optional version-update check, but no mandatory telemetry, no usage data collection, no licence heartbeat. Disconnect the network cable and it keeps running.

No vendor kill switch. We can't revoke your access, force a model update, or shut you down because your use case violates an acceptable use policy. The server is yours.

Full auditability. Every interaction logged locally. Prompts, responses, user identity, timestamps, model version. Your compliance team gets full visibility.

Cloud AI providers can't offer any of this. Their business model depends on you not having it.

Read the full digital sovereignty case →


What you can run

The Sovereign Server runs the full Helix stack. Everything you'd get on Helix Cloud or a self-managed Kubernetes deployment — but on dedicated hardware with no shared tenancy.

  • Private inference — Run state-of-the-art open-weight LLMs locally with vLLM and Ollama backends. The latest models from Meta, Alibaba, Mistral, and DeepSeek are competitive with Claude and OpenAI on coding, reasoning, and language tasks — and you can run them on your own hardware. OpenAI-compatible API, so your existing integrations work without changes.
  • RAG pipelines — Text and vision RAG over your internal documents, PDFs, wikis.
  • AI agents — Autonomous agents with web search, browser, API calling, MCP integration.
  • Agent desktops — Full GPU-accelerated streaming desktops for every agent. Watch them work, pair-program when they need help.
  • Fleet orchestration — Spec coding, kanban pipeline, human-in-the-loop review gates. Manage dozens of agents from a single dashboard.
  • Observability & evals — Full visibility into what your agents are doing and how well they're doing it.

With 8× RTX 6000 Pro GPUs and 768 GB of total VRAM, you can run large models (70B+ parameter) at production throughput while simultaneously running hundreds of concurrent agent desktops. Hardware video encoding means each desktop streams at 60fps with minimal CPU overhead.

Continuous updates, latest models. Helix is actively developed — you can upgrade the software at any time to pick up support for new models as they're released. New open-weight models are landing every few weeks, and your server keeps pace. No waiting for a vendor to decide to "support" something. If it runs on your hardware, you can run it.


Easily expandable

The Sovereign Server is a single machine, but it's not a dead end. Helix is designed to scale horizontally.

Need more compute? Just add more servers. Helix's architecture separates the control plane from GPU runners and agent sandboxes — so you can connect additional GPU servers to a single Helix control plane without replacing anything. Your second server doubles your capacity. Your third triples it. Each one slots in alongside the first.

For larger deployments, we recommend moving the Helix control plane onto Kubernetes, which lets you manage a cluster of GPU servers as a single fleet. We can help you configure this — whether that's a handful of rack servers in your data centre or a larger distributed deployment across multiple sites.

Start with one box. Scale when you need to. No rip-and-replace.


Who it's for

Regulated industries — Finance, healthcare, legal, defence, government. If you're subject to GDPR, NIS2, DORA, or the EU AI Act, the Sovereign Server gives you compliance by architecture rather than by contract.

Organisations leaving the cloud — If you've done the maths on cloud AI spend and decided to bring it in-house, the Sovereign Server is the fastest path. No Kubernetes expertise required.

Air-gapped environments — Classified and high-security environments where no outbound network access is permitted. The Sovereign Server runs fully disconnected after initial setup.

Teams scaling AI adoption — If your developers are already running AI agents on personal machines and you need to bring that inside your security perimeter with proper controls, this is the enterprise answer.


Pricing

~$175,000 — broken out: ~$100,000 for the hardware (CyberServe appliance, yours to keep) + $75,000 for the 8-week onboarding pilot and first-year enterprise licence. Helix pre-installed and configured.

Annual licence renewal after year one with per-seat pricing. For larger organisations, ask us about org-wide unlimited licences. Contact us for licence pricing.

Custom configurations available. If you need different GPU specs, more memory, or a multi-server deployment, we'll work with you to spec the right setup.


Get started

Ready to take control of your AI infrastructure?

Talk to us about a Sovereign Server →

Already have your own hardware? You can also deploy Helix on any Linux server or Kubernetes cluster. See Linux & Kubernetes docs →