Private GenAI Stack

Everything you could do with a private GenAI platform — LLMs, RAG, vision, agents, API tools — plus fleet orchestration for autonomous agent computers. Your infrastructure, your data, your control.

Helix started here

Before agent computers, before fleet orchestration, before any of that — Helix was a private GenAI stack. A way to run large language models, build intelligent agents, connect RAG pipelines, and process documents with vision models — all on your own infrastructure. No API keys to OpenAI. No data leaving your network. No vendor lock-in.

That hasn't changed. Everything the platform could do before, it still does. We've just added an entirely new layer on top.

What the private GenAI stack gives you

Large language models on your hardware

Run open-source LLMs — Llama, Mistral, DeepSeek, Qwen, and more — on your own GPUs. Helix handles model serving, GPU scheduling, and inference optimization. You choose the models. You control the versions. You own the weights.

Multi-model support — run different models for different use cases, swap them without changing application code
GPU scheduler — intelligent routing across available GPUs, automatic queuing, zero idle waste
Quantization and optimization — run larger models on smaller hardware with minimal quality loss
Air-gap ready — models run entirely offline, no outbound internet required

RAG pipeline — text and vision

Connect your knowledge base to your models. Helix includes a full retrieval-augmented generation pipeline with vector database, chunking, and retrieval — plus vision RAG for documents that aren't just text.

Vector database — built-in, no external dependencies
Document ingestion — PDF, Word, HTML, Markdown, plain text
Vision RAG — extract structured data from scanned documents, images, charts, and complex layouts that defeat traditional OCR
Automatic chunking and embedding — configurable strategies for different document types
Source attribution — every answer traces back to the document and page it came from

Agent framework

Build AI agents that can use tools, call APIs, and accomplish multi-step tasks. Define agents with system prompts and API tool specs — the LLM figures out how to orchestrate them.

API tool integration — connect any API with a Swagger/OpenAPI spec, no custom SDK needed
Multi-turn conversations — agents maintain context across complex interactions
Skills library — pre-built integrations for common enterprise tasks
RAG as a tool — agents can query your knowledge base as part of their workflow
Vision as a tool — agents can process images and documents as part of their reasoning

Evals and CI/CD for AI

Test your agents before they hit production. Helix includes an evaluation framework that lets you define test cases, run them automatically, and catch regressions before deployment.

Automated test suites — define expected inputs and outputs, run on every change
Regression detection — know immediately when a model swap or prompt change breaks something
CI/CD integration — plug into your existing pipelines
Quality metrics — track accuracy, latency, and cost across deployments

And now: agent computers on top

Everything above is the foundation. On top of it, Helix now offers something no other private GenAI stack does: agent computers.

Full GPU-accelerated desktop environments — browser, terminal, filesystem, GUI applications — where autonomous agents do real work. Not a chat interface. Not a terminal session. A complete computer per agent, fully isolated, running on your infrastructure.

This is the layer that turns a GenAI platform into a fleet orchestration system:

Private GenAI stack	+ Agent computers
Chat with LLMs	Agents that write code, run tests, deploy
RAG-powered Q&A	Agents that research, synthesize, produce documents
API tool agents	Agents with full desktop environments and GUI access
Single-user interactions	Fleet of 15+ agents working in parallel
Prompt-and-response	Spec coding → kanban pipeline → human review → merge

You don't have to choose between a private GenAI platform and agent computers. Helix is both.

Why it matters that it's private

Every API call to a third-party AI service is a dependency you don't control. Every model update could break your workflows. Every rate limit could slow your business.

With Helix on your infrastructure:

Your data never leaves your network — prompts, responses, documents, agent activity. All of it stays on hardware you control.
No vendor lock-in — swap models freely, run multiple models simultaneously, never worry about an API being deprecated.
Predictable costs — no per-token pricing surprises. Your hardware, your electricity, your budget.
Compliance by default — SOC 2 Type II, ISO 27001 certified. HIPAA ready. Air-gap deployable for classified environments.
No rate limits — your GPUs, your throughput. Scale by adding hardware, not by negotiating with a vendor.

Deployment options

On your Mac: The full Helix stack — LLMs, RAG, agents, and agent computers — running on Apple Silicon. $299/year. Start 24-hour free trial →

On Helix Cloud: Managed infrastructure, zero setup. Same capabilities, we handle the GPUs. Join the waitlist →

On your Kubernetes cluster: Enterprise deployment with RBAC, SSO, audit logging, and unlimited agents. Air-gap support available. Talk to us about an enterprise pilot →

From GenAI platform to fleet orchestration

The teams that deployed private GenAI stacks early got a head start. The teams that are adding agent computers to that stack now are compounding that advantage.

Helix gives you both layers in one platform — the private GenAI foundation your compliance team requires, and the agent computer fleet your engineering team is already asking for.

Get started → · Contact sales →