How to set up private AI for your company — the complete guide
A practical walkthrough for CISOs, CTOs, and IT leads who want enterprise AI without sending company data to OpenAI, Anthropic, or Google. Covers self-hosted vs cloud, deployment patterns, and the questions your security team will ask.
If you're reading this, someone on your team — probably your CEO — has asked: "Why can't we just use ChatGPT for everything?" And you, the person responsible for keeping data safe, said: "Because we can't send our contracts, emails, and financial data to OpenAI's servers."
That conversation is happening in every company with more than 20 employees. Here's the guide for what to do about it.
What "private AI" actually means
The term gets thrown around loosely. Let's be precise. A private AI system is one where:
- Your data never trains anyone's model. Not the vendor's, not a third party's, not a research partner's. Contractually guaranteed and technically enforced.
- Your prompts and responses are not visible to the vendor. No human review, no logging for model improvement, no cross-customer caching.
- You control where the data lives. Either on your infrastructure (self-hosted) or in a dedicated, single-tenant instance that you can audit.
If any of these three conditions aren't met, it's not private AI — it's just AI with a privacy policy.
The three deployment models
1. Self-hosted (air-gapped capable)
You run everything: the LLM inference endpoint, the vector database, the orchestration layer, the frontend. Nothing leaves your network.
Best for: Regulated industries (healthcare, legal, financial services), defense contractors, companies with strict data residency requirements, anyone who needs to operate without internet.
What you need: Docker or Kubernetes, 32+ GB RAM, a GPU if you want local inference (or an Azure OpenAI endpoint you control), and someone to maintain it.
Example: Doyna Self-Hosted runs as a Docker Compose stack with PostgreSQL, Qdrant, Neo4j, Redis, and a FastAPI backend. Total setup time: about a day.
2. Managed cloud (single-tenant)
A vendor runs the infrastructure for you, but your data lives in a dedicated instance — not shared with other customers. The vendor handles updates, scaling, and monitoring. You get the privacy guarantees of self-hosted without the ops burden.
Best for: Teams under 200 who want private AI but don't have a devops team. Companies that need to move fast but still answer to a CISO.
What to look for: Single-tenant architecture (not multi-tenant with "data isolation"), contractual zero-training clause, EU or region-specific hosting options, SOC 2 or equivalent compliance.
Example: Doyna Cloud starts at €23/month per seat. Each customer gets their own isolated instance in the EU. No shared databases, no shared LLM context.
3. "Private" features on public platforms
Products like ChatGPT Enterprise, Microsoft Copilot, and Google Gemini for Workspace offer opt-out from training and some data isolation. But:
- Your data still flows through their multi-tenant infrastructure
- You can't audit the physical data path
- You're trusting their privacy policy, not your own infrastructure
- You have zero control if they change terms
This is fine for low-sensitivity use cases. It's not private AI in the meaningful sense.
What your security team will ask
Having deployed private AI at multiple companies, here are the questions that always come up in the security review — and the answers your CISO expects.
"Where does the data go?"
Good answer: "The data stays in our AWS/Azure/GCP VPC (self-hosted) or in a single-tenant instance in the EU that we can audit." Bad answer: "The vendor says they don't use our data for training."
"What LLM are we calling?"
Good answer: "Azure OpenAI in our own tenant, or a self-hosted model via vLLM/Ollama." Bad answer: "OpenAI's API directly."
"What happens if the vendor goes out of business?"
Good answer: "We have a code escrow agreement and a 60-day data export window. On self-hosted, we already have everything." Bad answer: "They're well-funded, they won't go out of business."
"Is our data used to improve the model?"
Good answer: "No. Zero-training is contractually guaranteed and technically enforced by single-tenant isolation. There is no feedback loop from our usage to any model weights." Bad answer: "We opted out of the training checkbox."
Getting started
If you're evaluating private AI for your company, here's a 5-step process:
- Inventory your data sources. What do you want the AI to search? Email, documents, CRM, meetings, chat? The more sources you connect, the more useful the AI becomes.
- Define your deployment preference. Self-hosted or cloud? This depends on your compliance requirements, team size, and ops capacity.
- Evaluate vendors on the three criteria above. Data training policy, prompt visibility, data residency control.
- Run a 7-day pilot with a small team on real (not synthetic) data. The AI is only useful if it actually has your context.
- Measure ROI. Track: time saved per person per week, deals with better context, meetings summarized automatically, documents generated vs manually written.
Doyna offers a 7-day free trial with no credit card, no sales call. Connect your Gmail or Outlook, load your email, and see what private AI feels like on your own data.