| Component | Minimum | Recommended for Production |

- **OS**: Ubuntu 22.04 LTS or 24.04 LTS, RHEL 9, Rocky Linux 9

2. Technical Measures (Art. 32 GDPR)

- **Encryption at rest** for all disks, databases, vector stores, and backups (LUKS, AES-256)

- **Authorized Models Registry** listing allowed models with engineering justification

Hermes Agent and GDPR: how to deploy it on-premise for corporate data compliance

Q: 1. Data Processing Mapping

- **Agent Scope**: comprehensive list of finalities (customer support, data analysis, RAG, etc.)

Q: 3. Organizational Measures

- **DPO Appointment** (where mandated by industry sector)

One of the most frequent questions we receive at Hermes Agent Experts is: “Does my data leave the company?”. The short answer is no, if you deploy Hermes Agent on-premise. In this article, we look at how to configure Hermes Agent for GDPR compliance, NIS2, and regulated industries (healthcare, finance, public sector, defense), with a hands-on focus on what is truly needed in production.

The Problem: Corporate Data and Cloud Models

The adoption of generative AI in European enterprises is held back by one specific issue: corporate data is an invaluable asset, and sending it to an American API (OpenAI, Anthropic, Google) or a Chinese API (DeepSeek, Qwen) presents significant risks:

GDPR Violations (transfer of data to third-party countries without adequate protections under Schrems II)
NDA Violations with clients and partners
Loss of Competitive Advantage (prompts and corporate datasets can be used for model training, even when vendors claim otherwise)
Non-compliance with Sector-specific Regulations (healthcare, finance, public sector, and defense have strict in-house restrictions)

Hermes Agent was built specifically to solve this: it is an open-source agent that runs wherever you decide, speaks to any model (local or API-based), and lets you build an AI infrastructure completely under your control.

What “On-Premise” Means for Hermes Agent

“On-premise” in the context of Hermes Agent refers to three concrete aspects:

The agent runs on your server (physical, VPS, private cloud, or hybrid). No API calls to proprietary orchestration services.
The model runs on your hardware (or your private cloud). Self-hosted via vLLM, Ollama, llama.cpp, or TGI.
Data remains in your storage (PostgreSQL, Qdrant, local file system, or on-prem S3-compatible systems like MinIO).

There are no “phone home” components by default. No hidden telemetry. The codebase is open-source (MIT-like license) and completely auditable.

On-Premise Reference Architecture

[ User PC ] ←HTTPS→ [ Reverse proxy (Caddy/Traefik) ]
                              ↓
              [ Hermes Agent Core (Docker/K8s) ]
                              ↓
       ┌──────────┬──────────┬──────────┬──────────┐
       ↓          ↓          ↓          ↓          ↓
   [ LLM    ]  [ Vector ]  [ SQL DB ]  [ Tools ]  [ MCP    ]
   [ self-h.]  [ Store  ]  [          ]  [        ]  [ servers]
       ↓          ↓          ↓          ↓          ↓
   [ GPU     ]  [ /data  ]  [ /db    ]  [ /tmp  ]  [        ]
   [ server  ]  [        ]  [        ]  [        ]  [        ]

Everything sits inside your corporate perimeter. The reverse proxy handles TLS, authentication (SSO/LDAP/OAuth), and rate limiting. Hermes Agent communicates with self-hosted models via a local API (OpenAI-compatible), with the vector store via a private API, and with corporate tools via API/MCP.

Minimum Requirements for an On-Premise Deployment

Hardware

Component	Minimum	Recommended for Production
CPU	8 vCPUs (for small models)	32+ vCPUs
RAM	32 GB	128+ GB
GPU	Optional (quantized models on CPU)	1-4× NVIDIA L40S / A100 / H100
Storage	500 GB SSD	2-10 TB NVMe (for vector store and logging)
Network	1 Gbps	10 Gbps

Software

OS: Ubuntu 22.04 LTS or 24.04 LTS, RHEL 9, Rocky Linux 9
Containers: Docker + Docker Compose, or Kubernetes (Rancher, OpenShift, K3s)
LLM Serving: vLLM (recommended), Ollama (dev), TGI (HuggingFace), or llama.cpp (edge)
Vector Store: Qdrant (recommended), Weaviate, pgvector
Database: PostgreSQL 15+
Observability: Prometheus + Grafana + Loki (or Elastic Stack)
Auth: Keycloak, Authentik, or your corporate IdP
TLS: Internal Let’s Encrypt or corporate PKI

Recommended LLMs for On-Premise (2026)

Model	Parameters	Use case	Minimum VRAM
Llama 3.3 70B Instruct	70B (Q4)	General purpose, top-tier quality	48 GB
Qwen 2.5 72B Instruct	72B (Q4)	Multilingual (IT/EN/FR/ES/DE/CN)	48 GB
Mistral Large 2 (123B)	123B (Q4)	Complex reasoning, coding	80 GB
DeepSeek-V3	67B MoE	Coding + reasoning, highly cost-effective	48 GB
Mixtral 8x22B	141B (MoE active 39B)	Balanced quality and inference speed	48 GB
Llama 3.1 8B Instruct	8B (Q4)	Edge, simple tasks, ultra-low VRAM	8 GB
Phi-3 Medium	14B	Compact, outstanding quality	12 GB

For the European market, Qwen 2.5 72B is regularly the best choice: excellent quality in English, Italian, German, French, and Spanish, performing comparable to Llama 3 70B on MMLU benchmarks and remaining much more cost-effective to host.

Here is the operational checklist we execute for every client deployment:

1. Data Processing Mapping

Agent Scope: comprehensive list of finalities (customer support, data analysis, RAG, etc.)
Processed Data Types: classifications (PII, health data, financial data, code, internal documentation)
Legal Basis: contractual execution, legitimate interest, or consent
Retention: established storage durations for logs, vector stores, and backup structures
Registry of Processing Activities (Art. 30 GDPR) updated

Encryption at rest for all disks, databases, vector stores, and backups (LUKS, AES-256)
Encryption in transit (TLS 1.3, internal mTLS)
Data Segregation by corporate department/project boundaries
Access Control: RBAC, principle of least privilege, and MFA policies for all administrators
Comprehensive Audit Logging: signed and retained for compliance mandates
Encrypted Backups, regularly tested and kept off-site
Annual Penetration Testing
Continuous Vulnerability Scanning (Trivy, Snyk, OpenSCAP)

3. Organizational Measures

DPO Appointment (where mandated by industry sector)
DPA (Data Processing Agreements) signed with all subprocessors (e.g., hosting providers, LLM companies if API used)
Employee Training on secure AI usage patterns
Acceptable Use Policy signed by all active users
Automated DSR Workflows (handling rights of access, rectification, erasure, portability)
Data Breach Notification Procedure configured for triggers within 72 hours (Art. 33 GDPR)

4. Model Governance

Authorized Models Registry listing allowed models with engineering justification
Change Management protocols governing model updates
Model Traceability: linking specific generated outputs back to the engine version/weight hash (essential for auditing)
Human-in-the-loop Sample Reviews to ensure quality and compliance
Periodic Red Teaming exercises targeting prompt injections and jailbreaks

Hands-On: Minimal docker-compose.yml

Here is a starting docker-compose.yml for a secure, compliant on-premise installation:

version: '3.8'

services:
  # Self-hosted LLM (vLLM with quantized Qwen 2.5 72B)
  llm:
    image: vllm/vllm-openai:latest
    runtime: nvidia
    environment:
      - MODEL=Qwen/Qwen2.5-72B-Instruct-AWQ
    volumes:
      - /opt/models:/models
    ports:
      - "8000:8000"
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]

  # Hermes Agent Core
  hermes:
    image: nousresearch/hermes-agent:latest
    environment:
      - HERMES_LLM_BASE_URL=http://llm:8000/v1
      - HERMES_VECTOR_STORE=qdrant
      - HERMES_QDRANT_URL=http://qdrant:6333
      - HERMES_DB_URL=postgresql://hermes:***@db:5432/hermes
    volumes:
      - /opt/hermes/data:/data
      - /opt/hermes/skills:/skills
    ports:
      - "8080:8080"
    depends_on:
      - llm
      - qdrant
      - db

  # Secure Vector Store
  qdrant:
    image: qdrant/qdrant:latest
    volumes:
      - /opt/qdrant:/qdrant/storage
    ports:
      - "6333:6333"

  # Core Database
  db:
    image: postgres:15
    environment:
      - POSTGRES_DB=hermes
      - POSTGRES_USER=hermes
      - POSTGRES_PASSWORD=***
    volumes:
      - /opt/pgdata:/var/lib/postgresql/data

  # TLS Reverse Proxy
  proxy:
    image: caddy:2
    volumes:
      - /opt/caddy/Caddyfile:/etc/caddy/Caddyfile
      - /opt/caddy/data:/data
    ports:
      - "443:443"

All micro-frontend and engine traffic stays enclosed within the internal container network. Only the secure TLS reverse proxy is exposed externally, enforcing corporate SSO.

Audit Logs: What to Trace

Hermes Agent details can be configured for granular tracing. The following events must be stored:

User Identity: who executed the instruction (User ID, IP address, timestamp, user agent string)
User input: actual prompt (subject to internal DLP - data loss prevention filters)
Tool calls: which tools were invoked along with their specific arguments
Agent response: exact generated outputs
Model engine: exact model variant hash used
Inference Latency: total execution/response times
Status Outcome: success, error code, fallback, or human escalation event
Token consumption tracking: total input/output tokens and execution cost metrics

These application audit logs should be automatically routed directly into your corporate SIEM (Splunk, Elastic, Datadog) for log correlation.

Exercising the Right to be Forgotten (DSR)

When a customer or employee exercises their right to erasure (Art. 17 GDPR), we must execute workflows to securely scrape all associated records from:

Authentication systems (deactivating their profile, purging active tokens)
Vector index store (removing all document chunks mapped to their user ID)
Chat history databases (purging conversation threads)
Application logs (running GDPR pseudonymization scripts on logs)
Backups (data is overwritten according to standard backup rotation schedules)

We build automated, idempotent scripts that execute these purges cleanly, leaving an auditable validation of completed deletion.

NIS2 and Highly Regulated Environments

For sectors under the NIS2 Directive (energy, transportation, health, digital infrastructure, public sector, etc.), Hermes Agent can be hardened to guarantee:

Incident notifications within 24 hours (early alerting) and 72 hours (comprehensive notification) to target regulators
Cyber security risk posture tracking: treating the AI controller as a critical corporate asset
Business continuity plans: highly available active-active clusters, cold recovery procedures, precise RTO/RPO metrics
Supply chain verification: complete SBOM analysis of all core models, deep dependencies, and system container files

When Hybrid Cloud Makes Sense (And When It Doesn’t)

Pure On-Premise deployments are essential for:

Healthcare, finance, public sector, and defense industries
Entities legally bound by national data sovereignty regulations
Workflows that process trade secrets, patents, or intellectual property

Hybrid Cloud deployments (self-hosted models paired with highly specific vendor APIs) make sense for:

SMBs looking to minimize upfront server investments
Highly complex multi-modal tasks requiring global frontier models
Non-sensitive corporate tasks (e.g., market research, generating marketing copy, public documentation)

Pure, Commercial Cloud APIs are only appropriate for:

Open-access public data assets
Early sandbox experimentation
Companies utilizing verified enterprise non-training configurations residing in adequacy-approved countries

Licensing and Implementation Costs

An on-premises deployment represents an upfront investment (servers or dedicated private cloud resources) and maintenance effort. However, the 3-year TCO is regularly highly competitive with per-seat private cloud subscriptions, especially for organizations with steady high workloads.

Every deployment is quoted on a bespoke basis to align perfectly with the target infrastructure, integrations, and compliance requirements.

Studio Synapse offers a free 30-minute technical assessment and consulting session without any obligation. This session is designed to explore your use case, outline an appropriate deployment architecture, and provide a tailor-made quote. Contact us directly at contatti@hermesagentexperts.com.

Frequently Asked Questions

Is Hermes Agent GDPR compliant?

Hermes Agent is built to be GDPR-friendly by design: it can be installed completely on-premise with self-hosted models, so personal data never leaves the client's network. Full GDPR compliance is a factor of overall system setup: establishing processing registers, implementing strict access boundaries, defining data life cycles, and setting up DSR procedures. We deliver the entire compliant stack out of the box.

Can I use Hermes Agent with healthcare data?

Yes. In fully on-premise deployments with local self-hosted models. For health data ( EHR, diagnosis scripts, doctor reports) strict compliance mandates that data remains completely in-house and cannot be processed via third-party web APIs. Hermes Agent was constructed precisely for such restricted use cases.

Can Hermes Agent run within a VPS, or does it require dedicated local servers?

Hermes Agent is infrastructure-agnostic. It runs seamlessly on cloud VPS interfaces, private clouds, physical bare metal hardware, or hybrid configs. The ultimate choice depends on your security profile: defense, healthcare, and finance sectors frequently demand physical, offline private servers; for mid-market SMBs, a trusted EU-based cloud VPS hosting provider (such as Hetzner or OVH) with encrypted backups is excellent.

Which models can I run locally with Hermes Agent?

Any open-weights model compatible with standard hosting architectures: Llama 3 (Meta), Mistral, Mixtral, Qwen 2.5 (Alibaba), DeepSeek-V3, Gemma (Google), or Phi-3 (Microsoft). Performance matches closed commercial GPT-4 models on typical enterprise tasks. For edge queries requiring frontier models, we can leverage zero-data-retention APIs from OpenAI, Anthropic, or Google.

How is data retention handled in Hermes Agent?

By default, Hermes Agent does not store user data in permanent memory unless you explicitly enable long-term user memory tracking. Chat threads are ephemeral. Log tracing retention is configurable (typically 30 to 90 days), and index datasets within the vector DB follow custom retention policies, strictly respecting the GDPR minimization principle.

Does the agent support Right to be Forgotten (DSR) workflows?

Yes, via comprehensive, preconfigured cleanup scripts. These workflows scrub the user from authentication systems, purge their chunked history inside vector databases, delete relational database logs, and cleanly pseudonymize operational logs. We deliver a traceable script process for verified auditing.

What details are tracked in the audit logs?

The agent logs every key transaction: auth token context, user prompts, called core tools, read/write actions, model engines, processing latency, and token totals. Logs are cryptographically signed and archived. They connect natively to enterprise SIEM platforms such as Splunk, Elastic, or Datadog.

Is Hermes Agent approved for regulated sectors (banking, public sector, insurance)?

Yes, with custom configurations. We match Bank of Italy regulations (Circular 285) and international DORA mandates for financial institutions, IVASS policies for insurance groups, and the Italian Cloud strategy / AgID guidelines for Government installations. Our mix of self-hosted engines, encrypted storage, and end-to-end security satisfies strict compliance profiles.

What is the cost of an enterprise on-premise installation?

Every project is quoted on a bespoke basis according to integration complexity, legacy systems, security parameters, and requested SLA tiers. Studio Synapse provides detailed estimates only following an initial technical discovery. We offer a free 30-minute technical assessment and consulting session to map your project requirements.

What compliance standards do your installation processes follow?

At Studio Synapse, we function under processes mapped to ISO/IEC 27001 standards and execute according to OWASP and NIST CSF cybersecurity frameworks. We happily sign NDAs, DPAs (Data Processing Agreements), and support external compliance audits. For custom regional or medical requirements, we work alongside specialist cybersecurity legal counsels.

Conclusion

Hermes Agent is, today, one of the most effective technical answers to the data sovereignty challenge in the era of generative AI. On-premise, self-hosted, open-source, and model-agnostic: a combined feature set that enables European companies to implement AI without making concessions on compliance.

At Hermes Agent Experts (a brand of Studio Synapse), we deliver, configure, and manage on-premise Hermes Agent clusters for clients across Europe with an unwavering focus on compliance by design. For an honest evaluation of your specific operational goals, contact us directly at contatti@hermesagentexperts.com — we respond within one business day.