Ventrin Documentation

What Ventrin is

Ventrin is a policy layer that sits between your team and any LLM provider. It has two integration surfaces:

Browser extension — intercepts prompts in ChatGPT, Claude, Gemini, and Microsoft Copilot. Before each prompt reaches the provider, Ventrin scans it and either allows, rewrites, or blocks it. The user stays in their existing AI tool; nothing new to learn.
API gateway — a drop-in replacement for POST /v1/chat/completions. Your internal apps point at Ventrin instead of OpenAI/Anthropic/Gemini; Ventrin evaluates the prompt, forwards it to the provider using your stored credential, and returns the answer along with a structured decision record.

Both surfaces share one policy engine, one set of policy packs, and one audit log. A decision a team member sees in ChatGPT will use the same rules as a decision an app makes via the gateway.

Not an LLM wrapper. Ventrin does not run inference. You keep your existing provider contracts; we only govern what leaves your perimeter and in what shape.

How it works

Detect. A four-layer engine (regex for credentials, weighted term banks, context boosters, total score against thresholds) scans the outbound prompt.
Decide. Allow, warn, rewrite, or block based on which thresholds the score crosses and whether any hard-block pattern fired (credentials always block).
Rewrite or forward. If the decision is "rewrite", a deterministic anonymiser removes direct identifiers and swaps them for natural-language abstractions. If the decision is "allow" or "warn", the prompt is forwarded unchanged.
Log. Every decision writes one audit row with the decision, categories, score, and a masked preview.

Detection runs entirely on your server for API traffic, and partly in the browser for extension traffic (with a belt-and-braces verifier on the server). Content that triggers a hard block never leaves Ventrin's perimeter.

Quickstart

The shortest path from zero to a protected team:

Sign up at app.ventrin.com.
Create a workspace. Pick a vertical (legal / healthcare / generic) — this choice picks the starter policy pack.
Under API Access → Providers, add at least one provider API key (OpenAI / Anthropic / Gemini). Keys are encrypted before storage.
Under API Access → API Keys, generate a Ventrin key. Save it — the plaintext is shown once.
Under API Access → Settings, enable the gateway.
Point your app at https://www.ventrin.com/v1/chat/completions using the Ventrin key, or install the Chrome extension and sign in.
Send a test prompt. Open Events in the dashboard — you should see the event appear within a few seconds.

First sign-in and onboarding

First-time users see an onboarding wizard that covers four things: workspace name, vertical (used to pre-select a policy pack), a starter policy review, and an invite flow for the rest of the team.

You can re-run onboarding at any time via Settings → Re-run onboarding, which is useful if you've changed vertical or want to show the product to a colleague.

Overview

Path: /

Home screen for any signed-in user. Shows recent activity, a 7-day KPI set (events, blocks, sanitisations), and quick links into Events, Policies, and API Access.

End users see only their own events; admins see everyone.

Events & Flagged

Path: /events — sub-tab: Flagged (admins only, formerly /false-flags).

Events tab

Every evaluated prompt writes one event row. The table shows:

Timestamp, user, destination tool (ChatGPT / Claude / Gemini / Copilot)
Decision (allow / warn / sanitise / block) and risk score
Masked preview, plus a "Reveal" button for admins (audit-logged and step-up-gated)
If a rewrite was suggested: the rewritten text alongside the original

Filter chips

Click a decision chip to filter to that class, and/or a tool chip to filter to one destination. Chips toggle.

Reveal

Clicking the eye icon next to an event title calls the revealPrompt callable. You'll be asked for a reason (min 3 chars) which is stored on an audit row (promptRevealLogs). The session must have re-authenticated within the last 5 minutes — if not, you'll need to sign out and back in first.

Flagged tab

Users can report an intervention as a false positive from the extension. Those reports land here for admin triage: mark as False positive (will feed back into policy tuning) or Uphold rule.

Policies

Path: /policies — sub-tabs: Rules and Packs.

Rules tab

Every rule has: a name, category, severity (low/medium/high/critical), action (allow/warn/suggest_rewrite/block), enabled toggle, and one or more patterns (regex, keyword, or context). You can edit severity and action inline; pattern editing is coming in a later release.

Packs tab

Packs are the higher-level grouping. A workspace has a set of active packs; the engine composes them at evaluation time using max-weight (safest wins) semantics. Three packs ship today:

General — PII, credentials, basic company-internal. Always on.
Legal — client / matter / counterparty language, privileged text, case numbers.
Healthcare — patient context, NHS / MRN identifiers.

Each pack has per-category toggles, an allowlist of phrases to never flag, and a strictness multiplier (0.5–2.0).

Protections

Path: /protections — sub-tabs: Domains and Tracked Names.

Domains tab

Per-tool access mode (allowed, allowed + warn, strict, blocked) and optional per-team overrides. Use this to lock down a specific tool for a specific team (e.g. block Gemini for the regulated-case team while permitting ChatGPT workspace-wide).

Tracked Names tab

Workspace-specific terms Ventrin should always anonymise. Typically: client names, internal project codenames. Each term gets a mapping (e.g. "Acme Ltd → the organisation"); when a prompt contains the term, the rewriter uses the mapping.

Users & Teams

Path: /users — sub-tabs: Members and Teams.

Members tab

Workspace members and pending invites. Admin actions: invite, revoke invite, change team assignment, change role.

Teams tab

Group members for scoped policies and domain overrides. Each team has an optional strictness boost (−1 to +2) that biases thresholds tighter or looser for that team only.

API Access

Path: /api-access — tabs: API Keys, Providers, Logs, Settings.

API Keys tab

Create, rotate, and revoke Ventrin keys (vtk_…). Each key carries:

Label — free text for you to find it later.
Policy packs — which packs apply to requests on this key (empty = workspace default).
Allowed providers — restrict to a subset.
Bound provider key — pin to one specific stored provider credential.
RPM limit — per-key rate cap. Empty = workspace default.

The plaintext value is shown once at creation. To rotate without downtime, click Rotate: a new key is generated and the old one keeps working for 24 hours while you redeploy.

Providers tab

Stored API credentials for OpenAI / Anthropic / Gemini, encrypted with AES-256-GCM before writing to Firestore. You can add multiple keys per provider and mark one default. A shape warning appears if a pasted key's prefix doesn't match the selected provider (e.g. you chose OpenAI but pasted a Gemini key).

Logs tab

Every gateway request writes one row. Columns:

Time, decision, provider, model, service tag, score, latency
Top matched categories
Masked preview of the prompt

Filterable by decision, provider, and API key. The full decrypted body is only stored when the workspace enables Full body logging. Revealing a full body requires step-up auth and is audited.

Settings tab

Workspace-level gateway config:

Gateway enabled — master switch. When off, every request returns 403.
Per-key RPM default — used by keys with no explicit limit.
Workspace RPM — sum across all keys.
Strict mode — warnings auto-escalate to blocks.
Log mode — masked preview only (default) or full encrypted body (opt-in).

Settings

Path: /settings

Workspace identity + global posture:

Workspace name, vertical, business context (one-line description used to tune anonymiser wording)
Anonymisation mode — suggest, auto, or block if unsafe
Reveal requires confirmation (default: on)
Allow user "continue anyway" override
Fail-closed on evaluation error — if the policy call fails, block rather than allow
Prompt storage mode — full encrypted, full + masked preview, or reduced

Installing the extension

Download the zip from ventrin.com/download, or load the unpacked build from extension/dist.
Open chrome://extensions, enable Developer Mode, click Load unpacked, point it at the folder.
Pin Ventrin to the toolbar. A red "!" badge means not signed in.

The extension supports Chrome and Chromium-based browsers (Edge, Brave, Arc). Firefox support is on the roadmap.

Signing in

Click the extension icon. You'll see sign-in / sign-up options. If you already have a workspace from the dashboard, use the same email + password; otherwise sign up and redeem your workspace invite token.

Your ID token is cached in chrome.storage.local and refreshed every 30 minutes. If the token expires and the service worker is asleep, the content script will re-fetch it on the next prompt.

The intervention panel

When you press Send on ChatGPT / Claude / Gemini / Copilot, Ventrin evaluates your prompt before it leaves the page. One of four things happens:

Allowed — no intervention; the prompt flows to the provider.
Warn — a toast appears briefly; the prompt still flows. The warning is logged so admins can see patterns.
Suggest rewrite — a side panel appears with an anonymised version. Options: Use safe version (replaces the input and re-submits), Edit manually (shows the diff and lets you fix it), Continue anyway (if policy permits — logged distinctly), Report false flag.
Block — send is cancelled, the panel explains why, and offers a one-click fix when possible.

If the extension detects a credential in your prompt (API key, password, bearer token), it hard-blocks regardless of policy — there's no safe version to offer.

File / document scan

The extension intercepts file uploads to ChatGPT / Claude / Gemini when the page exposes a file input. Documents are scanned client-side (no upload until the scan passes). Supported formats: plain text, Markdown, PDFs, and .docx.

If the document contains credentials or clear direct identifiers, the upload is cancelled and the user sees a notice. For softer signals, the file is flagged but the user can choose to continue.

Troubleshooting

The extension shows a red "!" badge and prompts aren't being evaluated.: You're signed out. Click the extension icon → sign in with your workspace credentials.
The panel doesn't appear when I press Send.: Check the extension is enabled for the current site. ChatGPT and Claude occasionally change their submit-button selectors; we update the extension as they change, but reload the page first.
Prompts feel slow after I disabled or removed the extension.: Fixed in the April 2026 release. Old versions' page-world hook kept waiting for a content script that's no longer running; we now retire the hook within 3 seconds on disable. Reload open tabs if you're on an older build.
Ventrin is blocking a prompt that seems fine.: Use Report false flag in the panel. A workspace admin reviews these under Events → Flagged and can adjust the relevant rule or add an allowlist phrase.

Four layers

Hard patterns

Regex for credentials, API keys, private keys, bearer tokens, password mentions. First match returns blocked with hardBlock: true; these never score.
Weighted terms

Per-pack keyword banks (legal, healthcare, general). Duplicate terms across packs resolve to max weight.
Context boosters

Adjacency phrases that amplify the weights of nearby terms. Prevents "contract" alone from triggering a legal alert.
Scoring

Layer 2 + Layer 3 sum into a total, compared against pack thresholds: warn, sanitise, block.

Policy packs

A pack is a versioned JSON bundle with patterns, terms, boosters, hard blocks, and rewrite rules. Packs compose: a workspace enables N packs and the engine merges them at evaluation time (max-weight per term; most-sensitive threshold wins).

Three packs ship today (see Policies).

Thresholds & strict mode

General pack defaults:

warn — total ≥ 10
sanitise — total ≥ 40
block — total ≥ 85

When multiple packs are active, each threshold is taken as the minimum across packs — safer wins. Strict mode (on the gateway) promotes every warn to block, useful for internal tools where even low-signal content should halt.

Rewrite engine

When a score crosses the sanitise threshold:

Deterministic anonymise — regex + NER spans mapped to industry-appropriate natural language ("our client", "the organisation", "the matter").
Coref-lite — subsequent pronouns that refer to replaced entities are collapsed to the abstraction.
Quasi-identifier check — combinations like "small firm + London + asylum" flag a residual-risk warning even when no single identifier remains.
Role-aware polish — fixes awkward grammar left behind, keeps first-person voice ("I am acting for a client…" not "our client").
Verifier pass — re-scans the rewritten text; if any direct identifier still matches, blocks instead of forwarding.
Optional LLM polish — Gemini Flash smooths the sanitised output for naturalness. The original prompt is never sent; only the already-sanitised text.

Tracked names

Workspace-specific terms the generic packs wouldn't catch. Add the exact string (case-insensitive) and a natural-language replacement. The rewriter uses your mapping instead of the default.

Examples: Acme Ltd → the organisation, Project Aurora → the internal project, Mary Jones → the patient.

API overview

A drop-in replacement for OpenAI's /v1/chat/completions that runs every request through Ventrin's policy engine before forwarding to your chosen provider. The wire format is OpenAI-compatible; the difference is a mandatory status field on every response.

Full API reference: /api-guide.

Authentication

Send a Ventrin key (vtk_…) in either Authorization: Bearer or X-Ventrin-Api-Key. Only the SHA-256 of the key is persisted; the plaintext is shown once at creation.

If you leak a key, revoke it under API Access → API Keys → Revoke. To rotate with zero downtime, click Rotate: a new key is issued and the old one keeps working for 24 hours.

Provider keys & per-request selection

A workspace can store many provider credentials — for example three OpenAI keys for three projects. The credential used for a request is resolved in this order:

metadata.provider_key_id in the request body
X-Ventrin-Provider-Key header
The provider key bound to the authenticating API key (set in the dashboard)
The isDefault provider key for the inferred provider
Any enabled provider key for that provider

Decision statuses

allowed: Forwarded unchanged. Response mirrors the provider's native body.
warn: Forwarded unchanged; decision flagged. In strict mode this is escalated to blocked.
sanitised: Rewritten before forwarding. Response includes sanitised_prompt and transformations.
blocked: Not forwarded. Response includes message and, when safe, a suggested_safe_version.
error: Non-2xx HTTP. error.code is machine-readable.

Streaming

Set "stream": true in the request. The response becomes Server-Sent Events. The first event is ventrin.decision; subsequent events are the provider's native SSE frames, forwarded unchanged.

If the provider stream is idle for 30 seconds, Ventrin closes the connection with a ventrin.error event and logs the reason.

Rate limits & request shape

Per-key RPM (defaults to workspace defaultRpmLimit)
Per-workspace RPM
Per-IP cap (120 / min; failed-auth attempts trigger a temporary IP ban)
Max request body: 256 KB
Max messages.length: 50
Max latest-user-message length: 60 000 chars
Max per-request duration: 540 s

Node SDK

A typed client is published as @ventrin/sdk-node. It wraps fetch with retry on 5xx, a non-streaming timeout, and a discriminated GatewayResponse union so TypeScript forces you to handle every decision status.

javascript

import { Ventrin } from "@ventrin/sdk-node";

const client = new Ventrin({ apiKey: process.env.VENTRIN_KEY });

const r = await client.chatCompletions({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Summarise arbitration." }],
  metadata: { service: "assistant" },
});

if (r.status === "blocked") showUser(r.message, { retryWith: r.suggested_safe_version });
else if (r.status === "sanitised") showUser(r.provider_response, { rewritten: r.sanitised_prompt });
else if (r.status === "error") throw new Error(r.error.message);
else /* allowed | warn */ showUser(r.provider_response);

Encryption at rest

Prompts (when full-body logging is on) and provider credentials are encrypted with AES-256-GCM before writing to Firestore. The key is held server-side in the Cloud Function environment and never returned to the client. Moving per-workspace data-encryption keys backed by Google Secret Manager is on the roadmap.

Audit & reveal

Every decrypt of a stored prompt writes an audit row (promptRevealLogs or gatewayRevealLogs) containing the admin user id, the event/log id, a mandatory reason (3 chars minimum), and the timestamp.

Revealing requires step-up auth — the session must have re-authenticated within the last 5 minutes. An older session returns requires_step_up; sign out and back in to continue.

GDPR erasure

Two admin callables handle right-to-erasure requests:

purgeEventsForUser({ userId, reason }) — deletes every events row in this workspace for the given user id.
purgeGatewayLogsForUser({ service, requestIdPrefix, reason }) — deletes gateway logs matching a service tag and/or a requestId prefix.

Both require workspace-admin, step-up auth, and a non-empty reason. Deletions are batched (400 per commit) and the total count is returned plus a sample of deleted document ids for the audit record.

What we cannot guarantee

Policy enforcement against prompt-injection — we can narrow, not eliminate.
Zero reach of content into provider-side logs when a rewrite fails to fully scrub.
Perfect detection across every language — English is the strongest today.
Policy outcomes remaining stable across pack version upgrades.

See the full residual-risk list in our most recent security audit.

Rotating keys

Ventrin API keys: click Rotate on the row. New plaintext is shown once; the old key keeps working for 24 hours. Push a deploy with the new key, confirm traffic shifts, and the old key naturally stops working.

Provider keys: add the new provider key, mark it default, then revoke the old one after traffic has moved over (use the Logs tab to verify the new key is being resolved).

Reading logs

Events — human events from the extension. Filter by decision class and tool.

API Access → Logs — machine traffic from the gateway. Filter by decision, provider, and API key. Timing breakdown per row (policyMs, rewriteMs, providerMs, totalMs) makes it obvious when a customer's prompt is slow because of the upstream provider vs. our pipeline.

Alerts

Cloud Logging emits a structured ventrin: "reveal" record every time an admin reveals a stored prompt. Wire a log-based alert from this record to an email / Slack / PagerDuty channel — see MONITORING_RUNBOOK.md.

Rate-anomaly and billing-anomaly alerts are planned but not shipped.

Glossary

Workspace: The tenant boundary. One workspace owns its users, policies, events, keys, and provider credentials. Workspaces are fully isolated.
Policy pack: A versioned bundle of patterns, terms, boosters, hard-blocks, and thresholds. A workspace composes multiple packs.
Hard block: A pattern (credentials, private keys) that forces an immediate block regardless of score. The request never leaves Ventrin's perimeter.
Rewrite: A deterministic anonymisation of the prompt that preserves intent while removing identifiers. Optionally LLM-polished.
Intervention panel: The UI the browser extension shows when a decision is warn/rewrite/block.
Step-up auth: The requirement that sensitive actions (reveal, GDPR erasure) run within 5 minutes of a fresh sign-in.
Sanitise threshold: The score at or above which the engine attempts to rewrite the prompt before forwarding.
Strict mode: A gateway setting that promotes every warn outcome to blocked.
Gateway: The public API endpoint at POST /v1/chat/completions.
Provider key: Your OpenAI / Anthropic / Gemini credential, stored encrypted.
Ventrin key: A vtk_… token that authenticates a caller to the Ventrin gateway.

FAQ

Do you train on our prompts?: No. Ventrin never trains on user content. The rewriter's optional LLM call goes to Gemini Flash with your API key; the request is not used to train models per Google's terms for that endpoint.
Can I use my own OpenAI key?: Yes — that's the default model. You store your provider key under API Access → Providers; Ventrin never uses its own credentials to call providers on your behalf.
What browsers are supported?: Any Chromium browser (Chrome, Edge, Brave, Arc). Firefox is on the roadmap; Safari requires a separate App Store submission and is not prioritised for early access.
Does Ventrin see the text of my prompts?: Briefly, yes — every prompt is scanned server-side. With default logging, only a masked preview is persisted. Full-body logging is opt-in per workspace and always encrypted at rest.
How does Ventrin behave when it's down?: The gateway returns an error. The browser extension respects your workspace's fail-closed setting: either block the prompt (strict) or allow it with a warning toast (default).
What happens on free trial?: You get the full feature set with a modest RPM limit and capped seat count. No credit card until you move to a paid plan.

Support

Email hello@ventrin.app for product questions and security@ventrin.app for responsible disclosure. When reporting an issue, include your workspace id and (if relevant) the request_id from the affected event.

Our responsible-disclosure policy is available at security.txt.