For teams shipping AI features on regulated data

PII Guardrails for LLM Apps: Without Sending Prompts to a Third-Party API

Your security team won't let you send customer data to OpenAI. Your roadmap says “chatbot in Q3.” Philter AI Proxy closes the gap - a drop-in middleware that strips PII out of prompts before they reach the LLM provider, then puts it back on the way out. Runs in your VPC. Released under the permissive and business-friendly Apache license.

Drop-in proxy for OpenAI, Anthropic, Gemini, Ollama - point your SDK at it, nothing else changes
PII stripped before prompts leave your perimeter
RAG ingestion redaction - vector stores can't leak what was never written
Training-data prep - aggressive redaction for fine-tuning corpora

Permissive Apache license Runs in your VPC No third-party API HIPAA / GDPR / CCPA ready Redacting PII and PHI since 2017

Built for the AI privacy patterns showing up in production

Chatbots on regulated data
RAG over customer documents
Internal AI assistants
Training-corpus prep
Agent tool-call pipelines
Voice-AI transcripts

Try it live

Try it out! Select one of the industries and click Redact to redact the text.

Input

Patient Margaret Collins, born on 04/12/1978, with SSN 523-88-4021 was admitted to the ER at St. Luke's Medical Center. Her primary care physician, Dr. Howard Banks, can be reached at hbanks@stlukesmed.org or (555) 342-9187.

Redacted output

The redacted text appears here after you click Redact.

Do not enter PHI or PII.

Why AI teams pick Philter AI Proxy

Three failure modes, one engine

Prompts to hosted LLMs, ingestion into vector stores, training corpora - same PII problem in three shapes. Philter handles all three with one policy surface and one audit trail.

Drop-in, not rewrite

Point your existing OpenAI/Anthropic/Gemini SDK at the proxy URL. The rest of your app doesn't change. How it works.

Reversible when you need it

Redact with an encryption strategy on the way out so the model reasons over encrypted stand-ins, never the real identities. When the answer has to show real values, your app restores them through Philter's governed, audited re-identify API. The provider never receives the originals.

Defends against embedding inversion

Embedding inversion attacks can reconstruct text from vectors. The fix is to redact before embedding - not after. Philter does that as a first-class ingestion step.

Self-hosted, not someone else's privacy SaaS

The proxy runs in your VPC alongside the rest of your AI stack - not in a third-party tenant. Prompts and responses never leave your perimeter on their way to or from the LLM provider.

Provider-agnostic

Works with OpenAI, Anthropic, Gemini, and self-hosted LLMs via Ollama. Swap providers without re-doing the PII layer; the policy travels with the proxy, not the model.

What we’ll cover on the call

Where the PII actually flows: prompts, retrieval context, tool calls, logs, vector stores, training data - we’ll map your real surface, not a generic diagram.
The hosted-LLM BAA / DPA chain: which providers have which agreements, what they cover, what they don’t, and where the residual risk sits.
Proxy vs. inline integration: which fits your existing AI architecture cleanest, and the trade-offs.
Reversible-vs-irreversible redaction: when to use which, and how to keep the user-facing output natural without leaking on the LLM side.
A concrete next step: one of three concrete patterns to ship, in priority order.

No code review or sample data needed for the conversation - just an architecture sketch and an honest take on your timeline.

Make security review the easy part of your AI launch

30 minutes with Jeff - bring your AI architecture, leave with a concrete privacy plan that survives security review. Whether or not Philter is the right answer for your stack. Detection is probabilistic, so measure precision and recall on your own data before production; because Philter runs in your VPC, you remain the data controller for the output.

Or deploy Philter yourself →