Drop-in for major providers
Speaks the OpenAI, Anthropic, and Amazon Bedrock wire protocols, plus any OpenAI-compatible provider such as Mistral, Cohere, vLLM, or LM Studio. Your existing SDKs don't need a single line changed; only the base URL.
Redact PII before it reaches the LLM
Philter AI Proxy is an airlock for your large language model (LLM) traffic: prompts pass through to the model, but sensitive data never crosses your network boundary. It sits between your application and LLM providers for both generative AI and RAG, including OpenAI, Anthropic Claude, Amazon Bedrock, and OpenAI-compatible providers like Mistral, Cohere, and vLLM. Prompts get redacted before they leave your network; responses get scanned on the way back.
Speaks the OpenAI, Anthropic, and Amazon Bedrock wire protocols, plus any OpenAI-compatible provider such as Mistral, Cohere, vLLM, or LM Studio. Your existing SDKs don't need a single line changed; only the base URL.
Strips PII and PHI from prompts before they're forwarded to the model. Names, SSNs, MRNs, account numbers: all replaced according to your policy.
API key authentication and mutual TLS (mTLS) are both supported, independently or together. Per-key rate limiting and per-key policy overrides let you grant different access levels to different clients without running separate proxy instances.
Every redaction is logged with timestamp, entity type, and direction (in/out). The exact paper trail HIPAA and GDPR auditors expect for AI workloads.
Uses the same Phileas policies as the rest of the Philterd toolkit. Define once; apply across redaction, discovery, monitoring, and now AI traffic.
Runs inside your perimeter. The proxy is the last hop before LLM-bound traffic leaves your network, and the first hop on the way back: the airlock that sensitive data cannot cross.
Production-ready from day one. The proxy exposes the signals your platform team needs to operate it with confidence.
A /metrics endpoint exposes request counts, redaction latency, token usage (prompt and completion), and error rates, all labeled by provider and model. Drop it into your existing Grafana stack without custom instrumentation.
Every request is written as a JSONL record: entity types redacted, direction (inbound or outbound), model, policy, document ID, latency, client IP, and HTTP status. The exact audit trail HIPAA and SOC 2 reviewers expect for AI workloads.
/health checks Philter backend reachability and returns structured JSON. Wire it into your load balancer, Kubernetes liveness probe, or uptime monitor to catch backend connectivity issues before they affect clients.
The proxy ships as a multi-arch image on Docker Hub at philterd/philter-ai-proxy. Pull it, point a config file at your Philter instance and chosen provider, and run.
# 1. Pull the image from Docker Hub
docker pull philterd/philter-ai-proxy
# 2. Grab a starting config and edit it for your Philter and provider
curl -O https://raw.githubusercontent.com/philterd/philter-ai-proxy/main/config.example.yaml
mv config.example.yaml config.yaml
# 3. Run it: expose port 8080 and mount your config
docker run -p 8080:8080 \
-v "$(pwd)/config.yaml:/app/config.yaml:ro" \
-e PHILTER_PROXY_CONFIG=/app/config.yaml \
philterd/philter-ai-proxy
Prefer Compose? The repo ships a docker-compose.yaml: clone it and run docker compose up. Every configuration option is documented in the installation guide.
If something here isn’t covered, get in touch and we’ll answer.
Three ways to get going: deploy the open source yourself, spin it up from a cloud marketplace, or work with our team directly. Pick the path that fits.