Architecture

How the Philterd Toolkit Fits Together

This page maps how our open source redaction software fits into a complete privacy stack.

See deployment topologies ↓ Walk the product journey →

Component overview

Three primary data paths, one shared policy layer, and five cross-cutting operational tools.

Your Application & Data Sources application text · logs · documents · LLM prompts · event streams

LLM Traffic

Drop-in proxy for OpenAI, Anthropic, Amazon Bedrock, and any OpenAI-compatible provider. PII is redacted before prompts leave your network.

Core Redaction Engine

Philter & Phileas

Philter is the self-hosted HTTP API; Phileas is the embeddable library (Java, Python, .NET). Both are powered by PhEye NLP models and driven by Phileas policies.

Discovery at Rest

Phinder

Crawls local filesystems. Maps where sensitive data lives across your infrastructure before it becomes a compliance finding.

Cross-cutting: operational tools used at every stage

Policy Layer

Shared Phileas policy format. One definition applied by all products.

Philter Scope

Precision, recall, and F1 measurement. Fail the build when accuracy regresses.

Phield

Production PII flow monitoring and anomaly alerting.

Arbiter

Human review, structured exemptions, and audit trail.

Philter Diffuse

Differential privacy for safe aggregate analytics.

The shared policy layer

The single most important architectural property of the toolkit: every product that performs or tests redaction consumes the same Phileas policy format. You define your redaction rules once, version them like code, and they apply everywhere.

Policy Editor Author policies visually
Phileas Policy JSON Version-controlled config file
Philter / Phileas Executes redaction at runtime
Philter Scope Tests the policy in CI/CD
Philter AI Proxy Applies policy to LLM traffic

Pre-built policies for HIPAA Safe Harbor, PCI DSS, GLBA, clinical notes, and more are available in the Redaction Policy Library.

A continuous privacy lifecycle

The toolkit is a loop, not a one-shot pipeline. Discovery feeds redaction, redaction feeds review and monitoring, and what you learn feeds the next scan. One shared Phileas policy links every stage.

Every stage runs inside your own perimeter, driven by one shared policy.

Deployment topologies

Not every team needs every product. These three topologies cover the most common starting configurations; teams typically expand from Minimal toward Full Suite as the work matures.

Minimal

New to redaction or a single focused use case

Philter (HTTP API) or Phileas (embedded library)
PhEye NLP models (bundled with Philter)
Redaction Policy Editor

Deploys in under 5 minutes from the AWS, GCP, or Azure Marketplace. Suitable for log redaction, document pipelines, and single-system use cases.

Standard

Production deployment with AI workload coverage

Everything in Minimal
Philter AI Proxy for LLM prompt and response redaction
Philter Scope for CI/CD policy regression testing
Phield for production PII flow monitoring

Covers the two highest-priority concerns for most production teams: AI data egress and detection-accuracy regression.

Full Suite

Enterprise or heavily regulated deployment

Everything in Standard
Phinder for sensitive data discovery at rest
Arbiter for human-in-the-loop review and attestation
Philter Diffuse for differentially private aggregate analytics

Suitable for HIPAA, FedRAMP, and regulated-AI workloads where human attestation, discovery inventory, and provable privacy bounds are required.

Not sure which topology fits your team? Walk the product journey to find your starting point →

Data flow across the toolkit

Where the component view above shows the boxes, this shows the arrows: how data actually moves between the products. A request enters through one front door (the AI Proxy for LLM traffic, the MCP server for AI agents, or an SDK), the core engine performs detection and redaction under a shared policy, and the review and monitoring tools observe the result. Every hop stays inside your own perimeter.

Data-flow diagram of the Philterd toolkit. Your applications, pipelines, and AI agents send data through one of three front doors: the Philter AI Proxy, which redacts prompts before they reach LLM providers (OpenAI, Anthropic, Bedrock, Gemini, Ollama) and scans the responses on the way back; the Philter MCP server; or the SDKs and connectors. All three call Philter, the self-hosted redaction API. Philter uses the Phileas embeddable library, which calls the PhEye NLP model server, which loads the PII lenses. Policies authored in the Redaction Policy Editor or PhiSQL configure Philter. Phinder scans local files through Phileas to find PII at rest. Philter feeds the review and monitoring tools: Arbiter for human review, Philter Scope for accuracy measurement against a gold standard, and Phield for production monitoring, which raises PagerDuty and Slack alerts and passes aggregate counts to Philter Diffuse for differential privacy. — Three entry paths, one shared redaction engine and policy format, and the operational tools that observe what flows through. All self-hosted.

All components at a glance

Every product in the toolkit, its role, and where it fits in the architecture.

Philter Self-hosted PII redaction HTTP API Core

Phileas Embeddable redaction library (Java, Python, .NET) Core

PhEye NLP models and model server powering Philter and Phileas Core

Philter AI Proxy Drop-in proxy redacting PII from LLM prompts and responses LLM

Phinder Discovery scanner for PII at rest across local files Discovery

Policy Editor Visual no-code builder for Phileas policy files Cross-cutting

Philter Scope Precision, recall, and F1 benchmarking for policies in CI/CD Cross-cutting

Phield Production PII flow monitoring with PagerDuty and Slack alerting Cross-cutting

Arbiter Human review UI with structured exemption codes and audit trail Cross-cutting

Philter Diffuse Differential privacy for aggregate PII analytics Cross-cutting

Search Redact Query-time redaction of OpenSearch and Elasticsearch results Integration

Not sure where to start?

Most teams start with Philter or Phileas and add from there. If you want a guided read, the product journey walks through each stage and when to adopt it.

Deploy Philter in 5 minutes →