Philter 3.1.0

Philter 3.1.0 is now available on all three major cloud marketplaces.

What’s new in 3.1.0

Philter 3.1.0 is built on Phileas 2.12.0, which brings:

Filter priorities. Each filter can now have its own priority that is used as a tie-breaker when the same text is identified by two filters. For example, if you’re using the phone-number filter and an ID filter for 10-digit numbers, both may detect PII on the same text. The filter priority decides which label wins.
Zip code validation. The zip-code filter can now optionally validate zip codes against an internal database. When enabled, a string that looks like a zip code but doesn’t actually exist won’t be redacted, reducing false positives on otherwise-numeric data.
Per-filter context window sizes. The window size is roughly the number of words surrounding PII that the engine uses for contextual disambiguation. Previously every filter shared one window size; now each filter can set its own. Tighten the window where you want strict matching; widen it where surrounding context matters.

What Philter is (in case you’re new here)

Philter is open source software that redacts PII and PHI from text and PDF documents. It runs entirely inside your cloud. Your data never leaves your perimeter, never reaches a third-party API, and never lands in someone else’s logs. A REST API takes text in and returns redacted text out:

$ curl http://localhost:8080/api/filter \
    --data "His SSN was 123-45-6789." \
    -H "Content-type: text/plain"

His SSN was ***********.

Under the hood, Philter is a managed wrapper around Phileas , the open source library that handles detection and redaction. Philter adds the HTTP API, the NLP models, a Java SDK (any other language calls the API via the OpenAPI spec), and the marketplace-ready packaging.

Why use Philter

Three things distinguish it from cloud SaaS PII APIs and from hand-rolled regex:

Self-hosted. Runs inside your VPC, on-prem, or air-gapped . No outbound network calls in steady state. No data leaves your perimeter.
Hybrid detection. Combines deterministic pattern matching (SSNs, credit cards, phone numbers) with contextual NLP (names, locations, organizations). Pattern catches what NLP doesn’t; NLP catches what pattern can’t .
Per-entity policy. Each entity type has its own handling strategy: mask, encrypt (including format-preserving encryption ), replace with synthetic, drop, or hash. Configured via a JSON policy file, not code.

How to get started

Three deployment paths, each fitting a different operational profile:

Cloud marketplaces. One-click deployment from the AWS, GCP, or Azure marketplaces (links above). Per-instance-hour billing through your existing cloud account; no procurement contract required. Fastest path to a running Philter instance.
Container. Pull the Docker image and deploy via your existing Kubernetes, ECS, or Cloud Run pipeline. Maximum control over the deployment.
From source. Apache 2.0 licensed on GitHub . Read the code, build it yourself, or fork it. The underlying Phileas library can also be embedded directly in Java/Python/.NET applications without the Philter wrapper.

For a full overview of features, deployment options, and the rest of the Philterd toolkit, see the Philter product page , or jump straight to the user’s guide for configuration details.

Related posts:

From Phileas to Philter: The Evolution of Our Open Source Engine