Redaction Policies

Redaction policies, ready to deploy

Q: What is the difference between Philterd and Community policies?

Each policy card carries a badge. Philterd policies are written and maintained by the Philterd team; Community policies are contributed by practitioners outside Philterd and reviewed before they are merged. Both are released under the same permissive Apache license and held to the same schema and validation standards.

A community-curated library of Philter and Phileas policies covering HIPAA, PCI DSS, bankruptcy court filings, AI training prep, and more. Released under the permissive and business-friendly Apache license. Download as-is, fork, or contribute your own.

View on GitHub → Contribute a policy →

Legal v1.0.0

Bankruptcy Rule 9037 (FRBP 9037) Court Filing Redaction

Redact bankruptcy court filings per FRBP 9037 — last 4 of SSN/TIN/account, year-only of birthdates, minors' names to initials.

FRBP 9037bankruptcycourt filingsprivacyPII

Philterd

General v1.0.0

Brazilian Identifiers (CPF, CNPJ)

Redact Brazilian CPF (individual) and CNPJ (company) tax identifiers, formatted or unformatted, validated by their mod-11 check digits.

BrazilCPFCNPJtax IDnational ID

Philterd

General v1.0.0

Canadian Social Insurance Number (SIN) Redaction

Redact Canadian Social Insurance Numbers (SIN), accepting formatted and unformatted nine-digit values and rejecting Luhn-invalid look-alikes.

CanadaSINSocial Insurance NumberNational IDLuhn

Philterd

General v1.0.0

CCPA / CPRA Consumer Privacy Redaction

Redact personal information and sensitive personal information as defined by the California Consumer Privacy Act (CCPA/CPRA) from consumer records.

CCPACPRACaliforniaconsumer privacypersonal informationsensitive personal informationprivacy

Philterd

Healthcare v1.0.0

Clinical Notes De-Identification (Date-Shifted)

De-identify clinical notes for research, ML training, or analytics — preserving temporal relationships via per-patient date shifting.

HIPAAPHIclinical notesdate shiftingresearch

Philterd

Contact Center v1.0.0

Contact Center Call Recording Transcripts

Strip cardholder data and PII from contact-center call transcripts — primarily PAN, CVV, SSN, account numbers — to reduce PCI DSS scope and meet QA privacy requirements.

PCI DSScontact centercall recordingtranscriptsPCI scope reductionQA

Philterd

Education v1.0.0

FERPA Student Records Redaction

Remove personally identifiable information from student educational records per FERPA (20 USC 1232g; 34 CFR Part 99).

FERPAeducationK-12higher-edstudent records20 USC 1232g

Philterd

Legal v1.0.0

FRCP 5.2 Federal Civil Filing Redaction

Redact federal civil filings per FRCP 5.2 — last 4 of SSN/TIN/account, year-only birthdates, minor names to initials.

FRCP 5.2federal civilcourt filingsprivacyPII

Philterd

General v1.0.0

French Identifiers (NIR, SIREN, SIRET)

Redact French social-security numbers (NIR) and business identifiers (SIREN, SIRET), validated by their control key or Luhn check.

FranceNIRINSEESIRENSIRETsocial securitybusiness ID

Philterd

General v1.0.0

GDPR Personal Data Redaction

Redact personal data and special-category data as defined by the EU General Data Protection Regulation (GDPR) from documents and records.

GDPREUpersonal datadata protectionArticle 4Article 9privacy

Philterd

General v1.0.0

General-Purpose Starter Policy

A balanced starting policy covering common PII types — names, contact info, government IDs, payment data — with no vertical-specific tuning.

startergeneraldefault

Philterd

General v1.0.0

German Identifiers (Steuer-ID, Personalausweis)

Redact German tax identification numbers (Steuer-ID / IdNr) and national ID card numbers (Personalausweis), validated by their check digits.

GermanySteuer-IDIdNrPersonalausweisnational IDtax ID

Philterd

Finance v1.0.0

GLBA Nonpublic Personal Information (NPPI) Redaction

Redact Nonpublic Personal Information (NPPI) from financial customer records under the Gramm-Leach-Bliley Act (15 USC 6801-6809).

GLBANPPIfinancial privacySafeguards Rule15 USC 6801banking

Philterd

Healthcare v1.0.0

HIPAA Safe Harbor De-Identification

Remove all 18 HIPAA Safe Harbor identifiers from clinical text per 45 CFR 164.514(b)(2).

HIPAASafe HarborPHI45 CFR 164.514de-identification

Philterd

AI Training v1.0.0

LLM Training Data Preparation

Aggressive PII redaction for documents being fed into LLM training, fine-tuning, or RAG vector stores — preserves semantic structure with type tokens.

AILLMfine-tuningtraining dataRAGingestion

Philterd

Healthcare v1.0.0

Medical Chatbot — User Input Redaction

Redact PHI from user messages to a healthcare chatbot before they reach the LLM — preserves clinical meaning while removing identifiers.

HIPAAPHIchatbotLLMconversational AIRAG

Philterd

Finance v1.0.0

PCI DSS Scope Reduction

Strip cardholder data (PAN, CVV, expiration) from logs, transcripts, and tickets to reduce PCI DSS scope per Requirement 3.4.

PCI DSScardholder dataPANscope reductionReq 3.4

Philterd

Finance v1.0.0

SOX Financial Records Redaction

Redact personal and account identifiers from financial records and audit workpapers under Sarbanes-Oxley while preserving the financial figures auditors need.

SOXSarbanes-Oxleyfinancial reportingauditinternal controlsSEC15 USC 7201

Philterd

General v1.0.0

Spanish Identifiers (DNI, NIE, CIF)

Redact Spanish personal and organization identifiers (DNI, NIE, CIF), validated by their control letter or character.

SpainDNINIECIFnational IDtax ID

Philterd

Legal v1.0.0

State Court Filings Baseline Redaction

A starting policy for state-court PII redaction — covers the most-common state requirements; tune for your specific jurisdiction.

state courtcourt filingsprivacyPIIbaseline

Philterd

Finance v1.0.0

SWIFT / BIC Codes

Redact SWIFT/BIC bank and business identifier codes (ISO 9362), validated structurally including a valid ISO 3166 country segment.

SWIFTBICISO 9362bankingwire transfer

Philterd

Using a policy

Every policy is a single JSON file. Download it, upload it to your Philter instance, and reference it by name from the redaction API.

# 1. Download the policy
curl -O https://raw.githubusercontent.com/philterd/pii-redaction-policies/main/policies/philterd/healthcare/hipaa-safe-harbor.json

# 2. Upload to your Philter instance
curl -X POST http://localhost:8080/api/policies \
     -H "Content-Type: application/json" \
     --data @hipaa-safe-harbor.json

# 3. Redact text using the policy
curl http://localhost:8080/api/filter?p=hipaa-safe-harbor \
     --data "Patient John Smith was discharged on 2025-03-14." \
     -H "Content-Type: text/plain"

No Philter instance yet? Deploy one in 5 minutes →

Contributing

The library lives at github.com/philterd/pii-redaction-policies. PRs welcome: bring your own vertical, your own custom identifiers, your own edge cases.

Why contributing matters

The library is more useful the more eyes are on it. Every policy you contribute saves another team (in healthcare, finance, legal, government, AI training) from rebuilding the same thing privately and often incorrectly. A rising tide lifts all boats.

You save peers time. A FERPA policy you write for K-12 student records is the starting point another district uses next week. A call-center PCI policy you tune today is the one a peer at a different bank doesn’t have to invent from scratch.
You get better policies. Public PR review pulls in compliance officers, security engineers, and practitioners from outside your team. Your draft comes out tighter than anything a single team would ship alone.
You get credit. Your name (or your org’s) lands in the policy’s author field and shows up on the policy’s page right here on philterd.ai. Durable attribution, not a buried changelog entry.
You compound the library. Healthcare PHI patterns inform finance NPPI work. Legal redaction patterns inform government FOIA prep. A contribution in one vertical strengthens the adjacent ones.
You make the whole ecosystem safer. Every team that adopts a vetted, peer-reviewed policy is one fewer team rolling their own regex and missing identifiers in production. Privacy is a collective-action problem; this is the collective action.

How review works

Every contribution gets reviewed for: schema compliance, sidecar metadata completeness, and golden-file validation against a representative input. See CONTRIBUTING.md for the file layout, metadata schema, and review process.

Policies must conform to the Phileas redaction policy JSON schema.

Frequently asked questions

If something here isn’t covered, get in touch and we’ll answer.

What is a redaction policy?

A redaction policy is a single JSON file that tells Philter and Phileas which types of PII and PHI to find and how to handle each one (redact, mask, encrypt, replace with a synthetic value, or pass through). Every policy in this library conforms to the Phileas redaction policy JSON schema.

How do I use a policy from this library?

Download the JSON file, upload it to your Philter instance, and reference it by name from the redaction API. The exact commands are in the Using a policy section above. No Philter instance yet? Deploy one in about five minutes.

What is the difference between Philterd and Community policies?

Each policy card carries a badge. Philterd policies are written and maintained by the Philterd team; Community policies are contributed by practitioners outside Philterd and reviewed before they are merged. Both are released under the same permissive Apache license and held to the same schema and validation standards.

Are the policies free to use?

Yes. The entire library is released under the permissive and business-friendly Apache License, version 2. Download a policy as-is, fork it, adapt it to your own data, and ship it in production with no fees and no vendor lock-in.

Does using a HIPAA or PCI policy make me compliant?

No policy can do that on its own. These are vetted, peer-reviewed starting points that encode the entity types and handling rules a given framework calls for, not a legal guarantee of compliance. Treat any policy as a baseline: tune it against your real data, measure the results (Philter Scope scores precision and recall per entity type), and have the right people review it before you rely on it.

Can I contribute my own policy?

Yes, and contributions are welcome. The library lives at github.com/philterd/pii-redaction-policies; open a pull request with your policy and its metadata. See CONTRIBUTING.md for the file layout, metadata schema, and review process. Your name or your organization is credited in the policy's author field and on its page here.

What if the policy I need isn't in the library?

If you have a specific compliance framework or vertical use case that the library doesn't cover, the Philterd team can build a custom policy and tune it against your real data. Talk to the team about a custom policy.

Need a policy that isn't here?

If you have a specific compliance framework or vertical use case in mind, the Philterd team can build a custom policy and tune it against your real data.