Talk to an Expert

Tell us about your stack and the privacy problems you're trying to solve. We typically respond within one business day.

Prefer email? support@philterd.ai

Prefer to skip the form? Pick a time on our calendar →
or send a message

Please do not enter PII or PHI in this form. If you need to share an example, use a sanitized one.

Case Studies

Redaction in Production

How organizations use the Philterd toolkit to solve real PII and PHI redaction problems.

Healthcare · Phileas · PhEye

Multilingual Patient Chatbot

Challenge

The organization operated a patient-facing chatbot that triaged symptoms and routed conversations to clinical staff. Patients routinely typed Social Security numbers, dates of birth, medication names tied to specific conditions, and insurance member IDs into the chat window. The system needed to handle both English and French input with equal accuracy, and redaction had to happen in real time before messages were persisted or forwarded to a human agent.

Solution

Phileas was embedded directly into the chatbot's message-processing layer. Pattern-based filters handled structured identifiers (SSNs, phone numbers, dates of birth, insurance IDs) in both languages, while PhEye's NLP models detected unstructured PHI such as names, addresses, and clinical references that patterns alone would miss. Language detection routed each message to the appropriate model. The entire stack ran inside the organization's cloud with no data leaving their perimeter.

Result

Redaction runs inline with sub-100ms latency per message. Both English and French inputs are handled without language-specific routing from the end user. Chat transcripts stored for analytics and quality assurance contain no recoverable PHI, satisfying the organization's HIPAA and privacy obligations.

Healthcare · Philter

EHR-to-Database Data Pipeline

Challenge

Clinical notes, discharge summaries, and radiology reports were extracted from an EHR and streamed through an AWS data pipeline into a database used by research and analytics teams. The narrative text contained dense PHI: patient names, physician names, dates, facilities, and medical record numbers embedded in free-form prose. The organization needed to de-identify this text before it reached the analytics database so downstream consumers could work with the data without HIPAA restrictions.

Solution

Philter was deployed as a service within the AWS pipeline. As documents flowed from the EHR extract into the pipeline, each record's narrative fields were sent to Philter for redaction before being written to the analytics database. Philter's combination of pattern matching and NLP handled both the structured identifiers (MRNs, dates, phone numbers) and the unstructured names and locations that appear unpredictably in clinical prose. The deployment ran entirely within the organization's VPC with no data leaving their AWS environment.

Result

The analytics database receives fully de-identified text. Research and analytics teams query the data without individual HIPAA access controls on every row. The pipeline processes thousands of documents daily with redaction adding minimal latency to the overall flow.

Have a similar problem?

Tell us about your data pipeline, your compliance requirements, and where PII is getting through. We'll show you how to stop it.