Apache NiFi and Philter
How to use Philter to redact PII and PHI inside an Apache NiFi data flow, either through Philter's API or with an embedded NiFi processor.
Reference and how-to
Maintained guides to PII redaction with the Philterd toolkit: how redaction policies work, how to author them, and how the pieces fit together.
How to use Philter to redact PII and PHI inside an Apache NiFi data flow, either through Philter's API or with an embedded NiFi processor.
How to configure a Philter deployment for HIPAA: encryption of data at rest and in motion across AWS, Azure, and Google Cloud.
How to deploy Philter in AWS with a CloudFormation template: finding the Philter AMI, editing the template, and launching the stack.
How to replace Philter's default self-signed SSL certificate with a signed certificate from a trusted authority, using a Java keystore.
How to run an Apache reverse proxy in front of Philter for SSL termination, access control, and access logging.
How to manage Philter's configuration across instances in an auto-scaling environment, using a pre-baked machine image or an external properties file.
How to monitor a Philter deployment in AWS: CloudWatch Logs for application logs, load balancer health checks for availability, and CloudWatch Metrics.
Embeddings look like just numbers, but research shows they are partially invertible. A practical defense guide for vector stores against PII recovery attacks.
Three acronyms used interchangeably that shouldn't be. A reference for engineers and compliance leads, with the regulatory and architectural take on each.
Every prompt sent to an LLM is a data egress point. Six concrete patterns for structuring prompts, redacting inputs, and scanning outputs so PII doesn't leak.
Amazon Kinesis Firehose is a managed streaming service that moves data from sources to destinations like S3 and Redshift. This post redacts PII in that stream.
How to configure a Valkey cache so Philter maintains referential integrity (consistent replacement values) across documents and contexts in a cluster.
Map Philter AI Proxy features to SOC 2 Trust Services Criteria and HIPAA Security Rule safeguards, with guidance for your own attestations.
PDFs leak redacted text in unexpected ways: invisible text layers, embedded files, and metadata. Why PDF redaction is harder than it looks, with Philter's fix.
What a redaction policy is, how the JSON schema is structured, and how to use it to control exactly which PII is detected and how each type is redacted.
Comparing the two main approaches to redacting PII and PHI: an LLM versus pattern-based rules. How each handles accuracy, cost, and GDPR or HIPAA compliance.
How to call Philter from a Microsoft Power Automate (Flow) automation to redact PII and PHI from text, using a simple HTTP action.
Data redaction removes sensitive information from documents and datasets, but covers more techniques than most realize. A guide to strategies and trade-offs.
Format-preserving encryption (FPE) encrypts a value so the ciphertext keeps its shape and won't break downstream systems. A guide with credit-card examples.
PII is the term everyone uses and few define the same way. A practitioner's guide to what counts as PII, how to find it in real data, and how to handle it.
A self-hosted PII redaction vendor never touches your data, so there is no business-associate or processor relationship to govern. With definitions.
A hands-on walkthrough from empty file to working redaction policy: detect an entity, apply it with Philter, change how it redacts, and handle false positives.