Talk to an Expert

Tell us about your stack and the privacy problems you're trying to solve. We typically respond within one business day.

Prefer to skip the form? Pick a time on our calendar →
or send a message

Jeff Zemerick is the founder of Philterd, LLC, the consultancy and open source software company behind the Philterd PII redaction toolkit. Jeff has been working in software engineering, search, NLP, and data privacy for 20+ years; the rest of this page is the short version of what that’s looked like.

More about Jeff: detail on the consulting practice, certifications, conference history, and the broader sweep of industries Jeff works in lives at jeffzemerick.dev. Connect on LinkedIn.

What Jeff works on

Privacy engineering for cloud and AI workloads. Jeff designs PII protection architectures for healthcare, finance, legal, and government teams — combining the open source Philterd toolkit with the deployment patterns documented across this blog. Most engagements end with the client’s own engineers owning a privacy stack they can extend without future vendor dependence.

Apache OpenNLP. Jeff serves as the PMC Chair of Apache OpenNLP and is a member of the Apache Software Foundation. He has contributed to OpenNLP for over 15 years, including release stewardship, contributor mentorship, and the deep-learning integration work that connects OpenNLP to modern transformer models.

Open source maintenance. In addition to OpenNLP, Jeff is the primary developer behind the Phileas library, the Philter API, PhEye model server, Phinder, Phield, Philter AI Proxy, Philter Scope, Philter Diffuse, and the Redaction Policy Editor — the full Philterd open source toolkit, all Apache 2.0.

Background

17 years on AWS and GCP. Jeff holds 16 active AWS certifications, has authored questions for AWS certification exams, and is an AWS Community Builder. Jeff also holds 8 Google Cloud certifications and works regularly with both clouds (and Azure) in the consulting practice.

20+ years of NLP. Before privacy specifically, Jeff worked in search, information retrieval, and NLP — including search-relevance, vector search, multilingual document processing, and the production NLP pipelines that eventually became the underpinnings of Phileas.

International conference speaker. Jeff has presented at OpenSearchCon, ApacheCon, Berlin Buzzwords, the Linux Foundation Open Source Summit, the Apache Community Over Code conference, Strata Data, Activate, HashiTalks, the ONNX Community Meetup, and others — on PII redaction, NLP pipelines, vector search, and the convergence of cloud, AI, and privacy.

On the blog

Jeff writes all the posts on this blog. The high-level themes are PII redaction (foundations, tooling, vertical applications), the hybrid pattern + AI approach Philterd has championed since 2017, the architecture patterns for self-hosted privacy infrastructure, and increasingly the privacy implications of generative AI workloads — particularly RAG systems.

Recent posts:

The full blog index covers PII fundamentals, vertical-specific architectures (healthcare, finance, legal, insurance, education, government), integration patterns (Kafka, Snowflake, Trino, Graylog), and the broader compliance landscape.

Connect

Jeff maintains a separate personal site at jeffzemerick.dev that goes deeper into the consulting practice, certifications, conference history, and other industries beyond what’s covered here. For day-to-day updates and conversation, connect on LinkedIn. For privacy-engineering inquiries directly related to Philterd software or services, the contact form on this site (or any of the page CTAs) is the fastest path.