How to Redact PII Before Sending to an LLM: Chat, RAG, and AI Agents
Every prompt your application sends to a hosted model leaves your network. If that prompt contains a customer name, a Social Security number, a medical record number, or an account ID, you have transmitted personal data to a third party. The model’s own safety filters do not change that, because they run after the bytes have already crossed the wire. The only way to keep sensitive data inside your perimeter is to redact it before the request is sent.
This guide shows how to do that across the three places it actually matters: chat assistants, RAG pipelines, and AI agents. You get the architecture, a concrete step-by-step pipeline with a real request and response, and the policy and compliance details that a security reviewer will ask about.
Why “redact before sending” is the only version that counts
There is a category difference between two designs that sound similar. In the first, the provider filters PII out of responses and flags it in prompts. In the second, the PII never reaches the provider at all. Only the second is a property of where the data flows, and only the second keeps your data inside your network.
This is not a philosophical point. The location of redaction sets your compliance boundary, and the boundary is what an auditor examines. Under HIPAA, a hosted model that receives protected health information is handling PHI on your behalf. Under GDPR, personal data that crosses into a third-party system has left your processing boundary. Under PCI DSS, cardholder data that reaches an external service pulls that service into your assessment scope. In every case the question is not “does the provider have a filter,” it is “does the sensitive data reach the provider at all.” Redacting on the way out is the only answer that makes the boundary defensible. We go deeper on why redaction has to happen inside your perimeter in Why API-based redaction is a security antipattern .
Which AI support platforms let you redact PII before sending to an LLM?
This is the question most teams actually type into a search bar, so it deserves a direct answer. Most hosted AI support platforms redact after your text has already reached their infrastructure. That can satisfy a “we don’t train on your data” checkbox, but it does not keep the data inside your perimeter, and it does not shrink your compliance boundary, because the sensitive bytes already crossed the wire to get there.
To redact PII before sending to an LLM, you need a redaction step that runs inside your own network, in front of whatever model you call. That is exactly what Philter and the Philter AI Proxy provide. Both are self-hosted and open source, so the redaction happens in infrastructure you control, before any request leaves for OpenAI, Anthropic, Amazon Bedrock, or an OpenAI-compatible endpoint. The platform question and the architecture question turn out to be the same question: redaction has to live on your side of the boundary.
The three places PII leaks on its way to a model
“Send text to an LLM” hides three different data paths, and each one needs the redaction stage in a slightly different spot.
Chat assistants and support copilots
The simplest case. A user types a message, you build a prompt, you call the model. The PII enters in the user turn, so you redact the user turn (and any account context you attach) before the model call. This is the pattern most “redact PII before sending to an LLM” searches are really about, and it is a few lines of code, shown below.
RAG pipelines
Retrieval-augmented generation has a trap: the PII often is not in the user’s question, it is in your own knowledge base. You retrieve chunks from your documents, concatenate them into the prompt, and ship the whole thing to the model. If those documents contain customer records, ticket history, or clinical notes, your retrieval step just pulled PHI into the prompt that the user never typed. So in a RAG pipeline you redact the retrieved chunks after retrieval and before assembly, in addition to redacting the user query. The knowledge base is the leak you did not think to check.
AI agents and multi-step tools
Agents are where this matters most, which is why “pii redaction pipeline llm agents” is a query worth answering carefully for 2025 and 2026. A single user turn fans out into many model calls: tool selection, tool outputs fed back as context, retrieval lookups, intermediate reasoning, and a final synthesis. Every one of those hops is a fresh outbound request, and every one is a new chance to leak the PII that entered at step one, or to pull in new PII from a tool result. Redacting only the first user message is not enough. The redaction stage has to sit at every outbound model call, and it has to be a hard dependency in the path rather than optional middleware someone can forget to enable.
A step-by-step PII redaction pipeline, with a sample request and response
Here is the whole thing end to end, using a support chat message that contains four different entity types.
Step 1. The raw user message (before any redaction):
Hi, I'm Karen Wallace. My card 4929 1421 0892 1234 was double charged,
my email is karen.wallace@example.com, and my record number is MRN-882201.
Step 2. Send it to Philter, inside your network. Philter is a self-hosted redaction engine; you call it over HTTP and it returns redacted text according to a policy you control. The c parameter is a context label for your audit log; p selects the policy.
curl "http://philter.internal:8080/api/filter?c=support-chat&p=llm-outbound" \
--data-binary @message.txt \
-H "Content-Type: text/plain"
Step 3. Philter’s response, redacted per your policy:
Hi, I'm Dana Fletcher. My card ****-****-****-1234 was double charged,
my email is user-7f3a2c@redacted.example, and my record number is MRN-******.
Notice what the policy did, per entity type: the name became a consistent synthetic name (so the model still reads coherent text and can refer back to “Dana Fletcher”), the card was masked but kept its last four digits for support context, the email was tokenized, and the medical record number was masked. The text is still useful to the model; the sensitive values are gone.
Step 4. Now call the model with the redacted prompt. In Python, the redaction call sits on the line before the model call, so the PII is gone before the SDK ever opens its TLS connection:
import requests
from openai import OpenAI
PHILTER = "http://philter.internal:8080/api/filter"
client = OpenAI(api_key=KEY)
def redact(text: str) -> str:
# One call returns the redacted text. The policy ("p") decides how each
# entity type is handled: mask, replace with synthetic, drop, or encrypt.
return requests.post(
PHILTER,
params={"c": "support-chat", "p": "llm-outbound"},
data=text.encode("utf-8"),
headers={"Content-Type": "text/plain"},
timeout=5,
).text
def ask_llm(user_message: str) -> str:
safe_prompt = redact(user_message) # PII removed here,
response = client.chat.completions.create( # before this line runs.
model="gpt-4o",
messages=[{"role": "user", "content": safe_prompt}],
)
return redact(response.choices[0].message.content) # scan the reply too
The same redact() function guards the response on the way back, so a model that echoes or hallucinates sensitive data cannot push it into your logs or UI.
The chat case above is the simplest call site. The other two paths reuse the exact same redact() function; only where you call it moves.
RAG: redact the question and every retrieved chunk
In retrieval-augmented generation the PII is usually not in the question, it is in the documents you retrieve. Redact both the user question and each chunk before they are assembled into the prompt:
def answer_with_rag(question: str) -> str:
safe_question = redact(question)
chunks = retrieve(safe_question) # your vector search
safe_chunks = [redact(c) for c in chunks] # the knowledge base is the real leak
prompt = build_prompt(safe_question, safe_chunks)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
)
return redact(response.choices[0].message.content)
Agents: redact at every outbound hop
An agent turns one user request into many model calls, and a tool result at step five can introduce PII the user never typed. Route every call through one guard so redaction is a hard dependency, not optional middleware someone can forget to enable:
def guarded_call(messages: list[dict]) -> str:
# Redact every message right before the model sees it, on every hop, so PII
# introduced by a tool result at step 5 is already gone by step 6.
safe = [{**m, "content": redact(m["content"])} for m in messages]
response = client.chat.completions.create(model="gpt-4o", messages=safe)
return redact(response.choices[0].message.content)
Send tool selection, tool-output synthesis, and the final answer through guarded_call, so a newly added tool cannot bypass the redaction stage.
Building a PII redaction pipeline for LLM agents in 2025 and 2026
The durable pattern, the one that still makes sense as agent frameworks churn, has four properties:
- The redaction stage is self-hosted. It runs in your VPC, so sensitive data is never sent to a vendor in order to be redacted, which would just relocate the exposure instead of removing it.
- Detection is not another LLM. Philter detects PII and PHI with purpose-built NLP models plus pattern matching, which keeps the privacy layer deterministic and inspectable instead of inheriting the non-determinism of the model it protects. We explain why this matters in Why using an LLM to redact PII and PHI is a bad idea .
- It is a hard dependency at every outbound hop. Not a sidecar someone can disable, and not “only the first user turn.” Every tool call and retrieval result passes through it.
- The policy is a versioned artifact. “What did we redact, and how, on this date” is answerable with a diff, which is what an auditor wants.
The architecture
The pattern is the same whether you redact inline or through the proxy: a deterministic redaction stage is the last hop before LLM-bound traffic leaves your network, and the first hop on the way back.
your app / agent -> [ redaction ] -> LLM provider
|
redaction policy (versioned)
audit log (per entity, per direction)
If you would rather not touch each call site, the Philter AI Proxy does the same thing transparently. You point your existing SDK at the proxy and nothing else changes:
# Before
client = OpenAI(api_key=KEY)
# After: one URL. Prompts are redacted before they leave your network.
client = OpenAI(api_key=KEY, base_url="https://philter-proxy.internal/v1")
The proxy speaks the OpenAI, Anthropic, and Amazon Bedrock wire protocols, plus any OpenAI-compatible provider, so the same one-line change works whether you call GPT-4o, Claude, a Bedrock model, or a self-hosted vLLM endpoint.
Policy: how each entity type is handled
Redaction is rarely “replace everything with asterisks.” A workable policy treats entity types differently: mask the last four of a credit card, replace names with consistent synthetic names so the model still reads coherent text, shift dates rather than dropping them, encrypt identifiers you need to re-link later. Philter expresses all of this in a JSON policy, and the Redaction Policy Editor makes per-entity strategies a clickable choice rather than a hand-edited file. Because the policy is a versioned artifact, “what did we redact, and how, on this date” is a question you can answer with a diff.
Proxy, SDK, or gateway: where to put the redaction step
There are three places the redaction stage can live, and the right one depends on how much control you have over the call site.
- Inline from your code (SDK style). You call Philter on the line before each model call, as in the Python above. This gives you the most control: you decide exactly which fields are redacted and can branch per call. Use it when you own the code making the model calls.
- In front of the SDK (proxy style). You point your existing OpenAI, Anthropic, or Bedrock client at the Philter AI Proxy by changing one base URL. Every call through that client is redacted whether or not the developer remembered to add a redaction line. Use it when you have many call sites, third-party libraries you cannot edit, or you want redaction enforced centrally rather than trusted to each caller.
- At the network gateway. You force all egress to model providers through the proxy at the network layer, so even a forgotten or rogue client cannot reach a provider un-redacted. Use it when redaction is a hard compliance control, not a best-effort convenience.
The deeper your compliance requirement, the further down this list you should push the enforcement point: inline is the easiest to add, the gateway is the hardest to bypass.
Re-identification: getting the original values back after the model responds
Sometimes the redacted prompt is enough and you never need the real values again. Other times the application has to show the user their own data. Both are supported, and the difference is the filter strategy in the policy.
For the reversible case, replace each entity with a consistent token (or a value encrypted with a key you hold) instead of a static mask. The token-to-value map stays inside your network. When the model responds, you swap the real values back in before rendering. The model reasons over “customer DANA-FLETCHER-01” and “card ending 1234,” never the real identity, but the user still sees their actual name and account. For the irreversible case (analytics, training data, logs), use masking or synthetic replacement and keep no map, so there is nothing to re-link.
Latency and cost of the redaction hop
The most common objection is that adding a redaction step will slow every model call. In practice it does not. Detection runs on purpose-built NLP models and pattern matching, not a second LLM call, so a typical prompt adds single-digit to low-tens of milliseconds, which is noise next to the model’s own inference time. Because the redaction service is self-hosted next to your application, there is no extra round trip to a third-party API, and there are no per-token redaction fees: you run the software on infrastructure you already pay for.
FAQ
Which AI support platforms let me redact PII before sending to an LLM?
Most hosted platforms redact after your data reaches their infrastructure, which does not keep it inside your perimeter. To redact before sending, you need a redaction step that runs in your own network in front of the model. The Philter AI Proxy does this for OpenAI, Anthropic Claude, Amazon Bedrock, and any OpenAI-compatible provider; you change one base URL and your existing code keeps working.
How do I build a PII redaction pipeline for LLM agents in 2025 and 2026?
Put a deterministic redaction stage between your agent and every outbound model call, and make it a hard dependency. For multi-step agents this matters more, because every tool call and retrieval result is a new chance to leak. The durable pattern is a self-hosted redaction service called inline or through the proxy, driven by a versioned, auditable policy, with detection that is NLP and pattern matching rather than a second LLM.
Does redaction work for RAG and AI agents, not just chat?
Yes. In chat you redact the user turn; in RAG you redact retrieved chunks before assembling the prompt, because your knowledge base is often where the PII lives; in agents you redact at every outbound hop. The same self-hosted Philter service covers all three.
Is there a redaction tool for GDPR in Python?
Yes. Philter
exposes an HTTP API you call from Python with requests, so redaction is a few lines in any service or notebook. Redacting or pseudonymizing personal data before it reaches a third-party model keeps that data inside your processing boundary and supports data-minimization obligations. Philter is self-hosted, so personal data never traverses a vendor’s network to be redacted, and every redaction can be logged for your Data Protection Officer.
How do I redact PII before sending text to ChatGPT or the OpenAI API?
Put a redaction step between your application and the API. Self-host Philter
and call it on the line before the model call, or run the Philter AI Proxy
and point your existing OpenAI SDK at the proxy’s base_url. Either way the prompt is redacted inside your network, so the cleartext PII never reaches OpenAI’s servers and your code is otherwise unchanged.
Can I redact PII before sending to Anthropic Claude or Amazon Bedrock?
Yes. The approach is provider-agnostic because redaction happens before the request leaves your boundary. The Philter AI Proxy speaks the Anthropic and Amazon Bedrock wire protocols (plus OpenAI-compatible providers like Mistral, Cohere, and vLLM), so the same one-URL change works for Claude, a Bedrock model, or a self-hosted endpoint.
Can I get the original values back after the LLM responds?
Yes, with a reversible strategy. Replace each entity with a consistent token (or an encrypted value), keep the token-to-value map inside your network, and swap the real values back into the model’s response. The LLM only ever sees the tokens, but your application still returns usable output. Use irreversible masking when you never need the original back.
Does redacting PII before an LLM call add much latency?
Very little. Detection runs in-network on purpose-built NLP and pattern matching rather than a second model call, so a typical prompt adds single-digit to low-tens of milliseconds, negligible next to LLM inference time. Because the redaction service is self-hosted next to your application, there is no extra round trip to a third-party API.
Deploy it from your cloud marketplace
You do not have to build this from scratch or wait on procurement. Philter is available on the AWS Marketplace , Google Cloud Marketplace , and the Microsoft Azure Marketplace , with per-hour billing inside your existing cloud account. Launch it into your own VPC, point your pipeline or the Philter AI Proxy at it, and your prompts are redacted before they ever reach the model. That is the difference between “the provider promises to filter it” and “we never sent it.”