PII in Vector Embeddings: A Defense Guide
"It's just an array of floats" is the most reassuring sentence a vector-store skeptic can hear — and the most misleading. Sentence embeddings produced by modern models are partially invertible: an attacker with access to the embeddings (but not the original text) can reconstruct meaningful approximations of the source. For teams storing embeddings of sensitive data, this turns "we don't expose the raw text" from a complete defense into half of one.
This post is a practical defense guide. The threat model, the research, and four layered mitigations — ordered by effort vs. impact — that together cover the realistic attack surface.
What "embedding inversion" actually means
An embedding is a fixed-length vector (typically 384, 768, 1024, or 1536 floats) produced by a model from a chunk of text. Two pieces of text that are semantically similar end up close together in vector space; two that are semantically distinct end up far apart. That's the property that makes embeddings useful for retrieval.
The unintended consequence: semantic similarity is a strong proxy for content similarity. If an attacker has your embedding, they can:
- Generate large numbers of candidate sentences.
- Embed each candidate with the same model you used.
- Compute the distance to your embedding.
- The candidates that score closest are approximations of your original text.
Recent research (Morris et al., 2023; Song & Raghunathan, 2020) has shown that this approach can recover not just the topic of an embedded sentence but often substantial portions of the original wording — including names, dates, account numbers, and other PII. For embeddings of short, structured text (like a customer support ticket title), the recovery fidelity can approach the original.
The threat model
Three realistic attack scenarios:
- Vector store breach. The most direct: someone exfiltrates your Chroma / Qdrant / Milvus / pgvector database. Whether the chunk metadata went with it or not, the embeddings themselves are a recovery vector.
- Insider read access. An engineer with legitimate read access to the vector store has effective read access to (approximations of) the underlying text. This is rarely modeled because the metadata is "just numbers."
- Hosted vector database leak. If you're using a SaaS vector database (Pinecone, Weaviate Cloud, etc.), your embeddings sit on someone else's infrastructure. Their security is your security.
Notably not in scope: inversion attacks against the embedding API itself (those exist but require model-specific work and aren't usually the cheapest attack path). And online attacks against a live retrieval API (which are constrained by the retrieval surface, but are a separate problem).
Defense 1: redact before you embed
The strongest mitigation by far. If the source text doesn't contain PII when it's embedded, the embedding can't leak PII even under perfect inversion. This is the standard privacy-aware RAG ingestion pattern:
def ingest_document(text: str, doc_id: str, store):
# 1. Redact the source text with Phileas
redacted = redact_via_philter(text)
# 2. Chunk the redacted text
chunks = chunker.split_text(redacted)
# 3. Embed the redacted chunks
embeddings = embedder.embed_documents(chunks)
# 4. Store embeddings + redacted text
store.add(
embeddings=embeddings,
metadatas=[{"doc_id": doc_id, "chunk_id": i}
for i in range(len(chunks))],
documents=chunks, # also redacted
)This is the single highest-leverage defense. Nothing else on this list matters as much as getting the redact-before-embed step right. Philter and Phileas both fit naturally into ingestion pipelines for this purpose.
One subtle trade-off: embeddings of redacted text are different from embeddings of raw text. Retrieval quality on redacted embeddings is sometimes slightly degraded (the model loses the specific named entity as a signal). For most workloads the difference is negligible; for entity-heavy retrieval (e.g. "find documents about this specific person"), benchmark before assuming.
Defense 2: differential privacy on the embeddings themselves
If you can't redact before embedding (for example, if the model's retrieval quality requires the entity-level signal), the next defense is to add calibrated noise to the embeddings before storage. Local Differential Privacy applied at the embedding layer makes inversion attacks substantially harder by destroying the fine-grained signal that inversion exploits while preserving the coarse-grained signal that retrieval needs.
Philter Diffuse implements the formal ε-budget framework for this. The trade-off is explicit: lower ε means more noise, stronger privacy, weaker retrieval; higher ε means less noise, weaker privacy, stronger retrieval. You pick the operating point. The math behind the trade-off is auditable — this is the difference between "we add some noise" and "we add a measured amount of noise with a formal privacy guarantee."
from philter_diffuse import LocalDP
# Apply noise at ingestion time
ldp = LocalDP(epsilon=1.0) # tune ε per workload
noisy_embedding = ldp.privatize(embedding)
store.add(embeddings=[noisy_embedding], ...)For most enterprise workloads, ε between 0.5 and 2.0 is the practical operating range: meaningful resistance to inversion attacks without destroying retrieval quality.
Defense 3: keep the vector store inside your perimeter
The simplest architectural mitigation: don't send your embeddings to a hosted vector database. Self-host Chroma, Qdrant, Milvus, or pgvector inside the same VPC as your application. Embeddings stay where the application stays; vector-store breach risk reduces to general application-infrastructure breach risk.
This generalizes the same argument we made against SaaS PII redaction APIs: any time data leaves your perimeter, you've widened your blast radius by the union of your security posture and the vendor's. Self-hosting tightens it back to just yours.
For regulated workloads (healthcare, finance, government), self-hosting is usually not a choice but a requirement — the standard contracts with hosted vector providers don't include BAAs or the equivalent for NPPI.
Defense 4: encrypt embeddings at rest, access-control the index
Belt-and-suspenders measures that catch the cases the first three defenses miss:
- Storage-layer encryption. Use your vector database's transparent disk encryption (most major options support this) so a stolen disk image doesn't yield the embeddings. The encryption is invisible to the application but blocks the offline-attack path.
- Tight read-access controls. Treat read access to the vector store the same way you treat read access to the raw documents. The legitimate consumers of the embeddings are the retrieval service and (during debugging) a small set of named engineers — not the broader engineering organization.
- Per-tenant or per-user isolation for multi-tenant applications. A tenant whose embeddings get exposed in a cross-tenant query is its own breach. Index-per-tenant or row-level filtering should be enforced at the retrieval layer, not delegated to application logic.
- Audit logging on the retrieval API. Every retrieval query is a small information disclosure. Logging which queries hit which chunks (without logging the chunks themselves) gives you the timeline regulators will ask for after an incident.
Where the state of the art is heading
Two research directions worth tracking:
- Inversion-resistant embedding models. Some recent work explores training embedding models with explicit inversion-resistance objectives — producing embeddings that retain retrieval utility while being substantially harder to invert. Not yet a production-ready option, but the gap is closing.
- Fully homomorphic encryption (FHE) for retrieval. Encrypted embeddings + encrypted queries + encrypted retrieval; the server never sees plaintext anywhere. FHE is still slow for production retrieval, but the performance gap is narrowing year-over-year.
For now, the right answer for any sensitive-data RAG workload is the four-layer defense above: redact at ingestion, apply differential privacy if you can't redact, self-host the vector store, encrypt at rest and access-control tightly.
The bottom line
Vector embeddings aren't "just numbers" any more than a hashed password is "just a hash." Both are recoverable under realistic attack conditions. The right response isn't to abandon RAG — the architecture is too useful — it's to design embedding pipelines with the same threat model you'd apply to any other PII-handling system.
Redact what you can, add noise to what you can't, keep the store inside your perimeter, and apply layered defenses. The Philterd toolkit covers each of these layers (Philter / Phileas for redaction, Philter Diffuse for differential privacy on embeddings, Phinder for discovery, Phield for monitoring) with open source software and no third-party data path.
For a deeper architectural pattern, see "Building a Privacy-Aware RAG System" — the embedding-defense story sits inside the broader RAG-defense story.