Talk to the Team

Tell us about your stack and the privacy problems you're trying to solve. We typically respond within one business day.

Prefer email? support@philterd.ai

Prefer to skip the form? Pick a time on our calendar →
or send a message

Please do not enter PII or PHI in this form. If you need to share an example, use a sanitized one.

← All lenses

PII Lens

English Names (Extra Small)

Ultra-light English person-name detector, fine-tuned from DeBERTa-v3-xsmall on NVIDIA Nemotron-PII. The smallest size in the ph-eye-pii-en family, for the most constrained on-device use; Phileas handles structured identifiers via its pattern-based layer.

  • Status available
  • License CC-BY-4.0
  • Version 1.0.0
  • Updated 2026-06-18
  • PhEye compatibility >=1.0.0
  • Languages en
  • Model size 90 MB (int8 ONNX) / 294 MB (PyTorch)
  • Author Philterd

Entities detected

  • PERSON

When to load this lens

Load this lens for English person-name detection where footprint and latency matter most — edge, in-browser, or CPU at high throughput — and the small lens is still too heavy. It ships a quantized int8 ONNX graph and runs in-process. It is the lowest-capacity size, so real-text recall is lower than the larger sizes; prefer small/medium/large when your budget allows.

Pairs well with

  • English Names (Small): Low-latency English person-name detector, fine-tuned from GLiNER small on NVIDIA Nemotron-PII. The on-device size in the ph-eye-pii-en family; Phileas handles structured identifiers via its pattern-based layer.
  • English Names (Medium): Mid-size English person-name detector, fine-tuned from GLiNER medium on NVIDIA Nemotron-PII. The recommended default in the ph-eye-pii-en family; Phileas handles structured identifiers via its pattern-based layer.
  • English Names (Large): Highest-capacity English person-name detector, fine-tuned from GLiNER large on NVIDIA Nemotron-PII. The server-side size in the ph-eye-pii-en family; Phileas handles structured identifiers via its pattern-based layer.

What this lens detects

  • PERSON: people’s names as they appear in English text.

This is a name-only lens. Emails, phone numbers, SSNs, credit cards, IP addresses, and other structured PII follow regular patterns and are detected by Phileas’s pattern-based (regex, checksum, and dictionary) layer, not by this model. Compose this lens with that layer for full coverage.

Why this lens

This is the ultra-light member of the ph-eye-pii-en family, fine-tuned from microsoft/deberta-v3-xsmall (DeBERTa-v3-xsmall) on the synthetic nvidia/Nemotron-PII dataset. It is the smallest size — roughly half the parameters and latency of the small lens — and ships a quantized int8 ONNX graph (about 90 MB) for in-process, CPU-friendly inference. Like its siblings it is recall-leaning by design, since in redaction a missed name is a leak while an extra span is only over-redaction. A confidence threshold around 0.50 is a sensible starting operating point; lower it to push recall higher.

When to use this

  • The most constrained deployments — edge, in-browser (WASM), or memory-tight CPU — where the small lens is too large.
  • High-throughput, cost-sensitive pipelines where per-document latency dominates.
  • As the English name detector composed with Phileas’s pattern-based detection for structured PII.

When a missed name is costly and footprint allows, prefer ph-eye-pii-en-small, -medium, or -large, which have higher real-text recall.

Known limitations

  • Names only. This lens detects PERSON. Other PII is handled by Phileas’s pattern-based detection; compose accordingly.
  • English only. For other languages, load the corresponding language lens when available.
  • Lowest capacity. As the smallest size, its real-text recall is lower than the larger ph-eye-pii-en sizes — it misses more names on real-world documents. Choose it when footprint and latency dominate; otherwise prefer a larger size.
  • Trained on synthetic data. Reported accuracy is in-distribution on Nemotron-PII and is a ceiling, not a production guarantee; validate precision and recall on your own text. The model is recall-leaning, so expect some over-redaction and tune the threshold to your precision/recall balance.
  • The underlying model is licensed CC-BY-4.0; the Nemotron-PII training data requires attribution to NVIDIA.

Use this lens with PhEye, Phileas, or Philter

PhEye loads this lens at configuration time and exposes it to Phileas and Philter automatically. Have questions about a specific deployment? Talk to the team.

About PhEye →