Volume crossed the pricing step
Comprehend's per-100-character billing is cheap at low volume and expensive at high volume. Around 100M units per month, an HA Philter deployment becomes ~400× cheaper. See the worked TCO example.
Migration guide
Most teams move from AWS Comprehend to Philter when their volume crosses the break-even point on per-character pricing, when they need to support a second cloud, or when their data-residency posture rules out a multi-tenant API. This guide covers the concept mapping, the migration steps, and the cost math.
The three reasons teams give for migrating off Comprehend, in roughly the order we hear them.
Comprehend's per-100-character billing is cheap at low volume and expensive at high volume. Around 100M units per month, an HA Philter deployment becomes ~400× cheaper. See the worked TCO example.
Comprehend is AWS-only. A move to multi-cloud (or a customer-mandated GCP/Azure deployment) means rebuilding the redaction layer or running a parallel solution. Philter runs on any cloud, on-premise, and air-gapped from the same container image.
BAA or contractual residency requirements that started at "data stays in our AWS account" often evolve into "data stays in our VPC." Comprehend's multi-tenant endpoint sits outside the VPC; Philter sits inside it.
How Comprehend concepts translate to Philter equivalents. Most concepts have direct analogs; a few open up new capabilities Comprehend does not have.
| AWS Comprehend | Philter | Notes |
|---|---|---|
DetectPiiEntities API call | POST /api/filter | Same intent (find PII in text), different response shape. Philter returns redacted text by default; the entity list is available with an additional parameter. |
| Built-in entity types (NAME, ADDRESS, SSN, etc.) | Default policy entities + custom entities | Philter ships analogous defaults plus a full policy engine: dictionaries, regex, identifier patterns, severity thresholds. |
| Confidence threshold tuning | Per-entity confidence + severity in policy | Philter exposes finer control: confidence and severity are configurable per entity type, with different thresholds for different policies. |
RedactionConfig (mask / replace) | Filter strategies (mask, redact, encrypt, FPE, replace, abbreviate, pass through) | Philter supports more strategies, including format-preserving encryption and synthetic-value replacement. |
Async StartPiiEntitiesDetectionJob | Direct API calls or queue worker pattern | Philter's API is synchronous per-request and saturates a single instance at high throughput. For batch jobs, run multiple instances behind a queue. |
| IAM role + Comprehend service permission | Network policy + Philter authentication | Access control moves from IAM to network and (optionally) API key authentication on the Philter instance. |
| CloudWatch metrics | Prometheus + structured logs | Philter exposes Prometheus metrics and structured JSON logs. Wire them into your existing observability stack. |
A safe migration runs Philter in shadow mode against your existing Comprehend traffic, validates parity on a sample, and then cuts over. Most teams complete the migration in two to four weeks.
List every place your code calls DetectPiiEntities or StartPiiEntitiesDetectionJob. Catalog the policies (which entity types are checked, which redaction transforms are applied) and the volumes per integration point.
Deploy Philter from the AWS Marketplace into your VPC. Configure one policy per integration point, mapped from the Comprehend configuration. No changes to application code yet.
For a sample of production traffic, send the same text to both Comprehend and Philter. Diff the results. Tune Philter's policy to close any meaningful gaps (typically: a custom regex for an internal identifier Comprehend's built-ins don't cover).
Switch one integration point at a time from Comprehend to Philter. Monitor entity-type counts via Phield or your own metrics. Roll back instantly if anything looks off.
Once all integration points are on Philter and stable, remove the IAM permissions for Comprehend PII detection. The bill should drop to zero on the next invoice.
Comprehend's call path is application → AWS regional endpoint → multi-tenant Comprehend → response. Philter's call path is application → Philter container in your VPC → response. The data never leaves your account. For high-availability deployments, run two or more Philter instances behind an internal load balancer; throughput scales horizontally.
Comprehend bills per 100-character unit at $0.0001/unit on the standard tier. Philter on the AWS Marketplace bills per instance-hour at $0.49/hr. For a workload processing 5M documents per day at ~600 characters each (~30B characters/month, ~300M Comprehend units/month), Comprehend bills approximately $300,000/month; an HA Philter deployment (two t3.xlarge instances at $0.49/hr each) bills approximately $720/month. The break-even sits around 5-10M Comprehend units/month depending on document size.
NAME entity covers persons; Philter's PERSON entity does the same but with slightly different boundary detection. Run shadow mode to catch the cases where the difference matters for your text.A 30-minute call with Jeff covers your current setup, the migration path that fits your stack, and where the gotchas usually live. No sales pitch.