Migration guide

Migrate from Google Cloud DLP to Philter

Most teams move from Google Cloud DLP to Philter when their costs cross the break-even on per-unit billing, when they need to support a non-GCP cloud, or when the multi-tenant DLP endpoint becomes a problem for their data-residency or BAA posture. This guide covers the concept mapping, the migration steps, and the cost math.

Deploy Philter in 5 minutes

Why teams migrate

The reasons teams give for migrating off Cloud DLP, in the order we hear them most.

Per-unit pricing hits a ceiling

Cloud DLP charges per unit of content inspected. At enterprise volumes, the bill grows linearly while a per-instance Philter deployment is flat. See the worked TCO example.

Multi-cloud or hybrid stack

Cloud DLP is GCP-only. A second cloud or on-premise workload means rebuilding the redaction layer. Philter runs on any cloud, on-premise, and air-gapped from the same container image.

Data residency requirements

Cloud DLP is a multi-tenant managed service. For HIPAA-regulated PHI, certain financial data, or contractual residency requirements, sending raw text to a multi-tenant Google endpoint is exactly the opposite of what you want. Philter runs inside your VPC.

Concept mapping

How Cloud DLP concepts translate to Philter equivalents. The mapping is mostly direct; Philter exposes more granular policy controls.

Google Cloud DLP	Philter	Notes
`InspectContent` API	`POST /api/filter`	Same intent (find PII in text), simpler response shape. Philter returns redacted text by default; entity-level findings are available with an additional parameter.
`DeidentifyContent` API	`POST /api/filter` with a policy that includes filter strategies	Cloud DLP separates inspection and de-identification. Philter combines them in a single call driven by the policy.
InfoTypes (PERSON_NAME, EMAIL_ADDRESS, US_SOCIAL_SECURITY_NUMBER, etc.)	Policy entity types (PERSON, EMAIL_ADDRESS, SSN, etc.)	Direct one-to-one mapping for most common infoTypes. Philter ships additional domain-specific entities out of the box.
Custom infoTypes (regex, dictionary, large custom dictionary)	Custom identifier definitions in policy JSON	Define your own regex, dictionaries, or identifier patterns directly in the policy file.
Likelihood threshold	Per-entity confidence threshold + severity	Philter exposes the same idea with finer control. Each entity type gets its own threshold, configurable per policy.
Primitive transformations (redact, mask, replace, crypto hash, date shift)	Filter strategies (mask, redact, encrypt, FPE, replace, abbreviate, pass through)	Philter supports the same transformation set plus format-preserving encryption and synthetic-value replacement.
Stored infoTypes (large dictionary lookup)	Dictionary files referenced by the policy	Same capability. Philter loads dictionaries from local files or object storage; no separate stored-infoType API to manage.
Cloud Storage / BigQuery scanning	Phinder discovery scanner	Philter's sister tool Phinder handles bulk discovery across storage backends, including S3, GCS, and Azure Blob.
DLP audit logs in Cloud Logging	Structured Philter logs (JSON to stdout)	Wire Philter logs into your existing logging stack (Cloud Logging, CloudWatch, Datadog, Splunk).

Migration steps

A safe migration runs Philter in shadow mode against your existing Cloud DLP traffic, validates parity on a sample, and then cuts over. Most teams complete the migration in two to four weeks.

Inventory your DLP integration points
List every place your code calls InspectContent or DeidentifyContent. Catalog the infoTypes checked, the transformations applied, the likelihood thresholds, and the volumes per integration point.
Deploy Philter alongside Cloud DLP
Deploy Philter from the Google Cloud Marketplace into your VPC. Build the initial policy by translating each infoType configuration into a Philter entity entry. No changes to application code yet.
Translate custom infoTypes to Philter custom identifiers
Each custom infoType (regex, dictionary, stored dictionary) becomes a custom identifier or dictionary entry in the Philter policy JSON. The translation is mostly mechanical; use the Redaction Policy Editor for the interactive version.
Run shadow mode
For a sample of production traffic, send the same text to both Cloud DLP and Philter. Diff the results. Tune Philter's policy until the parity is acceptable. Pay attention to confidence thresholds; Cloud DLP's likelihood levels are coarser than Philter's per-entity thresholds.
Cut over per integration point
Switch one integration point at a time from Cloud DLP to Philter. Monitor entity-type counts via Phield or your own metrics. Roll back instantly if anything looks off.
Decommission Cloud DLP access
Once all integrations are stable on Philter, remove the IAM permissions for the DLP API. The bill should drop to zero on the next invoice.

Architecture changes

Cloud DLP's call path is application → GCP regional endpoint → multi-tenant DLP service → response. Philter's call path is application → Philter container in your VPC → response. The data never leaves your project's network. For HA deployments, run two or more Philter instances behind an internal load balancer; throughput scales horizontally.

Cost comparison

Cloud DLP charges per content unit inspected, with separate pricing for inspection and de-identification. At production volumes (1M+ documents per day at typical document sizes), the bill scales into the tens or hundreds of thousands per month. Philter on the Google Cloud Marketplace bills per instance-hour at $0.49/hr; an HA deployment of two instances costs approximately $720/month flat regardless of volume. The break-even point typically lands around 5-10M units per month, depending on document size and which DLP operations you use. See the worked TCO comparison for the math.

Common pitfalls

Treating likelihood levels as one-to-one. Cloud DLP returns likelihood as one of six discrete levels (VERY_UNLIKELY through VERY_LIKELY). Philter returns continuous confidence scores. Translate carefully: a Cloud DLP "LIKELY" threshold is roughly Philter's 0.7-0.85 range, but tune against your data.
Forgetting about scanning workloads. Cloud DLP is often used both for inline redaction and for bulk scanning (Cloud Storage, BigQuery). The inline workload goes to Philter; the scanning workload goes to Phinder. Plan both halves of the migration.
Skipping the per-entity strategy design. Cloud DLP's primitive transformations are simple. Philter's filter strategies are richer. Use the migration as an opportunity to design policies that fit downstream consumers, not just to reproduce the previous behavior.

Plan the migration with the team that built Philter

A 30-minute call with Jeff covers your current setup, the migration path that fits your stack, and where the gotchas usually live. No sales pitch.

Deploy Philter in 5 minutes

Contact Us

Migrate from Google Cloud DLP to Philter

Why teams migrate

Per-unit pricing hits a ceiling

Multi-cloud or hybrid stack

Data residency requirements

Concept mapping

Migration steps

Inventory your DLP integration points

Deploy Philter alongside Cloud DLP

Translate custom infoTypes to Philter custom identifiers

Run shadow mode

Cut over per integration point

Decommission Cloud DLP access

Architecture changes

Cost comparison

Common pitfalls

Further reading

Plan the migration with the team that built Philter