Talk to an Expert

Tell us about your stack and the privacy problems you're trying to solve. We typically respond within one business day.

Prefer to skip the form? Pick a time on our calendar →
or send a message

← All policies

Healthcare · Philterd

HIPAA Safe Harbor De-Identification

Remove all 18 HIPAA Safe Harbor identifiers from clinical text per 45 CFR 164.514(b)(2).

v1.0.0 Updated 2026-05-18 Philter >=3.0.0 By Philterd
HIPAASafe HarborPHI45 CFR 164.514de-identification

The policy

The full hipaa-safe-harbor.json file — the same content you’d get by downloading. Copy any part of it, or use the buttons in the hero to grab the whole file.

{
  "name": "hipaa-safe-harbor",
  "config": {
    "splitting": {
      "enabled": false,
      "threshold": 4000
    }
  },
  "ignored": [],
  "ignoredPatterns": [],
  "identifiers": {
    "age": {
      "ageFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}", "conditions": "context == \"age\" > 89"}
      ]
    },
    "date": {
      "onlyValidDates": true,
      "dateFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "phoneNumber": {
      "phoneNumberFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "emailAddress": {
      "emailAddressFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "ssn": {
      "ssnFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "ipAddress": {
      "ipAddressFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "url": {
      "urlFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "zipCode": {
      "zipCodeFilterStrategies": [
        {"strategy": "TRUNCATE", "truncateDigits": 3}
      ]
    },
    "personsName": {
      "personsFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "hospital": {
      "hospitalFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "city": {
      "cityFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "county": {
      "countyFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}
      ]
    },
    "identifiers": [
      {
        "id": "mrn",
        "pattern": "\\bMRN[\\s:#]*\\d{5,}\\b",
        "caseSensitive": false,
        "identifierFilterStrategies": [
          {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-MRN}}}"}
        ]
      },
      {
        "id": "account-number",
        "pattern": "\\bACCT[\\s:#]*\\d{6,}\\b",
        "caseSensitive": false,
        "identifierFilterStrategies": [
          {"strategy": "REDACT", "redactionFormat": "{{{REDACTED-ACCOUNT}}}"}
        ]
      }
    ]
  }
}

Example

Input

Patient John Smith (DOB 1985-04-12, MRN 47291) was seen at Mercy Hospital on 2025-03-14. Call (555) 123-4567 with questions.

Output

Patient {{{REDACTED-name}}} (DOB {{{REDACTED-date}}}, {{{REDACTED-MRN}}}) was seen at {{{REDACTED-hospital}}} on {{{REDACTED-date}}}. Call {{{REDACTED-phone-number}}} with questions.

Entities this policy acts on

NAMEAGEDATEPHONEEMAILSSNMRNACCOUNTURLIPZIPHOSPITALCITYCOUNTY

What this policy does

Removes the 18 protected health identifiers enumerated under the HIPAA Safe Harbor method (45 CFR 164.514(b)(2)):

#IdentifierHow this policy handles it
1NamespersonsName → REDACT
2Geographic subdivisions smaller than a statecity, county → REDACT; zipCode → truncate to 3 digits
3All elements of dates (except year) for dates directly related to an individualdate → REDACT (covers admission, discharge, birth, death)
4Telephone numbersphoneNumber → REDACT
5Fax numbersphoneNumber → REDACT (Philter classifies as phone)
6Email addressesemailAddress → REDACT
7Social Security numbersssn → REDACT
8Medical record numberscustom mrn identifier → REDACT
9Health plan beneficiary numberscustom account-number identifier → REDACT (tune the regex for your plan’s format)
10Account numberscustom account-number identifier → REDACT
11Certificate/license numbersneeds custom identifier per deployment
12Vehicle identifiers and serial numbersadd vin filter if applicable
13Device identifiers and serial numbersadd custom identifier per deployment
14Web URLsurl → REDACT
15IP addressesipAddress → REDACT
16Biometric identifiersout of scope for text redaction
17Full-face photosout of scope for text redaction
18Any other unique identifying number, characteristic, or codeadd custom identifiers per deployment

The age filter triggers only on ages > 89 per Safe Harbor §164.514(b)(2)(i)(C) — ages 90 and above must be aggregated into a single category of “90 or older.”

When to customize

  • MRN format. The default regex matches MRN 47291, MRN: 47291, MRN# 47291 with 5+ digits. If your EHR uses a different prefix (e.g., HRN, PTID) or a fixed-width format, update the pattern field in the custom mrn identifier.
  • Account number format. Same caveat — the default \bACCT[\s:#]*\d{6,}\b is illustrative. Replace with your billing system’s actual account-number pattern.
  • Date treatment. Safe Harbor permits keeping the year for non-individual dates. If you need year-only date generalization (e.g., to preserve temporal ordering for cohort analysis), switch the date strategy from REDACT to a custom replacement format.
  • Geographic granularity. ZIP codes are truncated to 3 digits (Safe Harbor allows the first 3 ZIP digits if the geographic area covers > 20,000 people). If your dataset includes ZIPs from sparsely-populated regions, tighten further per §164.514(b)(2)(i)(B).
  • Identifiers 11, 13, 18. Add custom regex identifiers for any deployment-specific codes (insurance member IDs, device serial numbers, internal patient identifiers).

Compliance notes

  • This policy implements the Safe Harbor method under 45 CFR 164.514(b)(2). The alternative — the Expert Determination method under 164.514(b)(1) — requires a qualified statistician’s opinion and is out of scope for this policy.
  • Safe Harbor compliance also requires that the covered entity have no actual knowledge that the residual information could identify an individual (164.514(b)(2)(ii)). Automated redaction does not satisfy that requirement on its own; a human reviewer or risk-assessment process is still needed for production deployments.
  • Output dataset is considered de-identified under HIPAA Safe Harbor only after all 18 identifiers are removed and the no-actual-knowledge condition is met.

References

Use this policy

Download and load into your running Philter instance:

# Download the policy
curl -O https://raw.githubusercontent.com/philterd/pii-redaction-policies/main/policies/philterd/healthcare/hipaa-safe-harbor.json

# Upload to your Philter instance
curl -X POST http://localhost:8080/api/policies \
     -H "Content-Type: application/json" \
     --data @hipaa-safe-harbor.json

# Redact text using the policy
curl http://localhost:8080/api/filter?p=hipaa-safe-harbor \
     --data "your text here" \
     -H "Content-Type: text/plain"

No Philter instance yet? Deploy one in 5 minutes → · Want to tune this policy against your data? Talk to the team.