Talk to an Expert

Tell us about your stack and the privacy problems you're trying to solve. We typically respond within one business day.

Prefer to skip the form? Pick a time on our calendar →
or send a message

← All policies

Contact Center · Philterd

Contact Center Call Recording Transcripts

Strip cardholder data and PII from contact-center call transcripts — primarily PAN, CVV, SSN, account numbers — to reduce PCI DSS scope and meet QA privacy requirements.

v1.0.0 Updated 2026-05-18 Philter >=3.0.0 By Philterd
PCI DSScontact centercall recordingtranscriptsPCI scope reductionQA

The policy

The full call-recording-transcripts.json file — the same content you’d get by downloading. Copy any part of it, or use the buttons in the hero to grab the whole file.

{
  "name": "call-recording-transcripts",
  "config": {
    "splitting": {
      "enabled": true,
      "threshold": 16000
    }
  },
  "ignored": [],
  "identifiers": {
    "creditCard": {
      "onlyValidCreditCards": true,
      "creditCardFilterStrategies": [
        {"strategy": "MASK", "maskCharacter": "*", "maskLength": "SAME", "leaveCharacters": 4}
      ]
    },
    "ssn": {
      "ssnFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "[REDACTED-SSN]"}
      ]
    },
    "phoneNumber": {
      "phoneNumberFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "[REDACTED-PHONE]"}
      ]
    },
    "emailAddress": {
      "emailAddressFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "[REDACTED-EMAIL]"}
      ]
    },
    "date": {
      "onlyValidDates": true,
      "dateFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "[REDACTED-DATE]", "conditions": "context == \"birth\""}
      ]
    },
    "address": {
      "addressFilterStrategies": [
        {"strategy": "REDACT", "redactionFormat": "[REDACTED-ADDRESS]"}
      ]
    },
    "identifiers": [
      {
        "id": "cvv-spoken",
        "pattern": "\\b(?:CVV|CVC|CSC|security[\\s-]?code|three[\\s-]?digit[\\s-]?code|verification[\\s-]?code|card[\\s-]?code)[\\s:#]*\\d{3,4}\\b",
        "caseSensitive": false,
        "identifierFilterStrategies": [
          {"strategy": "REDACT", "redactionFormat": "[REDACTED-CVV]"}
        ]
      },
      {
        "id": "account-number-spoken",
        "pattern": "\\b(?:account|acct)[\\s:#]*(?:number|#)?[\\s:#]*\\d{6,}\\b",
        "caseSensitive": false,
        "identifierFilterStrategies": [
          {"strategy": "MASK", "maskCharacter": "*", "maskLength": "SAME", "leaveCharacters": 4}
        ]
      },
      {
        "id": "card-expiration-spoken",
        "pattern": "\\b(?:expires?|expiration|exp)[\\s:#]*(0[1-9]|1[0-2])[/\\-](\\d{2}|\\d{4})\\b",
        "caseSensitive": false,
        "identifierFilterStrategies": [
          {"strategy": "REDACT", "redactionFormat": "[REDACTED-EXP]"}
        ]
      }
    ]
  }
}

Example

Input

Caller: My card number is 4532-1234-5678-9010, expires 12/27, security code 845, for account 9876543210. Phone is 555-867-5309 and SSN is 123-45-6789.

Output

Caller: My card number is ****-****-****-9010, [REDACTED-EXP], [REDACTED-CVV], for account ******3210. Phone is [REDACTED-PHONE] and SSN is [REDACTED-SSN].

Entities this policy acts on

CREDIT_CARDSSNPHONEEMAILDATEADDRESSCVVACCOUNTEXPIRATION

What this policy does

Tuned specifically for automated speech-to-text transcripts from contact-center calls — Genesys, NICE, Verint, Five9, Twilio Voice Intelligence, Amazon Transcribe Call Analytics, Google Cloud Contact Center AI, etc.

Differs from generic PCI policies in that it explicitly handles patterns that show up in spoken-aloud PII:

  • PAN (card numbers) — masked to last 4 visible per PCI DSS Req 3.4. onlyValidCreditCards: true requires Luhn-valid sequences to avoid false positives on order IDs and reference numbers.
  • CVV / security code — fully redacted. Includes spoken variations: security code, three-digit code, verification code, card code — not just the literal CVV.
  • Expiration dates — fully redacted when preceded by expires, exp, or expiration (the spoken pattern; raw 12/27 standing alone is too risky to auto-detect).
  • Account numbers — masked to last 4 visible.
  • SSN — fully redacted.
  • DOB — redacted when context indicates a birth date (commonly spoken aloud during identity verification).
  • Phone, email, address — redacted (typically the customer’s own info, used for verification).

Customer name is intentionally preserved. QA, dispute resolution, and supervisor review typically need to know which customer the call was about. If you’re sharing transcripts externally (third-party analytics, ML training data), add personsName redaction.

When to use this

  • Pre-ingest redaction before transcripts land in Datadog, Splunk, Elastic, or any centralized logging / analytics platform. Removes those systems from PCI scope.
  • QA monitoring systems (NICE QM, Verint WFO, etc.) where supervisors review calls but shouldn’t have access to cardholder data.
  • Coaching and training corpora built from anonymized call transcripts.
  • Customer-service AI training (intent classification, summarization models fine-tuned on real calls).
  • Compliance audit responses where transcripts must be produced but PCI-sensitive content must be redacted first.

When to customize

  • Customer name handling. Preserved by default. For external sharing or ML training, add a personsName redaction rule.
  • Spoken-vs-written numbers. This policy operates on numeric strings (4532-1234...). Some transcripts include the spoken-out form (four five three two...). Modern STT engines mostly normalize spoken numerals to digits, but not all do — verify your transcript output. If you have raw spoken-form transcripts, add a number-normalization pre-processing step before Philter.
  • Account number format. Default \b(?:account|acct)[\s:#]*(?:number|#)?[\s:#]*\d{6,}\b is generic. Replace with your billing system’s actual format.
  • Address handling. Default fully redacts addresses, which is conservative. For internal QA where caller geographic region is operationally useful, swap to a REPLACE with [CITY, STATE] or similar.
  • Transcript speaker labels. Many STT pipelines produce labels like Agent: and Customer:. This policy doesn’t touch those, but if your transcripts use real names for agents, add personsName redaction or a custom rule.

Why this matters

Contact-center transcripts are one of the worst PCI scope-expansion hot spots. A typical contact center pipeline includes:

  • The call recording itself (audio file)
  • The auto-generated transcript
  • A summary or notes added by the agent (CRM, ticket system)
  • A copy in the QM platform for supervisor review
  • A copy in the data warehouse for analytics
  • Backups of all of the above

Without consistent redaction at the transcript-generation step, every one of those systems is in PCI scope. With it, the audio file may still be (depending on storage and access controls), but the downstream text systems can be de-scoped.

Real-world impact: removing 5-10 systems from PCI scope typically saves six figures per year in audit and remediation costs, plus the operational tax of running every system to PCI standards.

Compliance notes

  • PCI DSS Requirement 3.4 — PAN must be rendered unreadable in storage. Masking to last 4 visible meets this.
  • PCI DSS Requirement 3.2 — Sensitive Authentication Data (CVV/CVC/CSC, full track data, PINs/PIN blocks) must not be stored after authorization. This policy fully redacts CVV references in transcripts.
  • GLBA Safeguards Rule (16 CFR Part 314) may also require redaction of customer financial data beyond PCI scope. Pair this policy with glba-nppi-redaction.json for financial-services contact centers.
  • Call-recording laws (federal Wiretap Act, state two-party-consent laws like California’s Penal Code 632, Florida’s Chapter 934) are separate from PII redaction. This policy doesn’t address whether you should be recording in the first place — that’s a regulatory and customer-disclosure question your compliance team needs to handle.

References

Use this policy

Download and load into your running Philter instance:

# Download the policy
curl -O https://raw.githubusercontent.com/philterd/pii-redaction-policies/main/policies/philterd/contact-center/call-recording-transcripts.json

# Upload to your Philter instance
curl -X POST http://localhost:8080/api/policies \
     -H "Content-Type: application/json" \
     --data @call-recording-transcripts.json

# Redact text using the policy
curl http://localhost:8080/api/filter?p=call-recording-transcripts \
     --data "your text here" \
     -H "Content-Type: text/plain"

No Philter instance yet? Deploy one in 5 minutes → · Want to tune this policy against your data? Talk to the team.