What this policy does
Removes personal information (PI) and sensitive personal information (SPI) as defined by the California Consumer Privacy Act (CCPA), as amended by the California Privacy Rights Act (CPRA).
The CCPA defines personal information very broadly — information that identifies, relates to, or could reasonably be linked with a consumer or household (Cal. Civ. Code § 1798.140(v)). The CPRA added a sensitive personal information subcategory (§ 1798.140(ae)) covering SSNs, driver’s licence / state ID numbers, financial account access, precise geolocation, and similar.
This policy targets:
- Names — redacted (confidence-gated)
- Email, phone, postal address — redacted (identifiers under § 1798.140(v))
- SSNs, driver’s licence, passport, state ID — redacted (sensitive personal information)
- Birthdates — truncated to year only when context indicates a birth date
- Credit card numbers and account numbers — masked to last 4 visible (financial-account SPI)
- IP addresses, MAC addresses, URLs — redacted as unique / online identifiers
When to use this
- Fulfilling a consumer “right to know” / access request where another consumer’s or household’s PI must be redacted from the disclosed records
- Sharing consumer data with a service provider or contractor under a CCPA-compliant contract that limits use to business purposes
- De-identifying data so it falls outside CCPA scope (§ 1798.140(m)) before analytics or model training
- Data-broker and advertising pipelines where opt-out / “Do Not Sell or Share” signals require stripping identifiers
- Internal analytics across teams where full consumer identity isn’t needed
When to customize
- Household linkage. CCPA covers “household” data. If your records can be re-linked to a household through combinations of non-redacted fields (e.g. address + device), evaluate those combinations — field-level redaction alone may not de-identify.
- Precise geolocation. CPRA treats precise geolocation as SPI. This policy does not detect lat/long or GPS coordinates by default; add a custom identifier if your data contains them.
- State / consumer ID formats. The default
state-idpattern is keyword-anchored. Replace it with the exact California DL/ID format if you only process California records. - Name confidence threshold. Default redacts names above confidence 70. Adjust for precision vs recall.
- De-identification standard. To rely on the CCPA’s de-identified-data exemption you must also commit to not re-identifying and implement safeguards (§ 1798.140(m)); redaction is a necessary but not sufficient step.
CCPA/CPRA vs GDPR
Both are broad consumer/data-subject privacy regimes, and the entity coverage overlaps heavily. Key differences for redaction purposes:
| CCPA / CPRA | GDPR | |
|---|---|---|
| Jurisdiction | California residents | EU/EEA data subjects |
| Unit of protection | Consumer and household | Natural person |
| Sensitive subcategory | “Sensitive personal information” (§ 1798.140(ae)) | “Special categories” (Art. 9) |
| Companion policy here | (this policy) | gdpr-personal-data.json |
If you operate in both regimes, the GDPR policy adds health/special-category detection; the two can be stacked on the same document.
Compliance notes
- Cal. Civ. Code § 1798.100 et seq. — the CCPA, effective 1 January 2020
- California Privacy Rights Act (CPRA) — amended the CCPA; most provisions effective 1 January 2023, enforced by the California Privacy Protection Agency (CPPA)
- § 1798.140(v) — definition of personal information
- § 1798.140(ae) — definition of sensitive personal information
- § 1798.140(m) — standard for de-identified data (outside CCPA scope)
- This policy is a baseline starting point, not legal advice or a de-identification certification. Assess re-identification and household-linkage risk for your specific dataset.