Talk to the Team

Tell us about your stack and the privacy problems you're trying to solve. We typically respond within one business day.

Prefer email? support@philterd.ai

Prefer to skip the form? Pick a time on our calendar →
or send a message

Please do not enter PII or PHI in this form. If you need to share an example, use a sanitized one.

← All policies

General · Philterd

French Identifiers (NIR, SIREN, SIRET)

Redact French social-security numbers (NIR) and business identifiers (SIREN, SIRET), validated by their control key or Luhn check.

v1.0.0 Updated 2026-06-15 Philter Requires Phileas 4.1.0+ (redaction policy schema 1.1.0 with the mod97 and luhn validators) By Philterd
FranceNIRINSEESIRENSIRETsocial securitybusiness ID

The policy

The full french-identifiers.json file. The same content you’d get by downloading. Copy any part of it, or use the buttons in the hero to grab the whole file.

{
  "config": {
    "splitting": {
      "enabled": false,
      "threshold": 4000
    }
  },
  "ignored": [],
  "identifiers": {
    "identifiers": [
      {
        "classification": "french-nir",
        "pattern": "\\b[12]\\d{4}(?:\\d{2}|2[AB])\\d{8}\\b",
        "caseSensitive": false,
        "validator": {
          "name": "mod97",
          "params": {
            "variant": "nir"
          }
        },
        "identifierFilterStrategies": [
          {
            "strategy": "REDACT",
            "redactionFormat": "[REDACTED-FRENCH-NIR]"
          }
        ]
      },
      {
        "classification": "french-siret",
        "pattern": "\\b\\d{14}\\b",
        "caseSensitive": false,
        "validator": "luhn",
        "identifierFilterStrategies": [
          {
            "strategy": "REDACT",
            "redactionFormat": "[REDACTED-FRENCH-SIRET]"
          }
        ]
      },
      {
        "classification": "french-siren",
        "pattern": "\\b\\d{9}\\b",
        "caseSensitive": false,
        "validator": "luhn",
        "identifierFilterStrategies": [
          {
            "strategy": "REDACT",
            "redactionFormat": "[REDACTED-FRENCH-SIREN]"
          }
        ]
      }
    ]
  }
}

Example

Input

NIR 255081416802538, SIREN 732829320, SIRET 73282932000074.

Output

NIR [REDACTED-FRENCH-NIR], SIREN [REDACTED-FRENCH-SIREN], SIRET [REDACTED-FRENCH-SIRET].

Entities this policy acts on

FRENCH_NIRFRENCH_SIRENFRENCH_SIRET

What this policy does

Detects and redacts three French identifiers using Phileas’s generic identifier filter with a validator, so each match is kept only if its check passes:

  • NIR (the INSEE / social-security number, “numero de securite sociale”): 13-character body plus a two-digit control key, validated by the mod97 validator (nir variant). Corsica department codes 2A and 2B are substituted (to 19 and 18) before the key is checked.
  • SIREN: the 9-digit business registration number, validated by the luhn validator.
  • SIRET: the 14-digit establishment number (SIREN plus a 5-digit NIC), validated by the luhn validator.

Each is replaced with a distinct token.

Why the validators matter

A 9- or 14-digit pattern would over-match ordinary numbers. The validator keeps a match only if the check passes, so 732829321 (a SIREN shape that fails the Luhn check) is left in place while 732829320 is redacted. Detection remains probabilistic; validate against your own documents.

Test vectors

  • NIR, valid: 255081416802538 (and Corsica 220032A00801642). Invalid (bad control key): 255081416802539.
  • SIREN, valid: 732829320. Invalid (bad checksum): 732829321.
  • SIRET, valid: 73282932000074. Invalid (bad checksum): 73282932000075.

Contextual cues

In free text, anchor on a nearby cue (“SIREN”, “SIRET”, “n de securite sociale”) and capture only the identifier with groupNumber:

{
  "classification": "french-siren",
  "pattern": "SIREN[\\s:#-]*(\\d{9})",
  "caseSensitive": false,
  "groupNumber": 1,
  "validator": "luhn",
  "identifierFilterStrategies": [
    { "strategy": "REDACT", "redactionFormat": "[REDACTED-FRENCH-SIREN]" }
  ]
}

This trades recall for precision.

Prerequisites

Use Phileas 4.1.0 or later, which provides redaction policy schema 1.1.0 and the mod97 and luhn validators. The example input and output were verified against Phileas 4.1.0.

References

Use this policy

Download and load into your running Philter instance:

# Download the policy
curl -O https://raw.githubusercontent.com/philterd/pii-redaction-policies/main/policies/philterd/general/french-identifiers.json

# Upload to your Philter instance
curl -X POST http://localhost:8080/api/policies \
     -H "Content-Type: application/json" \
     --data @french-identifiers.json

# Redact text using the policy
curl http://localhost:8080/api/filter?p=french-identifiers \
     --data "your text here" \
     -H "Content-Type: text/plain"

No Philter instance yet? Deploy one in 5 minutes → · Want to tune this policy against your data? Talk to the team.