Readable by humans
REDACT SSN WITH MASK says exactly what it does. Policies become something a compliance reviewer can read in a pull request, not a wall of nested JSON.
The query language for PII operations
PhiSQL is a declarative query language for PII privacy operations across the Philterd toolkit. Write a few readable lines instead of hand-editing JSON, and PhiSQL compiles them to the same Phileas policy schema that Philter and Phileas already run. One language for redaction rules, version-controlled and reviewable like any other code.
REDACT SSN WITH MASK says exactly what it does. Policies become something a compliance reviewer can read in a pull request, not a wall of nested JSON.
The same policy drives detection and redaction in Philter and Phileas. Author once in PhiSQL; run everywhere the JSON policy runs.
Express HIPAA Safe Harbor, PCI DSS scope reduction, or court-filing redaction as a short, auditable policy. The shipped examples mirror the rules auditors cite.
.phisql files diff cleanly in Git. Policy changes go through the same review and CI pipeline as the rest of your code.
PhiSQL compiles to the Phileas JSON your stack already executes. Adopt it for authoring without changing anything downstream.
The specification, grammar, reference implementation, and examples are all open source under the permissive Apache 2.0 license.
PhiSQL is defined as an open specification, with a Java reference implementation that proves it. Both live in one Apache-2.0 repository.
Versioned under spec/v0.1/: an ANTLR4 grammar and EBNF, a catalog of entity types, strategies, keywords, and predicates, plus worked examples that pair each .phisql file with the JSON it compiles to.
A Java parser and compiler published to Maven as ai.philterd:phisql. Build it with mvn verify in the reference/ directory, or pull it in as a dependency.
PhiSQL never adds capabilities the JSON schema does not already have. Anything you express in PhiSQL maps cleanly to a Phileas policy, so there is no lock-in and no second source of truth.
PhiSQL is an authoring layer, not a new runtime. Your .phisql source compiles to a standard Phileas JSON policy, which Philter, Phileas, and the rest of the toolkit already execute. The JSON schema stays the source of truth: Phileas JSON leads, PhiSQL follows.
PhiSQL v0.1 covers the redaction subset: REDACT, DEIDENTIFY, and IGNORE. Each query below is a complete, working policy drawn from the specification's worked examples.
Every example compiles to a standard Phileas JSON policy. The same rules you would otherwise hand-write in JSON, expressed in a few readable lines.
-- Minimal example: redact U.S. Social Security Numbers.
POLICY ssn_only;
REDACT SSN WITH MASK;
-- HIPAA Safe Harbor de-identification (45 CFR 164.514(b)(2)).
POLICY hipaa_safe_harbor
DESCRIPTION 'HIPAA Safe Harbor de-identification.';
DEIDENTIFY
PHYSICIAN_NAME AS RANDOM_REPLACE,
HOSPITAL AS RANDOM_REPLACE,
DATE AS TRUNCATE,
AGE AS REDACT,
SSN AS REDACT,
PHONE_NUMBER AS REDACT,
EMAIL_ADDRESS AS REDACT,
STREET_ADDRESS AS REDACT,
CITY AS REDACT,
STATE AS REDACT,
ZIP_CODE AS REDACT;
-- PCI DSS v4.0 Req 3.2-3.4: PAN to last 4 only.
-- A WHERE predicate gates the rule on detection confidence.
POLICY pci_dss_scope_reduction
DESCRIPTION 'PCI DSS v4.0 scope reduction.';
REDACT CREDIT_CARD WITH LAST_4 WHERE CONFIDENCE > 0.85;
-- Customer support tickets, with an allowlist for company names.
POLICY support_tickets
DESCRIPTION 'Customer support ticket redaction with allowlist.';
REDACT FIRST_NAME, SURNAME WITH STATIC_REPLACE(value='Customer', scope=document);
REDACT EMAIL_ADDRESS WITH MASK;
REDACT PHONE_NUMBER WITH MASK;
IGNORE TERMS ('Acme', 'AcmeCorp') FOR FIRST_NAME;
IGNORE TERMS ('Corp', 'Support', 'Engineering') FOR SURNAME;
-- Format-preserving encryption keeps the surface format of an
-- identifier while making the value cryptographically opaque.
POLICY fpe_ssn;
REDACT SSN WITH FPE_ENCRYPT;
The v0.1 draft is focused on redaction. Later versions extend the same language to the rest of the toolkit: discovery, monitoring, and benchmarking. The syntax below illustrates the direction and is not yet implemented.
-- Discovery (planned): inventory where PII lives.
FIND PII IN 's3://patient-records/' WHERE CONFIDENCE > 0.8;
-- Benchmarking (planned): score a policy on precision and recall.
BENCHMARK POLICY hipaa_safe_harbor AGAINST 'gold-standard/';
-- Monitoring (planned): alert on unexpected PII flow.
MONITOR PII ON 'kafka://topic/events' ALERT WHEN VOLUME > 1000;
Grammar and semantics for these statements are still being designed in the open. Follow the repository and its RFCs to weigh in.
If something here isn’t covered, get in touch and we’ll answer.
POLICY declarations, REDACT and DEIDENTIFY statements across entity types and strategies, WHERE predicates such as confidence thresholds, and IGNORE clauses for allowlisted terms and patterns.mvn verify in the reference/ directory, or add ai.philterd:phisql as a Maven dependency. The compiler turns a .phisql file into a Phileas JSON policy.Three ways to get going: deploy the open source yourself, spin it up from a cloud marketplace, or work with our team directly. Pick the path that fits.