PII in logs is the classic privacy gap: a team redacts carefully at the API layer, then a stack trace, a debug line, or an audit message writes a customer’s name, email, or SSN straight to disk, stdout, or a log shipper like Splunk, OpenSearch, or Datadog. Logs are copied, indexed, and retained widely, so a leak there is hard to take back.
The safest place to fix this is inside the logging framework itself, so the message is redacted before it is ever written. This guide shows how to run every log event through Phileas using a log4j2 rewrite policy and a logback converter. It is the companion to Phileas in Graylog: that post redacts logs after they land, this one redacts them before they leave the application.
The approach
Phileas is a Java library, so it embeds directly in your application alongside the logging framework. The integration has two pieces:
- A small, framework-neutral redactor that loads a policy once and exposes
redact(String). - A thin adapter at each framework’s extension point that passes the log message through the redactor.
Add Phileas as a dependency (see the Phileas documentation for coordinates and the embedding setup), then add the code below.
The shared redactor
This holds the Phileas filter pipeline, loads the policy from a file, and redacts a string. It also carries the single most important piece of this integration: a recursion guard. Phileas itself logs, so without a guard a Phileas-internal log event would hit the filter, which calls Phileas, which logs again, in an infinite loop. A ThreadLocal flag breaks that cycle.
public final class LogRedactor {
// Recursion guard: while we are redacting, any log event Phileas itself emits
// passes through untouched instead of re-entering the filter.
private static final ThreadLocal<Boolean> INSIDE = ThreadLocal.withInitial(() -> false);
private final PlainTextFilterService filterService;
private final Policy policy;
private LogRedactor(Policy policy) {
this.policy = policy;
this.filterService = new PlainTextFilterService(
new PhileasConfiguration(new Properties()),
new DefaultContextService(), new InMemoryVectorService(), null);
}
public static LogRedactor fromFile(String path) {
try {
final String json = Files.readString(Path.of(path));
return new LogRedactor(new Gson().fromJson(json, Policy.class));
} catch (Exception e) {
throw new IllegalStateException("could not load Phileas policy: " + path, e);
}
}
public String redact(String message) {
if (message == null || INSIDE.get()) {
return message;
}
INSIDE.set(true);
try {
return filterService.filter(policy, "logs", message).getFilteredText();
} catch (Exception e) {
// Fail closed: a redaction failure is loud, not a silent leak of the original.
return "[REDACTION FAILED]";
} finally {
INSIDE.set(false);
}
}
}
log4j2
In log4j2, the extension point that can change a message is a RewritePolicy attached to a RewriteAppender. A log4j2 Filter only decides accept or deny; it cannot alter the message, so the rewrite appender is the right choice.
@Plugin(name = "PhileasRewritePolicy", category = "Core", elementType = "rewritePolicy", printObject = true)
public final class PhileasRewritePolicy implements RewritePolicy {
private final LogRedactor redactor;
private PhileasRewritePolicy(LogRedactor redactor) {
this.redactor = redactor;
}
@Override
public LogEvent rewrite(LogEvent event) {
final String original = event.getMessage().getFormattedMessage();
final String redacted = redactor.redact(original);
if (redacted.equals(original)) {
return event;
}
return new Log4jLogEvent.Builder(event).setMessage(new SimpleMessage(redacted)).build();
}
@PluginFactory
public static PhileasRewritePolicy createPolicy(@PluginAttribute("policyFile") String policyFile) {
return new PhileasRewritePolicy(LogRedactor.fromFile(policyFile));
}
}
Wire it into log4j2.xml. The root logger writes through the rewrite appender; Phileas’s own logger is routed straight to the plain appender so its events never re-enter redaction (a second layer of recursion protection on top of the ThreadLocal).
<Configuration packages="com.example.logging">
<Appenders>
<Console name="PLAIN" target="SYSTEM_OUT">
<PatternLayout pattern="%d %-5level %logger - %msg%n"/>
</Console>
<Rewrite name="REDACTED">
<PhileasRewritePolicy policyFile="/etc/phileas/log-policy.json"/>
<AppenderRef ref="PLAIN"/>
</Rewrite>
</Appenders>
<Loggers>
<Logger name="ai.philterd" level="warn" additivity="false">
<AppenderRef ref="PLAIN"/>
</Logger>
<Root level="info">
<AppenderRef ref="REDACTED"/>
</Root>
</Loggers>
</Configuration>
logback
In logback, a Filter and TurboFilter also only accept or deny, so to transform the message you use a custom ClassicConverter and reference it in the pattern. The policy file path is passed as a converter option.
public final class PhileasMessageConverter extends ClassicConverter {
private LogRedactor redactor;
@Override
public void start() {
redactor = LogRedactor.fromFile(getFirstOption());
super.start();
}
@Override
public String convert(ILoggingEvent event) {
return redactor.redact(event.getFormattedMessage());
}
}
Register the converter with a conversion word and use it where %msg would normally go. As with log4j2, route Phileas’s own logger to a plain appender so it does not recurse.
<configuration>
<conversionRule conversionWord="redactedmsg" converterClass="com.example.logging.PhileasMessageConverter"/>
<appender name="REDACTED" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d %-5level %logger - %redactedmsg{/etc/phileas/log-policy.json}%n</pattern>
</encoder>
</appender>
<logger name="ai.philterd" level="WARN" additivity="false"/>
<root level="INFO">
<appender-ref ref="REDACTED"/>
</root>
</configuration>
The policy
The filter loads a standard Phileas redaction policy. For logs, prefer a lean, pattern-based policy: pattern types (emails, phone numbers, SSNs, card numbers) are cheap to evaluate on every event, while model-based name detection is far more expensive and usually not worth it on a high-volume log stream. A starting point:
{
"name": "logs",
"identifiers": {
"emailAddress": { "emailAddressFilterStrategies": [ { "strategy": "REDACT", "redactionFormat": "[EMAIL]" } ] },
"phoneNumber": { "phoneNumberFilterStrategies": [ { "strategy": "REDACT", "redactionFormat": "[PHONE]" } ] },
"ssn": { "ssnFilterStrategies": [ { "strategy": "REDACT", "redactionFormat": "[SSN]" } ] },
"creditCard": { "creditCardFilterStrategies": [ { "strategy": "REDACT", "redactionFormat": "[CARD]" } ] }
}
}
See Writing your first redaction policy and the policy schema guide for the full set of entity types and strategies.
Important tradeoffs
Running redaction on every log event has sharp edges. Decide each of these deliberately.
- Recursion guard. Phileas logs, so redaction can re-enter itself. Use both safeguards shown above: the
ThreadLocalflag in the redactor, and routing theai.philterdlogger to a plain appender. Verify it with a test that intentionally triggers a Phileas-internal log during redaction. - Performance and async. Redaction is work on the application’s path to writing a log line. Keep the policy lean, and consider asynchronous logging (log4j2’s async loggers, logback’s
AsyncAppender) so the cost moves off the application thread. A bounded async queue trades a small chance of dropped log events under extreme load for not blocking the application; choose the policy that fits your risk. - Fail open or fail closed. If Phileas throws, you can drop the event, pass the original through, or replace it with a marker. The example fails closed with
[REDACTION FAILED]so a problem is visible rather than a silent leak. Pick the behavior your compliance posture calls for. - What you redact. This redacts the formatted message. Structured context (MDC values), markers, and exception messages can also carry PII; extend the adapter to cover them if your logs put sensitive data there, and weigh the added cost.
- Validate it. Detection is probabilistic and configurable. Test the policy against representative log lines and confirm the output before relying on it. You remain responsible for what reaches your logs.
Where to go next
- Phileas and the redaction policy schema guide for everything the policy can do.
- Writing your first redaction policy to build the policy this filter loads.
- Phileas in Graylog for redacting logs that have already landed, the other end of the same pipeline.