Reference and how-to

Redacting PII in Logs with log4j2 and logback

How to redact PII from Java logs before they are written, with a Phileas-backed log4j2 rewrite policy and a logback converter, recursion guard included.

PII in logs is the classic privacy gap: a team redacts carefully at the API layer, then a stack trace, a debug line, or an audit message writes a customer’s name, email, or SSN straight to disk, stdout, or a log shipper like Splunk, OpenSearch, or Datadog. Logs are copied, indexed, and retained widely, so a leak there is hard to take back.

The safest place to fix this is inside the logging framework itself, so the message is redacted before it is ever written. This guide shows how to run every log event through Phileas using a log4j2 rewrite policy and a logback converter. It is the companion to Phileas in Graylog : that post redacts logs after they land, this one redacts them before they leave the application.

The approach

Phileas is a Java library, so it embeds directly in your application alongside the logging framework. The integration has two pieces:

A small, framework-neutral redactor that loads a policy once and exposes redact(String).
A thin adapter at each framework’s extension point that passes the log message through the redactor.

Add Phileas as a dependency (see the Phileas documentation for coordinates and the embedding setup), then add the code below.

The shared redactor

This holds the Phileas filter pipeline, loads the policy from a file, and redacts a string. It also carries the single most important piece of this integration: a recursion guard. Phileas itself logs, so without a guard a Phileas-internal log event would hit the filter, which calls Phileas, which logs again, in an infinite loop. A ThreadLocal flag breaks that cycle.

public final class LogRedactor {

    // Recursion guard: while we are redacting, any log event Phileas itself emits
    // passes through untouched instead of re-entering the filter.
    private static final ThreadLocal<Boolean> INSIDE = ThreadLocal.withInitial(() -> false);

    private final PlainTextFilterService filterService;
    private final Policy policy;

    private LogRedactor(Policy policy) {
        this.policy = policy;
        this.filterService = new PlainTextFilterService(
                new PhileasConfiguration(new Properties()),
                new DefaultContextService(), new InMemoryVectorService(), null);
    }

    public static LogRedactor fromFile(String path) {
        try {
            final String json = Files.readString(Path.of(path));
            return new LogRedactor(new Gson().fromJson(json, Policy.class));
        } catch (Exception e) {
            throw new IllegalStateException("could not load Phileas policy: " + path, e);
        }
    }

    public String redact(String message) {
        if (message == null || INSIDE.get()) {
            return message;
        }
        INSIDE.set(true);
        try {
            return filterService.filter(policy, "logs", message).getFilteredText();
        } catch (Exception e) {
            // Fail closed: a redaction failure is loud, not a silent leak of the original.
            return "[REDACTION FAILED]";
        } finally {
            INSIDE.set(false);
        }
    }
}

log4j2

In log4j2, the extension point that can change a message is a RewritePolicy attached to a RewriteAppender. A log4j2 Filter only decides accept or deny; it cannot alter the message, so the rewrite appender is the right choice.

@Plugin(name = "PhileasRewritePolicy", category = "Core", elementType = "rewritePolicy", printObject = true)
public final class PhileasRewritePolicy implements RewritePolicy {

    private final LogRedactor redactor;

    private PhileasRewritePolicy(LogRedactor redactor) {
        this.redactor = redactor;
    }

    @Override
    public LogEvent rewrite(LogEvent event) {
        final String original = event.getMessage().getFormattedMessage();
        final String redacted = redactor.redact(original);
        if (redacted.equals(original)) {
            return event;
        }
        return new Log4jLogEvent.Builder(event).setMessage(new SimpleMessage(redacted)).build();
    }

    @PluginFactory
    public static PhileasRewritePolicy createPolicy(@PluginAttribute("policyFile") String policyFile) {
        return new PhileasRewritePolicy(LogRedactor.fromFile(policyFile));
    }
}

Wire it into log4j2.xml. The root logger writes through the rewrite appender; Phileas’s own logger is routed straight to the plain appender so its events never re-enter redaction (a second layer of recursion protection on top of the ThreadLocal).

<Configuration packages="com.example.logging">
  <Appenders>
    <Console name="PLAIN" target="SYSTEM_OUT">
      <PatternLayout pattern="%d %-5level %logger - %msg%n"/>
    </Console>
    <Rewrite name="REDACTED">
      <PhileasRewritePolicy policyFile="/etc/phileas/log-policy.json"/>
      <AppenderRef ref="PLAIN"/>
    </Rewrite>
  </Appenders>
  <Loggers>
    <Logger name="ai.philterd" level="warn" additivity="false">
      <AppenderRef ref="PLAIN"/>
    </Logger>
    <Root level="info">
      <AppenderRef ref="REDACTED"/>
    </Root>
  </Loggers>
</Configuration>

logback

In logback, a Filter and TurboFilter also only accept or deny, so to transform the message you use a custom ClassicConverter and reference it in the pattern. The policy file path is passed as a converter option.

public final class PhileasMessageConverter extends ClassicConverter {

    private LogRedactor redactor;

    @Override
    public void start() {
        redactor = LogRedactor.fromFile(getFirstOption());
        super.start();
    }

    @Override
    public String convert(ILoggingEvent event) {
        return redactor.redact(event.getFormattedMessage());
    }
}

Register the converter with a conversion word and use it where %msg would normally go. As with log4j2, route Phileas’s own logger to a plain appender so it does not recurse.

<configuration>
  <conversionRule conversionWord="redactedmsg" converterClass="com.example.logging.PhileasMessageConverter"/>

  <appender name="REDACTED" class="ch.qos.logback.core.ConsoleAppender">
    <encoder>
      <pattern>%d %-5level %logger - %redactedmsg{/etc/phileas/log-policy.json}%n</pattern>
    </encoder>
  </appender>

  <logger name="ai.philterd" level="WARN" additivity="false"/>

  <root level="INFO">
    <appender-ref ref="REDACTED"/>
  </root>
</configuration>

The policy

The filter loads a standard Phileas redaction policy. For logs, prefer a lean, pattern-based policy: pattern types (emails, phone numbers, SSNs, card numbers) are cheap to evaluate on every event, while model-based name detection is far more expensive and usually not worth it on a high-volume log stream. A starting point:

{
  "name": "logs",
  "identifiers": {
    "emailAddress": { "emailAddressFilterStrategies": [ { "strategy": "REDACT", "redactionFormat": "[EMAIL]" } ] },
    "phoneNumber":  { "phoneNumberFilterStrategies":  [ { "strategy": "REDACT", "redactionFormat": "[PHONE]" } ] },
    "ssn":          { "ssnFilterStrategies":          [ { "strategy": "REDACT", "redactionFormat": "[SSN]" } ] },
    "creditCard":   { "creditCardFilterStrategies":   [ { "strategy": "REDACT", "redactionFormat": "[CARD]" } ] }
  }
}

See Writing your first redaction policy and the policy schema guide for the full set of entity types and strategies.

Important tradeoffs

Running redaction on every log event has sharp edges. Decide each of these deliberately.

Recursion guard. Phileas logs, so redaction can re-enter itself. Use both safeguards shown above: the ThreadLocal flag in the redactor, and routing the ai.philterd logger to a plain appender. Verify it with a test that intentionally triggers a Phileas-internal log during redaction.
Performance and async. Redaction is work on the application’s path to writing a log line. Keep the policy lean, and consider asynchronous logging (log4j2’s async loggers, logback’s AsyncAppender) so the cost moves off the application thread. A bounded async queue trades a small chance of dropped log events under extreme load for not blocking the application; choose the policy that fits your risk.
Fail open or fail closed. If Phileas throws, you can drop the event, pass the original through, or replace it with a marker. The example fails closed with [REDACTION FAILED] so a problem is visible rather than a silent leak. Pick the behavior your compliance posture calls for.
What you redact. This redacts the formatted message. Structured context (MDC values), markers, and exception messages can also carry PII; extend the adapter to cover them if your logs put sensitive data there, and weigh the added cost.
Validate it. Detection is probabilistic and configurable. Test the policy against representative log lines and confirm the output before relying on it. You remain responsible for what reaches your logs.

Where to go next

Phileas and the redaction policy schema guide for everything the policy can do.
Writing your first redaction policy to build the policy this filter loads.
Phileas in Graylog for redacting logs that have already landed, the other end of the same pipeline.

Frequently asked questions

Does this redact before logs are written or after?

Before. The filter runs inside the logging framework, so the message is redacted as the event is processed and the redacted text is what reaches the console, file, or log shipper. That is the safer architecture: sensitive data never lands in the log store. To redact logs that have already landed, see the Graylog post, which covers the other end of the pipeline.

Will this slow my application down?

Redaction runs on every log event, so it is not free. Use a lean policy for logs (pattern-based types are far cheaper than model-based detection), keep an eye on the per-event cost, and consider asynchronous logging so redaction does not block the application thread. Measure with your own policy and log volume.

What happens if Phileas fails on a log event?

That is a decision you make. The example here fails closed: if redaction throws, the message is replaced with [REDACTION FAILED] so a problem is loud rather than silently leaking the original. You can change it to drop the event or pass it through, depending on your risk tolerance.