Frequently asked questions about Airlock. For any questions not answered here please refer to our Support.
How do I run Airlock?
Airlock is available on the cloud marketplaces as a virtual machine product:
Does Airlock use ChatGPT or other third-party APIs?
No. Airlock never transmits your text or documents to any third-party service.
Airlock can run in a firewalled or air-gapped environment. For example, if you are using AWS, you can deploy Airlock to a private subnet and use security groups and network ACLs to prevent any outbound traffic from the Airlock instance and its subnet. In fact, we recommend doing so to increase your overall security posture.
Airlock is built upon an open source project called Phileas, an open source project for finding and redacting PII and PHI in text and documents. Everyone is welcome to check out the Phileas code to learn more about how it works, to submit an issue when one is found, and to contribute via pull requests. Phileas is licensed under the Apache License, version 2.
Airlock can redact many types of PII, PHI, and other sensitive information. We are constantly adding new types of information and new versions of each type. For example, a person’s age may be written in many ways and we work to add new ways as we discover them. If you wish to discuss these types of information in depth please contact us.
Some of the types of PII, PHI, and sensitive information identified by Airlock are listed below:
You create policies that tell Airlock what types of PII and PHI to find. A policy lists the types of sensitive information (phone numbers, names, etc.), when to remove them, and how to remove them. You can have as many policies as you need and you can select which policy to apply when redacting text.
Airlock can be deployed to your cloud via the cloud's marketplace. See Airlock's home page for links to the cloud marketplaces.
Airlock uses state of the art natural language processing (NLP) technology to identify sensitive information in text. These NLP methods use trained models created from a large corpus of text. The process of applying the model to text is non-deterministic. There are many factors that could affect the identification of sensitive information in your text such as how similar your text is to the corpus that was used to train the model, how the text is formatted, and the length of the text. For these reasons, it is important that you assess Airlock's performance on your data prior to utilization in a production system.
The confidence value in the filter strategy condition can be used to tune the NLP engine’s detection. Each identified entity has an associated confidence score between 0 and 100 indicating the model’s estimate that the text is actually an entity, with 0 being the lowest confidence and 100 being the highest confidence. The confidence value in the filter strategy allows you to filter out entities based on the confidence. For example, the condition confidence > 75 means that entities having less than a 75 confidence value will be ignored and entities having a confidence value greater than 75 will be filtered from the text.
Airlock supports several platforms and which platform is used may be determined by your choice of cloud provider. See Airlock's home page for links to the cloud marketplaces.
You can view Airlock's license agreement here.