Synthesize / RedactΒΆ

The Textual redact functionality allows you to identify entities in text, and then optionally tokenize or synthesize these entities to create a safe version of your unstructured text.

This functionality works on both raw strings and files, including PDF, DOCX, XLSX, and other formats.

Before you can use these functions, read the Getting started guide and create an API key.

When Textual operates on your data:

  1. It first identifies sensitive information. Textual can identify 30+ built-in entity types. You can also define your own custom entity types.

  2. Second, it uses information about where entities are located to tokenize or synthesize the data.

In Choosing tokenization or synthesis you can learn different ways to configure your output. To fine-tune how synthesized values are generated for specific entity types, see Customizing synthesis with generator metadata.