Data Security in IDP

This page describes data security when using Google's services to complement Appian's built-in document extraction features, specifically in the Intelligent Document Processing (IDP) application. Beginning with IDP 1.6, you can configure IDP to use only Appian to extract data from documents, keeping your data within Appian Cloud.

Data security is important. We want to make sure you understand where your data goes when you use IDP. IDP provides data privacy and protection because it secures your data with Appian as well as Google Cloud.

This page describes how your data moves between Appian and Google Cloud, as well as how it moves within Appian, when you use IDP to extract data from documents.

IDP is built using Appian's suite of document extraction features, which also uphold our high standards for security. We encourage you to learn more about data security in our document extraction features.

Classification model training

When the labeled document is uploaded, it is stored in a Google Cloud storage bucket in the supported region where storage was originally provisioned. The documents are then converted to their text form, or "digitized," to form a dataset to train the classification model.

Once the model is trained and deployed, which can take up to 24 hours, the documents and the dataset of document digitizations are deleted. The model and (before it's deleted) the dataset are stored and processed in the supported location corresponding with the region of the storage bucket.

Document type classification

After IDP is configured, users can upload the documents directly in the IDP site, or the upload can be automated as part of a larger workflow or from external systems. The classification step sends the document to Google Cloud for AutoML Natural Language processing.

The uploaded document is stored in a Google Cloud storage bucket in the supported region where storage was originally provisioned. Google's AutoML Natural Language digitizes the document content and classifies the content into user-defined categories based on a machine learning model that has been trained on a representative data set. The results are then returned back to Appian. After the model returns a prediction, which can take up to three minutes, the document is deleted.

Open in Github Built: Fri, Nov 19, 2021 (11:59:11 AM)

On This Page

FEEDBACK