Free cookie consent management tool by TermsFeed Data Security in IDP [Intelligent Document Processing v1.4]
Data Security in IDP
Google has deprecated legacy versions of AutoML services, which directly impacts IDP's core functionality.

Additionally, the IDP application was deprecated with Appian 23.2. Customers who wish to use the application will need to refactor plug-ins using AutoML.

Data security is important. We want to make sure you understand where your data goes when you use IDP. IDP provides data privacy and protection because it secures your data with Appian as well as Google Cloud.

This page describes how your data moves between Appian and Google Cloud, as well as how it moves within Appian.

Google Cloud Platform project setup for Appian AI

Security considerations during setup apply only to Appian AI customers. If you're using IDP with a Google Cloud Platform project you purchased separately, your configuration and security may be different.

There are two systems at play for IDP:

Each Google Cloud Platform project is configured for each customer to segregate data flows, security, and storage. For Appian AI customers, each project is also created with privileges that only provide access to the needed APIs. The project also uses automatic provisioning of unique service accounts to access Google Cloud Platform resources, in this case, Storage, AutoML Natural Language, and Document AI. Automatic provisioning means that service account credentials are generated automatically and no other users see or have access to the credentials during the process, making it more secure and straightforward. Additionally, the Google Cloud Platform project is set up in a region most applicable or as you select (us-central1 or eu).

More on the services included with Appian AI

idp-ai-system.png

Data is concealed through key parts of setup:

  • Provisioning: The entire Google Cloud Platform project setup process is automated with no human intervention.
  • Access: A limited number of support and engineering team members have access to metadata (customer name, support contact, and preferred region) to confirm the setup was complete and successful, with no access to underlying project resources or service account credentials.
  • Monitoring: We have monitoring dashboards to ensure we meet our service level agreements (SLAs) and monitor for problems with proactive alerts on performance, budget, and other diagnostic information.

Classification model training

The labeled document is stored in a Google Cloud storage bucket when uploaded, in the supported region where storage was originally provisioned. Then the documents are converted to their text form, or "digitized," to form a dataset to train the classification model. Once the model is trained and deployed, which can take up to 24 hours, the documents and the dataset of document digitizations are deleted. The model and (before it's deleted) the dataset are stored and processed in the supported location corresponding with the region of the storage bucket.

Document type classification

After IDP is configured, users can upload the documents directly in the IDP site, or the upload can be automated as part of a larger workflow or from external systems. The classification step sends the document to Google Cloud for AutoML Natural Language processing.

The uploaded document is stored in a Google Cloud storage bucket, in the supported region where storage was originally provisioned. Google's AutoML Natural Language digitizes the document content and classifies the content into user-defined categories based on a machine learning model that has been trained on a representative data set. The results are then returned back to Appian. After the model returns a prediction, which can take up to three minutes, the document is deleted.

Data extraction

For extraction, IDP sends the document to Google Cloud Storage within your configured Google Cloud Platform project so that Document AI can be performed on it.

The document is then analyzed using the Google Cloud Document AI API. This analysis data is stored in a JSON document in a Google Cloud storage bucket and sent back to Appian.

If you're using Appian AI, the uploaded document and JSON analysis document are deleted after 24 hours. If you are not using Appian AI and you want to temporarily store the JSON analysis document, you will need to arrange the deletion of the documents.

The auto-mapping learning of labels and values is stored in the Appian environment. The learning happens independently in each environment.

Appian does not share the extracted data with Google or use it beyond the customer instance. This data is not used to improve Document AI or improve mappings globally across Appian customers. The entire processing and improvements remain with and within your systems.

Document reconciliation

All downstream processing such as validation, continuous learning of mappings, and improved extracted results are processed and stored within your Appian Cloud instance. After a user completes the task to reconcile the document content with the extracted information, then the document data is written to the database. This data can be referenced in other applications as well.

General data security and privacy statements

Outside of the document classification and extraction process, Appian and Google protect data security, privacy, and integrity as part of the IDP application. Throughout this section, we reference the following resources to help summarize what the policies mean for your data:

Google's data security and privacy practices

Refer to Google's resources for up-to-date information and more details about Google Cloud Platform. Google Cloud Platform's Terms and Conditions, Data Privacy and Access is summarized below for convenience.

At a high-level, Google states that they do not access customer data in transit or at rest for any purpose other than to provide the respective services requests. By default, data is also encrypted during transit and at rest for security. In practice, this means that Google does not use customer data to improve the service: in this case, the machine learning models. Additionally for Appian AI customers, Appian implemented a strict data retention policy that automatically deletes processed documents and results within 24 hours. This results in further security of customer data by making the encrypted data available only for a short period sufficient for Google's service to process the request and write the results to be consumed by Appian. Processing and communication all occur within the confines of the customer's own Appian cloud instance and the customer's own Google Cloud Platform project.

Refer to section 5.2.1.: Customer's Instructions in Google's processing terms for more details.

Google provides access transparency logs, which expose any action including reading of data by anyone (Google, Appian, or otherwise) of both data and services.

How does Google treat your content?

Visit Google's site for the most up to date information regarding their security commitments. We've summarized some key points from Google Vision data-processing FAQ and AutoML FAQ to provide you with helpful answers to key questions.

  • Google won't use your content for any purpose except to provide the Cloud Vision API or AutoML service.
  • Google won't make content sent through these services available to the public or share them otherwise. If necessary, data may be shared with third-party vendors to provide aspects of the Cloud Vision API or AutoML services, such as data storage or transmission. In these cases, the data is shared only under contractually defined security and confidentiality conditions.
  • Google stores the content for a short period of time for analysis and to return results. The length of time depends on whether these actions take place asynchronously or immediately, but won't exceed a few hours. Metadata about the request is logged temporarily as well.
  • Google doesn't currently use your content to train or improve Google Vision or AutoML features.
  • Google doesn't claim ownership on the content you send to the Cloud Vision API or to AutoML.

Visit Google Cloud Platform Security page to learn more about the security measures in place for Google's Cloud Services.

Appian's data security and privacy practices

Appian does not access customer's business data and has a strong security, privacy, and compliance posture. Visit the Appian Trust Center for more details.

Open in Github Built: Fri, Mar 22, 2024 (05:03:50 PM)

Data Security in IDP

FEEDBACK