Free cookie consent management tool by TermsFeed Data Security in Document Extraction [Document Extraction Suite]
Data Security in Document Extraction

Data security is important. We want to make sure you understand where your data goes when you use Appian document extraction features. Document extraction provides data privacy and protection because it secures your data with Appian as well as Google Cloud.

This page describes how your data moves within Appian, as well as how it moves between Appian and Google Cloud, if you choose Google to extract your document data.

Appian's built-in document extraction

Appian provides built-in document extraction services for most documents and data types. If you choose to use Appian as your document extraction vendor, your data stays entirely within Appian Cloud. Appian does not access customer's business data and has a strong security, privacy, and compliance posture. Visit the Appian Trust Center for more details.

If you choose to use Google for document extraction, refer to the next section to understand how your data is secured.

Google's document extraction

Security considerations during setup apply only to Appian AI customers. If you're using document extraction with a Google Cloud Platform project you purchased separately, your configuration and security may be different.

There are two systems at play for document extraction:

Each Google Cloud Platform project is configured for each customer to segregate data flows, security, and storage. For Appian AI customers, each project is also created with privileges that only provide access to the needed APIs. The project also uses automatic provisioning of unique service accounts to access Google Cloud Platform resources, in this case, Storage, AutoML Natural Language, and Document AI. Automatic provisioning means that service account credentials are generated automatically and no other users see or have access to the credentials during the process, making it more secure and straightforward. Additionally, the Google Cloud Platform project is set up in a region most applicable or as you select (us-central1 or eu).

More on the services included with Appian AI


Data is concealed through key parts of setup:

  • Provisioning: The entire Google Cloud Platform project setup process is automated with no human intervention.
  • Access: A limited number of support and engineering team members have access to metadata (customer name, support contact, and preferred region) to confirm the setup was complete and successful, with no access to underlying project resources or service account credentials.
  • Monitoring: We have monitoring dashboards to ensure we meet our service level agreements (SLAs) and monitor for problems with proactive alerts on performance, budget, and other diagnostic information.

Data extraction

For extraction, Appian sends the document to Google Cloud Storage within your configured Google Cloud Platform project so that Document AI can be performed on it.

The document is then analyzed using the Google Cloud Document AI API. This analysis data is stored in a JSON document in a Google Cloud storage bucket and sent back to Appian.

If you're using Appian AI, the uploaded document and JSON analysis document are deleted after 24 hours. If you are not using Appian AI and you want to temporarily store the JSON analysis document, you will need to arrange the deletion of the documents.

The auto-mapping learning of labels and values is stored in the Appian environment. The learning happens independently in each environment.

Appian does not share the extracted data with Google or use it beyond the customer instance. This data is not used to improve Document AI or improve mappings globally across Appian customers. The entire processing and improvements remain with and within your systems.

Document reconciliation

All downstream processing such as validation, continuous learning of mappings, and improved extracted results are processed and stored within your Appian Cloud instance. After a user completes the task to reconcile the document content with the extracted information, then the document data is written to the database. This data can be referenced in other applications as well.

Data security in IDP

See Data Security in IDP for additional information about data storage, retention, and access in the application's common workflows.

General data security and privacy statements

Both Appian and Google protect data security, privacy, and integrity as part of their resepective document extraction services. Throughout this section, we reference the following resources to help summarize what the policies mean for your data:

Google's data security and privacy practices

Refer to Google's resources for up-to-date information and more details about Google Cloud Platform. Google Cloud Platform's Terms and Conditions, Data Privacy and Access is summarized below for convenience.

At a high-level, Google states that they do not access customer data in transit or at rest for any purpose other than to provide the respective services requests. By default, data is also encrypted during transit and at rest for security. In practice, this means that Google does not use customer data to improve the service: in this case, the machine learning models. Additionally for Appian AI customers, Appian implemented a strict data retention policy that automatically deletes processed documents and results within 24 hours. This results in further security of customer data by making the encrypted data available only for a short period sufficient for Google's service to process the request and write the results to be consumed by Appian. Processing and communication all occur within the confines of the customer's own Appian cloud instance and the customer's own Google Cloud Platform project.

Refer to section 5.2.1.: Customer's Instructions in Google's processing terms for more details.

Google provides access transparency logs, which expose any action including reading of data by anyone (Google, Appian, or otherwise) of both data and services.

How does Google treat your content?

Visit Google's site for the most up to date information regarding their security commitments. We've summarized some key points from Google Vision data-processing FAQ and AutoML FAQ to provide you with helpful answers to key questions.

  • Google won't use your content for any purpose except to provide the Cloud Vision API or AutoML service.
  • Google won't make content sent through these services available to the public or share them otherwise. If necessary, data may be shared with third-party vendors to provide aspects of the Cloud Vision API or AutoML services, such as data storage or transmission. In these cases, the data is shared only under contractually defined security and confidentiality conditions.
  • Google stores the content for a short period of time for analysis and to return results. The length of time depends on whether these actions take place asynchronously or immediately, but won't exceed a few hours. Metadata about the request is logged temporarily as well.
  • Google doesn't currently use your content to train or improve Google Vision or AutoML features.
  • Google doesn't claim ownership on the content you send to the Cloud Vision API or to AutoML.

Visit Google Cloud Platform Security page to learn more about the security measures in place for Google's Cloud Services.

Appian's data security and privacy practices

Appian does not access customer's business data and has a strong security, privacy, and compliance posture. Visit the Appian Trust Center for more details.

Open in Github Built: Fri, Sep 22, 2023 (07:59:10 PM)

Data Security in Document Extraction