Free cookie consent management tool by TermsFeed

Using IDP with External Systems

Google has deprecated legacy versions of AutoML services, which directly impacts IDP's core functionality.

Additionally, the IDP application was deprecated with Appian 23.2. Customers who wish to use the application will need to refactor plug-ins using AutoML.

Introduction

Many organizations use external systems to manage documents in conjunction with the Appian platform. Intelligent Document Processing (IDP) comes out-of-the box with web APIs to integrate with other systems. You can use these web APIs to start document processing, check on processing status, and retrieve the extracted document data. This page provides guidance for invoking IDP's web APIs. If you want to use IDP directly in the Intelligent Document Processing site, refer to Intelligent Document Processing User Guide. If you want to invoke IDP as part of a larger workflow, refer to Using IDP in a Subprocess.

Authentication and security

For each IDP web API, you can use any authentication method available for Appian web APIs. We recommend using the API Key authentication method. The service account that uses the API Key authentication must be a member of the DU Web API Users group to invoke the web APIs.

Upload documents to IDP

If you are an Appian developer who wants to classify and extract data from documents located in another system, you can upload these documents to IDP using web APIs.

There are two methods to upload documents to IDP web APIs: Upload Document and Upload Zip. You can use Upload Document to process a single document, and you can use Upload Zip to process a batch of documents.

For both methods, before using any of the request data, make the following replacements:

  • CHANNEL_ID (optional): The document channel ID associated with the document. The document channel ID can be found in the dudocchannel table in the database. If no document channel ID is provided, then the default channel ID will be used. If DOC_TYPE_ID is provided, then the CHANNEL_ID will be ignored.
  • DOC_TYPE_ID (optional): The document type ID associated with the document. The document type ID can be found in the dudoctype table in the database. This value should only be provided if the document type is known.
  • DOCUMENT_NAME (optional): The name of the document with the file extension (filename.pdf). We recommend providing this value to match the files with the document data result when retrieving the document data. It also increases readability for users. If no name is provided, the document is automatically named Untitled document from web API.

Method 1: Upload PDF

HTTP method and path:

POST /suite/webapi/du-upload-document-for-understanding?channelId=CHANNEL_ID&docTypeId=DOC_TYPE_ID

HTTP Headers:

Appian-Document-Name: DOCUMENT_NAME

The Request JSON body must contain a binary document that is supported by Appian Document Extraction.

You should receive a JSON response similar to the following:

1
{"jobGuid":"1F635AA6-7431-9555-664F-7E41306D4108"}

The Job GUID can be used to check on the status of the document being processed.

Method 2: Upload ZIP

HTTP method and path:

POST /suite/webapi/du-upload-zip-for-understanding?docTypeId=DOC_TYPE_ID

HTTP Headers:

Appian-Document-Name: DOCUMENT_NAME

The Request JSON body must contain a binary document for a zip file. This zip file must contain at least one file that is supported by Appian Document Extraction in the top-level directory.

You should receive a JSON response similar to the following:

1
{"jobGuid":"1F635AA6-7431-9555-664F-7E41306D4108"}

The Job GUID can be used to check on the status of the batch of documents being processed.

Retrieve status of job

Once you upload documents for processing, you can check on the status of the job using the Is Job Done web API.

Before using any of the request data below, make the following replacements:

  • JOB_GUID (required): The job GUID for the job for which you want to retrieve the status. The job GUID is returned in the response when uploading documents for processing.

HTTP method and path:

GET /suite/webapi/du-is-job-done?jobGuid=JOB_GUID

You should receive a JSON response similar to the following:

1
{"isJobDone": false}

Once isJobDone is set to true, you can retrieve the extracted data for the processed documents.

Retrieve data for documents

Once the job is done, you can retrieve the classification and extraction data for the processed documents using the Get Data for Job web API.

Before using any of the request data below, make the following replacements:

  • JOB_GUID (required): The job GUID for the job for which you want to retrieve the status. The job GUID is returned in the response when uploading documents for processing.

HTTP method and path:

GET /suite/webapi/du-get-data-for-job?jobGuid=JOB_GUID

You should receive a JSON response similar to the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
[
  {
    "understandingId": 465,
    "documentId": 11454,
    "documentName": "INV-12",
    "documentType": "Invoice",
    "entityId": 103,
    "data": {
      "invoiceNumber": "12",
      "invoiceDate": "10/09/2019",
      "total": "$17600.00",
      "supplier": "Acme Corporation"
    }
  },
  {
    "understandingId": 466,
    "documentId": 11455,
    "documentName": "Supper Kitchen",
    "documentType": "Receipt",
    "entityId": 142,
    "data": {
      "receiptDate": "12/27/2019",
      "subTotal": "$124.19",
      "tax": "10.56",
      "total": "$134.75"
    }
  }
]

This response contains the following information for each document:

Field Name Description
understandingId Document processing identifier Unique identifier of the document among all documents processed
documentId Appian document identifier Unique identifier of the document among all Appian documents
documentName Document name Name of the document
documentType Document type classification Name of the document type classification
entityId Entity identifier Unique identifier for the document among all processed documents of that particular document type
data Extracted data Fields and values of the extracted document data

If the job contained multiple files, you can use the documentName to match the document information with the correct file.

Feedback