The Intelligent Document Processing (IDP) application is primarily a mechanism to extract information from documents in order to digitize that data.
This process requires users to:
This document teaches users how to use IDP to upload, classify, and reconcile documents, as well as how to view information about processed documents, such as document status and the extracted information. It also provides an overview of the document processing metrics.
Not all users will have access to all actions and views. See the groups reference page for more information on what type of access each security group provides.
IDP transforms unstructured data from PDF documents into structured data. The application only accepts documents in PDF format, so before you get started, you may need to convert your documents. Appian Community offers multiple plug-ins to convert documents from other file formats to PDF:
We don't recommend converting Excel files to PDF for use in IDP. Instead, Appian can parse information from Excel files using rules. Use the Excel Tools plug-in to extract information from this file format.
If you need to upload documents manually, IDP features an easy-to-use upload form for document processing. However, you can also use it as a subprocess in a larger workflow. Furthermore, you can upload documents from external systems automatically, using a web API.
To upload documents manually:
After you upload the documents, the classification and extraction process will start.
The status of each document displays on the DOCUMENTS tab, along with other information and metrics about each document.
On the top-right corner of the page, click refresh to view the updated status. You can also use the filters at the top of the page to find certain documents.
The possible statuses are:
The Reconciled By column shows who completed the reconciliation task for a document. If the document type has automatic validations set up, the reconciliation task is skipped and Straight Through Processed appears in this column. If automatic validation fails, a user will need to complete the reconciliation task and their name appears in this column.
After documents are uploaded, tasks are assigned to users so that they can classify documents and confirm or correct the extracted information. See Tasks for more information on using tasks in Appian.
If there are any documents that are in the Pending Classification or Pending Reconciliation status, you can classify and reconcile them in the TASKS tab.
In this tab, you can search for tasks by Task Name and filter by Task Type (Classification or Reconciliation), Document Channel (if configured), or who the task is Assigned To.
Classification tasks are represented by icons:
For documents that didn't meet the minimum confidence threshold during auto-classification, a task will automatically be created for a user to manually classify the document.
These documents will be in the Pending Classification status.
To complete a manual classification task:
If the document is invalid, click INVALIDATE. For example, if an unsigned process order is uploaded instead of a signed one, you can classify it as invalid. Invalid documents won't go through data extraction and reconciliation.
All documents need to be reviewed by a user for accuracy and to fill in any missing fields. This is called reconciliation. After a document is uploaded, the Appian Document Extraction runs, extracting data from the document. Extraction usually takes about 2 - 5 minutes. When it is finished, a reconciliation task is automatically generated.
While the data is extracting, these documents will be in the Auto-Extracting status. After extraction is complete, they will be in the Pending Reconciliation status.
To complete the reconciliation task:
The status for reconciled documents will change to Completed and the data extracted will be written to the database.
To view the information that was input for a document, go to the DOCUMENTS tab and click the document name.
The Summary tab lists Overview information about the document at the top of the page. It also displays the data that was extracted from the document, along with a document viewer.
To edit the information that was extracted, click the edit button. Then update the information in the fields.
The METRICS tab is used for reporting and governance so that users can see how well the application is performing.
You can filter the information by Document Channel (if configured) and Document Type, as well as only show information for documents processed in the Past 3 Months or the Past 6 Months.
The first section on this page shows some key performance indicators for documents that have completed processing, including:
The next section displays charts that show:
At the bottom of the page is a grid that shows the metadata for processed documents.