Free cookie consent management tool by TermsFeed

Intelligent Document Processing

Introduction

Up to this point, companies that needed to extract data from documents and forms had two options: slow, labor-intensive manual entry or outdated, hard-to-customize optical character recognition software.

But if you have Appian, you have another option: intelligent document processing (IDP) capabilities. With IDP, you can automate some of your most common tasks when it comes to documents your business receives. IDP is no longer a single application for document classification and extraction. Instead, IDP is a suite of capabilities within Appian that allow you to automate the most labor-intensive parts of document management.

What's possible? We're glad you asked. With Appian, you can:

If you want to... Use Additional details
Apply optical character recognition (OCR) Document extraction AI skill
Document classification AI skill
Recognize handwriting Document extraction AI skill
Document classification AI skill
Recognize text in multiple languages Document extraction AI skill
Document classification AI skill
Extract text in multiple languages Document extraction AI skill See supported languages.
Integrate with bring-your-own OCR services Integration object Set up integration with extraction OCR engine.
Extract data from documents Document extraction AI skill
Extract data from tables Document extraction AI skill
Extract data from tables across multiple pages Document extraction AI skill
PDF Tools plug-in
See below.
Extract data from tables with multi-line rows Document extraction AI skill
Extract data from tables without grid lines Document extraction AI skill
Extract data from tables with merged cells Document extraction AI skill
Merge documents Document classification AI skill
PDF Tools plug-in
See below.
Split documents Document classification AI skill
PDF Tools plug-in
See below.
Convert documents Image files: PDF Tools plug-in
HTML files: HTML to PDF plug-in
MS Word files: Dynamic Document Generator plug-in
Not recommended for use with Excel. See Using Excel with Appian.
Deskew documents Document extraction AI skill Deskewing occurs during OCR. The image will not be deskewed to the end user, but the text will be identified accurately.
Capture documents Image files: PDF Tools plug-in.
HTML files: HTML to PDF plug-in.
MS Word files: Dynamic Document Generator plug-in.
Uploading on an interface with a!fileUploadField().
Using a Document Generation smart service.
Receiving a binary or Base64 document through an integration.
From a robotic task using document actions.
Not recommended for use with Excel. See Using Excel with Appian.
Implement capture rules, such as thresholds to accept a document (including quality) Document extraction AI skill
Expression rules
See below.
Validate documents Document extraction AI skill
Expression rules
See below.
Adjust images Document extraction AI skill Occurs during OCR. The image isn't updated for end users, but the text is accurately identified.
Classify documents Document classification AI skill
Manage metadata Edit Document Properties smart service
Records
Can be used in conjunction with AI skills.
Store documents Document folders
Apply document retention rules Delete Document Customers can control how long docs are stored in the Appian platform.
Apply rules for archiving Records
Folder properties
Apply legal holds Records
Folder properties
Configure security on any records or folders containing legal data.
Search documents using metadata Integration object

Implementation patterns

Extract data from tables across multiple pages

To extract data from tables that span multiple pages in a document:

  1. Use the PDF Tools plug-in to split the document into individual pages.
  2. Use the document extraction AI skill to extract table data from each page.
  3. Combine the results using post-processing logic.

Merge documents

When documents are received in packets containing multiple document types:

  1. Split the file into individual pages.
  2. Classify each page into its appropriate document type using the document classification AI skill.
  3. Combine like pages into new files.
  4. Send the newly composed files to the document extraction AI skill created for the corresponding document type.

Split documents

When documents are received in packets containing multiple document types:

  1. Split the file into individual pages.
  2. Classify each page into its appropriate document type using the document classification AI skill.
  3. Send the split files to the document extraction AI skill created for the corresponding document type.

Implement capture rules

  1. Add the Extract from Document smart service to a process.
  2. Configure the Confidence Threshold input and Confidence Score output according to your requirements.
  3. Configure additional verification using expression rules or user input tasks to confirm documents near the confidence threshold meet your requirements.

Validate documents

  1. Extract data from the documents using the document extraction AI skill and Extract from Document smart service.
  2. To verify the completeness or accuracy of required fields, use expression rules or user input tasks.

Feedback