Google has deprecated legacy versions of AutoML services, which directly impacts IDP's core functionality. Additionally, the IDP application was deprecated with Appian 23.2. Customers who wish to use the application will need to refactor plug-ins using AutoML. |
Each organization's documents are unique. Appian's Intelligent Document Processing (IDP) is flexible enough to allow you to tailor the application to your organization's needs.
IDP comes out-of-the box with Invoice, Purchase Order, Claim, and Receipt document types. Your organization may need to process documents of a different type. This page provides instructions for Appian developers to add more document types. See Appian Document Extraction for more information on what types of documents work best for data extraction.
To enable managers to easily derive insights from the reporting in the Metrics tab, we recommend limiting the number of document types to four, excluding the invalid document type. If you need more than four document types, we recommend adding another document channel.
Note: These instructions are specific to MySQL databases. If you use a different database, you may need to modify the steps.
If you want to modify the fields of existing document types, refer to Modifying Fields for Document Types.
The dudoctype
reference table stores all of the document types used in IDP. In order to add a document type, you must first add a row to this database table to define the new document type.
To add the new document type to the reference table:
dudoctype
table.doctypeid
: Set this to NULL
. When set to NULL
, an auto-incrementing ID will automatically be assigned.doctypename
: The document type name. This name must be unique among all document type names.choiceindex
: The index that is referenced in expression rules to select the corresponding document type and data store entity. The choiceindex
for a newly added document must be one more than the highest choiceindex
for that channelid
.doctypestatus
: Initially, set this to Inactive
. When you configure the application you will select which document types you want to process. This column updates to active
for document types that are selected.isinvalidtype
: Used to identify the Invalid document type. Unless you are adding an Invalid document type for a new document channel, set this to 0
. There can be only one invalid document type for each document channel.channelid
: The value from the channelid
column of the dudocchannel
reference table. Use the value for the document channel that you are adding the document type to. For the Standard channel that comes out of the box, this value is 1
.Let's say your organization needs to process medical device licenses and you want to add it to the first document channel. The highest choiceindex
for channel ID 1
is 4
which means the choiceindex
for the new document type would be 5. If you were adding the new document type to channel ID 2
instead, the choiceindex
would be 4 because for channel ID 2
the highest choiceindex
is 3
.
doctypeid | doctypename | choiceindex | doctypestatus | isinvalidtype | channelid |
---|---|---|---|---|---|
0 | NULL | 0 | Active | 1 | 1 |
1 | Invoice | 1 | Active | 0 | 1 |
2 | Purchase Order | 2 | Active | 0 | 1 |
3 | Claim | 3 | Active | 0 | 1 |
4 | Receipt | 4 | Active | 0 | 1 |
5 | NULL | 0 | Active | 1 | 2 |
6 | Partnership Application | 1 | Active | 0 | 2 |
7 | Statement of Work | 2 | Active | 0 | 2 |
8 | Retirement Application | 3 | Active | 0 | 2 |
To add a new document type called Medical Device License
, update the dudoctype
table by executing a database command like the following. Note that this example uses MySQL syntax.
1
INSERT INTO `dudoctype` (`doctypeid`, `doctypename`, `choiceindex`, `doctypestatus`, `isinvalidtype`, `channelid`) VALUES (NULL, 'Medical Device License', '5', 'Inactive', '0', '1');
In order to create a new document type, you will need to create a custom data type (CDT) with fields for the information you want to extract from the new document type.
See Appian Document Extraction to learn more about best practices for creating a new CDT and setting up the CDT fields for document extraction.
To create a CDT for the new document type:
In the Intelligent Document Processing (IDP)
application, create a CDT for the new document type by duplicating an existing data type using one of the original CDTs.
Tip: Why should you duplicate an existing data type? Fields of type Text
automatically use VARCHAR(255)
for the column definition in the associated database table. We recommend updating the column definition in the XSD to use text
. This has a much larger character limit to prevent problems with writing more than 255 characters to the table. Instead of editing the XSD, it is easier to just duplicate an original CDT.
urn:com:appian:types:DU
.DU_<DocumentType>
. Replace <DocumentType>
with the name of the new document type. For example, DU_MedicalDeviceLicense.Text
. If your document contains a checkbox, create a field for every checkbox option, even mutually exclusive options. Select Boolean
as the type for each one.id
field.text
.
Note: If you are duplicating a nested CDT for table extraction, you'll need to manually modify the table name in the XSD file. Download the XSD and modify the table name (annotated with @Table
) within it. Save the file, click Create New Version from XSD in the settings menu, and upload the new version of the XSD.
To create a new Medical Device License document type, you would create a new CDT with the name DU_MedicalDeviceLicense
. In this example, we duplicated the DU_Invoice
data type.
In that CDT, you would update the fields with the information you want to extract from a medical device license, such as licenseNumber
, licenseType
, and issueDate
.
Now that you have created the CDT, you must create a data store entity in the DU Data Store
and verify the data store to create a new database table.
See Managing Data Stores for more information about editing data stores.
To create a data store entity and verify the data store:
Intelligent Document Processing (IDP)
application, open the DU Data Store
object.To create the data store entity and table for the Medical Device License document type, add it to the DU Data Store
object using the DU_MedicalDeviceLicense
CDT and verify the data store.
In order to refer to the data store entity in other Appian objects, you will need to create a new constant that points to the data store entity that you just created for the new document type.
To create a constant for the new data store entity:
Intelligent Document Processing (IDP)
application, create a new constant.
DSE
(for data store entity) at the end of the name. For example, DU_NEW_DOC_TYPE_DSE.DU Data Store
.DU Rules and Constants
folder and click CREATE.For the Medical Device License document type, you would create a constant and name it DU_MEDICAL_DEVICE_LICENSE_DSE
, select DU Data Store
for the Data Store, select DU_MedicalDeviceLicense
for the data store entity, and click CREATE.
After you create the new objects for the new document type CDT and data store entity constant you need to update existing expression rules to use the new objects. This allows the new document type to be used in the application.
The DU_returnDataTypeForChoiceIndex
expression rule returns the CDT for a document type. Basically, given an index, it returns the CDT that matches the index in an array of CDTs. It is used to dynamically invoke the correct CDT for the document type when performing the reconciliation task.
In order for it to return the custom data type for the new type of document, you will need to add the CDT you created to the expression rule.
To add the new CDT to the expression rule:
DU_returnDataTypeForChoiceIndex
expression rule.choose()
function, add the new CDT as the last item in the array, using the type!{urn:com:appian:types:DU}DU_NewDocumentType'() convention.To update the DU_returnDataTypeForChoiceIndex
expression rule for the medical device license document type, you would add the type!DU_MedicalDeviceLicense()
CDT as the last item in the choose()
function.
Tip: Entering type!DU_
and selecting DU_MedicalDeviceLicense
from the auto-suggest list, will automatically convert type!MedicalDeviceLicense()
to type!{urn:com:appian:types:DU}DU_MedicalDeviceLicense()
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
if(
rule!DU_isInvalidDocTypeId(docTypeId: ri!choiceIndex),
{},
choose(
ri!channelId,
/*Channel ID 1*/
choose(
ri!choiceIndex,
'type!{urn:com:appian:types:DU}DU_Invoice'(),
'type!{urn:com:appian:types:DU}DU_PurchaseOrder'(),
'type!{urn:com:appian:types:DU}DU_Claim'(),
'type!{urn:com:appian:types:DU}DU_Receipt'()
)
)
)
The DU_returnDataStoreEntityForChoiceIndex
expression rule returns the data store entity for a document type. Basically, given an index, it returns the data store entity that matches the index in an array of data store entities. It is used to dynamically invoke the correct data store entity when writing or querying the document data.
In order for it to return the data store entity for the new type of document, you will need to add the new data store entity constant to the expression rule.
To add the new data store entity to the expression rule:
DU_returnDataStoreEntityForChoiceIndex
expression rule.choose()
function, add the new constant for the data store entity as the last item in the array.To update the DU_returnDataStoreEntityForChoiceIndex
expression rule for the medical device license document type, you would add the cons!DU_MEDICAL_DEVICE_LICENSE_DSE
constant as the last item in the choose()
function.
1
2
3
4
5
6
7
choose(
ri!choiceIndex,
cons!DU_INVOICE_DSE,
cons!DU_PURCHASE_ORDER_DSE,
cons!DU_CLAIM_DSE`,`
`cons!DU_LICENSE_DSE`
)
Before you upload any documents to IDP, you will need to configure the application to use it.
IDP provides a method for updating the configuration of the application in the Configure tab. See Configuring IDP page for instructions on how to edit the configuration of the application. You will need to activate the new document type and upload example documents of the new document types to train the classification machine learning model.
After the training is complete, you can start processing documents of the new type.
Note: Do not upload documents for your new document type for processing in IDP until after training is complete. The AI classification model will perform poorly on these document types and users will not be able to correct the classification until the training is finished.
Adding a Document Type