Estimated time to complete this tutorial: 1 hour
User experience level: Beginner
In this tutorial, you'll build two AI skills:
By delegating these tasks to AI, the process proceeds much faster and with a higher degree of accuracy than if this classification were done manually by a person.
Tip: Looking to start with just document extraction? Check out the Document Extraction Tutorial to build a process that extracts data from documents and writes that data to records.
Acme Logistics is a shipping and receiving company that manages inventory for its customers. In addition to physical items, Acme has to manage and act on documents such as invoices and purchase orders. Different teams act on different document types: the Fulfillment department reviews purchase orders, while an Appian process extracts data from invoices. Customers and vendors submit these documents through Acme's website, so all documents have one submission channel.
This large bucket of documents presents a challenge for Acme because different departments use each of these documents. It takes time to analyze each incoming document and know which department should be notified, so Acme is eager to see how Appian can help.
Acme's current intake process entails more than just classifying documents. First, a vendor submits a form with documents attached. Then, the analyst receives a notification that a new document has been submitted and they should begin evaluating the attached documents. Acme analysts manually review the documentation and send it to the proper department. This takes at least 2-3 minutes per document, meaning the average analyst can review about 20-30 documents an hour. The analyst then forwards the document to the appropriate department, repeating the process for each document that's come in. However, analysts have other important tasks to work on, so Acme is eager to find a way to speed up the review and notification process.
In this tutorial, you're building a process to replace most of the manual work that the analyst has to do. To automate this process in Appian, you'll need to:
This tutorial is designed to be used with Appian 23.2 and later.
This tutorial assumes you have an Appian application created already. We'll walk you through creating each of the design objects you need to automate document classification.
Tip: Objects in this tutorial use the AL
prefix. If you're creating objects in an application that uses a different prefix, use your application's prefix in new object names.
Acme analysts spend too much time reviewing incoming documents to determine which department it should go to. To support a large business, we want to save time by automating part of their manual process – classifying documents attached to a form submission.
Before the AI skill can serve its purpose, it needs to learn a lot about the documents your business encounters. One of your first steps should be to build a complete and representative dataset to train the model. The model can only learn from the documents you provide to it, so it's important to have a large number and variety of realistic examples.
Acme most commonly receives these two types of documents:
We've provided sample Invoices and Purchase Orders for you. Download these files to your computer, since you'll use them to train the AI skill in a later step. Unzip the compressed folders, as you'll need to upload the documents individually and not as a ZIP file.
Acme has clearly defined the use case and collected example documents, so you're ready to build a custom machine learning model in Appian using the AI skill design object. Yes, it's really that straightforward!
Configure the following properties:
Property | Value |
---|---|
Name | AL_IncomingDocument |
Description (Optional) | AI skill to classify documents Acme receives. |
In this step, you'll begin telling the skill about each type of document that is typically attached to a form submission.
Earlier, you identified that most claims have two types of documents attached:
You'll need to train the model to recognize each of these documents. To do that, you'll first create a document type and add examples of those documents.
To create a document type:
Invoice
.Purchase Order
.Training takes a few minutes to complete. While you wait, create the document extraction AI skill.
Configure the following properties:
Property | Value |
---|---|
Name | AL_ExtractInvoice |
Description (Optional) | AI skill to extract data from invoices Acme receives. |
companyName
.Text
for all seven fields.companyEmail
companyPhone
companyAddress
invoiceNumber
date
total
Tip: You're only going to define the document structure for invoices, because Appian will extract data from invoices. You're not using Appian to extract data from purchase orders, so you don't need to create a document extraction AI skill for purchase orders.
After you've given the document classification model a few minutes to learn how to recognize the document types, you're ready to review training metrics.
When the model is finished training, it returns multiple metrics on how well it is able to classify your documents. Click View Additional Metrics to see this data.
There is no single metric that can tell you that your model is ready to use in production. Instead, you'll want to evaluate this data based on your use case and consideration of how your process might be impacted.
For this tutorial, you're building a model to classify two types of documents (listed above). Acme receives invoices and purchase orders in equal numbers. However, the documents are reviewed by human beings at different stages in their respective processes. Data is extracted from invoices and saved to an interface, then a person reviews it. Purchase orders are reviewed immediately.
It's not very disruptive if an invoice is misclassified as a purchase order because a person will notice and correct it early. However, if a purchase order is misclassified as an invoice, it can cause some issues in that process. Therefore, you're paying special attention to how well the model predicts invoices. In this case, you want to pay attention to the precision for purchase orders. Likewise, you also want to pay attention to the recall for invoices.
The training metrics reveal that the model is able to accurately classify each document type most of the time. Acme is concerned about purchase orders being misclassified as invoices. So, you want to dive deeper into how that document type is performing.
See Evaluate Model Performance to learn more about each metric and when you might be interested in them.
Once the model training returns metrics that meet your requirements, you'll publish that version of the model to make it available to use in the Classify Documents smart service in a process model.
That's it! Now the model can be called from a smart service. So let's build the process that will use the published model in this AI Skill.
This section walks through each part of building the classification process.
Before you begin to accept documents, you'll need a place to store them. Acme wants to store document data in a record type. So let's make one!
In Create Record Type, configure the following properties:
Property | Value |
---|---|
Name | AL Document |
Display Name (Plural) | Documents |
Description | A record type to store data on documents uploaded to Acme Logistics. |
On the Create Data Model page, keep the default settings for the following fields:
Field name | Data type |
---|---|
id | Number (Integer) |
createdBy | User |
createdOn | Date and Time |
modifiedBy | User |
modifiedOn | Date and Time |
Click NEW FIELD to create two new fields in the model:
Field name | Data type |
---|---|
documentId | Number (Integer) |
documentName | Text |
If you plan to use this record type outside of this tutorial, we recommend downloading the database script.
To keep things organized, create a folder in your application to store the document files.
Configure the following properties:
Property | Value |
---|---|
Type | Document Folder |
Name | AL Uploaded Documents |
Description | Folder containing documents submitted via Acme's website. |
Parent Folder | AL Knowledge Center |
To reference the folder in your interface, you'll need to create a constant.
Configure the following properties:
Property | Value |
---|---|
Name | AL_UPLOADED_DOCUMENTS |
Description (optional) | Constant referencing the AL Uploaded Documents folder. |
Type | Document or Folder |
Value | AL Uploaded Documents |
Acme's customers submit a form to begin the claims process. You can create an interface to collect and save all of the necessary information, including documents.
Configure the following fields:
Property | Value |
---|---|
Name | AL_IntakeForm |
Description (Optional) | Interface to allow vendors to upload documents. |
Save In | Select the Rules & Interfaces folder in your application. |
Click New Rule Input and configure the following fields:
Property | Value |
---|---|
Name | record |
Description (Optional) | Leave blank. |
Type | AL Document (record type) |
Click New Local Variable and configure the following fields:
Property | Value |
---|---|
Name | document |
Value | Leave blank. |
This variable will temporarily store your document as you work through the form.
Click CREATE AND ADD ANOTHER.
This variable will format your data so it saves properly into the record rule input when you click Submit.
Property | Value |
---|---|
Name | toSave |
Value | See below. |
Tip: This local variable maps data to record type fields, so you'll use the recordType!
domain in the definition. You'll need to manually type the field references in order to reference the UUID for the record type you created earlier. We've included the excerpt here for your reference.
1
2
3
4
5
6
7
8
9
10
a!forEach(
items: local!document,
expression: {recordType!AL Document(
recordType!AL Document.fields.createdBy: loggedInUser(),
recordType!AL Document.fields.documentId: fv!item,
recordType!AL Document.fields.documentName: "User uploaded document",
recordType!AL Document.fields.createdOn: now(),
recordType!AL Document.fields.modifiedBy: null,
recordType!AL Document.fields.modifiedOn: null
)}[1])
Submit to Acme Logistics
.In the COMPONENT CONFIGURATION, configure the following properties:
Property | Value |
---|---|
Label Position | Hidden |
Display Value | Click Edit as Expression and enter: "Thank you for contacting Acme! Upload your document and we'll be in touch" |
Read-only | Checked |
In the COMPONENT CONFIGURATION:
Property | Value |
---|---|
Target Folder | AL_UPLOADED_DOCUMENTS |
Selected Files | local!document |
Save Files To | local!document |
Click the Submit button component to configure it:
Property | Value |
---|---|
Save Value To | Click Edit as Expression and enter: a!save(target: ri!record, value: local!toSave) |
If the incoming documents are meant for the fulfillment department, you want to present them with the data they need. In this step, you'll create a form to display as part of the user input task when a purchase order comes into the Acme website.
Configure the following fields:
Property | Value |
---|---|
Name | AL_FulfillmentTask |
Description (Optional) | Interface to show purchase order data for Fulfillment Dept. |
Save In | Select the interfaces folder in your application. |
Click New Rule Input and configure the following fields:
Property | Value |
---|---|
Name | record |
Description (Optional) | Leave blank. |
Type | AL Document (record type) |
In the COMPONENT CONFIGURATION, configure the following properties:
Property | Value |
---|---|
Label Position | Hidden |
Display Value | Click Edit as Expression and enter: "Someone submitted a purchase order via Acme's website. Review the document to confirm its validity." |
Read-only | Checked |
In the COMPONENT CONFIGURATION, configure the following properties:
Property | Value |
---|---|
Label Position | Hidden |
Document | Click Edit as Expression and enter: ri!record > fields > documentId . Appian automatically formats this selection so the expression appears as: ri!record[AL Document.documentId] . |
A process model is the primary tool in Appian for building a process. For this tutorial, our process model will incorporate the AI Skill we created earlier, but first we need to create the design object.
To create this process model:
Configure the following properties:
Property | Value |
---|---|
Name | AL Claim Intake |
Description (Optional) | Process to collect and verify incoming claims from Acme customers. |
The process kicks off when a user submits the start form. Configure the Start node to use the form you created:
AL
.AL_IntakeForm
when it displays in the dropdown list.To save the data that your customers input into the start form, you'll want to write records next. This will save all data, including the documents they attached to the claim.
This takes the data stored in the record
rule input in the interface, passes it to the record
process variable, and saves it to the AP Documents record.
Now that we have document data saved, we can pass it to the AI skill through the Classify Documents smart service. The smart service will return predictions for each document type, grouped by the model's confidence in those predictions.
To add the Classify Documents smart service:
AL_IncomingDocument
.90
, to indicate your threshold for the model's confidence in its prediction.Verify the following parameters for the new process variable:
Property | Value |
---|---|
Name | AboveThreshold |
Type | ClassificationResult |
Value | Leave empty |
Verify the following parameters for the new process variable:
Property | Value |
---|---|
Name | BelowThreshold |
Type | ClassificationResult |
Value | Leave empty |
Verify the following parameters for the new process variable:
Property | Value |
---|---|
Name | Failed |
Type | ClassificationResult |
Value | Leave empty |
This way, the results of document classification are saved into process variables and can be referenced later in the process. If we don't save the outputs of this node, the values can't be referenced in the following gateways.
If the model's prediction isn't up to our standards (that is, the model isn't very confident in its predictions), we want to notify a human being so they can double-check the classification. But if the model is confident, we're glad to let the process finish by notifying the proper team.
We'll build logic into the process using gateways. Depending on the result of an expression, we can instruct the process to follow any number of gateways.
From here to the end of the tutorial, you'll be building three paths:
Let's get started on adding the gateway and the three paths. Then, we can configure the XOR gateway to route to each one when appropriate.
XOR
.You'll configure the XOR node's pathways after you add the remaining nodes.
We want the analyst to receive an email if any predictions fall below the threshold, so they can review the document manually.
Send Email
.Notify Analyst
Email Configuration:
Property | Value |
---|---|
From | Select Process Model |
To | Type or search for the name of the analyst to receive the task. For demonstration purposes, you can use your own email address. |
Subject | "For Review: Document type prediction below threshold." |
Message Body:
Enter the following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
= "Dear Analyst, the following document type was predicted with a confidence level before your threshold. Review the prediction to confirm it's correct." & char(10),
char(10),
"Predicted type: " & pv!BelowThreshold.type,
char(10),
char(10),
a!linkField(
label: "Download document",
labelPosition: "ABOVE",
links: {
a!documentDownloadLink(
label: "Download",
document: pv!BelowThreshold.classifiedItem
)
}
)
AL_ExtractInvoice
.Configure the following parameters for the new process variable:
Property | Value |
---|---|
Name | ExtractedData |
Type | Text |
Value | Leave empty |
Note: For the purpose of this tutorial, we aren't going to write the extracted data to a record type or other data source. But in a real-world business process, you'd add another node in the process to save this data for future use.
User Input Task
.In the General tab, configure the following parameters:
Property | Value |
---|---|
Name | Notify User |
Description | Notifies the fulfillment department that there is a new purchase order to review. |
Task Display Name | New purchase order received |
Default Task Priority | Normal. |
AL_FulfillmentTask
.Now that we have built the three possible paths, we can direct the process model to any of them based on the results of the Classify Documents smart service.
In the General tab, configure the following parameters:
Property | Value |
---|---|
Name | Classification Result? |
Description | Checks the result of document classification |
Click NEW CONDITION and configure the following parameters:
Property | Value |
---|---|
Condition | Open the Expression editor and enter a!isNotNullOrEmpty(pv!BelowThreshold) |
Result | Select the Notify Analyst node. |
Path Label | Low confidence |
Click NEW CONDITION and configure the following parameters:
Property | Value |
---|---|
Condition | Open the Expression editor and enter pv!AboveThreshold.type = "Invoice" |
Result | Select the Extract from Document node. |
Path Label | Invoice |
Click NEW CONDITION and configure the following parameters:
Property | Value |
---|---|
Condition | Open the Expression editor and enter pv!AboveThreshold.type = "Purchase Order" |
Result | Select the Notify User node. |
Path Label | Purchase Order |
Tip: Encounter an error? Check that you published the model in the IncomingDocument
AI skill.
You just built, trained, and used a custom machine learning model in Appian.
Build a Doc Classification Process with AI Skill