Transform and Load Data

Process Mining is deprecated with Appian 24.2 and will no longer be available in an upcoming release.

Instead, we encourage customers to use Process HQ to explore and analyze business processes and data.

Transformation Projects

After you've created your data sets, the next step is to create a transformation project.

On the Transformation Projects tab, you can create transformation projects to transform data sets into event logs. Mining Prep must transform data sets into event logs for process mining analysis. An event log is a list of events that process mining uses to analyze processes. Events represent activities in process mining. Activities are tasks in a business process that may be automated or performed by a human. They are often associated with start and end time stamps, who performed the activity, or how much the activity cost.

Transformation Projects

To create a transformation project, click the Add button on the Transformation Projects tab. You can also create a transformation project directly from a data set by clicking the Create Transformation Project button.

Transformation projects consist of:

Data Sets
Building Blocks
Transformation Actions

Once you've built the transformation project, you can click the Transform & Load button to execute the transformation actions and load the event log into Process Mining.

Transformation Project

Transformation Project Statuses

Transformation project statuses indicate the execution status of the project whenever you manually or automatically execute a transformation project.

Statuses display in the Status column alongside a time stamp that indicates when the status occurred:

Running: The transformation project is currently extracting, transforming, and loading into Process Mining.
Completed: The transformation project successfully executed, and the event log is available in Process Mining.
Failed: The transformation project did not execute successfully due to an error.

Building blocks

Use building blocks to organize your transformation projects. A building block is automatically created when you add a transformation project.

Building blocks can contain one or more data sets.

In each building block, you identify the case ID, events, and attributes for the data set. You must define these properties before you can run the final transformation. You also add transformation actions to building blocks. The final transformation is comprised of a series of transformation actions. A transformation action defines a single data manipulation instance.

Tip: Transformation actions run inside the building blocks in which they are defined. For smaller projects, it may make sense to keep all data sets in a single building block, so you only have to add the transformation actions once. As projects grow, you may wish to add more building blocks to more clearly organize data sets. However, when you have multiple building blocks, you will have to add transformation actions to each block.

Building blocks can contain one or many data sets. If you add multiple data sets to a single building block, they must share a common column of values to serve as a key.

During the final transformation, Mining Prep automatically combines all valid building blocks into one event log for process mining analysis.

Add data sets to building blocks

After you add a new building block, you can choose to add an existing or new data set.

To add an existing data set to a building block:

Select the data set from the drop-down field in the USE DATA SOURCE panel.
Click Use Data Source.

mp-select-data-source

To add a new data set to a building block:

Click Upload Data Source in the USE DATA SOURCE panel.
Click the Select file button and select the CSV file.
The fields on the window automatically populate. Edit the auto-populated values as necessary.
Click the Upload button.

mp-upload-data-source

Transformation actions

Transformation actions define changes to the data's format, structure, and name in the event log. You add transformation actions from the building-block level, and these actions only apply to data inside the building block.

To create a complete event log that process mining can analyze, it must contain the following items:

Case ID
Events
Case Attributes (optional)

Tip: Although case attributes are optional, they can be helpful during the process mining phase. Case attributes that you define during Mining Prep are available as filters in process mining and can be used as dimensions when creating charts in a dashboard.

It is common to need to transform your data set so that you can define these properties. For example, events must contain columns in a datetime format to serve as time stamps. You can add a date conversion transformation action to a column so that it contains the correct data type. See Identify data properties for more information.

Transformation building blocks store transformation actions. You access these actions from a building block.

To add a transformation action:

Add a data set to the building block if you haven't already.
Click the Add button in the Transformation Actions panel.
Select the transformation action you want to add.
Complete the required fields in the right-hand panel.
Click Preview to verify the transformation result.
Click Save to add the transformation action to the building block.

Mining prep supports the following transformation actions:

Merge Columns
Match Columns
Replace Values
Filter
Add Data
Rename Column
Date Conversion
Match Pattern

Add transformation drop down

Merge columns

The merge columns transformation action adds a new column that combines values from two or more columns, separated by characters you define.

For example, if you have an attribute that's associated with an ID, you can combine these into a single column with the merge transformation action.

Merge column transformation action

To merge columns:

In your building block, click ADD > Merge Columns.
In Select columns to merge, select two or more columns to combine.
In New column name, enter the name of the new merged column.
In Separator, enter the text that will divide the column values. For example, you can separate values by -.
Click Preview to verify the name change.
Click Save.

Match columns

The match columns transformation action adds new columns and populates them according to match conditions.

To create a new column, enter the new column name in the New Column Name field.

Screenshot of New Column Name field with string type

After creating the column, you need to specify its data type in the Column Type field. In the preceding example, the new column is a string type. The following table describes the column types you can select:

Type	Description	Example
String	Text or combinations of alphabetical and numerical characters.	`jdoe44`
Integer	Whole, non-decimal numbers.	`8675`
Double	Double-precision floating-point numbers.	`1403.22`

After selecting the data type, you can define the conditions that determine how the match column populates. Each condition starts with a comparison between two columns. If the comparison is true, the data from the Then field is returned; if the comparison is false, the data from the IF NOT THEN field is returned.

The comparison operators you can choose from are:

Comparison Operators	Description	Example
=	is equal to	`1=2` returns `false`
!=	is not equal to	`1!=2` returns `true`
Contains	contains	`"booker12" contains b` returns `true`
Is null	Empty cell	`"Jane Doe" is null` returns `false`
Is not null	Not empty cell	`"Jane Doe" is not null` returns `true`

mp-add-condition

If you have more than one condition you want to define, click Add Condition to add additional conditions.

Replace values

The replace values transformation action replaces values in existing columns based on conditions you define. For example, you may need to replace a special character or abbreviation in an attribute column.

To start, select an existing column from the Select Column field.

Screenshot of Select Column field

After selecting the column, you can define the conditions that determine how the column values are replaced. Each condition starts with a comparison between two columns. If the comparison is true, the data from the Then field is returned; if the comparison is false, the data from the IF NOT THEN field is returned.

The comparison operators you can choose from are:

Comparison Operators	Description	Example
=	is equal to	`1=2` returns `false`
!=	is not equal to	`1!=2` returns `true`
Contains	contains	`"booker12" contains b` returns `true`
Is null	Empty cell	`"Jane Doe" is null` returns `false`
Is not null	Not empty cell	`"Jane Doe" is not null` returns `true`

If you have more than one condition you want to define, click Add Condition to add additional conditions.

Filter

The filter transformation action reduces the number of rows in your data set to only those that match filter criteria.

Filters on a column of type string are case sensitive and support the following comparison operators:

Comparison Operators	Description	Example
=	is equal to	`1=2` returns `false`
!=	is not equal to	`1!=2` returns `true`
Contains	contains	`"booker12" contains b` returns `true`
Is null	Empty cell	`"Jane Doe" is null` returns `false`
Is not null	Not empty cell	`"Jane Doe" is not null` returns `true`

Filters on a column of type datetime, integer, or double additionally support the following comparison operators:

Comparison Operators	Description	Example
<	Less than	`1<2` returns `true`
>	Greater than	`1>2` returns `false`
<=	Less than or equal	`1<=2` returns `true`
>=	Greater than or equal	`1>=2` returns `false`

These operators allow you to filter ranges of values in your data set.

mp-filter-action

Add data

Additional information from other data sets can enrich your process mining analysis. This might be data from a different system like your CRM system or internal tables from your business unit. With the add data transformation action, you can add supplemental data sets to your transformation project.

To add the add data transformation action to a building block:

Click Add and select the Add Data transformation action. A preview of your current transformation displays.
Select an existing data set from the Select Table drop-down.
Select a key column from the current data set that shares values with a column in the new table.
Click Preview to verify the transformation result.
Click Save to add the new data to the data set in the building block.

Note: The add data transformation action only adds rows where values match between the new and source data key columns.

After adding data, you can use any new columns as events or attributes, and you can continue to transform the new combined data set.

mp-add-data

Rename column

Use the rename column transformation action to change the name of an existing column. For example, you might want to change the name of a column to be more descriptive or contain more context about the data it contains.

Case and event attributes are often good columns to rename since they are available as filters in Process Mining after transforming and loading. Rename these columns to make attribute filters more understandable for other users in Process Mining.

To rename a column:

In your building block, click ADD > Rename Columns.
In Select columns to rename, select the column you want to rename.
In New column name, enter the new column name you want to use instead.
Click Preview to verify the name change.
Click Save.

Date conversion

Use the date conversion transformation action to standardize dates within your transformations and convert strings into UTC time stamps. In addition to case IDs and events, time stamps are a fundamental component of the data used for process mining.

Note: The date conversion transformation action converts a string into the UTC time zone based on the original time zone of the data set. The result of this conversion is UTC, not your local time zone.

When you add a date conversion transformation action, you must specify the following:

The columns that contain the time stamp you want to use in your event log.
The current datetime format as a format string. Appian automatically suggests the correct format for the most common datetime formats. For example, the MM/dd/yy hh:mm string matches data like this: 11/23/1955 12:00.
The local time zone that applies to the data set.

datetime format strings use the following case-sensitive characters:

Character	Description	Example
y	year	`1996`
M	month	`Jul` or `07`
d	day	`10`
a	half day	`PM`
h	hour (0-23)	`0`
m	min	`30`

Click Preview to verify the transformation result, before clicking Save.

mp-date-conversion

Time stamps and time zones

Mining Prep saves, operates, and previews all date and time information using the Coordinated Universal Time (UTC) time zone. When you perform date conversions, you must specify the format of the time stamp and the local time zone of the data set.

When you preview and save the date conversion, Mining Prep converts the value from your selected time zone to UTC.

For example, let's say your data set contains a column with time stamps that look like this: 2016/01/04 12:09:44.

You would specify the format string as yyyy/MM/dd hh:mm:ss for this data. Assuming the event occurred in McLean, VA, and your IT system uses this local time zone when creating the log, you would select US/Eastern as the respective time zone.

After you save the date conversion, time stamp will now display as 04.01.2016 05:09:44 UTC. This represents 2016/01/04 12:09:44 EST.

Match pattern

The match pattern transformation action replaces or extracts data from a column based on regular expression (regex) patterns. Matched data is extracted or replaced into the Output column. The Regular Expression field supports PCRE2 syntax.

Leave Replace with empty to extract matched data into the output.
Enter text into Replace with to replace matched data with the text you provided and leave the unmatched text as is.

To match a pattern:

In your building block, click ADD > Match Pattern.
In Select a column to match, select a column that contains the input data.

Note: The input can be any data type, but the output will always be a string.
In New column name, enter the name of the new column that will output the results.
In Regex Pattern, enter a regex pattern that matches the data you want to extract or replace from the input.
In Replace with, optionally enter a string to replace text matched by the pattern.
Click Preview to verify the expected output.
Click Save.

Example use cases

Let's look at a couple use cases that involve a column containing email addresses.

First, imagine you'd like to extract all of the email domains from the email addresses column into a new column. This would be useful to use as a company name case attribute.

Select column to match: Email address
Regex pattern: (?<=@)[^.]+(?=\.)
Output column: Company

If the email address column contained karen.anderson@appian.com, the output column would display appian. Here's how the preview looks for this pattern match:

Match Pattern Transformation Action

Next, imagine you'd instead like to replace the email domains but leave the rest of the email address.

Select column to match: Email address
Regex pattern: (?<=@)[^.]+(?=\.)
Output column: New email address
Replace with: appiancorp

If the email address column contained amy.lee@appian.com, the output column would display amy.lee@appiancorp.com.

Identify data properties

To create a complete event log that process mining can analyze, it must contain the following items:

Case ID
Events
Case Attributes (optional)

It is common to need to transform your data set before you can define these properties.

mp-data-properties

Case ID

The case ID is the unique identifier of a business case, which usually consists of a combination of letters and numbers. A business case represents a real business transaction in an IT system. A case often consists of multiple events. Each event can have unique time stamps and case attributes.

In process mining, a business case runs through a particular process variant; therefore, it is the image of exactly one process cycle. The case ID identifies each case individually.

To identify the case ID property from a building block:

Click the Case ID button in the Identify section.
Select the column that contains the case ID in the Case ID drop-down field.
Click Save.

Events

The digital representation of when an activity occurred. An instance of an activity. All events that are assigned to a business transaction form a unique case. Similarly, a business process consists of several activities. An event has an associated case ID, time stamp, and other contextual data. The time stamps correspond to the start or end time of the event.

Not all events have start and end time stamps, but you need at least a start time stamp to successfully define an event. Additionally, process mining can only calculate the duration of events if a start and end time stamp exist.

Note: You can only identify events if Mining Prep recognizes a column of type datetime. If you don't have datetime columns in your data set, Mining Prep will automatically prompt you to select a column from your data set to perform a date conversion.

Mining Prep supports two event identification types:

Single Events: One column contains all time stamps for one specific activity. In this instance, the column header usually indicates the specific logged activity name.
Multiple Events: Multiple columns contain event names and time stamps for several activities. One column contains the logged activity names, and the other column(s) contain time stamps.

To identify single events:

Click the Events button in the Identify section on your building block.
Click the Add Event button.
Select the Single Event option.
Configure the event options:
- Event Name: Define the name of the event as it should appear in process mining. The name can differ from the original column name.
- Start or Start & End: Choose whether the time stamps have a start column or both start and end columns.
- Start: Choose the column that contains the time stamp for the start of the event.
- End: Optionally, choose the column that contains the time stamp for the end of the event. Only displays if Start & End is selected.
- Event Attributes: Optionally, choose characteristics that are specific to a certain activity. For example, "amount" or "responsible department".
Click Save.

To identify multiple events:

Click the Events button in the Identify section on your building block.
Click the Add Event button.
Select the Multiple Events option.
Configure the event options:
- Name of Event Group: Define the name of the group of events. This is only used in Mining Prep and doesn't display in Process Mining.
- Event Name Column: Choose the column that contains multiple event names.
- Start or Start & End: Choose whether the time stamps have a start column or both start and end columns.
- Start: Choose the column that contains the time stamp for the start of the event.
- End: Optionally, choose the column that contains the time stamp for the end of the event. Only displays if Start & End is selected.
- Event Attributes: Optionally, choose characteristics that are specific to a certain activity. For example, "amount" or "responsible department".
Click Save.

Case Attributes

Case attributes are an optional property of an event log and describe characteristics of a case. Case attributes help you make observations such as how costly a deviation or variant is in your process. Each case can have one or more attributes that describe it in more detail. For example, customer or vehicle types can be case attributes. Attribute characteristics can vary from case to case.

You can select the following attribute types:

Attribute	Description	Valid Types
Numerical	Numeric characteristics like “total payment amount.”	`integer`, `double`
Impact	Numeric characteristics that are useful when summed together. In the Process Mining module, the Variants and Deviations pages display the summed values of impact attributes beneath the Impact column.	`integer`, `double`
Categorical	Descriptive characteristics like “responsible department.”	`integer`, `double`, `string`

Numerical and categorical attributes that you define in Mining Prep are available as filters in the Process Mining module.

Note: Make sure you use valid data types with your attributes to prevent transformation errors later on. Appian displays a warning if you select a column with an invalid data type. Note that if your data uses a comma as a decimal separator, it is assumed to be a string. This may not be accurate depending on your specific data set.

Data types

Each column in your data set is associated with a data type. The data type indicates what kind of data the column contains. This is important to keep in mind, because you can only merge columns of the same data type. To help you keep track each column's data type, Mining Prep displays a data type icon in the column headers of your data previews.

The following table describes the possible data types and associated data type icon:

Data Type	Icon	Example
string		`angela.lewis`
integer		`20`
decimal		`20.50`
datetime		`20.02.2022 12:10:13 UTC`

Execute the transformation project

When your transformation project is complete, you are ready to run the transformation and load the event log into Process Mining.

To transform and load your project into Process Mining:

Go to the Transformation Projects tab in Mining Prep.
Select the checkbox next to your transformation project.
Click Transform & Load.
Select the environment in which to load the event log. The API key is automatically populated for you.
Click Ok.

Once you've manually transformed and loaded a transformation project, you can schedule these actions to happen automatically.

Access Process Mining

After a successful transform and load in Mining Prep, a Success message displays with a link to your Process Mining environment. This link will navigate you to the discovered model that's based on the newly created event log.

Sucess message

To access your Process Mining environment:

Successfully transform and load a transformation project.
In the Success message, click the here hyperlink.

Alternatively, you can access Process Mining at any time from the navigation menu.

Note: To access any links between Mining Prep and Process Mining, your admin must configure the miningBaseUrl option during installation.

Schedule updates

You can define a schedule for your transformation project to load updated data to Process Mining for continuous analytics, monitoring, and notifications.

Mining Prep automatically extracts, transforms, and loads the updated data from your source system to Process Mining on a reoccurring basis.

See Schedule for more information.

Feedback

Was this page helpful?