Configuring Backup and Restoration

This topic outlines the backup process for Appian documents and data.

Reasons for taking backups

The main reasons for backing up Appian installations and related data are for data archival and auditing purposes and as a protection against catastrophic events. The frequency of conducting backups depends on the use case.

For data archival and auditing purposes, backing up weekly or even monthly may be sufficient. Using backups to preserve data when a catastrophic event occurs requires more frequent backups that ultimately depend on how much data you are willing to lose when restoring the data. If you back up nightly, the most data you might lose is one day's worth. This concept is referred to as the Recovery Point Objective (RPO).

While backups are an approach to achieving an RPO on the order of hours, it is not the preferred method for achieving very low RPOs (minutes). Instead, the High Availability configuration should be used for disaster recovery with low RPO requirements.

Data components

There are two primary data components that should be backed up:

  • Appian application data (documents, archived processes, and other files)
  • External data

Each of these components is backed up separately, yet all backups must occur simultaneously so that data can be restored in a consistent state. When scheduling a script to backup the Appian engines, external databases must be backed up at the same time. This ensures that during recovery the system's data snapshot is synchronized and consistent. Otherwise, you might encounter synchronization issues.

Appian application data

Application data comprises files stored on the server that are referenced by Appian engines, including documents stored in the document management system, archived processes, search indices, the keystore file, data in the data server, and other data files. The folders containing this application data should be backed up at the same time. The locations of application data components are listed in the following table.

When running in a system with multiple application servers, some of these directories are shared between servers. In those cases, the data only needs to be backed up once from one of the servers. For those directories that are not shared between servers, the data needs to be backed up from each of the servers. See High Availability and Distributed Installations for a list of directories that are shared when configuring a multiple server setup.

Component Name Folder Location Notes
Application Server <APPIAN_HOME>/_admin/accdocs1/  
Application Server <APPIAN_HOME>/_admin/accdocs2/  
Application Server <APPIAN_HOME>/_admin/accdocs3/  
Application Server <APPIAN_HOME>/_admin/mini/  
Application Server <APPIAN_HOME>/_admin/models/  
Application Server <APPIAN_HOME>/_admin/process_notes/  
Application Server <APPIAN_HOME>/_admin/shared/  
Data Server <APPIAN_HOME>/data-server/data/hs/ Exclude for live data backup
Data Server <APPIAN_HOME>/data-server/data/ss/ Exclude for live data backup
Search Server <APPIAN_HOME>/search-server/data/  
Application Server <APPIAN_HOME>/server/archived-process/  
Channels Engine <APPIAN_HOME>/server/channels/gw1/  
Content and Collaboration Statistics Engines <APPIAN_HOME>/server/collaboration/gw1/  
Forums Engine <APPIAN_HOME>/server/forums/gw1/  
Application Server <APPIAN_HOME>/server/msg/  
Notifications and Notifications Email Engines <APPIAN_HOME>/server/notifications/gw1/  
Personalization Engine <APPIAN_HOME>/server/personalization/gw1/  
Portal Engine <APPIAN_HOME>/server/portal/gw1/  
Process-design Engine <APPIAN_HOME>/server/process/design/gw1  
Process-analytics Engine (0000) <APPIAN_HOME>/server/process/analytics/0000/gw1/  
Process-analytics Engine (0001) <APPIAN_HOME>/server/process/analytics/0001/gw1/  
Process-analytics Engine (0002) <APPIAN_HOME>/server/process/analytics/0002/gw1/  
Process-execution Engine (00) <APPIAN_HOME>/server/process/exec/00/gw1/  
Process-execution Engine (01) <APPIAN_HOME>/server/process/exec/01/gw1/  
Process-execution Engine (02) <APPIAN_HOME>/server/process/exec/02/gw1/  
Internal Messaging Service <APPIAN_HOME>/services/data/ Exclude for live data backup

If you have more than the default three shards of Process Execution and Process Analytics, the gw1/ directories for those shards must be backed up as well.

After performing an upgrade or restoration, it is particularly important to ensure your keystore and Appian data source data are present and in the correct state. If there is a problem with one of the two, the system will not start up and will log an ERROR message to the application server log.

External data sources

All external data sources used with Appian must be backed up separately and simultaneously. Use your preferred backup mechanism for the specific RDBMS used in the Appian environment.

Types of backups

The following sections deal with two different types of backups:

  • Full System Backups - all of the data listed in the sections above, including the Appian software libraries and other executable artifacts that are provided by the installer.
  • Live Data Backups - just the directories listed in the Appian Application Data section above, the RDBMS data, and, optionally, the log files.

A full system backup is appropriate when upgrading Appian or applying a patch/hotfix. Full system backups can also be run periodically (perhaps monthly) in order to capture a snapshot of the state of the server(s) should a catastrophe strike that requires a full system restoration. Consider that while this document deals with the Appian components that need to be backed up, for the full system backup scenario you should also consider the need to backup the entire server(s) at a drive and operating system level.

A live data backup is appropriate in order to frequently capture the state of the application data in order to provide rapid restoration of a previous state in case of a failure. While not sufficient for low RPO and RTO use cases, it can be used for a hot/cold restoration scenario where an RPO on the order of hours is desirable.

Backup procedures

This section outlines the steps to be followed to backup all data of an Appian installation at an application level. It is not meant to replace or eliminate the need to backup the server at a drive and operating system level. Application data backup procedures must be accompanied by some form of tape or similar backup for a production environment.

Full system backup procedures

Before a product upgrade or patch, the following steps must be taken to backup the Appian application in preparation for the upgrade:

  1. Stop all Appian services.
  2. Ensure all Appian processes are stopped using the Task Manager for a Windows environment, the ps command for a Linux environment, or your preferred system monitoring tool.
  3. Copy the entire Appian installation directory to a backup location.
  4. If your application data is stored on a SAN or other high-availability drive, ensure that all file system content on this drive is copied to the backup location.
  5. Take a snapshot or data dump of the Appian data source, your business data sources, and any external RDBMS(s).
  6. Proceed with the product upgrade.

Live data backups

A live data backup can be performed on a running system in order to replicate the state of the data to an external storage device for restoration later should some catastrophic failure occur on the primary system. The frequency of the live backup determines the Recovery Point Objective (RPO).

In order to perform a live backup, you must use external tools that guarantee that the Appian software experiences no disruption to disk access during the backup procedure and that the data is backed up as a snapshot, at a single point in time.

For example, in Appian Cloud we perform live backups by temporarily freezing the filesystem using the Linux OS command xfs_freeze and performing a data volume snapshot then immediately unfreezing the filesystem. The snapshot is then replicated to a safe storage location. Use a similar technique that is appropriate for your environment.

Before implementing in production, you must test this procedure during a load test in a testing or staging environment to ensure that your technique does not cause interruption to filesystem access.

You cannot perform a live data backup of Kafka data, but Kafka logs will still need to be properly initialized during restoration procedures. Follow the steps in Start Appian to initialize Kafka logs after a live data backup.

Alternative: data-only backups

For normal maintenance of an Appian application, regularly scheduled backups of the system data are strongly recommended. The frequency of the backup varies depending on the usage of the system. Daily to weekly backup intervals are recommended, depending on the resource requirements of your data and your recovery scenarios.

To perform a scheduled backup of Appian data, follow the instructions below.

All scheduled backups should be configured to run during off-peak hours.

  1. Depending on your environment, save the following code as a .sh or .bat script in the <APPIAN_HOME>/_admin/_scripts/tools/datamaintenance/ folder.

    • Linux (.sh)

      1
      2
      3
      4
      
          #!/bin/bash
          source ../../../../server/_scripts/exports.sh
          INSTALLDIR=$AE_HOME/_admin/_scripts/tools/datamaintenance/antScripts
          ant -f $INSTALLDIR/aecopyfiles.xml -Dae.source.location=$AE_HOME -Dae.dest.location="/data/aebackups" -Dae.datadelete="n"   -Dae.copycontents="y" -Dkdb.num="1" -Dkdb.invertselect="y" -Dbatchmode="y"
      
    • Windows (.bat)

      1
      2
      3
      4
      5
      
          echo off
          call ../../../../server/_scripts/exports.bat
          set INSTALLDIR=%AE_HOME%/_admin/_scripts/tools/datamaintenance/antScripts
          ant -f %INSTALLDIR%/aecopyfiles.xml -Dae.source.location=%AE_HOME% -Dae.dest.location="c:/ae_backups" -Dae.datadelete="n" -Dae.copycontents="y" -Dkdb.num="1" -Dkdb.invertselect="y" -Dbatchmode="y"
          echo on
      

      You can change the the default location to the location of your choosing.

  2. Schedule the script to run as a cron job or Windows Scheduled Task.

These scripts utilize Apache Ant as the building technology for the scripts. There is no single batch file included with the application since the parameters used for the backup can differ depending on the site. Some examples are provided in the /examples subfolder.

Limitations of live data backups

Both approaches for live data backups described above may cause issues when performed on directories used by Kafka. For this reason, we recommend that you do not perform live backup of services that rely on Kafka data.

When you do not perform a live backup of services that rely on Kafka data:

  • You will be unable to recover transactions since the last checkpoint.
  • Any Saved User Filters will be lost.
  • If using data sync in Appian Records, you will need to resync your data as part of the Restore Procedures.

If these are acceptable trade-offs, the following directories should be excluded from the backup and verified to be empty when recovering from a live data backup:

  • <APPIAN_HOME>/services/data/
  • <APPIAN_HOME>/data-server/data/

If you need to perform a live backup of services that rely on Kafka data, we recommend following the approach described in Full System Backup Procedures.

Restore procedures

This section outlines the steps to be followed to restore all data of an Appian installation at an application level. It assumes restoration to an existing system with Appian installed.

1. Stop Appian

Stop all Appian services. Backup the Appian application data into a temporary folder by copying the contents of the <APPIAN_HOME> directory into a temporary backup location.

2. Restore Kafka logs and ZooKeeper data

Delete the contents of the <APPIAN_HOME>/services/data/ directory. Copy the /services/data/ directories from the backup location into the empty <APPIAN_HOME>/services/data/ folder.

If you're restoring from a live data backup, make sure /services/data directory is empty. You can continue restore procedures, but will not be able to recover certain data as described in Limitations of live data backups.

3. Restore Appian engine data

Delete any remaining .kdb files from the <APPIAN_HOME>/server/<ENGINE_NAME>/ directories to avoid conflicts. Copy your KDB files from the backup location (the highest numbered .kdb file for each engine on production) into the corresponding folder in the Appian installation directory. Refer to the Appian Engine Data table earlier in this document for the relevant directories. Ensure that the new .kdb file is the highest numbered .kdb in the target folder. If it is not, rename it to be the highest numbered .kdb file.

4. Restore Appian application data

Delete the contents of the Appian application data directories to avoid file naming/id conflicts. Refer to the Appian Application Data table earlier in this document for file locations. Copy the contents of the application data folders from the backup location into the corresponding folder in the Appian installation directory.

When restoring to a multiple application server environment, restore the files to the single shared directory for those backed up directories that are shared between servers. Restore the the files to the local directories for the those backed up directories that are not shared between servers.

5. Restore external data

Restore the latest backup or snapshot of any external data sources so that the data is synchronized.

6. Start Appian

Start all Appian services.

When restoring data from a live data backup, you will need to start the Data Server and run these additional commands from <APPIAN_HOME>/data-server/bin:

  1. ./stop.bat (.sh)
  2. ./start.bat (.sh) --recover-no-tx-log
  3. ./start.bat (.sh)

7. Restore synced records data

For record types with data sync enabled, you must sync your data to ensure your data is correctly restored. You can do this by manually syncing each record type, or using an import customization file.

To trigger a sync in your important customization file, add recordType.<UUID of a Record Type>.forceSync=true to the file. You can reference this property multiple times to trigger a sync on different record types.

Open in Github Built: Wed, Aug 17, 2022 (01:05:05 PM)

On This Page

FEEDBACK