High Availability and Distributed Installations

When and Why to Have a Distributed Installation

There are three principle reasons to run a distributed installation of Appian: high availability, scaling and load balancing, and segregation.

High Availability

In order to have a highly-available installation of Appian, it needs to be robust to potential hardware failures. This is only possible if every service that comprises Appian has more than one instance, running on different servers, such that the unexpected loss of one server does not take out all instances of any service. Servers in a high-availability installation may be spread across separate data centers as long as there is low (less than 10ms) network latency between the data centers.

Scaling and Load Balancing

High-load sites may have more demand for a service than a single instance of it can provide. In this situation, adding additional instances of one, or many, services can increase the capacity of the installation.

Segregation

Some customers have requirements to only run one instance of each service, but want them to run on separate servers for capacity (a single server is not large enough to host all services) or security (a desire to host data-persistence services in a different network zone, for example) reasons.

Windows

Running more than one instance of the Appian engines, Kafka, or Zookeeper is not a supported configuration in Windows environments. This means that high-availability and load balancing the Appian engines are not possible in Windows environments.

Separating services onto different servers is supported on Windows, as is adding additional instances of other components, like the application server.

How to Configure a Distributed Installation

Preparation

Planning the Topology

The first step in setting up a multiple server configuration is mapping out which servers will run the various architectural components of the Appian software. The distribution of the architectural components across one or more servers on a network is referred to by the documentation and the product as the "topology."

The various software components capable of being run on separate servers are as follows:

  • Appian Java EE application (suite.ear)
  • Search Server
  • Appian Engines
  • Kafka
  • Zookeeper
  • RDBMS

These components can be configured for data redundancy and high availability. Review the Setting Up for Disaster Recovery page for details before planning a high availability topology.

Some considerations when planning the topology:

  • Components that require establishing consensus between the different instances (search server, Kafka, and Zookeeper) require three instances in order to have a system that is robust to a failure of one of the instances. At most three Kafka and Zookeeper instances can be configured for a site and Appian recommends not configuring more than three search servers.
  • Components that do not establish consensus (the application server and the Appian engines) only require at least two instances in order to be robust to a failure of one instance.
  • Due to the way file names and file paths are calculated for documents stored in Appian, the application server and engine servers must be on servers using the same type of operating system. Do not mix Windows and Linux.
  • The different components of Appian, i.e. the Java EE application (suite.ear), the search server, the Appian engines, the RDBMS, etc, can be configured to run on the same physical machine or on separate machines. Similarly, each component can be clustered independently. An environment may choose to have two instances of application servers running suite.ear and three instances of the search server deployed, for example.
  • Clustering the Appian engines, Kafka, or Zookeeper is only supported on Linux environments, not on Windows environments.
  • The Search Server, the Appian engines, and Kafka are data persistence and reporting components and therefore require the same considerations given to the RDBMS when considering data redundancy.
  • Only one instance of any given Appian engine may run on a given server.
  • Configuring multiple Kafka and Zookeeper instances can provide additional resiliency for those services in the event of a hardware or network failure, but the additional resiliency for the system as a whole will only be achieved if there are also multiple copies of all of the Appian engines. If there is only one instance of a certain type of engine, the risk from adding additional components to the system (and therefore adding additional opportunities for failure) outweighs the benefit from adding resiliency to the Kafka and Zookeeper layers. Appian recommends only running multiple instances of Kafka and Zookeeper if you also have multiple instances of all Appian engines.

Once you have read about configuring the system for high availability, determine the number of servers you will use and which components of the architecture will run on each of the servers.

Network Configuration

Distributed installations require static IP addresses for each server. You must have a static IP address assigned to each machine prior to configuring your distributed installation. If you have not done so already, assign static IP addresses to each machine you plan to use to host Appian.

You must also verify that each machine can communicate with the others in the network over the ports that Appian uses. As a security best practice, it is recommended to configure firewall settings so that each port is only open to the machines that need access. For example, Kafka uses port 9092 and the other Services that need to communicate with Kafka are Engines, Data Server, and other Kafka instances. So port 9092 should only be open to machines that are hosting an instance of Engines, Data Server, or Kafka. For a non-distributed installation where all Appian services is hosted on one server, then only the local host should have access to the ports.

Install Appian on Each Machine

Install a full version of Appian on machines that you wish to host any Appian component. Regardless of whether the machine is intended to run just Appian Engines, just the main suite.ear Java EE application, just a Search Server node, or some combination thereof, the full installation should exist on each server in the environment. The full installation should exist on each server in the environment in order to eliminate the possibility of misconfiguration due to missing components. An Appian installation is not required on the machine running the RDBMS.

Each installation of Appian must be of the same version and hotfix level.

Configuration

Configuration File Consistency

When running across multiple servers, it is especially important to make sure that they are configured the same. All configuration files, such as appian-topology.xml, custom.properties, and others, must be the same on all servers.

Topology XML File

The way to specify which components of Appian run on which hosts is with the appian-topology.xml file, located in <APPIAN_HOME>/ear/suite.ear/conf. Example configurations can be found in appian-topology.xml.example, which is located in the same directory.

When specifying hostnames in the appian-topology.xml file for a distributed installation, you must not use "localhost" as that will resolve differently on the different machines in the cluster. Hostnames specified in appian-topology.xml must exactly match the host value that is marked with _h in the output from _admin/_scripts/licinfo.sh (.bat).

An appian-topology.xml file that is empty, contains only XML comments, or contains invalid XML will result in the engines using the default topology.

Engine Security Token

As part of a distributed installation, it is a requirement to copy the appian.sec file across all machines in the distributed environment, for it is necessary to enable authorized connections between the engines and specified application servers. It is located in <APPIAN_HOME>/ear/suite.ear/conf.

Refer to Appian Engine Connection Restrictions for more information.

Service Manager Password

As part of a distributed installation, it is a requirement to copy the service_manager.conf file located in /services/conf across all machines in the distributed environment, for it is necessary to enable authorized connections to the service manager and the engines across machines.

The service_manager.conf file is created when running the password script.

Scheduling Checkpoints

When moving to a high-availability configuration you should also remove any custom configurations to checkpoint scheduling configurations. High-availability should use the default values for these configurations as engines do not become unavailable when checkpointing on high-availability installations.

Shared Files

The following directories must be shared across all servers that run that component. All servers that run the given component need both read and write access to these directories.

Component Name Folder Name
Application Server APPIAN_HOME/_admin/accdocs1/
Application Server APPIAN_HOME/_admin/accdocs2/
Application Server APPIAN_HOME/_admin/accdocs3/
Application Server APPIAN_HOME/server/archived-process/
Application Server APPIAN_HOME/_admin/search/
Application Server APPIAN_HOME/server/msg/
Application Server APPIAN_HOME/_admin/mini/
Application Server APPIAN_HOME/_admin/models/
Application Server APPIAN_HOME/_admin/process_notes/
Application Server APPIAN_HOME/_admin/shared/
Channels Engine APPIAN_HOME/server/channels/gw1/
Content and Collaboration Statistics Engines APPIAN_HOME/server/collaboration/gw1/
Forums Engine APPIAN_HOME/server/forums/gw1/
Notifications and Notifications Email Engines APPIAN_HOME/server/notifications/gw1/
Personalization Engine APPIAN_HOME/server/personalization/gw1/
Portal Engine APPIAN_HOME/server/portal/gw1/
Process Analytics 00 Engine APPIAN_HOME/server/process/analytics/0000/gw1/
Process Analytics 01 Engine APPIAN_HOME/server/process/analytics/0001/gw1/
Process Analytics 02 Engine APPIAN_HOME/server/process/analytics/0002/gw1/
Process Design Engine APPIAN_HOME/server/process/design/gw1/
Process Execution 00 Engine APPIAN_HOME/server/process/exec/00/gw1/
Process Execution 01 Engine APPIAN_HOME/server/process/exec/01/gw1/
Process Execution 02 Engine APPIAN_HOME/server/process/exec/02/gw1/

If you have more than the default three shards of Process Execution and Process Analytics, the gw1 directories for those shards must be shared across servers as well.

The recommended approach for sharing directories between servers is:

  1. Set up a central network attached storage server
  2. Create a directory structure on the storage server that mirrors the directories listed in the table above
  3. Replace the above directories on each server with links to the corresponding directory on the network attached storage server

Both Kafka and Zookeeper are sensitive to latency with regard to CPU, memory, and disk contention. For high-load sites and any site that has multiple Kafka or Zookeeper instances, Appian recommends having enough CPUs on the machines that host these services such that they each have at least one CPU reserved for their use. For example, if you have the default 15 engines, Kafka, Zookeeper, and service manager all on a single server on a heavily-loaded system, that server should have at least 18 CPUs. Appian also recommends keeping the data directories for these two components (services/data/kafka-logs and services/data/zookeeper) on local disks rather than mounting them onto network drives. This recommendation is consistent with industry best practices for these services.

Shared Logs

In addition to the above directories, which must be shared across servers to have a functioning system, many administrators choose to share application logs between servers for ease of access by linking the /logs directory on the local machine to /shared-logs/<local machine name> directory on a network attached storage server and adding a link from APPIAN_HOME/shared-logs to the shared-logs directory on the network storage device.

Appian Health Check's data collection step will look for a directory named "shared-logs" directly inside the APPIAN_HOME directory and will collect logs inside any subdirectories found there.

1
2
3
APPIAN_HOME/shared-logs/<machine A>
APPIAN_HOME/shared-logs/<machine B>
APPIAN_HOME/shared-logs/<machine C>

With this shared logging configured, the data collection step of Health Check only needs to be run on a single server rather than run once on each server.

How to Run a Distributed Installation

Starting

The procedure for starting a distributed installation of Appian is not different than when starting a non-distributed installation of Appian except that you must start all instances of a given component, across all servers, before moving onto the next component. First make sure the RDBMS is running, then start all of the engines, then start all instances of the the search server, then start the application servers.

If the Appian engines are running on different servers than Kafka & Zookeeper, either can be started first. The engines will wait for Kafka & Zookeeper before they become available.

Stopping

The procedure for stopping a distributed installation of Appian is not different than when stopping a non-distributed installation of Appian except that you must stop all instances of a given component, across all servers, before moving onto the next component. First shut down all application servers, then shut down all instances of the search server, then shut down all of the engines (using the --cluster option of the stop script.

FEEDBACK