High Availability

Overview

This page details how to configure High Availability for installations of Appian on Kubernetes.

What is high availability?

Highly available installations of Appian are robust to certain classes of hardware and infrastructure failure. For an installation of Appian on Kubernetes to be considered highly available, each of its components must run more than one replica, spread across different failure domains, such as nodes and zones, such that the unexpected loss of one pod, node, or zone does not take out all replicas of the component. Replicas in a highly available installation may be spread across failure domains as long as there is low (less than 10ms) network latency between replicas.

How to set up high availability

Create a ReadWriteMany persistent volume claim

Just like server-based, highly available installations of Appian require shared file storage, highly available installations of Appian on Kubernetes require ReadWriteMany storage.

Create a ReadWriteMany persistent volume claim in the same namespace as your Appian custom resource. For a list of in-tree volume plugins that support ReadWriteMany, see Access Modes. For a list of out-of-tree CSI volume plugins that do so, see Drivers (search for "Read/Write Multiple Pod").

Note: A separate ReadWriteMany persistent volume claim must be created from the ReadWriteMany persistent volume claim created when setting up Health Check.

Configure your Appian custom resource

For most uses cases, configuring high availability for your installation of Appian on Kubernetes is as easy as setting the replicas, podDisruptionBudget, and haExistingClaim fields in your Appian custom resource:

apiVersion: crd.k8s.appian.com/v1beta1
kind: Appian
metadata:
  name: appian
spec:
  zookeeper:
    # https://docs.appian.com/suite/help/latest/High_Availability_and_Distributed_Installations.html#exactly-three-(3)-instances-of-search-server,-data-service,-and-the-internal-messaging-service
    replicas: 3
    podDisruptionBudget:
      minAvailable: 2

  kafka:
    # https://docs.appian.com/suite/help/latest/High_Availability_and_Distributed_Installations.html#exactly-three-(3)-instances-of-search-server,-data-service,-and-the-internal-messaging-service
    replicas: 3
    podDisruptionBudget:
      minAvailable: 2

  searchServer:
    # https://docs.appian.com/suite/help/latest/High_Availability_and_Distributed_Installations.html#exactly-three-(3)-instances-of-search-server,-data-service,-and-the-internal-messaging-service
    replicas: 3
    podDisruptionBudget:
      minAvailable: 2

  dataServer:
    # https://docs.appian.com/suite/help/latest/High_Availability_and_Distributed_Installations.html#exactly-three-(3)-instances-of-search-server,-data-service,-and-the-internal-messaging-service
    replicas: 3
    podDisruptionBudget:
      minAvailable: 2

  serviceManager:
    # Or greater to meet increased demand
    # https://docs.appian.com/suite/help/latest/High_Availability_and_Distributed_Installations.html#at-least-two-(2)-instances-of-the-application-server-and-appian-engines
    # https://docs.appian.com/suite/help/latest/Scaling_Appian.html#add-application-servers
    replicas: 3
    podDisruptionBudget:
      # Or greater, a percentage, or maxUnavailable (as long as at least 1 replica is always available)
      # https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget
      minAvailable: 1

    # USER ACTION REQUIRED - Update to match the name of your ReadWriteMany persistent volume claim
    haExistingClaim: ""

  webapp:
    # Or greater to meet increased demand
    # https://docs.appian.com/suite/help/latest/High_Availability_and_Distributed_Installations.html#at-least-two-(2)-instances-of-the-application-server-and-appian-engines
    # https://docs.appian.com/suite/help/latest/Scaling_Appian.html#add-engine-replicas
    replicas: 3
    podDisruptionBudget:
      # Or greater, a percentage, or maxUnavailable (as long as at least 1 replica is always available)
      # https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget
      minAvailable: 1

    # USER ACTION REQUIRED - Update to match the name of your ReadWriteMany persistent volume claim
    haExistingClaim: ""

For this configuration, the Appian operator will set the replicas field of each component's stateful set to 3 and configure its affinity field with default inter-pod anti-affinity terms such that the component's replica pods run on different nodes and, if possible, in different zones. It will also create pod disruption budgets for each component to protect against certain types of voluntary disruptions.

Note: If you define your own pod anti-affinity for a component, you must include the terms that would otherwise be defaulted by the operator. For more information, see Default inter-pod anti-affinity for Appian custom resources.

Topology changes

The Appian operator's validating admission webhooks disallow most live, topology-related changes to Appian custom resources. The only allowed, live, topology-related changes are:

Changing the number of real-time stores for each Data Server replica (.spec.dataServer.topology.rtsCount). Updating this field requires restarting both Data Server's and Webapp's pods.
Changing the number of Webapp replicas (.spec.webapp.replicas). Updating this field is disallowed both before Appian has started and when .spec.webapp.haExistingClaim hasn't already been set.

Other topology-related changes are only allowed when deleting and recreating Appian custom resources:

Increasing the number of shards of the process analytics and process execution engines (.spec.serviceManager.topology.analyticsExecShardCount).
Changing the number of Service Manager / engine replicas (.spec.serviceManager.replicas / .spec.serviceManager.engineOverrides.<ENGINE>.replicas). Updating this field is disallowed when .spec.serviceManager.haExistingClaim wasn't previously set.
Changing the number of Webapp replicas (.spec.webapp.replicas). Updating this field is disallowed when .spec.webapp.haExistingClaim wasn't previously set.

Note: Other topology-related changes are only allowed when deleting and recreating Appian custom resources and require manual data manipulation and actions as documented in Converting a standalone environment to HA and Converting an HA environment to standalone. Documentation on performing these operations for installations of Appian on Kubernetes will be made available at a later time.

Feedback

Was this page helpful?