Data Server

The data server is a unique database service designed from the ground up for Appian. The architecture of the data server guarantees ACID-compliant transactions while also offering predictable performance for analytical workloads at scale. Data for record types with sync enabled is stored in the data server, along with other application data and metadata, including user-saved filters for a record list.

Architecture

The distributed architecture of the data server creates a fault-tolerant database service, significantly improving the reliability of reads and writes. The data server consists of the following:

  • An appender component for performing background operations
  • A historical store component for writing data
  • Multiple real-time store components for querying data
  • A watchdog component that monitors the health of all components and recovers a component in the event of an isolated failure
  • A data client that runs in the application server and provides an interface for Appian to make requests to the data server

Appender

The appender component performs all background operations in the data server. This includes periodically appending data from memory to the historical store database, generating a new snapshot database to be used by the real-time store components, and performing garbage collection. This component consists of a gateway that schedules and initiates each background operation and an engine that executes each background operation.

Data Client

The data client runs in the application server. It provides a simple interface for the rest of Appian to communicate with the data server.

Historical Store

The historical store (or hs) component executes write requests to the data server. This component consists of a gateway that listens for write requests from the application server and an engine that ensures a given write request is valid. After a write request has been validated, the historical store commits the transaction by forwarding the request to the Internal Messaging Service, which serves as a transaction log for the data server. The effects of a new transaction are distributed from the Internal Messaging Service to each component of the data server, and are periodically appended to the immutable kdb+ database that underlies the historical store. If the component fails before data has been appended to the aforementioned kdb+ database, the historical store achieves its durability guarantees by replaying transactions upon startup from the Internal Messaging Service transaction log.

The data server depends on a running instance of the Internal Messaging Service for write transactions to be committed. If the Internal Messaging Service is unavailable, writes to the data server will fail.

Real-time Store

Each real-time store (or rts) component processes and executes query requests to the data server. When query requests are made from the application server, they are load-balanced across the real-time stores. Each real-time store component consists of a gateway that listens for query requests from the application server and an engine that serves the query request and provides the requested data. All real-time stores share an underlying kdb+ database, called the snapshot database, that is optimized for query performance. This database is generated periodically by the appender component.

The recommended topology for the data server is to run exactly two real-time store components. Currently, running more than two real-time store components is not supported. Self-managed customers who have the data server configured to run with only one real-time store component should update their topology upon upgrade.

Watchdog

The watchdog component runs a Java process that monitors the health of each data server component. In the event of an isolated failure of any other component, the watchdog will attempt to heal the failed component. If the watchdog component itself fails, it will self-heal.

Configuration

Licensing

A valid license (k4.lic) is required to run the data server. See Requesting and Installing a License for information on obtaining and installing a k4.lic license.

Security

Requests to the data server are secured with a security token that's unique to every customer environment. For Appian Cloud customers, this token is generated during the site deployment. For self-managed customers, this token is generated by the configure script. If the token has not been set properly, the data server will not start, which will result in the application server not starting. See Data Server Connection Restrictions for more information.

Topology

The data server topology is specified in the appian-topology.xml file. See Configuring the Data Server for more info.

In order to start the data server, the appian-toplogy.xml file must be identical in both the <APPIAN_HOME>/conf/ directory and the <APPIAN_HOME>/data-server/conf/ directory. If these two files are not in sync, then the data server will fail to start.

File System

The <APPIAN_HOME>/ae/data-server/ directory stores the data server binaries, scripts, configuration details, and data.

The data files are located in the <APPIAN_HOME>/ae/data-server/data/ directory. The hs directory contains the historical store database files and the ss folder contains the snapshot database files. Since the access to data server is latency-sensitive, it is recommended that the data is hosted locally on the machine, rather than a shared drive or an external drive such as shared network-attached storage (NAS). This is true in High Availability (HA) topologies as well, since each data server node stores its own version of the data.

For disaster recovery purposes, the <APPIAN_HOME>/data-server/data/ directory and the Kafka logs should be backed up regularly.

See Internal Data for a comprehensive overview of where Appian persists data on the file system.

Logging

Each component of the data server writes its logs to the <APPIAN_HOME>/logs/data-server/ directory:

  • Appender: appender-engine.log and appender-gateway.log
  • Data Client: client.log
  • Historical Store: hs-engine.log and hs-gateway.log
  • Real-time Store: rts-engine-*.log and rts-gateway-*.log
  • Watchdog: watchdog.log

The log files contain important information about startup and shutdown, process execution, configuration, and errors. In the event of a system issue, these files should be shared with Appian Support. Note that for the real-time store components, the logs are enumerated as rts-engine-0.log, rts-engine-1.log, etc. for each real-time store component.

The <APPIAN_HOME>/logs/data-server/ directory will always be free of any customer business data, and can be safely exported without any risk of exposing sensitive data.

The data server also logs other data, including performance metrics and traces. See Logging for a more comprehensive overview of Appian logs.

Recovery and Monitoring

Watchdog continuously monitors each component of the data server and restores functionality of each component in the event of an isolated failure.

To validate that the data server is running correctly, execute the <APPIAN_HOME>/data-server/bin/health.sh script (health.bat on Windows). The following information is displayed after executing the health script.

For the data server cluster:

  • node_count: Number of nodes in the cluster
  • healthy: true if the data server is functioning normally, false otherwise

For each node in the data server cluster,

  • hostname: Host name of the node
  • ip: IP address of the node
  • healthy: true if the data server is functioning normally on this node, false otherwise

Sizing Guidance

Disk Space

After it is started for the first time in your environment, the data server may take up to 40MB of disk space by default. If a site is not syncing records data, then additional disk space usage from the data server will be negligible.

If a site is syncing record data, the data server will consume disk space proportional to the total amount of data synced from all sources. As a rough estimate, the data server is expected to consume up to 10 times the total amount of raw data. This occurs as a result of various optimizations, including building indices and creating read-only replicas to improve query performance. The exact disk space consumption will vary depending on the data types of record fields. For instance, syncing records with many large strings will consume significantly more disk space.

In order to perform its internal optimizations on the data, the data server requires sufficient overhead disk-space availability. If there is insufficient disk space available for the data server to run its internal optimizations, queries performance will degrade. If more space is provisioned or sufficient space is cleared, the data server will resume its background operations without any other intervention required.

Memory

All combined, the watchdog, hs-gateway, rts-gateway-0, rts-gateway-1, and appender-gateway processes require approximately 200MB of memory to run. If a site is not syncing records data, then additional memory usage from the data server will be negligible.

If a site is syncing records data, there will be significant spikes in memory usage during the sync. Memory spikes will also occur while complex queries and background operations are running. Under normal workloads, these spikes should not exceed 1GB. In a worst-case scenario, memory spikes could result in spikes up to 4GB. For example, storing a synced record with excessively large text columns might result in a higher memory spike.

Starting and Stopping

To start or stop the data server, please refer to Starting and Stopping Appian.

Note that when logging out of Windows, the data server process started by the user using the script will stop. Instead, the data server can be installed as a Windows service and started and stopped using the Windows service management console. For instructions on controlling the data server as a Windows service see Installing data server as a Windows Service.

Troubleshooting

If the data server is unreachable when the application starts up or if the application is started before the data server is running, the application server will not start.

If the data server cannot start and the watchdog.log indicates an issue with the security token, refer to the Data Server Connection Restrictions page for troubleshooting.

If the data server stops running while the application server is running, you will not be able to access, create, update, or delete user-saved filters. Additionally, any synced records will be temporarily inaccessible. See this page for more information on monitoring and troubleshooting synced records.

Open in Github Built: Fri, Oct 22, 2021 (11:11:24 AM)

On This Page

FEEDBACK