Dispatcher Paragon

This document describes how to reconfigure an etcd node or restore a whole etcd cluster that is used by Terminal Server.

Follow this guide only if you were referred to it from a different chapter from the documentation.

Checking a Terminal Server etcd Cluster's Health

Ensure that the majority (more than half) of the ETCD cluster members are up and running. If half or fewer are online, quorum cannot be determined, and the health check will always report cluster is unhealthy.
Connect to a server where Dispatcher Paragon Management Service is installed
1. Start Command line (CMD) and navigate to the "<install_dir>\SPOC\terminalserver\etcd\" folder
2. Check the Terminal Server etcd cluster's health
  1. Run this command:
    etcdctl.exe --endpoint http://10.0.5.217:2377 cluster-health
    Replace 10.0.5.217 with the actual IP of a server where is Spooler Controller still functional or where the configuration will not be changed.
  2. The output will contain a list of Terminal Server etcd cluster members, and the last line will report the Terminal Server etcd cluster's health – it can be:
    1. cluster is healthy
    2. cluster is unhealthy

When the Terminal Server etcd Cluster Is Healthy

If the etcd quorum is not lost, then you can remove the affected node from the etcd cluster configuration and add a new or reconfigured node.

If you do not mind that all Embedded Terminals that are managed by the affected Spooler Controller Group will need to be reinstalled, you can use the when TS etcd cluster is unhealthy procedure, which is much simpler.

Example environment:

Management Service is installed on IP address 10.0.13.148
First Site Server is installed on IP address 10.0.5.217 – etcd member ID 5df1a03e6509526c
Second Site Server is installed on IP address 10.0.5.218 – etcd member ID 4698d36b2a32ca93
Third Site Server is installed on IP address 10.0.5.219 – etcd member ID 54237a9912a7236 (this node will be reinstalled and recovered)

Example Result of a Terminal Server etcd Cluster Health Check

The Third Site Server was reinstalled as an example

failed to check the health of member 54237a9912a7236 on http://10.0.5.219:2377: Get http://10.0.5.219:2377/health: dial
tcp 10.0.5.219:2377: connectex: No connection could be made because the target machine actively refused it.
member 54237a9912a7236 is unreachable: [http://10.0.5.219:2377] are all unreachable
member 4698d36b2a32ca93 is healthy: got healthy result from http://10.0.5.218:2377
member 5df1a03e6509526c is healthy: got healthy result from http://10.0.5.217:2377
cluster is healthy

Stop the affected node and delete its data

Stop the Dispatcher Paragon Terminal Server service on the affected node
Delete the folder TS-XX.XX.XX.XX in "<install_dir>\SPOC\terminalserver\etcd\" on the affected node

Disable authentication of the etcd Cluster

This step is only required if authentication is enabled on the etcd cluster. For more information see Hardening etcd communication security.

Replace the placeholder username and password in this command with valid credentials for the etcd cluster and run it:

etcdctl.exe --endpoint http://10.0.5.217:2377 -u username:password auth disable

Alternatively, instead of disabling and later re-enabling the authentication, you can supply the credentials to every command below by using argument -u username:password.

Remove the Affected Node from the etcd Cluster

The affected node is the one with failed to check the health of member.

Remove the affected node.
1. Run this command:
  etcdctl.exe --endpoint http://10.0.5.217:2377 member remove 54237a9912a7236
  Replace 10.0.5.217 with the actual IP address of a server where Spooler Controller is still functional or where the configuration was not hanged.
  Replace 54237a9912a7236 with the actual etcd member ID of the reinstalled server.
2. The result should look like this:
  Removed member 54237a9912a7236 from cluster
Verify the cluster health again.
1. Run this command:
  etcdctl.exe --endpoint http://10.0.5.217:2377 cluster-health
  Replace 10.0.5.217 with the actual IP address of a server where Spooler Controller is still functional or where the configuration was not changed.
2. The result should look like this:
  member 4698d36b2a32ca93 is healthy: got healthy result from http://10.0.5.218:2377
  member 5df1a03e6509526c is healthy: got healthy result from http://10.0.5.217:2377
  cluster is healthy

Add the Affected Node to the etcd Cluster Again

Add the affected node again.
1. Run this command:
  etcdctl.exe --endpoint http://10.0.5.217:2377 member add TS-10.0.5.219 http://10.0.5.219:2378
  Replace 10.0.5.217 with the actual IP address of a server where Spooler Controller is still functional or where the configuration was not changed.
  Replace 10.0.5.219 (the IP address of the affected server) with the actual IP address of the affected server.
2. The result should look like this:
  Added member named TS-10.0.5.219 with ID 188abf215116e622 to cluster
  
  ETCD_NAME="TS-10.0.5.219"
  ETCD_INITIAL_CLUSTER="TS-10.0.5.219=http://10.0.5.219:2378,TS-10.0.5.218=http://10.0.5.218:2378,TS-10.0.5.217=http://10.0.5.217:2378"
  ETCD_INITIAL_CLUSTER_STATE="existing"
Verify the cluster health again.
1. Run this command:
  etcdctl.exe --endpoint http://10.0.5.217:2377 cluster-health
  Replace 10.0.5.217 with the actual IP address of a server where Spooler Controller is still functional or where the configuration was not changed.
2. The result should look like this:
  member 188abf215116e622 is unreachable: no available published client urls
  member 4698d36b2a32ca93 is healthy: got healthy result from http://10.0.5.218:2377
  member 5df1a03e6509526c is healthy: got healthy result from http://10.0.5.217:2377
  cluster is healthy

Connect to the Affected Node

Start Command line (CMD) and move to the "<install_dir>\SPOC\terminalserver\etcd\" folder.
Do not use PowerShell!
Run etcd manually to create the proper etcd configuration:
This is needed only once after the changes.
etcd64.exe -name TS-10.0.5.219 -data-dir "c:\DispatcherParagon\SPOC\terminalserver\etcd\TS-10.0.5.219" -initial-advertise-peer-urls http://10.0.5.219:2378 -listen-peer-urls http://10.0.5.219:2378 -listen-client-urls http://10.0.5.219:2377,http://127.0.0.1:2377 -advertise-client-urls http://10.0.5.219:2377 -initial-cluster-token safeq-cluster -initial-cluster TS-10.0.5.219=http://10.0.5.219:2378,TS-10.0.5.218=http://10.0.5.218:2378,TS-10.0.5.217=http://10.0.5.217:2378 -initial-cluster-state existing
Replace 10.0.5.219 (the IP address of the affected server) with the actual IP address of the affected server.
Replace -initial-cluster values with ETCD_INITIAL_CLUSTER values that were shown just after adding the affected node back in point 2.4. b.
The command will not exist, it will just keep running showing various messages. Wait till there is an information that affected node was published and continue with the next step:
<datetime> I | etcdserver: published {Name:TS-10.0.5.219 ClientURLs:[http://10.0.5.219 :2377]} to cluster fb81dcd206a7a785
Start the Dispatcher Paragon Terminal Server service.
- at this point the previously launched command in CMD will exit
Verify that "Offline storage refreshed" can be seen in the Terminal Server log at least at ten minutes after the start of Terminal Server.

Connect to a Server Where Dispatcher Paragon Management Service Is Installed and Verify the etcd Cluster's Health Again.

Start Command line (CMD) and navigate to the "<install_dir>\SPOC\terminalserver\etcd\" folder.
1. Run this command:
  etcdctl.exe --endpoint http://10.0.5.217:2377 cluster-health
  Replace 10.0.5.217 with the actual IP of a server where is Spooler Controller still functional or where the configuration was not changed.
2. The result should look like this:
  member 188abf215116e622 is healthy: got healthy result from http://10.0.5.219:2377
  member 4698d36b2a32ca93 is healthy: got healthy result from http://10.0.5.218:2377
  member 5df1a03e6509526c is healthy: got healthy result from http://10.0.5.217:2377
  cluster is healthy
3. The node is now reconfigured.

Enable authentication of the etcd Cluster

This step is only required if authentication was previously enabled on the etcd cluster and you disabled it as part of step 2.3.

Run this commands where placeholders username and password need to be replaced with valid credentials:

etcdctl.exe --endpoint http://10.0.5.217:2377 auth enable

etcdctl.exe --endpoint http://10.0.5.217:2377 -u root:password role remove guest

When the Terminal Server etcd Cluster Is Unhealthy

Unfortunately, you cannot add or remove nodes if the Terminal Server etcd quorum was lost. You can only recreate the Terminal Server etcd cluster again.

All data stored inside the Terminal Server etcd cluster will be lost, so you will need to reinstall all affected Dispatcher Paragon Embedded Terminals after cluster recreation.

Stop the Dispatcher Paragon Terminal Server service on all nodes in the affected Spooler Controller Group.
1. Back up the folder TS-XX.XX.XX.XX in "<install_dir>\SPOC\terminalserver\etcd\" on all nodes (if present).
2. Delete the folder TS-XX.XX.XX.XX in "<install_dir>\SPOC\terminalserver\etcd\" on all nodes (if present).
Start the Dispatcher Paragon Terminal Server service on all nodes.
1. Verify that "Offline storage refreshed" can be seen in the Terminal Server log after the start of Terminal Server (it might take up to 10 minutes before this record appears)
Verify that ETCD cluster is healthy using etcdctl.exe.
Reinstall all Dispatcher Paragon Embedded Terminals that are managed by the affected Spooler Controller Group.
If etcd authentication is enabled (property enableEtcdApiAuth set to Enabled in the management interface Settings), then follow the Hardening etcd communication security guide to Enable authentication of the etcd API.