Special Features

Synchronizing Supvisors instances

The overall design of Supvisors is to add a Supvisors plugin into every Supervisor instance, and to make them share the events generated by Supervisor with each other.

To that end, a communication protocol needs to be put in place place between all Supvisors instances. Given the objectives of Supvisors, a polling mechanism doesn’t fit. All Supervisor events have to be processed, so an event-driven protocol is naturally considered.

Communication protocols

2 internal communication protocols are used in Supvisors.

XML-RPC publication

The main protocol implemented in Supvisors is based on the XML-RPC protocol provided by Supervisor. It is used to share the local events to the other Supvisors instances.

The XML-RPC protocol was originally discarded because it led easily to deadlocks when involving requests to multiple Supervisor instances. So a first implementation has been done based on a PyZmq PUB-SUB. It then has been replaced by a custom implementation to limit the mandatory dependencies and to have a better control over the underlying threads and sockets. In both case, the events were sent over a TCP socket and posted sequentially to the local Supervisor using a supervisor.sendRemoteCommEvent XML-RPC.

Finally, with a proper understanding of the limitations brought by the XML-RPC implementation and its non-thread-safe nature, the Supvisors design has been simplified so that the local events and requests are processed in threads dedicated to each Supervisor proxy.

The TICK events are sent to all Supvisors instances discovered of declared in the supvisors_list option of the [supvisors] section in the Supervisor configuration file, with the exception of ISOLATED instances. As soon as the Supvisors instance is CHECKED, all other events are shared.

UDP Multicast

The second protocol implemented in Supvisors is based on an UDP Multicast. It relies on the following options in the [supvisors] section in the Supervisor configuration file:

multicast_group ;

multicast_interface ;

multicast_ttl.

With this protocol, the Supvisors instances could be unknown at start-up and will be discovered on-the-fly. The UDP Multicast group is used to exchange ticks. Upon reception of a tick coming from an unknown Supvisors instance, the local Supvisors instance adds the remote Supvisors instance into its internal model and opens the TCP connections with it.

Note

Although it has been considered at some point, the idea od having Supvisors working only in UDP Multicast, without the TCP Publish / Subscribe, has been discarded. Supvisors cannot afford to lose events or to receive them in an inappropriate sequence.

Principles of Synchronization

The INITIALIZATION state of Supvisors is used as a synchronization phase so that all Supvisors instances are mutually aware of each other.

The following options defined in the rpcinterface extension point of the Supervisor configuration file are particularly used for synchronizing multiple instances of Supervisor:

supvisors_list ;

synchro_options ;

synchro_timeout ;

core_identifiers ;

auto_fence.

Common part

Once started, all Supvisors instances publish the events received from Supervisor, especially the TICK events that are triggered every 5 seconds.

At the beginning, all Supvisors instances are declared in an UNKNOWN state. When the first TICK event is received from a remote Supvisors instance, a hand-shake is performed between the 2 Supvisors instances. The local Supvisors instance:

sets the remote Supvisors instance state to CHECKING ;

performs a supvisors.get_instance_info(local_identifier) XML-RPC to the remote Supvisors instance in order to know how the local Supvisors instance is perceived by the remote Supvisors instance.

At this stage, 2 possibilities:

the local Supvisors instance is seen as ISOLATED by the remote instance:

the remote Supvisors instance status is then reciprocally set to ISOLATED ;

the local Supvisors instance is NOT seen as ISOLATED by the remote instance:

a supervisor.getAllProcessInfo() XML-RPC is requested to the remote instance ;

the processes information is loaded into the internal data structure ;

the remote Supvisors instance status is set to CHECKED, then RUNNING.

What happens next will depend on the conditions selected in the synchro_options option.

Whatever the number of available Supvisors instances, Supvisors elects a Master among the active Supvisors instances and enters the DISTRIBUTION state to start automatically the applications.

By default, the Supvisors Master instance is the Supvisors instance having the smallest deduced name among all the active Supvisors instances, unless the attribute core_identifiers is used. In the latter case, candidates are taken from this list in priority.

Important

About late Supvisors instances

When a Supvisors instance is started while the others are already in OPERATION. During the hand-shake, the local Supvisors instance gets the Master identified by the remote Supvisors. That confirms that the local Supvisors instance is a late starter and thus the local Supvisors instance adopts this Master too and skips the synchronization phase.

`STRICT` option

When the STRICT option is selected, the synchronization is complete when all Supvisors instances declared in the supvisors_list option are marked as RUNNING. This excludes any Supvisors instance that has been added to Supvisors in discovery mode.

This option prevails over the LIST and USER options if combined with them.

`LIST` option

When the LIST option is selected, the synchronization is complete when all known Supvisors instances are marked as RUNNING. This includes the Supvisors instances declared in the supvisors_list option AND the Supvisors instances that has been added to Supvisors in discovery mode.

This option prevails over the USER options if combined with it.

`TIMEOUT` option

It may happen that some declared Supvisors instances do not publish (very late starting, no starting at all, system down, network down, etc).

When the TIMEOUT option is selected, each Supvisors instance waits for synchro_timeout seconds to give a chance to all other instances to publish. When this delay is exceeded, all the Supvisors instances that are not identified as RUNNING or ISOLATED are set to:

SILENT if Auto-Fencing is not activated ;

ISOLATED if Auto-Fencing is activated.

This option prevails over all other synchro_options options if combined with them.

`CORE` option

Another possibility is when it is predictable that some Supvisors instances may be started later. For example, the pool of nodes may include servers that will always be started from the very beginning and consoles that may be started only on demand.

In this case, it would be a pity to always wait for synchro_timeout seconds. That’s why the core_identifiers attribute has been introduced so that the synchronization phase is considered completed when a subset of the Supvisors instances declared in supvisors_list are RUNNING.

This option prevails over LIST and USER options if combined with them.

`USER` option

This option is useful in a context where Supvisors is running in a system made up of many nodes that may be started on a random basis and where core Supvisors instances cannot be easily identified.

When the USER option is selected, it allows the user to put an end to the synchronization phase when the set of running Supvisors instances is suitable to the user.

This action can be performed through the Supvisors end_sync XML-RPC (via code, supervisorctl or the Supvisors Web UI). This XML-RPC has an optional parameter that allows the user to select the Supvisors Master instance. If not set, the default election mechanism applies.

Auto-Fencing

Auto-fencing is applied when the auto_fence option of the rpcinterface extension point is set. It takes place when one of the Supvisors instances is seen as inactive (crash, system power down, network failure) from the other Supvisors instances.

In this case, the running Supvisors instances disconnect the corresponding URL from their subscription socket. The Supvisors instance is marked as ISOLATED and, in accordance with the program rules defined, Supvisors may restart somewhere else the processes that were eventually running in that Supvisors instance.

If the incriminated Supvisors instance is restarted, the isolation doesn’t prevent the new Supvisors instance to receive events from the other instances that have isolated it. Indeed, it has not been considered so far to filter the subscribers from the Publish side.

That’s why the hand-shake is performed in Synchronizing Supvisors instances. Each newly arrived Supvisors instance asks to the others if it has been previously isolated before taking into account the incoming events.

In the case of a network failure, the same mechanism is of course applied on the other side. Here comes the premises of a split-brain syndrome, as it leads to have 2 separate and identical sets of applications.

If the network failure is fixed, both sets of Supvisors are still running but do not communicate between them.

Attention

Supvisors does NOT isolate the nodes at the Operating System level, so that when the incriminated nodes become active again, it is still possible to perform network requests between all nodes, despite the Supvisors instances do not communicate anymore.

Similarly, it is outside the scope of Supvisors to isolate the communication at application level. It is the user’s responsibility to isolate his applications.

Extra Arguments

Supervisor users have requested the possibility to add extra arguments to the command line of a program without having to update and reload the program configuration in Supervisor.

#1023 - Pass arguments to program when starting a job?

Indeed, the applicative context is evolving at runtime and it may be quite useful to give some information to the new process (options, path, URL of a server, URL of a display, etc), especially when dealing with distributed applications.

Supvisors introduces new XML-RPCs that are capable of taking into account extra arguments that are passed to the command line before the process is started:

supvisors.start_args: start a process in the local Supvisors instance ;

supvisors.start_process: start a process using a starting strategy.

Note

The extra arguments of the program are shared by all Supvisors instances. Once used, they are published through a Supvisors internal event and are stored directly into the Supervisor internal configuration of the programs.

In other words, considering 2 Supvisors instances A and B, a process that is started in Supvisors instance A with extra arguments and configured to restart on node crash (refer to Running Failure strategy). if the Supvisors instance A crashes (or simply becomes unreachable), the process will be restarted in the Supvisors instance B with the same extra arguments.

Attention

A limitation however: the extra arguments are reset each time a new Supvisors instance connects to the other ones, either because it has started later or because it has been disconnected for a while due to a network issue.

Starting strategy

Supvisors provides a means to start a process without telling explicitly where it has to be started, and in accordance with the rules defined for this program.

Choosing a Supvisors instance

The following rules are applicable whatever the chosen strategy:

the process must not be already in a running state in a broad sense, i.e. RUNNING, STARTING or BACKOFF ;

the process must be known to the Supervisor of the targeted Supvisors instance ;

the related program must be enabled in the targeted Supvisors instance ;

the targeted Supvisors instance must be RUNNING ;

the targeted Supvisors instance must be allowed in the identifiers rule of the process ;

the load of the targeted node where multiple Supvisors instances may be running must not exceed 100% when adding the expected_loading of the program to be started.

The load of a Supvisors instance is defined as the sum of the expected_loading of each process running in this Supvisors instance.

The load of a node is defined as the sum of the loads of the Supvisors instances that are running on this node.

When applying the CONFIG strategy, Supvisors chooses the first Supvisors instance available in the supvisors_list.

TODO: discovery

When applying the LESS_LOADED strategy, Supvisors chooses the Supvisors instance in the supvisors_list having the lowest load. The aim is to distribute the process load among the available Supvisors instances.

When applying the MOST_LOADED strategy, Supvisors chooses the Supvisors instance in the supvisors_list having the greatest load. The aim is to maximize the loading of a Supvisors instance before starting to load another Supvisors instance. This strategy is more interesting when the resources are limited.

When applying the LESS_LOADED_NODE strategy, Supvisors chooses the Supvisors instance in the supvisors_list having the lowest load on the node having the lowest load.

When applying the MOST_LOADED_NODE strategy, Supvisors chooses the Supvisors instance in the supvisors_list having the greatest load on the node having the greatest load.

When applying the LOCAL strategy, Supvisors chooses the local Supvisors instance. A typical use case is to start an HCI application on a given console, while other applications / services may be distributed over other nodes.

Attention

A consequence of choosing the LOCAL strategy as the default starting_strategy in the rpcinterface extension point is that all programs will be started on the Supvisors Master instance.

Note

When a single Supvisors instance is running on each node, LESS_LOADED_NODE and MOST_LOADED_NODE are strictly equivalent to LESS_LOADED and MOST_LOADED.

Starting a process

The internal Starter of Supvisors applies the following logic to start a process:

if the process is stopped:

choose a Supvisors instance for the process in accordance with the rules defined in the previous section
perform a supvisors.start_args(namespec) XML-RPC to the chosen Supvisors instance

This single job is considered completed when:

a RUNNING event is received and the wait_exit rule is not set for this process ;

an EXITED event is received with an expected exit code and the wait_exit rule is set for this process ;

an error is encountered (FATAL event, EXITED event with an unexpected exit code) ;

no STARTING event has been received 2 ticks after the XML-RPC ;

no RUNNING event has been received X+2 ticks after the XML-RPC, X corresponding to the number of ticks needed to cover the startsecs seconds of the program definition in the Supvisors instance where the process has been requested to start.

This principle is used for starting a single process using a supvisors.start_process XML-RPC.

Attention

About using the wait_exit rule

If the process is expected to exit and does not exit, it will block the Starter until Supvisors is restarted.

Starting an application

The application start sequence is re-evaluated every time a new Supvisors instance becomes active in Supvisors. Indeed, as explained above, the internal data structure is updated with the programs configured in the new Supervisor instance and this may have an impact on the application start sequence.

The start sequence corresponds to a dictionary where:

the keys correspond to the list of start_sequence values defined in the program rules of the application ;

the value associated to a key contains the list of programs having this key as start_sequence.

Hint

The logic applied here is an answer to the following Supervisor unresolved issues:

#122 - supervisord Starts All Processes at the Same Time

#456 - Add the ability to set different “restart policies” on process workers

Important

Only the Managed applications can have a start sequence, i.e. only those that are declared in the Supvisors Supvisors’ Rules File.

The programs having a start_sequence lower or equal to 0 are not considered in the start sequence, as they are not meant to be automatically started.

The internal Starter of Supvisors applies the following principle to start an application:

while application start sequence is not empty:

pop the process list having the lower (strictly positive) start_sequence

for each process in process list:
apply Starting a process

wait for the jobs to complete

This principle is used for starting a single application using a supvisors.start_application XML-RPC.

Starting all applications

When entering the DISTRIBUTION state, all Supvisors instances evaluate the global start sequence using the start_sequence rule configured for the applications and processes.

The global start sequence corresponds to a dictionary where:

the keys correspond to the list of start_sequence values defined in the application rules ;

the value associated to a key is the list of application start sequences whose applications have this key as start_sequence.

The Supvisors Master instance starts the applications using the global start sequence. The following pseudo-code explains the logic used:

while global start sequence is not empty:

pop the application list having the lower (strictly positive) start_sequence

for each application in application list:
apply Starting an application

wait for the jobs to complete

Note

The applications having a start_sequence lower or equal to 0 are not considered, as they are not meant to be automatically started.

Important

When leaving the DISTRIBUTION state, it may happen that some applications are not started properly due to missing relevant Supvisors instances.

When a Supvisors instance is started later and is authorized in the Supvisors ensemble, Supvisors transitions back to the DISTRIBUTION state and tries to repair such applications. The applications are not restarted. Only the stopped processes are considered.

May the new Supvisors instance arrive during a DISTRIBUTION or CONCILIATION phase, the transition to the DISTRIBUTION state is deferred until the current distribution or conciliation jobs are completed. It has been chosen NOT to transition back to the INITIALIZATION state to avoid a new synchronization phase.

Starting Failure strategy

When an application is starting, it may happen that any of its programs cannot be started due to various reasons:

the program command line is wrong ;

third parties are missing ;

none of the Supvisors instances defined in the identifiers of the program rules are started ;

the applicable Supvisors instances are already too much loaded ;

etc.

Supvisors uses the starting_failure_strategy option of the rules file to determine the behavior to apply when a required process cannot be started. Programs having the required set to False are not considered as their absence is minor by definition.

Possible values are:

ABORT: Abort the application starting ;

STOP: Stop the application ;

CONTINUE: Skip the failure and continue the application starting.

Running Failure strategy

The autorestart option of Supervisor may be used to restart automatically a process that has crashed or has exited unexpectedly (or not). However, when the node itself crashes or becomes unreachable, the other Supervisor instances cannot do anything about that.

Supvisors uses the running_failure_strategy option of the rules file to warm restart a process that was running on a node that has crashed, in accordance with the default starting_strategy set in the rpcinterface extension point and with the supvisors_list program rules set in the Supvisors’ Rules File.

This option can be also used to stop or restart the whole application after a process crash. Indeed, it may happen that some applications cannot survive if one of their processes is just restarted.

Possible values are:

CONTINUE: Skip the failure and the application keeps running ;

RESTART_PROCESS: Restart the lost process on another Supvisors instance ;

STOP_APPLICATION: Stop the application ;

RESTART_APPLICATION: Restart the application ;

SHUTDOWN: Shutdown Supvisors (i.e. all Supvisors instances) ;

RESTART: Restart Supvisors (i.e. all Supvisors instances).

Important

The RESTART_PROCESS is NOT intended to replace the Supervisor autorestart for the local Supvisors instance. Provided a program definition where autorestart is set to false in the Supervisor configuration and where the running_failure_strategy option is set to RESTART_PROCESS in the Supvisors rules file, if the process crashes, Supvisors will NOT restart the process.

Note

Given that this option is set on the program rules, program strategies within an application may be incompatible in the event of multiple failures. That’s why priorities have been set on this strategy. STOP_APPLICATION supersedes RESTART_APPLICATION, which itself supersedes RESTART_PROCESS and finally CONTINUE. So if a program with the RESTART_APPLICATION option fails at the same time that a program of the same application with the STOP_APPLICATION option, only the STOP_APPLICATION will be applied.

When the RESTART_PROCESS strategy is evaluated, if the application is fully stopped - supposedly because of the failure -, Supvisors will promote the RESTART_PROCESS into RESTART_APPLICATION. The idea is to benefit from a full start sequence at application level rather than uncorrelated program restarts in the event of multiple failures within the same application.

Hint

The STOP_APPLICATION strategy provides an answer to the following Supervisor request:

#874 - Bring down one process when other process gets killed in a group

Hint

The SHUTDOWN strategy provides an answer to the following Supervisor request:

#712 - shutdown supervisord once one of the programs is killed

Stopping strategy

Supvisors provides a means to stop a process without telling explicitly where it is running.

Stopping a process

The internal Stopper of Supvisors applies the following logic to stop a process:

if the process is running:

perform a supervisor.stopProcess(namespec) XML-RPC to the Supervisor instances where the process is running

This single job is considered completed when:

a STOPPED event is received for this process ;

an error is encountered (FATAL event, EXITED event whatever the exit code) ;

no STOPPING event has been received 2 ticks after the XML-RPC ;

no STOPPED event has been received X+2 ticks after the XML-RPC, X corresponding to the number of ticks needed to cover the stopwaitsecs seconds of the program definition in the Supvisors instance where the process has been requested to stop.

This principle is used for stopping a single process using a supvisors.stop_process XML-RPC.

Stopping an application

The application stop sequence is defined at the same moment than the application start sequence. It corresponds to a dictionary where:

the keys correspond to the list of stop_sequence values defined in the program rules of the application ;

the value associated to a key is the list of programs having this key as stop_sequence.

Note

The Unmanaged applications do have a stop sequence. All their programs have the default stop_sequence set to 0.

Hint

The logic applied here is an answer to the following Supervisor unresolved issue:

#520 - allow a program to wait for another to stop before being stopped?

Hint

All the programs sharing the same stop_sequence are stopped simultaneously, which solves some of the requests described in the following Supervisor unresolved issue:

#723 - Restart waits for all processes to stop before starting any

The internal Stopper of Supvisors applies the following algorithm to stop an application:

while application stop sequence is not empty:

pop the process list having the greater stop_sequence

for each process in process list:
apply Stopping a process

wait for the jobs to complete

This principle is used for stopping a single application using a supvisors.stop_application XML-RPC.

Stopping all applications

The applications are stopped when Supvisors is requested to restart or shut down.

When entering the DISTRIBUTION state, each Supvisors instance evaluates also the global stop sequence using the stop_sequence rule configured for the applications and processes.

The global stop sequence corresponds to a dictionary where:

the keys correspond to the list of stop_sequence values defined in the application rules ;

the value associated to a key is the list of application stop sequences whose applications have this key as stop_sequence.

Upon reception of the supvisors.restart or supvisors.shutdown, the Supvisors instance uses the global stop sequence to stop all the running applications in the defined order. The following pseudo-code explains the logic used:

while global stop sequence is not empty:

pop the application list having the greater stop_sequence

for each application in application list:
apply Stopping an application

wait for the jobs to complete

Conciliation

Supvisors is designed so that there should be only one instance of the same process running on a set of nodes, although all of them may have the capability to start it.

Nevertheless, it is still likely to happen in a few cases:

using a request to Supervisor itself (through Web UI, supervisorctl, XML-RPC) ;

upon a network failure.

Attention

In the event of a network failure - let’s say a network cable is unplugged -, if the auto_fence option is not set, a Supvisors instance running on the isolated node will be set to SILENT instead of ISOLATED and its URL will not disconnected from the subscriber socket.

Depending on the rules set, this situation may lead Supvisors to warm restart the processes that were running in the lost Supvisors instance onto other Supvisors instances.

When the network failure is fixed, Supvisors will likely have to deal with a bunch of duplicated applications and processes.

When such a conflict is detected, Supvisors enters in the CONCILIATION state. Depending on the conciliation_strategy option set in the rpcinterface extension point, it applies a strategy to be rid of all duplicates:

SENICIDE

When applying the SENICIDE strategy, Supvisors keeps the youngest process, i.e. the process that has been started the most recently, and stops all the others.

INFANTICIDE

When applying the INFANTICIDE strategy, Supvisors keeps the oldest process and stops all the others.

USER

That’s the easy one. When applying the USER strategy, Supvisors just waits for a third party to solve the conflicts using Web UI, supervisorctl, XML-RPC, process signals, or any other solution.

STOP

When applying the STOP strategy, Supvisors stops all conflicting processes, which may lead the corresponding applications to a degraded state.

RESTART

When applying the RESTART strategy, Supvisors stops all conflicting processes and restarts a new one.

RUNNING_FAILURE

When applying the RUNNING_FAILURE strategy, Supvisors stops all conflicting processes and deals with the conflict as it would deal with a running failure, depending on the strategy defined for the process. So, after the conflicting processes are all stopped, Supvisors may restart the process, stop the application, restart the application or do nothing at all.

Supvisors leaves the CONCILIATION state when all conflicts are conciliated.

Special Features

Synchronizing Supvisors instances

Communication protocols

XML-RPC publication

UDP Multicast

Principles of Synchronization

Common part

STRICT option

LIST option

TIMEOUT option

CORE option

USER option

Auto-Fencing

Extra Arguments

Starting strategy

Choosing a Supvisors instance

Starting a process

Starting an application

Starting all applications

Starting Failure strategy

Running Failure strategy

Stopping strategy

Stopping a process

Stopping an application

Stopping all applications

Conciliation

`STRICT` option

`LIST` option

`TIMEOUT` option

`CORE` option

`USER` option