Troubleshooting
This section deals with frequent problems that could happen when experiencing Supvisors for the first time.
It is assumed that Supervisor is operational without the Supvisors plugin.
Supvisors plugin cannot be resolved
[bash] > supervisord -n
Error: supvisors.plugin:make_supvisors_rpcinterface cannot be resolved within [rpcinterface:supvisors]
For help, use /usr/local/bin/supervisord -h
This error happens in a early stage of Supervisor startup, when the plugin factory is called.
Just in case, make sure that supvisors.plugin:make_supvisors_rpcinterface has been copied correctly.
Otherwise, this is the symptom of an improper Supvisors installation.
Important
Supvisors requires a Python version greater than 3.6 and must be available from the Python interpreter used by Supervisor’s supervisord command.
Upon any doubt, check the Python version and start the interpreter in a terminal to test the import of Supvisors:
[bash] > which supervisord
/usr/local/bin/supervisord
[bash] > head -1 /usr/local/bin/supervisord
#!/usr/bin/python
[bash] > /usr/bin/python --version
Python 3.9.6
[bash] > /usr/bin/python
Python 3.9.6 (default, Nov 9 2021, 13:31:27)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import supvisors
>>>
If an ImportError is raised, here follow some possible causes:
Wrong pip program
Issue: Supvisors may have been installed with a pip command corresponding to another Python version.
Solution: Install Supvisors using the pip command whose version corresponds to the Python version used by Supervisor.
[bash] > /usr/bin/python --version
Python 3.9.6
[bash] > /usr/bin/pip --version
pip 20.2.4 from /usr/lib/python3.9/site-packages/pip (python 3.9)
Local Supvisors not in PYTHONPATH
Issue: In the case where Supvisors is not installed in the Python packages, but used from a local
directory, the PYTHONPATH environment variable may not include the Supvisors location.
Solution: Set the Supvisors location in the PYTHONPATH environment variable before starting Supervisor.
[bash] > ls -d ~/python/supvisors/supvisors/__init__.py
/home/user/python/my_packages/supvisors/__init__.py
[bash] > export PYTHONPATH=/home/user/python/my_packages:$PYTHONPATH
[bash] > supervisord
Incorrect UNIX permissions
Issue: The user cannot read the Supvisors files installed (via pip or pointed by PYTHONPATH).
Solution: Update the UNIX permissions of the Supvisors package so that its files can be read by any user.
[user bash] > ls -l /usr/local/lib/python3.9/site-packages/supvisors/__init__.py
-rw-------. 1 root root 56 Feb 28 2022 /usr/local/lib/python3.9/site-packages/supvisors/__init__.py
[user bash] > su -
Password:
[root bash] > chmod -R a+r /usr/local/lib/python3.9/site-packages/supvisors
[root bash] > exit
exit
[user bash] > ls -l /usr/local/lib/python3.9/site-packages/supvisors/__init__.py
-rw-r--r--. 1 root root 56 Feb 28 2022 /usr/local/lib/python3.9/site-packages/supvisors/__init__.py
[bash] > supervisord
Could not make supvisors rpc interface
At this stage, there must be some log traces available.
If the startup of Supervisor ends with the following lines, there must be an issue with the Supvisors configuration,
and more particularly with the option supvisors_list.
[bash] > supervisord -n
[...]
2022-11-17 17:47:15,101 INFO RPC interface 'supervisor' initialized
[...]
Error: Could not make supvisors rpc interface
For help, use /usr/local/bin/supervisord -h
There are 4 main causes to that.
No inet_http_server
Issue: Supervisor is configured without any inet_http_server.
Solution: Configure Supervisor with a inet_http_server.
The aim of Supvisors is to deal with applications distributed over several hosts so it cannot work with a Supervisor
configured with an unix_http_server.
Based on the the following Supvisors configuration including only an unix_http_server:
[unix_http_server]
file=/tmp/supervisor.sock
[rpcinterface:supvisors]
supervisor.rpcinterface_factory = supvisors.plugin:make_supvisors_rpcinterface
If Supervisor is started from the local host, the following log traces will be displayed:
[bash] > supervisord -n
[...]
2022-11-18 15:21:20,166 INFO RPC interface 'supervisor' initialized
2022-11-18 15:21:20,184;WARN;Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/supervisor-4.2.4-py3.9.egg/supervisor/http.py", line 821, in make_http_servers
inst = factory(supervisord, **d)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/plugin.py", line 128, in make_supvisors_rpcinterface
supervisord.supvisors = Supvisors(supervisord, **config)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/initializer.py", line 94, in __init__
self.supervisor_data = SupervisorData(self, supervisor)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/supervisordata.py", line 94, in __init__
raise ValueError(f'Supervisor MUST be configured using inet_http_server: {supervisord.options.configfile}')
ValueError: Supervisor MUST be configured using inet_http_server: etc/supervisord.conf
Error: Could not make supvisors rpc interface
For help, use /usr/local/bin/supervisord -h
Incorrect Host name or IP address
Issue: The option supvisors_list includes a host name or an IP address that is unknown to the network
configuration of the local host.
Solution: Either fix the host name / IP address, or update your network configuration or remove the entry.
Based on the the following Supvisors configuration including an unknown host name:
[rpcinterface:supvisors]
supervisor.rpcinterface_factory = supvisors.plugin:make_supvisors_rpcinterface
supvisors_list = unknown_host,rocky51,rocky52
If Supervisor is started from the hosts rocky51 or rocky52, the following log traces will be displayed:
[bash] > supervisord -n
[...]
2022-11-17 17:47:15,120;ERRO;get_node_names: unknown host unknown_host
2022-11-17 18:43:52,834;CRIT;Wrong Supvisors configuration (supvisors_list)
2022-11-17 18:42:24,352;WARN;Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/supervisor-4.2.4-py3.9.egg/supervisor/http.py", line 821, in make_http_servers
inst = factory(supervisord, **d)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/plugin.py", line 128, in make_supvisors_rpcinterface
supervisord.supvisors = Supvisors(supervisord, **config)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/initializer.py", line 98, in __init__
self.supvisors_mapper.configure(self.options.supvisors_list, self.options.core_identifiers)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/supvisorsmapper.py", line 236, in configure
raise ValueError(message)
ValueError: could not define a Supvisors identification from "unknown_host"
Error: Could not make supvisors rpc interface
For help, use /usr/local/bin/supervisord -h
In the event where the host name or IP address seems legit to the user, here are a few explanations about how Supvisors
identifies the local Supervisor instance among the supvisors_list elements:
Supvisors extracts the
host_namefrom the<identifier>host_name:http_port:internal_portelement and stores the host name and aliases returned by thesocket.gethostbyaddrfunction.Supvisors considers that the local Supervisor instance is the element whose fully-qualified domain name, as returned by the
socket.getfqdnfunction, belongs to the list of host name and aliases.
From the example below, the values rocky51.cliche.bzh, rocky51 and 192.168.1.65 are valid host_name
elements to be used in supvisors_list.
>>> from socket import gethostbyaddr, getfqdn
>>> gethostbyaddr('rocky51.cliche.bzh')
('rocky51.cliche.bzh', ['rocky51'], ['192.168.1.65'])
>>> gethostbyaddr('rocky51')
('rocky51.cliche.bzh', ['rocky51'], ['192.168.1.65'])
>>> gethostbyaddr('192.168.1.65')
('rocky51.cliche.bzh', ['rocky51'], ['192.168.1.65'])
>>> getfqdn()
'rocky51.cliche.bzh'
Could not find local the local Supvisors
Issue: The option supvisors_list does not include any host name or IP address corresponding to the local host.
Solution: Either add the local host to the list, or avoid to start Supervisor from the local host using this configuration.
Based on the the following Supvisors configuration including 2 host names:
[rpcinterface:supvisors]
supervisor.rpcinterface_factory = supvisors.plugin:make_supvisors_rpcinterface
supvisors_list = rocky52,rocky53
if Supervisor is started from a host that is not present in this list, the following traces will be displayed:
[bash] > supervisord -n
[...]
2022-11-17 18:30:33,863;INFO;SupvisorsMapper.configure: identifiers=['rocky52', 'rocky53']
2022-11-17 18:30:33,863;ERRO;SupvisorsMapper.find_local_identifier: could not find local the local Supvisors in supvisors_list
2022-11-17 18:44:45,571;CRIT;Wrong Supvisors configuration (supvisors_list)
2022-11-17 18:44:45,572;WARN;Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/supervisor-4.2.4-py3.9.egg/supervisor/http.py", line 821, in make_http_servers
inst = factory(supervisord, **d)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/plugin.py", line 128, in make_supvisors_rpcinterface
supervisord.supvisors = Supvisors(supervisord, **config)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/initializer.py", line 98, in __init__
self.supvisors_mapper.configure(self.options.supvisors_list, self.options.core_identifiers)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/supvisorsmapper.py", line 240, in configure
self.find_local_identifier()
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/supvisorsmapper.py", line 269, in find_local_identifier
raise ValueError(message)
ValueError: could not find the local Supvisors in supvisors_list
Error: Could not make supvisors rpc interface
For help, use /usr/local/bin/supervisord -h
Multiple candidates for the local Supvisors
Issue: This happens when multiple Supvisors instances have to be started on the same host. In that case, the option
supvisors_list includes at least 2 host names or IP addresses referring to the same host and that have not been
qualified using a Supervisor identification.
Solution: Use the Supervisor identification option and apply it to the supvisors_list.
Based on the the following Supvisors configuration including a host name rocky51 and its IP address
192.168.1.70:
[rpcinterface:supvisors]
supervisor.rpcinterface_factory = supvisors.plugin:make_supvisors_rpcinterface
supvisors_list = rocky51,rocky52,192.168.1.70:30000
if Supervisor is started from the host rocky51, the following traces will be displayed:
[bash] > supervisord -n
[...]
2022-11-18 10:42:25,931;INFO;SupvisorsMapper.configure: identifiers=['rocky51', 'rocky52', '192.168.1.70:30000']
2022-11-18 10:42:25,931;ERRO;SupvisorsMapper.find_local_identifier: multiple candidates for the local Supvisors: ['rocky51', '192.168.1.70:30000']
2022-11-18 10:42:25,931;CRIT;Wrong Supvisors configuration (supvisors_list)
2022-11-18 10:42:25,940;WARN;Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/supervisor-4.2.4-py3.9.egg/supervisor/http.py", line 821, in make_http_servers
inst = factory(supervisord, **d)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/plugin.py", line 128, in make_supvisors_rpcinterface
supervisord.supvisors = Supvisors(supervisord, **config)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/initializer.py", line 98, in __init__
self.supvisors_mapper.configure(self.options.supvisors_list, self.options.core_identifiers)
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/supvisorsmapper.py", line 240, in configure
self.find_local_identifier()
File "/usr/local/lib/python3.9/site-packages/supvisors-0.15-py3.9.egg/supvisors/supvisorsmapper.py", line 269, in find_local_identifier
raise ValueError(message)
ValueError: multiple candidates for the local Supvisors: ['rocky51', '192.168.1.70:30000']
Error: Could not make supvisors rpc interface
For help, use /usr/local/bin/supervisord -h
At the moment, a solution in Supvisors is to qualify the entry in supvisors_list by adding its Supervisor
identifier. This is also the name that will be used for the Web UI.
[rpcinterface:supvisors]
supervisor.rpcinterface_factory = supvisors.plugin:make_supvisors_rpcinterface
supvisors_list = <supv-01>rocky51,rocky52,<supv-03>192.168.1.70:30000
Then Supervisor shall be started by passing this identification to the supervisord program.
[bash] > supervisord -ni supv-01
Remote host SILENT
A remote Supvisors instance may be declared SILENT, although supervisord is running on the remote host.
Firewall rules
There is likely an issue with the firewall of the hosts. By default, a firewall is configured to block almost everything. The Supervisor HTTP ports have to be explicitly allowed in the firewall configuration.
Issue: Without the Supvisors plugin, accessing the remote Supervisor web page using its URL is rejected.
Solution: Use HTTP ports that are allowed by the firewall or ask the UNIX administrator to enable the HTTP ports used by the Supervisor configuration.
Inconsistent Supvisors configuration
Issue: Accessing the remote Supvisors web page using its URL is accepted. Various error messages may be received.
Solution: Make sure that the supvisors_list is consistent for all Supvisors instances, in accordance with
rpcinterface extension point.
When using a simple Supervisor / Supvisors configuration as follows:
[inet_http_server]
port=:60000
[rpcinterface:supvisors]
supervisor.rpcinterface_factory = supvisors.plugin:make_supvisors_rpcinterface
supvisors_list = rocky51,rocky52,rocky53
internal_port = 60001
It is assumed that supervisord will be started on the 3 hosts with the same configuration, i.e. with a Supervisor server available on port 60000 and with Supvisors internal publisher available on port 60001.
If the Supervisor configuration on rocky52 is different and declares an inet_http_server on port 60100,
the XML-RPC from rocky51 and rocky53 towards rocky52 will fail.
A variety of different errors may be experienced depending on how wrong configuration is.
[bash] > supervisord -n
[...]
2022-11-18 18:16:20,428;ERRO;Context.on_tick_event: got tick from unknown Supvisors=rocky52
[...]
[ERROR] failed to check Supvisors=rocky52
[...]
Discovery mode not working
When Supvisors is in discovery mode, it uses an UDP Multicast group to share the identification of every Supvisors instance periodically. The relevant configuration options in the Supervisor configuration file are:
multicast_group,
multicast_interface,
multicast_ttl.
When Supvisors is running with a multicast group set, the following command should show the multicast address chosen.
Based on a multicast address 239.0.0.1 and a multicast interface eth0
[bash] > netstat -g
IPv6/IPv4 Group Memberships
Interface RefCnt Group
--------------- ------ ---------------------
[...]
eth0 2 239.0.0.1
[...]
There are quite a number of reasons that may cause this function to not work, at OS level and in the Supvisors configuration.
The consequence is always the same: the remote Supvisors instance is not detected, although supervisord is running on the remote host.
The main difficulty is that there will be no log trace to help in Supvisors, due to the non-connected nature of UDP. UDP sockets are open and eventually bound but nothing happens on them.
First of all, it is absolutely mandatory that multicast is enabled in the system nodes and the hardware in-between.
The aim of this section is clearly not to be a tutorial about configuring multicast in a system, which is not really in my area of expertise anyway. The aim is to help the Supvisors user and/or his favorite system administrator, by giving a few common hints.
A basic multicast exchange using a tool like iperf can be done in order to discharge Supvisors.
MULTICAST not activated on the network device
Issue: Multicast is not activated on the network device.
Here are 2 unix commands that provide feedback on the multicast status of the eth0 network device.
[bash] > ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
[...]
[bash] > ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
[...]
Solution: If MULTICAST is not displayed for the considered network interface, the following
command should do the trick.
[bash] > ip link set multicast on dev eth0
IGMP not enabled in the firewall
Issue: The IGMP protocol is not enabled in the firewall daemon (default configuration).
Solution: The following commands enable permanently IGMP for a zone:
[bash] > firewall-cmd --permanent --zone=zone-name --add-protocol=igmp
[bash] > firewall-cmd --reload
Multicast denied by an ACL
Issue: Multicast is working between co-located Supvisors instances, but not with Supvisors instances located on remote hosts, and separated only by a network switch.
Solution: Depending on the network configuration and hardware, multicast may also be denied by an ACL on a switch. This is quite specific to the firmware involved, so refer to the switch documentation.
Multicast not enabled in a router
Issue: Multicast is working between co-located Supvisors instances, but not with Supvisors instances located on remote hosts, and separated by a router.
Solution: Multicast may be disabled in the router. Again, this is quite specific to the firmware involved, so refer to the switch documentation.
As an example, multicast can be enabled in a CISCO router with the following command:
ip multicast-routing
TTL too low
Issue: Same issue as above, despite multicast is enabled in the router.
The Time-To-Live (TTL) of a multicast message is decremented everytime it passes through a router, and in accordance with the router threshold. When the TTL is lower than the router threshold, the message is discarded.
Solution: The TTL value in the multicast_ttl option should be set accordingly with the configuration
of the routers between any of the Supvisors instances. The router threshold could also be decreased.
Incompatible candidates
Issue: Multicast is confirmed working (outside the scope of Supvisors) and the following log trace occurs:
[bash] > supervisord -n
[...]
2024-07-05 18:24:45,204;WARN;SupvisorsMapper.check_candidate: the Supvisors instance known as <test>rocky51:60000 is incompatible with the candidate <test>rocky52:60000
[...]
This can happen in discovery mode when at least 2 Supvisors instances have the same nick identifier. The nick identifier can be optionally set:
either in the Supervisor configuration file:
[supervisord] ... identifier=testor when starting the supervisord daemon:
[bash] > supervisord -ni test [...]
When this option is set, and thus different from the default value “supervisor”, it MUST be unique per instance.
Solution: There are a few alternatives:
leave the
identifieroption unset and let Supvisors build a defaultnick_identifier;ensure that supervisord is started using the
-ioption and with a different parameter for every Supvisors instance,use a different Supervisor configuration file per Supvisors instance.