|
|
# The concept of Hyperion
|
|
|
|
|
|
Hyperion is meant to easily start complex systems (like distributed systems), causing minimal overhead while significantly enhancing transparency of both the startup process and the current system status.
|
|
|
|
|
|
The fundamental approach is strongly influenced by [vdemo](https://code.cor-lab.org/projects/vdemo) and [TMuLE](https://github.com/marc-hanheide/TMuLE): A terminal multiplexer - tmux in this case - is used to start components defined by a system configuration in named sessions and windows.
|
|
|
|
|
|
## Architecture
|
|
|
|
|
|
The model I came up with to efficiently manage a system is built as follows:
|
|
|
|
|
|
- For a configuraton a tmux session is created. It contains a main window that is meant to run commands that are not tied to specific components.
|
|
|
- Each component is started in a window within the session. On window creation stdout and stderr are redirected to a log file with tee.
|
|
|
- A component has two 'check layers': the first layer is checking if the window exists and analyzing if there are any non-tee child-processes running, the optional second layer is running a check command that is specified in the component configuration.
|
|
|
- Stopping a component means sending SIG_INT to the window and killing it afterwards.
|
|
|
|
|
|
Spreading this model across complicates things a bit:
|
|
|
- Each host needs to run a tmux session with its own local components
|
|
|
- The system needs centralized management (control center)
|
|
|
- Connections between the control center and slave machines have to be set up
|
|
|
- Network traffic becomes relevant
|
|
|
|
|
|
### Master/Slave/Remote user Interface
|
|
|
|
|
|
The **control center** (also master) is the core of the whole application: it is able to start, check and stop local components, start and stop slaves, forward start, check and stop commands to a slave and send status information to connected user interfaces.
|
|
|
Upon startup it parses the configuration, resolves hosts, creates a custom ssh configuration (using ControlMasters to speed up connections and lower bandwidth) and starts slave instances on all available involved hosts. The start process includes setting up a socket server that handles connections to slaves and a separate socket server that handles user interfaces.
|
|
|
|
|
|
A **slave** is a light version of the control center that is limited to handling local components only. It needs a host and a port on startup to which it will connect. If the connection fails, the slave will terminate. The socket connection is used to forward local events to the master (check, crash, etc.) and receive instructions from the master.
|
|
|
|
|
|
A **remote user interface** is an interface that connects to a master and subscribes to its events to forward them to a guided user interface for visual representation. It also uses the socket channel to send user input commands to the master.
|
|
|
|
|
|
Exemplary flow of communication:
|
|
|
1. A control center (server), a remote slave (slave) and a user interface are running on separate hosts
|
|
|
2. User clicks start component 'example component' -> the event is sent to server.
|
|
|
3. Server interprets command, resolves dependencies of 'example component' and send start commands for dependency 'example dep' of 'example component' to slave.
|
|
|
4. Slave runs start, checks if it was successful and forwards check event to server.
|
|
|
5. Server notices check event for 'example dep' forwards event to all connected remote user interfaces and on successful check starts 'example component' and runs check
|
|
|
6. ...
|
|
|
|
|
|
## Network Specifics
|
|
|
|
|
|
- ssh connections are established only via ssh keys
|
|
|
- Socket clients use a ssh ControlMaster connection to forward ports
|
|
|
- The control master binds its socket servers to loopback, forcing clients to use ssh port forwarding (ensures encrypted communication and authenticated users only)
|
|
|
|
|
|
## Configuration
|
|
|
|
|
|
For a guide on how to write configurations visit [[Configuring with YAML|YAML-File-Structures]].
|
|
|
|
|
|
The configuration is mandatory for the server, slave, standalone and validation modes.
|
|
|
On the server the configuration is parsed after sourcing the custom environment file, if it was provided. The parser preprocessing consists of resolving hostnames given as environment variables and appending a component id to each component definition consisting of `COMPONENT_NAME@COMPONENT_HOST`. While parsing it will detect circular or missing dependencies end exit with an error status.
|
|
|
On successful preprocessing the host dumps the config as single file in a tmp directory, from where it will be copied to all connected slave machines (into a slave specific tmp directory) via sftp.
|
|
|
|
|
|
A slave will not preprocess a configuration, because it assumes, this was already done by the master server. ***Note:*** Because of this behaivor it is required to have hostnames declared consistently across all involved hosts for the program to run as intended.
|
|
|
|
|
|
Validation mode preprocesses the configuration like the server mode does, but without the dumping and copying part.
|
|
|
More precisely the server mode uses the functionality provided by validation mode. The only difference between the two is that if visual validation was selected the process will always terminate after a dependency graph has been generated, even if missing dependency errors or circular dependencies are encountered.
|
|
|
|
|
|
# TODO
|
|
|
- Monitoring
|
|
|
- logging structure |
|
|
\ No newline at end of file |