- Supervision & Monitoring
Supervision is a special system attached to each container which allows defining strategy for handling incoming faults and problems. This article describes supervision system implementation in Kraken Framework and shows how it can be used.
Supervision & Monitoring
Supervision is a dependency relationship description between containers. The supervisor, which is entry point of supervision system, oversees subordinates and therefore must respond to their failures. When a subordinate detects a failure, i.e. throws an exception, it searches itself for proper strategy and then applies it. All the problems are solved using set of solvers. Depending on the nature of the work to be supervised and the nature of the failure, the supervision system has a choice of the following two options:
- Solve the problem using local supervisor
- Escalte the problem delegating it to remote supervisor
It is important to always view container as a part of a supervision hierarchy. Each container is a subordinate to its parent supervision system. The only exception from this rule is root container, which posses no remote supervision system. To ensure root container works properly all the time you can assign a system supervisor to it. You are able to read more about this in project deployment.
Local supervision is the first layer of supervision system. Each container has its own local supervision mechanisms that should be used to solve easy problems which does not require to keep reaction from its parent.
Remote supervision is the second layer of supervision system. It is placed in the container's parent's domain. The request to the remote supervisor is done automatically if unhandled exception is not valid to solve for a local system.
Translating The Failures
Each supervisor is configured with a function translating all possible failure causes into solvers, called strategy. Solvers do not take the failed actor’s identity as an input, which might not seem flexible, but at this point it is vital to understand that supervision is about forming a recursive fault handling structure. If you try to do too much at one level, it will become hard to reason about.
Applying The Strategy
The supervision strategy is a set of possible failure identifies accompanied with an array of solvers which are designed to solve it. Using strategy is similar to
try-catch block, in the sense, that the first valid record is executed, and the search function then terminates. Therefore, it is extremely important to understand that the order of solvers matters.
To apply the strategy one should modify
supervision.remote configuration options of any container or directly modify existing supervisor via
Supervisor.Remote services. The details about supervisors are presented in supervision API article.
Triggering The Supervisor
Triggering the supervisor is done automatically by all unhandled errors and exceptions. It might also be done manually by calling
fail method of
Kraken\Runtime\RuntimeContainerInterface which references current container. When trigger happens, your application changes state from
failed, switches its workflow and ceases all non-solving related operations. In short, it switches your container to maintenance mode. To mark the problem as solved, one of the solvers have to call
succeed method, which will switch back
failed state to the
started one and resume unfinished callbacks. It is very important to know, that solvers do not call this method automatically, meaning you have to remember about this yourself. It is a good practice to keep
ContainerContinue solver at the end of each solving chain.
succeedmethod will automatically set your container flow back to the default state halting the rest of solvers, as the problem has been marked as solved. That's why it is important to keep succeeding as the last solving directive.
Delegating solving process to remote supervisor can be done from local supervisor, using one of the solvers, after it had entered
failed state. To do this, your solver should call its parent's
cmd:error command passing proper
message param or easier via using
Solver is a piece of code that is executed by supervisor as a part of strategy to solve occurred problem. Solvers might be compared to special case of commands. Kraken provides your application a set of solver to choose from. All of the possible default solvers are defined inside
There are three families of default solvers:
cmd solvers' purpose is to provide your application with the set of helpers to work with errors. They allow logging, escalating and other non-container related operations.
container solvers are set of solvers which allow to work on container that has thrown exception itself. They should be used in local supervision and allows basic container-related operations.
runtime solver are set of solvers which are designed to work on remote containers. They should be used in remote supervision and allows full-set of container-related operations, including restarting a child.