For the administration of servers, networks, applications, etc. tools have been available from the very beginning to take care of traditional administration tasks. In a technical environment these are divided into monitoring/debugging tools and administration/optimisation applications.
Monitoring tools monitor the services and functionalities of the respective servers, or aspects of them. They provide the administrator with information about the status and performance of the services and processes. The information flow for administration and optimisation tools is usually reversed. The administrator decides on the actions he wants to take. He will make his decisions based on, among other things, observations derived from monitoring. These procedures can then be applied to the services using the available tools. Today, a significant part of system administration work specifically involves this process of reviewing the results given by the monitoring system and the subsequent use of administration or optimisation tools. Due to the sustained trend toward ever increasingly distributed applications, this process is much more complex in practice than it appears in theory. Each additional component increases the number of possible “adjustments” enabling optimal implementation of the services in terms of availability and performance.
To master this complexity, it is necessary to clarify the dependencies between machines, applications, resources and services. This makes it possible to identify the correct points for intervention, and to estimate the likely consequences of changes (see the M-A-R-S model diagram).
If such a dependency model is adequately defined, it is possible to significantly optimise the tasks involved in IT operations using a new class of applications (referred to here as “Administration Agents”).
As a rule, the function of today’s tools is unidirectional, i.e. the tool either informs the administrator about the need for intervention or the administrator initiates appropriate actions in the target system via another tool.
Administration agents have the advantage, just like the human administrator himself, of possessing a model of dependencies. For example, they can use the information from monitoring to determine which possible intervention options are appropriate and which areas are potentially affected. A preliminary selection like that saves the administrator a good part of his day-to-day work making it possible to achieve a faster response, and the time saved can be well-used in other areas.
The M-A-R-S model, developed by arago, describes the dependencies of hardware, applications, resources and services.
In addition, such a ruleset enables the definition of standard actions that make manual intervention in acute situations completely unnecessary. A small example follows:
A mail server receives emails from applications, saves them in interim storage and dispatches them to the Internet. This process produces a large number of log files that document the server’s processing. Due to the size of the incoming files and the necessity of archiving them, they are automatically transferred to an archive server during the night but remain on the server itself for research purposes in case of user questions.
Should the available space in the log partitions of the server reach a critical value, there is hopefully a monitoring system that informs the administrator.
The administrator then checks which of the logs have been successfully transferred to the archive server and removes them from the mail server using an appropriate tool. In our example, we conduct two “monitoring” events (available space and transferred logs) for one “administrative” event. Because it is based on a consistent pattern, the same action will be necessary as soon as the level of available space reaches a critical value.
The described chain of actions/reactions can be fully automated through an administration agent. This agent quasi bridges the monitoring and administration tools and thus can access the complete monitoring information
(here: The level of available space and transferred log files) and all administrative options
(here: The removal of files) plus complete information about a model that describes the dependencies between servers, services etc. The administration agent “knows” the demand to retain as many log files as possible on the server and knows the archiving conditions for these files on the archive server (and the status of the transfer). It is therefore in the position to intelligently correlate the two monitoring events and automatically delete the required number of log files that have already been transferred to the level to the server.
The administrator first becomes involved when something in this chain of actions does not function as defined. For example an error occurs in the archiving or the deletion process fails.
As it is well-known that in IT maintenance nothing is as constant as change itself, the system administration team wins valuable time applying the described approach This gained time can be invested into the optimisation of the dependency model and set of rules in turn.
Under ideal conditions, this would lead to a continuous improvement in IT services without demanding a great effort of the administration team.