as you – hopefully – have read, automation is not magic, not even black magic. It is the execution of actions based on conditions. As this does not sound all that difficult, what do we need to integrate this concept into everyday IT maintenance life? Simple, we need some sort of machine – an automation engine – that will sit on all the IT components of our environment and execute actions if some conditions we have programmed the machine with become true.
Simple may be a pretty misleading term. The concept of this machine is very simple, but this automation engine has to monitor all data available in our IT environment in order to match any conditions and on the other hand this engine will have to find the right action to execute. The concept behind this approach is simple; the technical problems to be solved in order to make this machine work are numerous and have to be dealt with. Let me take a glimpse at a few of the most immanent ones:
- 003366;">Mass of data to be processed
As you may remember, we are looking at all the system management, KPI and quality data we can get our hands on for all the IT around us. So there is a lot of data and we have to deal with all of it. - 003366;">Mass of conditions
Besides all that data there are a lot of conditions that have to be evaluated upon the available data series. The automation engine is a very elaborate version of a rule engine, because it is dealing with a highly interconnected logic tree (the IT model) and many conditions on a large data space. So typical approaches like decision matrixes do not work for cutting short on rule evaluation. - 003366;">Unknown rules
If we wanted to put everything that needs to be executed automatically into an explicit rule, building the rule system would take a lifetime and the problem “mass of conditions” would become ever more influential. Building implicit rules is too complicated for the user. So the automation engine has to adopt a behavior of encircle the problem. This is a divide and conquer approach instead of asking a user to enter every circumstance and every reaction because this kind of “brain dump” is simply not invented yet. I know this is very abstract and I am sure that I will find a little more time soon to elaborate on the way an automation engine has to find the proper actions to take to solve a real life problem.
By the way, most computer systems and approaches in system management software take a simple approach to tackle these challenges. Techniques like root cause analysis or autonomic systems try to move down the dependency tree and find the problem somewhere down there. Why is this approach practical? Well it narrows down the amount of possible data sources and actions that can be taken quickly and in that way a computer system can actually work by out the simple problem resulting. And why are these approaches a short jump? Well, they simply don’t work with complex problems that show symptoms in some remote location of the IT environment or problems that are caused by multiple sources. Most problems in modern IT systems are of the latter kind and therefore the common top down approaches execute quite some actions, but not merely as much as an automation engine should solve. Or would you expect your best system administrator to simply go down the logical tree of connected systems while trying to find out why your ecommerce application is not working? No, not really, because good administrators encircle the cause of a problem and thus exclude great parts of the IT environment throught their experience as possibly causes and then only concentrate on the “relevant” remainders.

