Can Automation be Trusted - Or How to Build Trust on Laziness
Automation, Social Impact of Automation June 19th. 2008, 9:57amWell, what a very basic question… Should we be discussing automation engines, when we should not have trust in them automatically taking action? Surely not, and obviously we are discussing automation engines.
So why do I hear so much about the lack of trust towards automated actions? It may be a stunning change in the field of system administration, that some entity takes automatic action where normally a system administrator would have typed in a couple of commands up to now. And change always induces fear and prejudice. Questions like “do you really trust the engine to restart this business critical service?” are not really uncommon. Well why should the machine not do that? After all the only action a system administrator would have taken is to restart the whole machine instead of just the service?
This simple every day example shows the real problem: Trust
We seem to have a problem when faced with the necessity to trust a machine or some lower level of reactive “intelligence”. Maybe this is just due to the many science fiction books we have read on robots and machines gone mad. In the end we are the ones who gave the engine the rule set by which it acts.
Actually we trust in automation every day we step into a lift. Much more than that, we rely on hard wired automation when we breathe or when our heart beats. I think none of us would be too happy about the idea of having to think and act out every breath and heartbeat consciously and willingly. Not much difference in automated actions in IT administration - and just like you can hold your breath automated actions can be overridden at any time.
This sounds very logical, doesn´t it? But logic is not the drink for “unsinkable rubber ducks” (the term true believer nowadays it too closely connected to politics - and besides much less enjoyable). So a good argument usually does not help much. In order to get on with automation either management uses force or try to employ man´s oldest habit - laziness (maybe we could get entangled in a discussion on greed or laziness being around first). And do not get me wrong, great things like the wheel were invented because of laziness. And on the way, we build trust towards automation in a non intrusive way - i.e. everyone involved can discover for himself that automation helps and is not evil. So this is how it is done:
- Setup the automation engine in full
- Disable all automated commands and redirect them to a trouble ticket or service management tool.
- Have administrators use this tool and hence make them see what the engine would have done.
- After a while people will start to copy and past the commands form the trouble ticket or service management tool into the various command lines.
- This is the time to enable automatic command execution. The connection to the service management or trouble ticket system stays as it is. So the commands executed are not in any way “block boxed”.
- There will not be mistrust and all the discussions, bad feelings and politics attached to it.
4 Responses to “Can Automation be Trusted - Or How to Build Trust on Laziness”
Leave a Reply
You must be logged in to post a comment.
June 19th, 2008 at 10:43 am
Chris,
sorry to come back to the healthcare sector, but I can’t deny the fact, that I worked half of my life with clinical information systems, providing information to humans (indeed even god-like doctors have a human core) deciding everyday about life and death based on facts they see and have on their mind and in their computers.
In the healthcare industry we have similar discussions, if it would some day be possible that a computer decides, if a certain examination makes sense or not.
My point of view is that a computer, programmed in the right way, has much more computing power and could use much more information for a ‘right’ decision than a human could ever have.
The big question is, at which point of time the computer is allowed to decide on his own and who is reponsible for Dr. computers actions.
This is a very emotional point, though not many people know about and/or fear singularity.
June 19th, 2008 at 12:51 pm
“Well why should the machine not do that? After all the only action a system administrator would have taken is to restart the whole machine instead of just the service?”
The action the sys admin would have taken can be many different things and may depend on number of factors. Why the service stopped in the first place? does he need to collect data to investigate the root cause of failure later? is it the right time of the day/week/month/year? are there other services hosted on the same server? was there a failover/redundancy server? may restarting it cause other problems?
Of course, automation can have all this information as well and make the right decision to whether or nor restart the service, at least in theory. The problem is that this type of information is often tacit and not captured in computer consumable form.
IMHO, straight forward actions can and should be automated but we still seem to be more comfortable to have human intelligence in the process, in case not all information is captured and hard logic is not suitable for the case. So the lack of “trust” is often not to the computer software but whether we manage to implement sophisticated automation AND provide all the decision making data to allow the automation make the right choice for us.
Restarting a failed service is the most common example given for automation, but as mentioned in the beginning of the command, event that simple task is not always simple.
A typical example of automation going nuts is opening a ticket for critical alarms from monitoring systems. Most of the time this is a great notification, but when something goes wrong and 10,000 alarms are created instead of usual 100, automation proceeds to take down the ticketing system, etc.
If the
June 27th, 2008 at 3:54 pm
Hi Berkay,
it’s a pity that you post was not complete, because you were mentioning the big point: Most so called automation systems are performing as you described it. This is a key differentiator of our Automation Engine, which is fed by a model of the IT infrastructure and is aware about the dependencies between Machines, Applications, Resources and Services. Depending of the depth of the model our engine can travel through that model and to encircle problems and execute actions defined by the rule set. If required also manual intervention could be defined within the rules, so in really critical situations the admin could be asked. A big plus for automation is, that every single step is documented, so improvements in the rule base are straightforward - this is not guaranteed by a human admin performing under heavy load.
I’m convinced that if one is really honest, some/many admins or even doctors are overwhelmed by huge number factors and decisions leave often room for improvements.
Regards
Roland
Disclaimer: Some years ago I was resonsible for a team administering about 250 servers at 70 hospital-sites, so I claim to know what I’m talking about…
July 15th, 2008 at 8:24 pm
Hi Roland,
I didn’t even realize that my post was cut off. I guess I talked(wrote) too much
What you describe does sound quite interesting and useful. As in many things, the terminology is polluted and everyone understand something else when we say “automation”. As you point out, you can automate intelligently but most people see automations that are not so sophisticated, which may be the cause of skepticism you see. almost every product suite claims “automation”, many nothing more than ability to write a script, etc.
How to differentiate what’s good from what’s bad or insufficient? Examples certainly help. I think one solid use case would go a long way. Looking forward to reading more about your thoughts and solutions!