Jul 29
Reading and writing about IT automation, I keep on learning about the subject. Lately I found that there are so many flavors of automation around the operating processes of IT, that misunderstanding seems inevitable. So I try to make a point here to talk about the different kinds of automation one can use all around maintaining a high quality IT environment.

- Incident-, Problem-, Capacity- and Availability Management
Automation engines specialized on analyzing and handling events that occur in a IT environment that may lead to or themselves represent malfunctions, loss of quality and the like. Both reactive (automated reaction to an incoming event) and proactive (automated actions taken to prevent events from occurring) are target of these engines. Automation engines that handle the “fault operating” are either embedded into the ITIL processes (see blog entry on extending ITIL with automation) like our automation engine (aAe) or are embedded into system components or management systems with a narrow scope e.g. on redundancy activation.
- Change Management
Automation engines specialized on performing changes that modify or extend an IT environment automatically. Either these engines are Inserting an abstracted layer above tasks that need to be performed (like adding users, restarting a component and the like) these engines allow an administrator to perform tasks on many machines or on different platforms without by interacting with the automation engine. An example for this kind of engine is the Puppet framework with a very structured approach to abstraction. Or these engines focus on scaling an IT environment by dynamically adding resources or automatically installing or modifying a system like the Tivoli Provisioning Manager or VMWare Virtual Center does.
I really do hope (not just to save you some consulting fees) to have helped avoid misunderstandings, when you are talking to others about automation and even better maybe I could point out some additional techniques you can look at to make life easier.
Jul 28
Sunday I had a great chance to view virtualization at its best. I attended an Airbus A340 simulator training at Lufthansa Flight Training via ProFlight. As my eyesight will always prohibit me from entering a real cockpit flying in a simulator is about the closest I will ever get to flying an airplane. And I have to say the simulator is like the real thing and I love it!
The session started out with a short briefing on basic aerodynamics, the controls of the Airbus A340 and an overview of basic flying procedures. The instructor was a retired jet pilot who is now working as a pilot trainer. He was very experienced and calm and professional and got us up to speed very quickly. So we could enter the A340 simulator after an hour of pre flight briefing.
-
-
A340 Cockpit copilot's side
-
-
A340 Cockpit full view
-
-
Chris at the Conrols
-
-
Appoach to SFO
It is really hard to tell the difference between the simulator and the real thing. Sitting in the cockpit you can feel the movement, hear the noise and let yourself be drawn into the world of flying. We started out flying in New York and took a look at Coney Island at night. After NY I choose San Francisco International Airport as the next spot for flight training. Taking an A340 at a totally illegal altitude above the Golden Gate is really stunning.
I had the chance to practice four approaches to SFO and got it right the first time (just look at the analysis and be humble!). The only real problem I had was not to hit any other aircraft while taxiing to the terminal. Obviously I feel much better with my head up above the clouds. And I will definitely be back again.
Jul 24
Some time ago I published an article on the future of IT operation after we are through with all the ITIL implementations (still) taking place. Assuming that all the nice failure handling, proactive failure avoiding and communication processes like Incident, Problem, Capacity and Availability Management are in place, implementing automation is the logical way to move ahead. Compared to implementing ITIL automation actually changes the things that are done and the way they are done. As you may have guessed this statement alone was fertile ground for interesting and heated debating.
Generally the article concluded that implementing the ITIL processes concentrates on the interfaces between IT experts, clients, business requirements and the like where automation concentrates on the way IT operation is actually “produced” (in an industrial meaning of the word). Even though these two may be viewed separately the article shows how an automation environment highly depends on monitoring and IT component data. An ITIL environment puts forth a valid definition of both data sources for a complete IT environment and is therefore a good foundation to start implementing automation.

An IT operations environment with implemented ITIL processes also has common interfaces to the acting staff members. This makes it very easy to “inject” a new entity - like an automation engine - into the whole system. In such an approach the automation engine wraps itself around the data sources of CMDB and monitoring systems. All communication that would today be directed towards human recipients is handled by the automation engine first. Only if the automation engine is not able to complete the task the IT experts are involved.
This short description reveals how well an ITIL implementation prepares an IT organization for implementing automation. It also shows how automation is made completely transparent to the business using the IT - as the automation engine acts like any human entity taking part in the ITIL processes.
The article itself gives a short overview of the “operational” ITIL processes and how their implementation builds the foundation for automation. If you are interested you may read the whole text here.
Jun 10
We are all moved by compliance issues. Mainly storage vendors, consultants and auditors are having a feast. For most corporations introducing the new rules is quite a drain on resources. Besides this, changes in the working processes are the main cause for discomfort in the workforce and management of the entities affected by the rules.
Automation actually solves one big problem compliance poses on IT operation. However it may also make an old one reappear.
So let us take a look at the good news first. One demand often posed by auditors and clearly stated in all new compliance rule sets is, that all actions and the reasoning behind taking the actions should be well documented and archived. In a normal working environment this usually means getting on the case of everybody and forcing them to type explanations of what they did into some documentation system after the system has behaved like big brother and logged the technical parts of the doing. This can become tedious and does not have much positive effect on day-to-day business. So most explanations in these systems look like ‘fixed the ABC problem’ and the reasoning part is lost forever. This is where an automation engine really helps. An automation engine will document each action it takes, archive the data and the rules that have caused the action to be taken and reveal the planned next steps and all related actions and problems. So there is one big relief for everybody working on or auditing IT operations. Great, isn’t it?
The second topic is the way roles and rights are managed along compliance rule sets. In the dark ages, there was a super user (many administrators are still worshippers of this creed). According to the new rules one administrator can have the rights to perform manipulations on exactly the entities he is attached to. A database administrator for example should only be able to talk to his database and if he needs some different system settings, because his database requires more semaphores he will have to create a change request to the OS administrators. At least that is how it works in theory or whenever administrators want to slow each other down dramatically. I think the intention of the new rules is clear and unarguable: One human should only be able to have influence on the direct area he is dedicated to. Everything else can produce unpredictable risks and should thus be avoided. All fine and good and most corporations (at least the larger ones) have implemented ‘the admin silo view’ by using simple mechanisms like ’sudo’ or more complicated rights management systems. Upon inserting an automation engine in this environment any administrator who can create a rule that is reusable could lead to command executions outside the rule author’s area of competence.
Well one would argue that is exactly what we want. We want to reuse the expert knowledge of someone who solved a problem in different environments. Auditors probably would say ‘no this is exactly what we do not want’….. A big dilemma?
I do not really think so. And I do think that we really want the knowledge to be distributed and here is why:
- The ones who are writing rules are experts. Like the export we call in, when we really cannot find the cause of or remedy for a problem.
- The guy who wrote the rule will always be identifiable from the engines point of view and that was the original intent of the compliance rules (make sure we know what was done by whom and where).
- One could restrict rule attachment by group signatures and the like (additional parameter in the IT model) to create peace and quiet, but should one really dismiss the power of implicit rules if every action and its originator is well documented? (Maybe someone really into the field of compliance could answer this question for me???).
So all in all automation may cause some auditors or process consultants some headaches, but heck - this is what they are paid for, isn´t it? On the other hand an automation engine produced well formed documentation and reasoning for the auditors, which is something that any kind of silo restriction on the human workforce cannot guarantee.
May 21
Considering the fact, that only 30-40% of the energy consumed by a data center is used by the actual computational equipment and considering that another 30%-40% of the energy consumed by the IT equipment is converted to heat, only 21%-24% of the energy eaten up in data center is actually converted into computing power. Looking at these numbers from the other side means for each Euro spent on “computational energy” 4,76 Euro are spent on “overhead”. This ration can be improved by optimizing air conditioning, getting rid of heat hot spots or generally using energy efficient and modern equipment. Still it seems unlikely that this will help to get anywhere below 3 Euros of “overhead” for each Euro spent on “computational energy”. These numbers are taken out of the keynote presentation given by Steven Sams at IBM PULSE 2008 (Also check out the “Raised Floor Blog”, where Steven Sams is one of the authors)
On the other hand this means that reducing the energy needed for computational equipment will in absolute numbers decrease the excess energy consumption by a factor of more than 4. So improving the facility is a good start, but reducing the energy actually needed by computational equipment is the real price. The way to reduce energy needed is a direct result from capacity management. Generally speaking this means – in the best case – turning off as many components as possible – or if that is not possible, at least cutting their energy usage by slowing the CPUs or putting virtual instances into suspend mode until their service is really required. Does this sound easy? Well it does, but how does it work? Virtualization certainly is the key technology, but what good would virtual machines be, if their resources could not be allocated automatically depending o their actual use or – if you want to be cautious – by their predicted load and therefore by their predicted usage. A specialized set of rules is put behind process and operational automation, to perform the scale-down and scale-up of the virtual machine resources. This automation can even decide to turn off hardware, that is currently not needed or at least to slow the CPUs of hardware that cannot be turned off, but is in little use.
Modern “or very green systems” come along with special agent to detect energy consumption and usage deriving possible executions. But how about all the legacy applications – the applications that are running on more than 95% of all the components, using up energy in our data centers today? An automation engine that actually acts like an operator (someone who could manually cut down on power use) could examine the equipment in the IT landscape it is acting upon and execute general rules to reduce energy usage. By combining both technologies – the more effective combination of modern hardware and specialized software for new applications and a general automation engine for all the legacy applications – the power of virtualized components can actually be converted to green power. This is not just a fabulous business case, but it also is a good thing for the environment and hence for all of us.
May 21
Yesterday I had the chance to get a feeling for one of the hottest topics in IT infrastructure. A panel session at IBM PULSE 2008 was dedicated to the topic of Cloud Computing (even though IBM marketing people don´t seem to like the term and have come up with quite some innovative words – words no one uses, so let us stick with the cloud). The panel was buzzing with intelligence, unfortunately we as the audience could not really match up. So we listened to a pretty much directed discussion on how cloud computing would replace today´s approach to hardware and infrastructure in general. Well I do agree, no one needs dedicated servers when resources can be allocated dynamically and come preconfigured and interconnected. Kristin Hansen stripped the key features of a cloud down to simplicity (users do not care how their resources are set up, they just use them), mobility (obviously use is possible from anywhere and even a large computing cluster could be controlled from a phone like device) and elasticity (you only setup or pay what you really need). Sounds fine to everyone and Google and Amazon have definitely shown to the world that this concept works in a closed shop environment. According to Dave Lindquist IBM is working on a methodology and technology to make most applications “cloudable”. The most interesting remark I heard during the discussion was the “Cloud Computing is the combination of technology (virtualization and automation) and discipline (a stringent way of breaking down the offered services into small blocks in order to recombine them quickly and automatically upon the user´s request as well as defining standards or service catalogues to be offered)”. I guess the discipline part will put forth a great deal of discussions between process consultants and methodology consultants and in the end there will certainly be a couple of good ways to set things up. Just as certainly there will be the need to standardize these processes and methodologies in the end, so clouds are not proprietary but keep mobile even between cloud providers.
Naturally I am more interested in the technology part, that is needed behind cloud computing. Technology - in this case - not referring to the cloud management servers and agents themselves, but the technology surrounding them. The first technology that comes to mind is virtualization as without this core there will be no cloud, at least no cloud that can integrate legacy applications rather than working in a very tightly closed universe like Google does. There are quite some good approaches to virtualization – commercially as well as open source – and the approach taken should really depend on the needs of the applications to be run on a specific part of the cloud. It does probably make sense to even merge the available virtualization technologies within one cloud. It might make sense to use containers build into the operating system or complete hardware virtualization depending on the kind of application to be run and therefore a cloud manager will have to deal with all kinds of virtualization technology.
More on my focus is the service management side of cloud computing and I strongly believe that automated operating is a key component of a good cloud infrastructure. Definitely the cloud infrastructure and management components will take care auf automatic provisioning and resource management, but as soon as legacy applications – that do not really know that they are running on a beautifully scalable environment – are involved manual administration of these applications would mean chasing an ever changing rabbit across a chameleon planet – an image most amusing to bystanders but neither funny to administrators nor to the ones paying them. So in my opinion an automation engine could be fed IT model data and monitoring feeds directly from the cloud manager and could thus deal with the ever changing environment and keep the application automation rules up to date with the cloud components currently in use. This automation engine cannot use a drill down approach, because the infrastructure might not even support drill downs and can change ever so often. The automation engine assuring a good foundation for quality service a professional service management will have to use a more human “circle in” or divide and conquer approach.
Does this sound familiar? By the way, check out the articles on the “Blue Cloud”; technical pioneers at work (other bloggers also think about the blue cloud)…. Also interesting is the cooperation between Google and IBM on producing cloud standards
May 19
One should think that Florida in spring is pure sunshine, but actually it looks quite rainy these days. So Cy and I have the time and concentration to follow the IBM PULSE 2008 Conference, which we are attending this year. Just for a short introduction, I have been listening to the opening keynote moderated by Al Zollar (really good show). The best speech in the opening block was given by Steve Mills who introduced the challenges the IT service industry is facing today. He talked about the introduction of “smart devices” or intelligence into most devices that are making up our assets (not just IT but all infrastructure) and the possibilities these new interfaces offer. He also talked about the growing amount and growth of data processed and stored today. This was (of course) followed by a presentation of the challenges the energy hunger of IT components makes us all face in terms of environmental, social and economic consequences. All this wrapped up in the expectation, that with IT becoming ever more part of every day life and every day business the expectations towards quality of service of IT systems are not just rising but are at roughly 100% - he made a very impressive connection to telecommunication (the dial tone is there when you pick up the phone) and the power grid (power is simply available, when you need it). In order for IT to meet these expectations Mr. Mills proposed that the IT professionals are facing their final challenge in industrialization of IT services.
Does that sound familiar? Yes, it does (at least if you have read quite a bit of what we have been writing here)… Mr. Mills went about industrialization on a rather abstract level, but one of his key points wars process automation and automating manual labor. The latter being exactly where the automation technology I am focused on comes into play. Especially when all the smart devices keep on generating more and more monitoring data it is not just laborious to handle all the events the attention of administrative staff is supposed to be pointed to, but at some level just becomes impossible. That is when we need an “engine” or an “industrial robot” to take care of processing the events and taking the right action.
I have seen great approaches for visualizing events and monitoring data here as well as very interesting techniques and technologies to get in control of automated processes (I should point out, that these concepts were obviously inspired by practices from the production industry), but I have not seen much on tools that actually perform the work. I think we will have some very interesting conversations with colleagues from all branches of the IT industry on our ideas concerning actually automating the operating of IT systems. At least my first impression at a “meet the experts” event after the key note is, that IBM PULSE 2008 is the right place for an open minded discussion on how far we can possibly take this approach of automating IT operating and what technology is needed or even available to do that. I am very excited about that, especially since the idea of automation – an idea techies have been discussing for quite some time - seems to have penetrated strategic management and people like Mr. Mills show how far someone with a vision can take the concept of automation.

May 01
As we have learned in a previous post of my valued collegue Chris, automation is a very easy thing -
It is the execution of actions based on conditions.
From the technical standpoint this simple explanation sums it up. But for me, working on productmanagement and marketing issues, I can tell you, that there are loads of other aspects of our featured subject Automation.
Today it’s possible to automate almost everything, starting from doors to production plants and - to switch over to the IT world - from simple applications to system run books even whole datacenters. All in common is the goal to make life easier and to prevent the “user” from doing unnecessary, repeating tasks, like opening doors, pressing knobs, typing commands into a shell window or even executing restart scripts.
I’m sure the potentials of Automation becoming the “next big thing” seems to be huge. So is the number of vendors offering offering products. The wheels are turning and the M&A guys have already started earning money. HP bought Opsware in summer last year for $1.6 billion and BMC Software by that time acquired RealOps, the so called “Run Book Automation pioneer” for $52.5 million and in March this year BladeLogic for an impressing price of more than $800 million. The latest news is CA signing an OEM agreement with Opalis, which leaves plenty of room for rumors. Happy merging.
The answer to the question “why companies are spending these huge amounts of money for Automation technology?” gives Bob Beauchamp, president and CEO of BMC, who said
“Organizations around the world will spend more than $140 billion this year running data centers, Automation is the only way IT can bring this spending under control and still meet the requirements of their businesses.”
Just a last word to think about: How much are companies spending for all the other stuff outside of datacenters?