Intelligent Control Loops
Self-management is a key component of autonomic computing. Conceptually similar to network management, this requires the use of control loops that do the following:
- Collect information from the system components.
- Collect information from the actors.
- Analyze the information.
- Make decisions based on the analysis.
- Adjust the system components as necessary.
- Inform the actors concerning changes.
Let's take an example from telecom network problem management. The following series of steps illustrates the business process flow that results from a network error such as when one or more link failures occur.
- Network fault
- Fault resolution
- Create ticket
- Root cause analysis
- Initiate workflow
- Repair fault
- Close ticket
The first stage in the process might result in an update to a GUI in a network monitoring center. Alternatively, a support technician might be paged--somehow, someone somewhere gets contacted.
The next step is an effort to resolve the problem. The latter might be intermittent in nature, or it might have been caused by an errant digger. Assuming the latter, a trouble ticket is created. At this stage, an effort is made to determine the root cause of the problem. Bear in mind that an optical fiber cut may result in a huge number of alarms from the network, so it can take time to get to the bottom of the deluge of management data. We're assuming that a fiber cut has occurred, so it becomes necessary to initiate and track a specific workflow. This will typically require personnel to go to the fault site and carry out repairs. Once this is done, the trouble ticket can be closed.
The business process described in Figure 1 can be mapped into a control loop in broad terms as follows:
- Network fault -> Collection
- Root cause analysis -> Analysis
- Initiate workflow -> Make decisions
- Repair fault -> Adjust the system components
- Inform actors -> Communicate updates to the users (e.g., via email, pager, or text message)
The important point is that most business processes can be resolved into control loops similar to the above. As mentioned previously, this allows for automation.
Other Control Loops
We're all used to control loops in our everyday lives: heating/cooling systems, any timer-based electronic device, fuel injection systems, etc. There is always a danger associated with closed loop systems: instability. This is a well-known problem from automatic control theory, where a controlled element (such as a heating system) starts to oscillate uncontrollably. This problem presents an important challenge to the concept of autonomic computing.
Autonomic Computing Components
Two major autonomic components are involved in the control loops:
- Autonomic managers
- Managed elements
Figure 1 illustrates these components. Two entities are common to both the autonomic manager and the managed elements: sensors and effectors. Sensors are used to collect data concerning the state of a given element in either of two ways: polling (or explicit "gets") or on a notification basis. Effectors provide a means of modifying the state or configuration of an element. Taken together, sensors and effectors provide a manageability interface.
Figure 1. Autonomic components: Manager and managed element
The autonomic manager implements the control loop and consists of four parts, each of which shares a common knowledge base. The four parts are:
- Analyze: Correlates collected data.
- Plan: Specifies the actions needed to achieve specific goals in line with business policies.
- Monitor: Collection, aggregation, filtering, and reporting of data from a managed element.
- Execute: Allows for changes to be made to a managed element in conjunction with a plan.
IBM maintains an architectural blueprint for autonomic computing in its Thomas J. Watson Research Center in Hawthorne, New York.