Overview of Risk Management Process Part 2 of 2

Step 2 - Analyzing and Prioritizing Risks

Risk analysis builds on the risk information generated in the identification step, converting it into decision-making information. In the analyzing step, three more elements are added to the risk's entry on the master risks list: the risk's probability, impact, and exposure. These elements allow operations staff to rank risks, which in turn allows them to direct the most energy into managing the list of top risks.

Risk Probability

Risk probability is a measure of the likelihood that the consequences described in the risk statement will actually occur and is expressed as a numerical value. Risk probability must be greater than zero, or the risk does not pose a threat. Likewise, the probability must be less than 100 percent, or the risk is a certainty-in other words, it is a known problem.

The following table demonstrates an example of a three-value division for probabilities.

Click here for table showing Risk Probability Ranges

Risk Impact

Risk impact is an estimate of the severity of adverse effects, the magnitude of a loss, or the potential opportunity cost should a risk be realized. Risk impact should be a direct measure of the risk consequence as defined in the risk statement. It can either be measured in financial terms or with a subjective measurement scale. If all risk impacts can be expressed in financial terms, use of financial value to quantify the magnitude of loss or opportunity cost has the advantage of being familiar to business sponsors. The financial impact might be long-term costs in operations and support, loss of market share, short-term costs in additional work, or opportunity cost.

The best way to estimate losses is by a numeric scale: the larger the number, the greater the impact to the business. As long as all risks within a master risks list use the same units of measurement, simple prioritization techniques will work. It is helpful to create translation tables to convert specific units such as time or money into values that can be compared to the subjective units used elsewhere in the analysis, as illustrated in the following table. This particular table is a logarithmic transformation where the score is roughly equal to the log10($loss)-1.

High values indicate serious loss. Medium values show partial loss or reduced effectiveness. Low values indicate small or trivial losses. The scoring system for estimating monetary loss should reflect the organization's values and policies. A $10,000 monetary loss that is tolerable for one organization may be unacceptable for another.

Click here for Example of a Translation Table

When monetary losses cannot be easily calculated, it may be possible to develop alternative scoring scales for impact that capture the appropriate services affected. The following table illustrates a simple example.

Click here for Example Alternative Scoring Scale

Risk Exposure

Risk exposure measures the overall threat of the risk, combining the likelihood of actual loss (probability) with the magnitude of the potential loss (impact) into a single numeric value. In the simplest form of quantitative risk analysis, risk exposure is calculated by multiplying risk probability by impact.

Exposure = Probability x Impact

Sometimes a high-probability risk has low impact and can be safely ignored; sometimes a high-impact risk has low probability and can be safely ignored. The risks that have high probability and high impact are the ones most worth managing, and they are the ones that produce the highest exposure values.

When scores are used to quantify probability and impact, it is sometimes convenient to create a matrix that considers the possible combinations of scores and then assigns them to low-risk, medium-risk, and high-risk categories. For the use of a tripartite probability score where 1 is low and 3 is high, the possible results may be expressed in the form of a table where each cell is a possible value for risk exposure. In this arrangement, it is easy to classify risks as low, medium, or high depending on their position within the table.

The following table is an example showing probability and impact.

The advantage of this tabular format is that it is easy to understand through its use of colors (red for the high-risk zone in the upper-right corner, green for low risk in the lower-left corner, and yellow for medium risk along the diagonal). It also uses a well-defined terminology: "High risk" is easier to comprehend than "high exposure."

Risk analysis provides a prioritized risk list to guide IT operations in risk planning activities. Within the MOF Risk Management Discipline, this is called the master risks list (described previously in Risk Lists). Detailed risk information including condition, context, root cause, and the metrics used for prioritization (probability, impact, exposure) are often recorded for each risk in the risk statement form.

Best Practices

These best practices will be beneficial during the risk analysis and prioritization step of the risk management process.

Risk Factor Charts

A risk factor chart helps the group quickly determine the exposure it faces for all general categories of risk. One line of such a chart might look like the row in the following table.

Click here for Table of Example Risk Factor Chart

Settle Differences of Opinion

It is unlikely that all IT operations staff will agree on risk ranking because staff members with different experiences or viewpoints will rate probability and impact differently. To maintain objectivity in the discussion and to limit arguments, be sure to decide as a group how to resolve these differences before starting this step. Options include a majority-rule vote, picking the worst-case estimate, or siding with the person who has the longest experience dealing with the situation in which the risk event actually occurs.

Measure Financial Impact

It is often helpful to roughly estimate impact in financial terms and record this in addition to the impact's numeric estimate. If several risks have the same exposure value, then the financial estimate can help determine which one is most important. Also, the financial data helps in the planning step to ensure that the cost of preventing a risk is lower than the cost of incurring the consequences.

It might seem that the financial estimate is preferable and could be used in place of a numeric value. In practice, however, financial impact values tend to be a much more labor-intensive way to produce the same top risks list.

If you decide to use a monetary scale for impact, use it for all risks. If a particular risk's impact uses a numeric scale and another's impact uses a monetary scale, then the two cannot be compared to each other, so there is no way to rank one over the other.

Perform a Business Impact Analysis

You should perform a business impact analysis-for example, by using a questionnaire that the users of the service fill out, estimating the importance and impact of service outages. This can help IT understand the service's perceived value, and this might be a factor to consider when ranking risks.

Record the Impact's Classification

Some IT groups find it useful to categorize the nature of the impact, such as security, capital expenditure, legal, labor, and so on.

Step 3 - Planning and Scheduling Risk Actions

Planning and scheduling risk actions is the third step in the risk management process. The planning activities carried out by IT operations translate the prioritized risks list into action plans. Planning involves developing detailed strategies and actions for each of the top risks, prioritizing risk actions, and creating an integrated risk management plan. Scheduling involves the integration of the tasks required to implement the risk action plans into day-to-day operations activities by assigning them to individuals or roles and actively tracking their status.

Planning Activities

When developing plans for reducing risk exposure:

Focus on high-exposure risks.
Address the condition to reduce the probability.
Look for root causes as opposed to symptoms.
Address the consequences to minimize the impact.
Determine the root cause, then look for similar situations in other areas that may arise from the same cause.
Be aware of dependencies and interactions among risks.

During risk action planning, IT operations should consider these six points when formulating risk action plans:

Research

Much of the risk that is present in IT operations is related to the uncertainties surrounding incomplete information. Risks that are related to lack of knowledge may often be resolved or managed most effectively by learning more before proceeding.

Accept

Some risks are such that it is simply not feasible to intervene with effective preventative or corrective measures; IT elects to simply accept the risk in order to realize the opportunity. Acceptance is not a "do-nothing" strategy, and the plan should include development of a documented rationale for accepting the risk but not developing mitigation or contingency plans.

It is prudent to continue monitoring such risks through the IT life cycle in the event that changes occur in probability, impact, or the ability to perform preventative or contingency measures related to this risk. For example, a data center may need to temporarily house servers in a basement room that is at risk of flooding. There may be no alternative location available given the heat and power requirements. Mitigation or risk transfer would be too expensive and cause too much disruption. In such a case and given the fact that flooding has never occurred before, it may be justifiable to accept the risk and monitor the situation.

Avoid

Risk avoidance prevents IT from taking actions that increase exposure too much to justify the benefit. An example is upgrading a rarely used application on all 50,000 desktops of an enterprise. In most cases, the benefit does not justify the exposure, so IT avoids the risk by not upgrading the application.

Transfer

Whereas the avoidance strategy eliminates a risk, the transference strategy often leaves the risk intact but shifts responsibility for it elsewhere. Examples where risk is transferred include:

Insurance.
Using external consultants with greater expertise.
Purchasing a solution instead of building it.
Outsourcing services.

Risk transfer does not mean risk elimination. In general, a risk transfer strategy will generate risks that still require proactive management, but reduce the level of risk to an acceptable level. For example, a company with an e-commerce site might outsource credit verification to another company. The risks still exist, but they become the outsource partner's responsibility. However, if the outsource partner is better able to perform credit verification, then transferring the risks can also reduce them.

Mitigation

While the goal of risk avoidance is to evade activities or situations having unacceptable risk, risk mitigation planning involves performing actions and activities ahead of time to either prevent a risk from occurring altogether or to reduce the impact or consequences of its occurring. For example, using redundant network connections to the Internet reduces the probability of losing access by eliminating the single point of failure.

It is vitally important to assign an owner to every mitigation plan, and it is helpful to define the plan's milestones in order to track its progress and its success metrics.

Not every risk has a reasonable and cost-effective mitigation strategy. In cases where a mitigation strategy is not available, it is essential to consider effective contingency planning instead.

Contingency

Risk contingency planning involves creating one or more fallback plans that can be activated in case efforts to prevent the adverse event fail. Contingency plans are necessary for all risks, including those that have mitigation plans. They address what to do if the risk occurs and focus on the consequence and how to minimize its impact. Often IT can establish triggers for the contingency plan based on the type of risk or the type of impact that will be encountered.

Triggers are indicators that tell IT a condition is about to occur, or has occurred, and therefore it is time to put the contingency plan into effect. Ideally, the trigger becomes true before the consequences occur. It may help to think of triggers as warning lights that light up while there is still time to avoid danger. For example, if the condition is that the server runs out of hard disk space, the trigger might be that the server's disk has reached 80 percent of its capacity and is showing an upward trend.

In some cases, the triggers may be date-driven. For example, if the condition is that a newly ordered server might not arrive in time to support the launch of a mission-critical application, a trigger might be set for the latest date on which the server could safely arrive. If the server does not arrive in time and the trigger becomes true, one contingency plan might be to make use of an existing server from a less-critical service.

Best Practice

This best practice will be beneficial during the risk action planning step.

Prioritize

A mitigation plan might have several actions, and the sequence might affect the mitigation's success at reducing, avoiding, or transferring the risk, so it is important to prioritize the steps in this plan.

A contingency plan essentially describes how to shift away from normal operations when a condition occurs. Especially if the consequences disrupt many services, it may be valuable to bring some services back online first. Agree beforehand on the order in which to restore service, and decide how long each part can be offline.

CLICK ON EACH LINK BELOW TO CONTINUE READING STEPS 4-6

Dr. NICHOLAS WATSON

PhD; M.Ed; B.Ed