Showing posts with label emc. Show all posts
Showing posts with label emc. Show all posts

Sunday, August 12, 2007

BUSINESS CONTINUITY

One step beyond Disaster Recovery

I recently advised a medium-sized commercial bank in the Philippines about a stalled project to create a business continuity solution.

Financial institutions in the Philippines do not face equivalent data integrity and safety requirements as they do here in the U.S. Still, management knew that they had to improve their IT capabilities. Their primary data center is located in their head office and it’s vulnerability surfaced at every coup attempt.

They learned about me from another client. Click
here for that story.

The bank was trying to install an EMC Asynchronous SRDF solution.

I briefly worked for EMC U.S.A. as a systems engineer. I’m familiar with the product line and the subject of disaster recovery & business continuity in general.

Disaster Recovery (DR) aptly describes the process of recovering from a disaster.

DR can be illustrated with the knowledge that all hard drives crash. It’s not a question of “if,” but a question of “when.” When the drives of a “production box” crash, business grinds to a halt unless and until the data can be restored and the server restarted. The process of restoring the data and restarting the server is disaster recovery.
A “production box” is tech-speak for a computer server that’s serving a live network.
Operations can grind to a halt for any number of reasons. Fire, a software crash, human error, network failure, and a power blackout are common culprits.

DR planning begins by defining the acceptable minimum values of two factors. The first is called the Recovery Time Objective (RTO) and the second is the Recovery Point Objective (RPO).
RTO is the amount of time you require to recover your lost or damaged data in order to become operational again. Can your business tolerate being down for several days or several hours? Whether it’s days or hours, this figure is your RTO.
RPO, on the other hand, is the amount of data accumulated over time that you can tolerate losing. Can your business afford to lose a day’s worth of data? If so, then your data must be backed up on a daily basis. A retail operation, like a supermarket, that logs hundreds or thousands of transactions a day may require several backups made during the course of the day.
“Business Continuity” (BC) extends the scope of preparation, plans, and resources past DR. Those two factors, RTO and RPO, figure into this as well.
BC’s goal is to ensure the business will be able to continue operating through crises and disasters. Accomplishing that requires going beyond the processes and equipment for restoring data and replacing equipment. Indeed, BC refers to making plans and preparing resources that, among other things, will prevent the loss of data. It refers to advance preparation in order to cope with the unexpected.

A good BC plan has:
  1. identified the most likely disaster scenarios and their impact on the business;
  2. determined the “mission-critical,” important, and less-important processes, systems, and services of the company;
  3. established its priorities for supporting the mission-critical components;
  4. developed and implemented the most redundant and fault-tolerant system possible within its budget;
  5. several alternate strategies
  6. taught and regularly practice the plan with its people; and
  7. the continuing support of senior management.
“Mission-critical” is tech-speak for the most important processes, systems, and services that a business must have in order to fulfill its mission. What is a mission? For a hospital, it could be the 24/7 availability of patient information.

“Redundant” is tech-speak for a backup that can temporarily take the place of a failed primary system.

“Fault-tolerant” is tech-speak for the characteristic of being able to withstand glitches.

Certain industries and companies require uninterrupted IT services. For them, BC is mandatory. The airline industry and financial institutions are examples. The financial sector, in fact, has to follow stringent guidelines for protecting and maintaining the security of its data. These companies must have minimal downtime. How minimal?
A calendar year has 8,760 hours. To give you an idea of the pressure to perform, consider that a 99.9% uptime is “only” equivalent to 8,751 hours.
Imagine the trouble a bank would face if it's nine non-operational hours occurred on the 15th. Employees would not receive their pay.
It turns out that a 99.99% uptime is required to stay operational 8,759 hours of the year! That’s still one hour short of the goal!
When the availability or integrity of data is compromised for any reason, businesses risk losing revenue and market share, experiencing decreased productivity, damaging their reputation, eroding their customers’ loyalty, and, in certain industries, being penalized for failing to comply with mandated regulations.

I enjoy BC planning because it's an activity that can incorporate numerous improvements for a little or no additional cost. It's a rare opportunity to deliver a lot of added value beyond the client's initial expectations.

There are several ways to go with DR and BC. You can create it in-house or outsource some or all of its aspects.

I'll cover both but the next entry will focus on the offerings of two established players in the field of storage, DR, and BC. These are the two
I’m familiar with, EMC and NetApp.


Sphere: Related Content

Tuesday, July 17, 2007

BUSINESS CONTINUITY PLANNING

Planning takes four steps

It took a while but business continuity planning (BCP) has finally become visible on the radar screen of managers and owners of smaller businesses (< $100 million sales). It’s about time too. The state of the world today is far more volatile than it was a mere eight years ago. Nine 11 did change everything.

Every organization should plan for its continued existence in the event of a major disruption. How will it continue to operate if its operation—and existence—is disrupted by any number of natural or man-made disasters?

The practice of Business Continuity Planning (BCP) has evolved into a recognized field. Job titles that carry or imply this area now exist. Practitioners can join any number of reputable associations that promote this field. Several recognized certifications can now be earned as well.

I had the good fortune of working as a Sales Systems Engineer for the world’s largest enterprise storage vendor just before the dot com crash. I’m referring to EMC, the 800-pound gorilla of the enterprise storage space. At that time, the basic rationale behind EMC’s fabulously expensive SRDF (Symmetrix Remote Data Facility) was real-time replication for disaster recovery (DR). Under the proper guidance, it can be a short leap from DR to BCP. And that is where SRDF is now positioned—as the lynchpin of the data side of business continuity planning.

The mission of a Systems Engineer who works in Sales is to support his sales reps by designing the storage and DR solutions for customers and prospects alike. To him fell the task of dealing with the technical aspect of any proposal or project. This frequently involved making technical presentations for prospects and serving as the single point-of-contact for existing customers that were contemplating system upgrades.

Disaster recovery (DR) is a subset of the BC solution. Many fine definitions of the term abound so rather than reinvent the wheel, I will quote some of the better ones. Disaster recovery is:

  • the process, policies and procedures of restoring operations that are critical to the resumption of business [Wikipedia].
  • the ability of an organization to respond to a disaster or an interruption in services by implementing a disaster recovery plan to stabilize and restore the organization’s critical functions. [Disaster Recovery Journal].

Wikipedia goes on to say that…

  • a disaster recovery plan (DRP) should include plans for coping with the unexpected or sudden loss of communications and/or key personnel, although these are not covered in this article, the focus of which is data protection. Disaster recovery planning is part of a larger process known as business continuity planning (BCP).

Disaster Recovery Journal continues as well…

  • The management approved document that defines the resources, actions, tasks and data required to manage the technology recovery effort. Usually refers to the technology recovery effort. This is a component of the Business Continuity Management Program.

The two share the common thread in their reference to business continuity planning and its inclusion of disaster recovery within its larger scope.

I will continue this in a subsequent post. For now, let me break down the steps that BCP entails. The process follows these four steps in a logical sequence.

Identification

Identify risks and hazards that confront your business. These can be natural hazards, e.g., flooding and earthquake, or man-made risks, e.g., power outage, theft, fire, attack against your computer network. Obviously you have to draw the line at some point since it is impractical to anticipate some risks regardless of their severity. For example, two key project members in an SAP implementation project I participated in literally met an unfortunate and fatal accident. That incident delayed a major portion of the entire project until replacement personnel were hired.

Assessment

It is possible to quantitatively and qualitatively determine the likelihood, magnitude, and duration of the identified risks. Assessing risks this way allows you to prioritize them. When risks are categorized this way, you can budget your resources more rationally.

Plan Development

You now have the information to create the plans and procedures for preparing your organization to respond to and recover from interruptions. This is a high-level step and as the saying goes, the devil is in the details. This is where senior management, which should have initiated this project to begin with, should return and visibly support the BCP team. The team will need the time to extensively discuss the risks and possible solutions with functional heads. Without that support, the team will find it difficult to get the attention of the functional heads, much less their full-hearted cooperation.

Exercise

In this final step you must exercise the plan. This is the only way to learn what works and what does not. Needless to say, this is another step that senior management must support. Exercising the plan is a continuing activity. In fact, this entire process is performed iteratively. Exercising the BC plans will refine those plans and, more importantly, teach the employees how to respond if and when the real event happens.



Sphere: Related Content

Saturday, July 14, 2007

PROCUREMENT & CONTRACT MANAGEMENT

A best practice that should be used for selecting a business continuity solution.




IT managers have a plethora of solutions to choose from depending upon technical factors, business objectives, and, of course, available budget.


Providing continuous business operations is daunting. This is one of those tasks that have become more difficult because the number and kind of choices have multiplied. When I was growing up, it wasn’t difficult to brush my teeth. I would just pick up my toothbrush and go. When my kids were growing up, it was because they each had at least four toothbrushes to pick from.

“Which color tonight?” “Elmo or Cookie Monster?” Similarly, “IBM or EMC?”


There’s a configuration available for every budget and level of protection. Interoperability, or the product’s capability of working with other manufacturer’s products, is less of a concern today since practically all products—hardware or software—are interoperable. On the one hand, that’s good but on the other, it further complicates the decision tree.

“We’re an IBM shop; should we stay with IBM?” “What about our branch offices? Will a NetApp-Symantec solution suffice?”

When I’m grocery shopping, I don’t need assistance in deciding which coffee to buy. I have my regular brand and if a competing brand is on sale I know enough to decide whether to try the one on sale or not. That’s not the situation here. The “correct” solution is a decision that my organization will have to live with for years to come.

This is where I use the formal contract management process. Making sense of the buying complexity led me to learn and use it. Procurement and contract management is a subject in itself. I'll cover that in a future article.

I use procurement and contract management as a best practice. I use it since this is not a cursory purchase. It’s not wise to simply request any three or four vendors to make their presentations and submit proposals.

Procurement and contract management has three major phases. I call the first the Pre-award Phase (yes, I know it’s imaginative). For me, the buyer, I have to:

1. Plan the procurement.

2. Plan the solicitations.

3. Request the solicitations.


PROCUREMENT PLANNING

Procurement planning, the first activity, determines what to procure and when. It includes the people aspect. Do I plan to hire or train existing employees?


SOLICITATION PLANNING

Solicitation planning, the second activity, fills in the details. I develop a Statement of Work (SOW) that contains my requirements. This is more difficult than it appears. I have to understand my own requirements and understanding those demand purposeful and far ranging analysis. Then I need to communicate those requirements using specific deliverables in the SOW. This purchase is inevitably going to be delivered through a project process. It’s very unlike a server equipment purchase. The solution will be delivered through a project because it requires the vendor to coordinate closely with the client to plan, configure, before the cut over.

REQUESTING SOLICITATIONS

The third activity is requesting solicitations through a Request for Proposal (RFP). This is where my thoroughness in the first two activities—procurement planning and solicitation planning—pays off. I’ll be creating an RFP that clearly communicates my needs. I must provide an accurate well-defined RFP in order to receive useful bids. By useful, I refer to bids that meet most of my requirements. Let’s face it. Before I settle on a vendor, I’ll sit down with the finalists to review, clarify, and negotiate the final contract. If I wrote a poor quality RFP, I’ll receive poor quality bids. And I’ll have more work, and consume more time, when I sit down with the finalists. I’ll need to re-define my requirements and then request the vendors that are still interested to submit bids again.


Sphere: Related Content