• HSBC

Keeping systems alive - even during disasters (page 1 of 4)

  • Thursday, October 21 - 2004 at 10:24

In the last article in the disaster recovery series replication was discussed as being a crucial building block to keeping data safe 24x7 with no risk of data loss even if a disaster occurred.

Why Availability Matters
Replication is the obvious next step to ensuring higher service levels after a solid backup plan has been deployed. But if backup is the building block of a disaster recovery strategy and replication is the next step with real-time data protection, where does system availability fit in?

"If the service centre stops functioning, we cannot serve our customers - which means that no newspapers would be produced. In other words, if we had a shutdown, it would paralyze 30 percent of the Norwegian press."
Gunstein Løken, Operations and Development Manager, Orkla Media Service Senter IT


Backup and replication technologies both deal with minimising data loss, but neither technology can help keep systems alive - even during disasters. For maximum system availability we need to look at other alternatives. Traditionally, if backup is the only technology used to protect systems then the only way to get back to business is to do a restore from disk or tape, which can take anywhere from a couple of hours to days or even weeks. An example of how crucial it is to keep systems alive comes from a few years back when a large online brokerage company suffered 4 system outages within 2 months, resulting in a 22% dip in the stock price as customers lost faith in the company.

Minimise downtime


To minimise downtime as well as data loss, a combination of clustering and replication must be used. Clustering is simply the process of moving a failed application on a system that is experiencing a disaster to a working system, whether in the same data centre or in another location. This process can take anywhere from seconds to minutes.

What is a Cluster?
Before clustering emerged as a viable technology for keeping systems alive, users simply connected to systems and if the systems went down the users would be paralysed from doing anything until that system was fixed.

This also meant that if you were the administrator for an IT environment you would be held responsible for getting the system back up and running as quickly as possible and everyone would be hounding you until the system was fixed.

server failure

Figure 2. In the above environment, if the server went down the entire IT environment would be crippled and unavailable.

Though the concept of clustering had been available for many years on mainframes, it wasn't until the 90's that it became widespread on open systems, such as Windows, Unix and Linux. With clustering IT administrators could now secure access to systems and minimise downtime by having extra systems available in case of failure.

another system taking over in case of failure

Figure 3. By having another system available to take over in case of system outages, downtime can be minimised. If one system fails, another takes over.

How does clustering work?
Clustering is by no means magic, it simply automates the process of rebuilding a system and starting an application on a standby system. Without clustering rebuilding a system can take a very long time, because first the operating system has to be installed, then applications, then patches downloaded and applied, then the system has to be configured, etc.
Article Options

Disclaimer »

Articles in this section are primarily provided directly by the companies appearing or PR agencies which are solely responsible for the content. The companies concerned may use the above content on their respective web sites provided they link back to http://www.ameinfo.com

Any opinions, advice, statements, offers or other information expressed in this section of the AMEinfo.com Web site are those of the authors and do not necessarily reflect the views of AME Info FZ LLC / Emap Limited. AME Info FZ LLC / Emap Limited is not responsible or liable for the content, accuracy or reliability of any material, advice, opinion or statement in this section of the AMEinfo.com Web site.

For details about submitting your stories, please read the guide - all content published is subject to our terms and conditions