Planning for Disaster Recovery on LAMP Systemsby Robert Jones
I make my living building custom databases with web interfaces for
biotechnology companies. These are MySQL and Perl CGI applications, running
under Linux, and every one of them is different. Disaster recovery planning for
these applications has consisted of routine tape backups of all the software
and data, a bunch of
ReadMe files, and having me around to put the
pieces back together if something breaks—and things do break... power
supplies, disk drives, RAID controllers, you name it. Recovery means we fix or
replace the hardware, reinstall Linux, restore the apps from tape, and then
stitch everything back together. Some recoveries have been easy. Others have
involved pacing back and forth and swearing for hours on end while figuring out
how the heck I had all this working in the first place. Not pretty, but that's
just what you do, right?
That's fine in some situations but in larger companies, especially those with a formal "Corporate IT" group, this approach just doesn't cut it. There is a clash of cultures here that many of us have to face as our startup companies reach a certain size. All of a sudden we find ourselves spending way too much time in meetings, drawing up formal specifications and policies for this, that, and the other. Disaster recovery planning is one of the first of these efforts that most of us have had to deal with. Don't get me wrong, disaster recovery is a critical issue. It's just that it can be a very painful process for those of us who come from an informal development background.
I went through this last year when the CIO at one of my clients brought in outside consultants to formulate their disaster recovery plan. Their first step was to ask what the name of my executable was and where the installation script was located... OK... bit of a problem there. I have 54 Perl CGI scripts in one application alone, nine applications, and no installation scripts for any of them. Eventually, I understood what they really wanted to know, as opposed to what they asked in the questionnaire that they gave me to fill out. I viewed the process as an opportunity, rather than a hassle, and reviewed how I had my applications set up from their perspective. I had to make some changes to the software but, more importantly, I came up with an approach that I now use with all my projects to design in disaster recovery right from the start. I know I'm not the only one dealing with this issue so here are some ideas that you might want to build into your apps.
The (Configuration) Problem
The problem with our sort of database applications is that they weave themselves into the Linux system configuration. We add definition blocks to the configuration files for Apache, MySQL, Samba, et al. We create system-wide environment variables in user shells and insert symbolic links into the filesystem. Every time we rebuild a system we have to make the configuration changes anew. The potential for error is large, even assuming we remember all the steps.
On top of that, we need to deal with software dependencies. Perl modules are a godsend, but most of the sophisticated modules use of other modules extensively. When we use these, we inherit a hierarchy of dependencies that can make installation and recovery even more challenging and error prone. We don't want to give up the things that make our mode of development so productive, but we do need to understand and manage these dependencies. Our goals in designing for disaster recovery should be to keep things simple, to understand where dependencies exist, and to limit those where appropriate.
Separate the Application from the System
I place all my application software and data on its own partition,
preferably on a separate disk. You want to be able to restore the application
onto any suitable Linux system without regard for how that system is
configured. Don't use
/usr/local. You don't want your software anywhere near anything
that the system or any other package might install. My preference is to create
a partition called
/proj on its own disk with each application in
its own directory under there.
With this layout I can take any suitable hardware and perform a standard install of whatever version of Linux is current. This step is really simple; anyone can do it and it lets you verify general system operation before we start with any of my applications. That is exactly what the disaster recovery people want to hear.
Only then do I create my partition and restore my software and data from
tape. It is totally separate from any of the system software. This way the
request for whomever looks after your backups is simply "Give me the latest copy
/proj", as opposed to "Give me this from
/usr/local/bin, and this from
/etc" and so
on. Again, the disaster recovery people really like that. Complexity means
room for error, simplicity means success.
Use Perl Modules Wisely
The CPAN collection of Perl modules has incredible value, saving us from
reinventing many wheels. But with every one that we use we add to the
dependencies we need to manage. The more dependencies, the less robust the
application. If a given module is not part of the standard distribution then
ask yourself if you really need it. If the only reason to include
Date.pm, for example, is to use a simple date conversion function,
then think about writing your own version. I know that goes against the grain
but in some cases, it can have a big impact on the complexity of your
Archive Local Copies of Helper Applications
If you use third-party software within in your application, make sure to
archive copies of the distribution kits for each package. For instance, I use
ImageMagick for image manipulation and
gnuplot to generate scatter
plots in a gene expression application. In a recovery situation I don't want to
hunt around the Net looking for the right tar files when I should be getting
the database back up. Archive tar or RPM files for each package in your
application directory, list them in the appropriate
and describe what they do and where they are used. You can then include that
list directly in the recovery plan.
Archive Local Copies of Perl Modules
The same advice goes for Perl modules, but here things can get a bit messy.
The problem is that many of the really useful modules,
example, require other modules in order to function. So, you need to archive the
whole tree of dependent modules. Figuring out what modules you need and whether
they are part of the standard Perl distribution is not a simple task. The
Module::CoreList can be
useful in this regard.
Most of us deal with this by using the CPAN module to download and install
our modules. When it works it is the best thing since sliced bread, but having
it fail halfway through an install is not uncommon. Debugging the problem
requires you to sift through the reams of output it generates and even then the
fix is often not apparent. The most common advice found on the web is to run
the program again and see if it works the second time around. Sorry, but that
just isn't a good answer. In that case you are stuck with fetching the
distribution kits for each module and building them by hand. See the
.cpan/build/ directory where CPAN.pm stores and builds modules it
Ideally, I would walk through the installation of everything I need for a project, fixing problems as I go. Then I would flush all of that out of the system and reinstall everything from scratch using the knowledge gained the first time around. In reality, I don't have time for that. So the best advice I can give is to record every step you take and then edit that log to produce the preferred set of steps that you will follow next time around. Be warned that this does not sit well with disaster recovery professionals. Explaining this in the plan will test your creative writing skills.
Edit the System Configuration Files
Note: My company is called Craic Computing and so you will see the word
craic dotted throughout the following examples. It simply serves
to distinguish my modifications from any system code.
The next step is to modify the system configuration files to suit your
application. One major target is likely the Apache
file. This is where you set up virtual hosts, link directories to web trees,
take care of URL rewriting, and set any special options. The default file is
already a beast, with around 1,500 lines, so we would prefer to keep all our
application-specific definitions in one place, preferably at the very end of
Include directive is the answer to our concerns. We can
place all the application-specific definitions into a separate file and have
that text included verbatim through the use of this reference.
Better still, I can create a separate configuration file for each
application, place all of these in the same directory, and refer to that
directory with a single
Include directive. (In this example,
craic_config is that directory.)
In a similar fashion, we can define system-wide environment variables in
application-specific files and include them in
sourcing a separate shell script file, or files, using this block of code:
CRAIC_DIR=/proj/linx_config/profile for j in $CRAIC_DIR/*.sh; do if [ -r $j ]; then . $j fi done unset j
We can even use the same mechanism to set up Samba shares by including an
external file in the main
include = /proj/linux_config/samba/smb.conf
By limiting the changes made to system files to these simple statements we maintain our separation of system and application as much as possible.
Pages: 1, 2