ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Advanced MySQL Replication Techniques
Pages: 1, 2, 3, 4, 5, 6

Measuring circular replication performance

That usage is transparent, but how good is the performance? To answer this question, I had to make some measurements. Rather than clobbering this article unnecessarily, I refer to a recent entry in my blog dealing with this specific issue (Measuring replication speed, with its companion measuring replication speed source code). This tool shows that replicating 1,000 bytes along the whole replication array takes 0.001150 second. To get a reasonable comparison, consider that measuring on the same machine (that is, without replication at all) takes 0.000015 second, while moving the same data between two nodes takes only 0.000065 second.



I would also like to show some more tangible experience, something that makes sense to the average programmer. The method I used for measuring aims at getting a precise figure. However, replication speed can be so fast that it can accommodate even the most demanding programmer needs. Thus, consider an extreme test. Insert three records containing more than 1,000 bytes in one node, and immediately afterward fetch those records from that node's master (Figure 3). Those records will need to navigate through the whole array of nodes before being available.

Testing replication speed
Figure 3. Testing replication speed

The testing code is a Perl script.

#!/usr/bin/perl

use strict;
use warnings;

use English qw( -no_match_vars );
use DBI;
use Time::HiRes qw/ usleep /;

my @configuration_files = ();
my $max_config_index    = 3;
my $current_config      = 0;

for ( 'A' .. 'D' ) {
    push @configuration_files,  "$ENV{HOME}/circular_replica/my.node$_.cnf";
}

sub get_connection {
    my ($config_files) = @_;
    my $config_file = $config_files->[$current_config];
    $current_config++;
    if ($current_config > $max_config_index) {
        $current_config = 0;
    }

    my $dbh;
    eval {

        $dbh=DBI->connect("dbi:mysql:test"
            . ";mysql_read_default_file=$config_file",
                undef,
                undef,
                {RaiseError => 1})
         or die "Can't connect: $DBI::errstr\n";
    };
    if ( $EVAL_ERROR ) {
        print STDERR $EVAL_ERROR;
        return;
    }
    return $dbh;
}

my $dbh = get_connection(\@configuration_files);
$dbh->do(qq{truncate x});
my $bigtext = 'a' x 1000;

for my $loop (1 .. 10)
{
    my $dbh1 = get_connection(\@configuration_files);
    my $dbh2 = get_connection(\@configuration_files);
    my ($server1) = $dbh1->selectrow_array(qq{select \@\@server_id});
    my ($server2) = $dbh2->selectrow_array(qq{select \@\@server_id});
    for (1 .. 3) {
        $dbh2->do( qq{insert into x values (null, concat("server ", ? ), ?)  } , undef , $server1, $bigtext );
    }
    usleep(1);
    my $count = $dbh1->selectrow_array(qq{ select count(*) from x });
    print "inserted a record from server $server2\n ",
          "retrieved $count records from $server1\n";
}

After removing all records from the test table, for which there is now a third column (MEDIUMTEXT), the code loops through the nodes, getting at each step a node ($dbh2) and its master ($dbh1). Immediately after inserting three records in the forward node, it calls the master to fetch a count of records, after a stop of one microsecond (usleep). A sample session follows:

inserted a record from server 30
 retrieved 3 records from 20
inserted a record from server 10
 retrieved 6 records from 40
inserted a record from server 30
 retrieved 9 records from 20
inserted a record from server 10
 retrieved 12 records from 40
inserted a record from server 30
 retrieved 15 records from 20
inserted a record from server 10
 retrieved 18 records from 40
inserted a record from server 30
 retrieved 21 records from 20
inserted a record from server 10
 retrieved 24 records from 40
inserted a record from server 30
 retrieved 27 records from 20
inserted a record from server 10
 retrieved 30 records from 40

With this simple demonstration, I believe I have shown that circular replication arrays in MySQL 5 are a viable alternative to expensive clusters. However, circular replication does not scale well. When the number of nodes grows to more than 10 nodes, the speed of replication may not be up to the expectations of normal business operations.

Keep this in mind, though. If a small number of servers are enough for your business, an array of replication nodes could be just what you need.

Pages: 1, 2, 3, 4, 5, 6

Next Pagearrow





Sponsored by: