If you have followed the instructions,
and your replication setup is not working, first elliminate the user error
factor by checking the following:
Is the master logging to the binary
log? Check with SHOW MASTER STATUS. If it is, Position will be non-zero. If
not, verify that you have given the master log-bin option and have set
server-id.
Is the slave running? Check with SHOW
SLAVE STATUS. The answer is found in Slave_running column. If not, verify slave
options and check the error log for messages.
If the slave is running, did it
establish connection with the master? Do SHOW PROCESSLIST, find the thread with
system user value in User column and none in the Host column, and check the
State column. If it says connecting to master, verify the privileges for the
replication user on the master, master host name, your DNS setup, whether the
master is actually running, whether it is reachable from the slave, and if all
that seems ok, read the error logs.
If the slave was running, but then
stopped, look at SHOW SLAVE STATUS output andcheck the error logs. It usually
happens when some query that succeeded on the master fails on the slave. This
should never happen if you have taken a proper snapshot of the master, and
never modify the data on the slave outside of the slave thread. If it does, it
is a bug, read below on how to report it.
If a query on that succeeded on the
master refuses to run on the slave, and a full database resync ( the proper
thing to do ) does not seem feasible, try the following:
First see if there is some stray
record in the way. Understand how it got there, then delete it and run SLAVE
START If the above does not work or does not apply, try to understand if it
would be safe to make the update manually ( if needed) and then ignore the next
query from the master.
If you have decided you can skip the
next query, do SET SQL_SLAVE_SKIP_COUNTER=1; SLAVE START; to skip a query that
does not use auto_increment, last_insert_id or timestamp, or SET
SQL_SLAVE_SKIP_COUNTER=2; SLAVE START; otherwise If you are sure the slave
started out perfectly in sync with the master, and no one has updated the
tables involved outside of slave thread, report the bug, so you will not have
to do the above tricks again.
Make sure you are not running into an
old bug by upgrading to the most recent version.
If all else fails, read the error
logs. If they are big, grep -i slave /path/to/your-log.err on the slave. There
is no generic pattern to search for on the master, as the only errors it logs
are general system errors - if it can, it will send the error to the slave when
things go wrong.
When you have determined that there is
no user error involved, and replication still either does not work at all or is
unstable, it is time to start working on a bug report. We need to get as much
info as possible from you to be able to track down the bug. Please do spend
some time and effort preparing a good bug report. Ideally, we would like to
have a test case in the format found in mysql-test/t/rpl* directory of the
source tree. If you submit a test case like that, you can expect a patch within
a day or two in most cases, although, of course, you mileage may vary depending
on a number of factors.
Second best option is a just program
with easily configurable connection arguments for the master and the slave that
will demonstrate the problem on our systems. You can write one in Perl or in C,
depending on which language you know better.
If you have one of the above ways to
demonstrate the bug, use mysqlbug to prepare a bug report and send it to
bugs@lists.mysql.com. If you have a phantom - a problem that does occur but you
cannot duplicate "at will":
Verify that there is no user error
involved. For example, if you update the slave outside of the slave thread, the
data will be out of sync, and you can have unique key violations on updates, in
which case the slave thread will stop and wait for you to clean up the tables
manually to bring them in sync.
Run slave with log-slave-updates and
log-bin - this will keep a log of all updates on the slave.
Save all evidence before reseting the
replication. If we have no or only sketchy information, it would take us a
while to track down the problem. The evidence you should collect is:
All binary logs on the master
All binary log on the slave
The output of SHOW MASTER STATUS on
the master at the time you have discovered the problem
The output of SHOW SLAVE STATUS on the
master at the time you have discovered the problem
Error logs on the master and on the
slave
Use mysqlbinlog to examine the binary
logs. The following should be helpful to find the trouble query, for example:
mysqlbinlog -j pos_from_slave_status
/path/to/log_from_slave_status | head
Once you have collected the evidence
on the phantom problem, try hard to isolate it into a separate test case first.
Then report the problem to bugs@lists.mysql.com with as much info as possible.
Getting Maximum Performance from MySQL
Optimization is a complicated task
because it ultimately requires understanding of the whole system. While it may
be possible to do some local optimizations with small knowledge of your
system/application, the more optimal you want your system to become the more
you will have to know about it.
So this chapter will try to explain
and give some examples of different ways to optimize MySQL. But remember that
there are always some (increasingly harder) additional ways to make the system
even faster.
No comments:
Post a Comment