通过Heartbert2 让Mysql Replication 具有-数据库-IT落伍者

前言

MasterSlave 的数据库机构解决了很多问题特别是read/write 比较高的应用结构如图

写操作全部在Master 结点执行并由Slave 数据库结点定时(默认s)读取Master 的binlog

将众多的用户读请求分散到更多的数据库节点从而减轻了单点的压力

它的缺点是

Slave 实时性的保障对于实时性很高的场合可能需要做一些处理

高可用性问题Master 就是那个致命点(SPOF:Single point of failure)

本文主要讨论的是如何解决第个缺点

解决方案如下图

使用两个MySQL 主库mastermaster数据存在共享设备上用heartbeat 进行监控当master 发生故障时将资源切换到master

故障发生后无需对slave 进行修改slave 自动切到master(断电切换需要手工同步slave)

网络设置

磁盘设置

mysql 安装

到官方网址下载最新版本mysql找到对应的包在haha 分别安装

# cd /usr/local/

# tar xzvf mysqllinuxx_glibctargz

# mv mysqllinuxx_glibc mysql

# groupadd mysql

# useradd g mysql mysql

#passwd mysql

修改/etc/mycf数据文件路径设置在共享磁盘修改相关参数这个参数在haha上必须一样

datadir=/u/data #数据文件路径

serverid = #数据库ID

logbin=/u/data/master # binlog 路径

下面步骤先在ha 上执行然后在ha 执行

先在ha 把磁盘mount

[root@ha mysql]# mount /dev/sdb /u

[root@ha u]# mkdir data

[root@ha u]# chown R mysqlmysql data/

安装mysql

[root@ha u]# cd /usr/local/mysql

# /scripts/mysql_install_db user=mysql

# cp supportfiles/mysqlserver /etc/rcd/initd/mysqld

# chmod +x /etc/rcd/initd/mysqld

# chkconfig add mysqld

# /etc/rcd/initd/mysqld start

在ha 把共享磁盘umount在ha 上mount把上面的数据删除后执行步骤完成后ha 也umount

在slave 上安装mysql数据放在slave 本地过程略

heartbeat 安装

官方网址分别在haha 安装确认下列包安装

[root@ha ~]# rpm ivh libnetx_rpm

[root@ha ~]# rpm ivh heartbeatpilsx_rpm

[root@ha ~]# rpm ivh heartbeatstonithx_rpm

[root@ha ~]# rpm ivh perlTimeDateelnoarchrpm

[root@ha ~]# rpm ivh heartbeatx_rpm

[root@ha ~]# rpm ivh heartbeatdevelx_rpm

开始编辑配置文件(haha 都执行)

[root@ha local]# cp /usr/share/doc/packages/heartbeat/hacf /etc/had/

[root@ha local]# cp /usr/share/doc/packages/heartbeat/authkeys /etc/had/

编辑/etc/had/authkeys使用的是第种认证方式(crc)接着把文件的权限改为

cat /etc/had/authkeys

显示

auth

crc

更改文件权限

chmod /etc/had/authkeys

[root@ha ~]# cat /etc/had/hacf

debugfile /var/log/hadebug

logfile /var/log/halog

logfacility local

keepalive

deadtime

warntime

udpport

crm yes

node ha ha

bcast eth

auto_failback off

apiauth cibmon uid=hacluster

respawn hacluster /usr/lib/heartbeat/cibmon –d

配置资源共享IP共享磁盘MYSQL 个服务组成组资源

[root@ha ~]# cat /etc/had/haresources

ha Filesystem::/dev/sdb::/u::ext mysqld

启动的时候从左到右依次运行脚本关闭的时候从右到左依次关闭

这个文件原名为haresources 在x 上使用不过为了区别使用此名称

将资源文件转换成cibxmlx 里编译好后自带有转换脚本

[root@ha ~]# cd /var/lib/heartbeat/crm/

[root@ha crm]# rm rf cibxml*

[root@ha crm]#

/usr/lib/heartbeat/haresourcescibpy stout c /etc/had/hacf /etc/had/haresources

[root@ha crm]# cat cibxml|grep mysql

即每秒检测资源运行情况如果发现资源不在则尝试启动资源如果s 后还未启动

成功则资源切换向另节点可根据业务进行修改

启动heartbeat在ha 和ha 都启动

[root@ha ~]# /etc/initd/heartbeat start

查看资源情况

============

Last updated: Tue Feb ::

Current DC: ha (cdebcbbbbeef)

Nodes configured

Resources configured

============

Node: ha (cdebcbbbbeef): online

Node: ha (abdfbeacfcfc): online

Resource Group: group_

IPaddr____ (ocf::heartbeat:IPaddr): Started ha

Filesystem_ (ocf::heartbeat:Filesystem): Started ha

mysqld_ (lsb:mysqld): Started ha

将heartbeat 设置成开机自动重启

[root@ha ~]# chkconfig add heartbeat

[root@ha ~]# chkconfig level heartbeat on

[root@ha ~]# chkconfig list heartbeat

heartbeat :off :off :on :on :on :on :off

mysql slave 配置

在master 赋予slave 权限(任一节点操作)

[root@ha ~]# /usr/local/mysql/bin/mysql u root p

Enter password:

Welcome to the MySQL monitor Commands end with ; or \g

Your MySQL connection id is

Server version: log MySQL Community Server (GPL)

Type help; or \h for help Type \c to clear the buffer

mysql> GRANT RELOAD REPLICATION SLAVE ON ** TO slave@

IDENTIFIED BY nslave;

mysql> show master status\G;

*************************** row ***************************

File: master

Position:

Binlog_Do_DB:

Binlog_Ignore_DB:

row in set ( sec)

slave 同步

因为没有数据所以省略数据同步的步骤在slalve 执行下列命令

mysql> CHANGE MASTER TO MASTER_HOST=

MASTER_PORT =

MASTER_USER=slave

MASTER_PASSWORD=nslave

MASTER_LOG_FILE=master

MASTER_LOG_POS = ;

Query OK rows affected ( sec)

mysql> start slave;

Query OK rows affected ( sec)

mysql> show slave status\G;

*************************** row ***************************

Slave_IO_State:Waiting for master to send event

Master_Host:

Master_User: slave

Master_Port:

Connect_Retry:

Master_Log_File: master

Read_Master_Log_Pos:

Relay_Log_File: mysqlrelaybin

Relay_Log_Pos:

Relay_Master_Log_File: master

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

row in set ( sec)

失败测试

网络切换测试

将ha 的网线拔掉在ha 看资源的切换情况

Node: ha (cdebcbbbbeef): online

Node: ha (abdfbeacfcfc): OFFLINE

Resource Group: group_

IPaddr____ (ocf::heartbeat:IPaddr): Started ha

Filesystem_ (ocf::heartbeat:Filesystem): Started ha

mysqld_ (lsb:mysqld): Started ha

没有问题把ha 网线插上资源又回切到ha

Node: ha (cdebcbbbbeef): online

Node: ha (abdfbeacfcfc): online

Resource Group: group_

IPaddr____ (ocf::heartbeat:IPaddr): Started ha

Filesystem_ (ocf::heartbeat:Filesystem): Started ha

mysqld_ (lsb:mysqld): Started ha

宕机切换

将ha 强制关机在开机这和的情况差不多能正常切换

服务切换

将ha 的mysql 服务强制中断修改/etc/mycf

datadir=/u/data #实际这个路径不存在

[root@ha ~]# ps ef|grep mysql

root : ? :: /bin/sh /bin/mysqld_safe

datadir=/u/data pidfile=/u/data/hapid

mysql : ? :: /usr/local/mysql/bin/mysqld

basedir=/usr/local/mysql datadir=/u/data user=mysql logerror=/u/data/haerr

pidfile=/u/data/hapid socket=/tmp/mysqlsock port=

root : ? :: /bin/sh /etc/initd/mysqld start

root : pts/ :: grep mysql

[root@ha ~]# kill

[root@ha ~]# ps ef|grep mysql

root : pts/ :: grep mysql

在ha 上查看资源情况已经切换过来

Node: ha (cdebcbbbbeef): online

Node: ha (abdfbeacfcfc): online

Resource Group: group_

IPaddr____ (ocf::heartbeat:IPaddr): Started ha

Filesystem_ (ocf::heartbeat:Filesystem): Started ha

mysqld_ (lsb:mysqld): Started ha

Failed actions:

mysqld__monitor_ (node=ha call= rc=): complete

mysqld__start_ (node=ha call= rc=): complete

slave 同步测试

将ha 的heartbeat 关闭

[root@ha data]# /etc/initd/heartbeat stop

在ha 看资源情况

Node: ha (cdebcbbbbeef): online

Node: ha (abdfbeacfcfc): OFFLINE

Resource Group: group_

IPaddr____ (ocf::heartbeat:IPaddr): Started ha

Filesystem_ (ocf::heartbeat:Filesystem): Started ha

mysqld_ (lsb:mysqld): Started ha

资源全部切换到ha我们执行一些操作看是否同步

[root@ha ~]# /usr/local/mysql/bin/mysql u root p D test

Enter password:

Welcome to the MySQL monitor Commands end with ; or \g

Your MySQL connection id is

Server version: log MySQL Community Server (GPL)

Type help; or \h for help Type \c to clear the buffer

root@test >create table t(t int);

root@test >insert into t values();

root@test >select * from t;

| t |

| |

row in set ( sec)

在slave 查看是否同步

[root@mysql ~]# /root/cron/lgsh

Welcome to the MySQL monitor Commands end with ; or \g

Your MySQL connection id is

Server version: log MySQL Community Server (GPL)

Type help; or \h for help Type \c to clear the buffer

mysql> use test

Database changed

mysql> select * from t;

| t |

| |

row in set ( sec)

在将资源切回到ha在进行操作看slave 的情况

[root@ha ~]# /etc/initd/heartbeat start

Node: ha (cdebcbbbbeef): online

Node: ha (abdfbeacfcfc): online

Resource Group: group_

IPaddr____ (ocf::heartbeat:IPaddr): Started ha

Filesystem_ (ocf::heartbeat:Filesystem): Started ha

mysqld_ (lsb:mysqld): Started ha

资源已经切回到ha对其进行操作

[root@ha ~]# /usr/local/mysql/bin/mysql u root p D test

Enter password:

Welcome to the MySQL monitor Commands end with ; or \g

Your MySQL connection id is

Server version: log MySQL Community Server (GPL)

Type help; or \h for help Type \c to clear the buffer

mysql> insert into t values();

mysql> select * from t;

| t |

| |

rows in set ( sec)

在slave 查看是否同步

mysql> select * from t;

| t |

| |

rows in set ( sec)

mysql> show slave status\G;

*************************** row ***************************

Slave_IO_State:Waiting for master to send event

Master_Host:

Master_User: slave

Master_Port:

Connect_Retry:

Master_Log_File: master

Read_Master_Log_Pos:

Relay_Log_File: mysqlrelaybin

Relay_Log_Pos:

Relay_Master_Log_File: master

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Master 切换后slave 无需任何人工介入自动同步

模拟写入切换测试

写个简单的循环

[root@mysql cron]# cat testsh

for ((num=;num<;num=num+))

echo

/usr/local/mysql/bin/mysql u sg psg h D test e

insert into t values($num);

if (( $? )); then

echo $num:no

else

echo $num:ok

done

在写入过程中手工切换手工reboot 等操作和的测试结果一样在写入过程中强制ha 断电需要手工处理slave 同步将ha 断电后资源切到ha 后在slave 会报错

mysql> show slave status\G;

*************************** row ***************************

Slave_IO_State:

Master_Host:

Master_User: slave

Master_Port:

Connect_Retry:

Master_Log_File: master

Read_Master_Log_Pos:

Relay_Log_File: mysqlrelaybin

Relay_Log_Pos:

Relay_Master_Log_File: master

Slave_IO_Running: No

Slave_SQL_Running: Yes

查看slave 报错日志

:: [Note] Slave I/O thread: Failed reading log event reconnecting to retry log

master at postion

:: [ERROR] Error reading packet from server: Client requested master to start

replication from impossible position ( server_errno=)

:: [ERROR] Got fatal error : Client requested master to start replication from

impossible position from master when reading data from binary log

:: [Note] Slave I/O thread exiting read up to log master position

在master 中找不到position

查看master 的binlog

[root@ha data]# /usr/local/mysql/bin/mysqlbinlog master|tail

/*!*/;

# at

# :: server id end_log_pos Query thread_id=

exec_time= error_code=

SET TIMESTAMP=/*!*/;

insert into t values()

/*!*/;

DELIMITER ;

# End of log file

ROLLBACK /* added by mysqlbinlog */;

/*! SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

在master 上master 最后值是而在slave 却是Slave 的Log_Pos比master 大应该是在断电的时候binlog 有部分没有及时写入磁盘但从库已经读入我们需要在slave 执行change maseter 操作MASTER_LOG_FILE 在原来的基础上加Log_Pos 为

mysql> stop slave;

Query OK rows affected ( sec)

mysql> CHANGE MASTER TO MASTER_HOST=MASTER_PORT =

MASTER_USER

=slaveMASTER_PASSWORD=nslaveMASTER_LOG_FILE=masterMAST

ER_LOG_POS = ;

Query OK rows affected ( sec)

mysql> start slave;

Query OK rows affected ( sec)

mysql> show slave status\G;

*************************** row ***************************

Slave_IO_State:Waiting for master to send event

Master_Host:

Master_User: slave

Master_Port:

Connect_Retry:

Master_Log_File: master

Read_Master_Log_Pos:

Relay_Log_File: mysqlrelaybin

Relay_Log_Pos:

Relay_Master_Log_File: master

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

在这次断电的切换过程中将会有减去条binlog 操作在slave 存在而在masert 没有在后面的复制中可能会产生错误不过这些错误影响不大如果对数据的同步要求很严格可以在主库添加参数sync_binlog=这样最多将只会有条数据错误不过这样将影响mysql 写性能

heartbeat crm 常用命令

查看资源状态

#crm_mon –i

查看节点资源

#crm_resource L

查看资源在那个节点上运行

# crm_resource W r mysqld_

启动/停止资源

#crm_resource r mysqld_ p target_role v started

#crm_resource r mysqld _p target_role v stopped

将资源组从当前节点转移到另个节点

#crm_resource M r group_

将资源组转移到指定节点

#crm_resource M r group_ H ha

允许资源组回到正常的节点

#crm_resource U r group_

将资源从CRM 中删除

#crm_resource D r mysqld_ t primitive

将资源组从CRM 中删除

#crm_resource D –r group_ t group

将资源从CRM 中禁用

#crm_resource p is_managed r mysqld_ t primitive v off

将资源从新从CRM 中启用

#crm_resource p is_managed r mysqld _t primitive v on

重启资源

#crm_resource C H ha r mysqld_

检查所有节点上未在CRM 中的资源

#crm_resource P

检查指定节点上未在CRM 中的资源

#crm_resource P H ha

检查所有节点上未在CRM 中的资源

#crm_resource P

检查指定节点上未在CRM 中的资源

#crm_resource P H ha

八讨论

对写要求不是很高的应用我觉得可以考虑用NFS 来代替共享存储设备结构图如下

将数据文件放在NFS 上这样几台廉价的PC 机就能实现相对成本降低

参数sync_binlog=对写的性能有多少影响个人始终觉得会带来很大的写性能问题在数据完整和性能之间做个平衡有得有失吧

此方案相对网上流传的MasterMaster Replication 方案个人觉得实施起来相对简单维护也相对简单缺点是需要一个共享设备备机在处于空闲状态在master 断电的时候MMM 方案也应该存在主机的内存binlog 丢失的问题当然这些我并没有测试过只是自己的推测