月度归档：2017年07月

HA heartbeat(高可用架构)配置

2017年7月31日 07:04:18 2,090 次浏览

HA heartbeat(高可用架构)配置
HA即(high available)高可用，又被叫做双机热备，用于关键性业务。 工作原理：heartbeat最核心的包括两个部分，心跳监测部分和资源接管部分，心跳监测可以通过网络链路和串口进行，而且支持冗余链路，它们之间相互发送报文来告诉对方自己当前的状态，如果在指定的时间内未收到对方发送的报文，那么就认为对方失效，这时需启动资源接管模块来接管运行在对方主机上的资源或者服务。常见的实现高可用的开源软件有 heartbeat 和 keepalived。

环境说明：
操作系统：CentOS release 6.8 (64)
服务器A:主机名：web1 eth0网卡地址：192.168.122.20
服务器B:主机名：web2 eth0网卡地址：192.168.122.22
虚拟VIP：192.168.122.100

前期准备：
1、修改Hostname主机名 (2台节点都需要操作)

[root@web1 ~]# vim /etc/sysconfig/network

2、增加hosts (2台节点都需要操作)

[root@web1 ~]# vim /etc/hosts

#增加内容如下：
192.168.122.20 web1
192.168.122.22 web2
3、关闭iptables和selinux。(2台节点都需要操作)

[root@web1 ~]# service iptables stop

[root@web1 ~]# setenforce 0

[root@web1 ~]# sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

4、双机所需软件： libnet heartbeat nginx
#安装扩展源,或者使用阿里云扩展http://mirrors.aliyun.com/help/epel

 [root@web1 ~]# yum install epel-relese -y
[root@web1 ~]# yum install –y libnet heartbeat nginx

配置heartbeat拷贝配置文件

[root@web1 ~]# cd /usr/share/doc/heartbeat-3.0.4/
[root@web1 heartbeat-3.0.4]# cp ha.cf haresources authkeys /etc/ha.d/
[root@web1 heartbeat-3.0.4]# cd /etc/ha.d/

5、修改authkeys #取消注释，认证方式选择md5

[root@web1 ha.d]# vim authkeys

auth 3 3 md5 Hello!

[root@web1 ha.d]# chmod 600 authkeys

//然后修改其权限
6、编辑haresources文件

[root@web1 ha.d]# vim haresources

加入下面一行：
web1 192.168.122.100/eth0 nginx
//说明：web1为主节点hostname，192.168.122.100为vip，/24为掩码为24的网段，eth0为vip的设备名，nginx为heartbeat监控的服务，也是两台机器对外提供的核心服务。
7、编辑ha.cf

[root@web1 ha.d]# vim ha.cf

修改为如下内容：
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 60
udpport 694
ucast eth0 192.168.122.22 //添加备机IP
auto_failback on
node master
node slave
ping 192.168.122.1 //网关IP
respawn hacluster /usr/lib64/heartbeat/ipfail
配置说明：
debugfile /var/log/ha-debug //该文件保存heartbeat的调试信息。
logfile /var/log/ha-log //heartbeat的日志文件。
keepalive 2 //心跳的时间间隔，默认时间单位为秒s。
deadtime 30 //超出该时间间隔未收到对方节点的心跳，则认为对方已经死亡。
warntime 10 //超出该时间间隔未收到对方节点的心跳，则发出警告并记录到日志中。
initdead 60 //系统启动或重启之后需要经过一段时间网络才能正常工作，该选项用于解决这种情况产生的时间间隔，取值至少为deadtime的2倍。
udpport 694 //设置广播通信使用的端口，694为默认使用的端口号。
ucast eth0 192.168.122.22 //设置对方机器心跳检测的网卡和IP。
auto_failback on //heartbeat的两台主机分别为主节点和从节点。主节点在正常情况下占用资源并运行所有的服务，遇到故障时把资源交给从节点由从节点运行服务。在该选项设为on的情况下，一旦主节点恢复运行，则自动获取资源并取代从节点，否则不取代从节点。
respawn heartbeat /usr/lib64/heartbeat/ipfail
指定与heartbeat一同启动和关闭的进程，该进程被自动监视，遇到故障则重新启动。最常用的进程是ipfail，该进程用于检测和处理网络故障，需要配合ping语句指定的ping node来检测网络连接。如果你的系统是64bit，请注意该文件的路径。
8、把主节点上的三个配置文件拷贝到从节点

[root@web1 ha.d]# scp authkeys ha.cf haresources web2:/etc/ha.d

#如找不到scp命令，请yum 安装openssh-clients

9、从节点slave编辑ha.cf

[root@web2 ~]# vim /etc/ha.d/ha.cf

只需要更改一个地方如下:
ucast eth0 192.168.122.22改为ucast eth0 192.168.122.20 //改为主机器IP
10、启动heartbeat服务
配置完毕后，先master启动，后slave启动。

[root@web1 ~]# service heartbeat start

Starting High-Availability services: INFO: Resource is stopped
Done.
11、检查测试[root@web1 ha.d]# ip a 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:90:ee:8d brd ff:ff:ff:ff:ff:ff inet 192.168.122.20/24 brd 192.168.122.255 scope global eth0 inet 192.168.122.120/24 brd 192.168.122.255 scope global secondary eth0 //浮动IP已漂移到主上面192.168.122.120

 
[root@web1 ha.d]# ps aux |grep nginx
root 15062 0.0 0.4 108936 2172 ? Ss 03:09 0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx 15064 0.0 0.6 109360 3204 ? S 03:09 0:00 nginx: worker process

 
[root@web1 ha.d]# netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 17545/nginx
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 2082/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1345/master
tcp 0 0 :::80 :::* LISTEN 17545/nginx
tcp 0 0 :::22 :::* LISTEN 2082/sshd
tcp 0 0 ::1:25 :::* LISTEN 1345/master

在从节点启动heartbeat

 
[root@web2 ~]# service heartbeat start
[root@web2 ~]# ip a

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:e3:92:b7 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.22/24 brd 192.168.122.255 scope global eth0
inet6 fe80::5054:ff:fee3:92b7/64 scope link
valid_lft forever preferred_lft forever
此时从节点没有浮动IP

12、测试方式
（1）主节点上禁ping看浮动IP是否漂移到从节点

 [root@web1 ~]# iptables -I INPUT -p icmp -j DROP
[root@web1 ~]# /etc/init.d/heartbeat stop #主节点停止heartbeat服务
[root@web2 ha.d]# tailf /var/log/ha-debug #观察从节点ha-debug日志

Jul 26 05:26:06 web3 ipfail: [6977]: debug: Got asked for num_ping. Jul 26 05:26:06 web3 ipfail: [6977]: debug: Found ping node 192.168.122.1! Jul 26 05:26:07 web3 ipfail: [6977]: info: Ping node count is balanced. Jul 26 05:26:07 web3 ipfail: [6977]: debug: Abort message sent. Jul 26 05:26:08 web3 heartbeat: [6949]: info: local resource transition completed. Jul 26 05:26:08 web3 heartbeat: [6949]: info: Initial resource acquisition complete (T_RESOURCES(us)) Jul 26 05:26:08 web3 heartbeat: [7002]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys web3] to acquire. Jul 26 05:26:09 web3 heartbeat: [6949]: info: remote resource transition completed. Jul 26 05:26:09 web3 ipfail: [6977]: debug: Other side is unstable. Jul 26 05:26:09 web3 ipfail: [6977]: debug: Other side is now stable. Jul 26 05:46:27 web3 ipfail: [6977]: debug: Got asked for num_ping. Jul 26 05:46:27 web3 ipfail: [6977]: debug: Found ping node 192.168.122.1! Jul 26 05:46:28 web3 ipfail: [6977]: info: Telling other node that we have more visible ping nodes. Jul 26 05:46:28 web3 ipfail: [6977]: debug: Sending you_are_dead. Jul 26 05:46:28 web3 ipfail: [6977]: debug: Message [you_are_dead] sent. Jul 26 05:46:34 web3 heartbeat: [6949]: info: web1 wants to go standby [all] Jul 26 05:46:34 web3 ipfail: [6977]: debug: Other side is unstable. Jul 26 05:46:35 web3 heartbeat: [6949]: info: standby: acquire [all] resources from web1 Jul 26 05:46:35 web3 heartbeat: [7117]: info: acquire all HA resources (standby).ResourceManager(default)[7130]: 2017/07/26_05:46:35 info: Acquiring resource group: web1 192.168.122.120/24/eth0 nginx /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.122.120)[7158]: 2017/07/26_05:46:35 INFO: Resource is stopped ResourceManager(default)[7130]: 2017/07/26_05:46:35 info: Running /etc/ha.d/resource.d/IPaddr 192.168.122.120/24/eth0 start IPaddr(IPaddr_192.168.122.120)[7285]: 2017/07/26_05:46:35 INFO: Adding inet address 192.168.122.120/24 with broadcast address 192.168.122.255 to device eth0 IPaddr(IPaddr_192.168.122.120)[7285]: 2017/07/26_05:46:35 INFO: Bringing device eth0 up IPaddr(IPaddr_192.168.122.120)[7285]: 2017/07/26_05:46:35 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.122.120 eth0 192.168.122.120 auto not_used not_used /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.122.120)[7259]: 2017/07/26_05:46:35 INFO: Success INFO: Success ResourceManager(default)[7130]: 2017/07/26_05:46:35 info: Running /etc/init.d/nginx start Starting nginx: [ OK ] Jul 26 05:46:35 web3 heartbeat: [7117]: info: all HA resource acquisition completed (standby). Jul 26 05:46:35 web3 heartbeat: [6949]: info: Standby resource acquisition done [all]. Jul 26 05:46:35 web3 heartbeat: [6949]: info: remote resource transition completed. Jul 26 05:46:35 web3 ipfail: [6977]: debug: Other side is now stable. Jul 26 05:46:35 web3 ipfail: [6977]: debug: Other side is now stable. ARPING 192.168.122.120 from 192.168.122.120 eth0 Sent 5 probes (5 broadcast(s)) Received 0 response(s)
查看从节点是否有浮动IP， Nginx进程是否启动成功

 
[root@web2 ~]# ip a

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:e3:92:b7 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.22/24 brd 192.168.122.255 scope global eth0
inet 192.168.122.120/24 brd 192.168.122.255 scope global secondary eth0

[root@web3 ~]# netstat -lntp
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 8050/nginx
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1685/master

[root@mysql-proxy ~]# curl 192.168.122.120 //客户端测试机器
Web3

（2）测试脑裂
主节点master和从节点slave都down掉eth1网卡

[root@web1 ~]# ifdown eth1

Mysql5.6主从架构

2017年7月31日 07:00:07 2,213 次浏览

发表评论

MySQL replication(主从架构)配置

MySQL Replication 又叫做AB复制或者主从复制。它主要用于MySQL的时时备份或者读写分离。Master为主服务器，Slava为从服务器，初始状态时，master和slave中的数据信息相同，当master中的数据发生变化时，slave也跟着发生相应的变化，变使得master和slave的数据信息同步，达到备份的目的。在配置之前先做一下准备工作，配置两台mysql服务器。

环境说明：

操作系统：CentOS release 6.8 (64)

主库IP：192.168.120.10 （master）

从库IP：192.168.120.11 （slave）

Mysql version: 5.6.35

前期准备：

下载Mysql二进制包，/usr/local/src目录

修改Hostname主机名

搭建主从，必须保证主从数据库数据一致。

[root@master ~]# vim /etc/sysconfig/network
[root@master ~]# vim /etc/hosts
[root@master ~]# cd /usr/local/src

配置mysql服务

详细步骤

#下载MySQL

[root@master ~]# wget https://mirrors.tuna.tsinghua.edu.cn/mysql/downloads/MySQL-5.6/mysql-5.6.35-linux-glibc2.5-x86_64.tar.gz

#解压

[root@master ~]# tar –zxvf mysql-5.6.35-linux-glibc2.5-x86_64.tar.gz

#移动到/usr/local/mysql

[root@master ~]# mv mysql-5.6.35-linux-glibc2.5-x86_64 /usr/local/mysql

#建立Mysql用户

[root@master ~]# useradd -s /sbin/nologin mysql -M

#初始化数据库并创建数据库存放目录

[root@master ~]# cd /usr/local/mysql
[root@master ~]# mkdir -p /data/mysql ; chown -R mysql.mysql /data/mysql
[root@master ~]# ./scripts/mysql_install_db --user=mysql --datadir=/data/mysql/

–user 定义数据库的用户 –datadir 定义数据库的存放目录

#拷贝配置文件

[root@master ~]# cp support-files/my-default.cnf /etc/my.cnf

#拷贝启动脚本并修改属性

[root@master ~]# cp support-files/mysql.server /etc/init.d/mysql
[root@master ~]# chmod 755 !$

#修改启动脚本

[root@master ~]# vim /etc/init.d/mysql Datadir= Basedir=

#把启动脚本加入系统服务项，并设置开机启动

[root@master ~]# chkconfig --add mysql
[root@master ~]# chkconfig mysql on

#启动MySql

[root@master ~]# service mysql start
Stating MySQL. SUCCESS!

#设置MySQL系统环境变量

[root@master ~]# vim /etc/profile.d/mysql.sh
export PATH=$PATH:/usr/local/mysql/bin

#让配置生效

[root@master ~]# source !$

#查看端口是否启动成功

 [root@master ~]# netstat -lntp |grep 3306
tcp 0 0 :::3306 :::* LISTEN 2057/mysqld

配置replication

1. 设置master

修改Mysql主配置文件：

 [root@master ~]# vim /usr/local/mysql/my.cnf

主在[mysqld]部分查看是否有以下内容，如果没有则添加：

server-id=1 log-bin=mysql-bin

如除了这两行是必须的外，还有两个参数，可以选择性的使用：

binlog-do-db=databasename1,databasename2 binlog-ignore-db=databasename1,databasename2

binlog-do-db=须需要复制的数据库名，多个数据库名，使用逗号分隔。

binlog-ignore-db=须不需要复制的数据库名，多个数据库名，使用逗号分隔。这两个参数其实用一个就可以了。

#设置MySQL密码

[root@master ~]# mysqladmin -uroot password ‘123456'
[root@master ~]# mysql -uroot -p123456
mysql> grant replication slave on *.* to 'slave'@'192.168.120.11' identified by 'hello123';
mysql> flush privileges; //刷新权限

//这里的repl是为slave端设置的访问master端mysql数据的用户，密码为123123，这里的192.168.120.11为slave的ip

[root@master ~]# service mysql restart //重启服务
mysql> flush tables with read lock; //锁定数据库，此时不允许更改任何数据
mysql> show master status; //查看状态，这些数据是要记录的，一会要在slave端用到
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000006 | 474952 | | |
+------------------+----------+--------------+------------------+

2. 设置slave

先修改slave的配置文件my.cnf:

[root@slave ~]# vim /etc/my.cnf

找到 “server-id = 1” 这一行，删除掉或者改为 “server-id = 2” 总之不能让这个id和master一样，否则会报错。
另外在从上，你也可以选择性的增加如下两行，对应于主上增加的两行:

replicate-do-db=databasename1,databasename2

replicate-ignore-db=databasename1,databasename2

改完后，重启slave:

[root@slave ~]# service mysql restart

拷贝master上的db1库的数据到slave上，不同的机器需要远程拷贝，注意这一点:

[root@master ~]# mysqldump -uroot -p123456 db1 > db1.sql

#创建db1数据库

[root@slave ~]# mysql -uroot -p123456 -e “create database db1”

#导入db1数据库

[root@slave ~]# mysql -uroot -p123456 db1 < db1.sql

第二行中，使用了一个-e选项，它用来把mysql的命令写到shell中，这样可以方便把mysql操作写进脚本中，它的格式就是 -e “commond” 它很实用，把数据拷贝过来后，就需要在slave上配置主从了:

[root@slave ~]# mysql -uroot -p123456
mysql> stop slave;
mysql> change master to master_host='192.168.120.10', master_port=3306,
master_user='slave', master_password='hello123',
master_log_file='mysql-bin.000006', master_log_pos=474952;
mysql> start slave;

其中master_log_file和master_log_pos是在上面使用 show master status 查到的数据。执行完这一步后，需要在master上执行一步:

[root@master ~]# mysql -uroot -p123456 -e “unlock tables”

然后查看slave的状态:

mysql> show slave status\G;

确认以下两项参数都为yes:

Slave_IO_Running: Yes Slave_SQL_Running: Yes

测试主从

在master上执行如下命令:

[root@master ~]# mysql -uroot -p123456 -e “use db1;

select count(*) from db”

+———-+

| count(*) |

+———-+

| 2 |

+———-+

[root@master ~]# mysql -uroot -p123456 -e "use db1;
truncate table db"
[root@master ~]# mysql -uroot -p123456 -e "use db1;
select count(*) from db"

+———-+

| count(*) |

+———-+

| 0 |

+———-+

这样清空了db1.db表的数据，下面查看slave上的该表数据:

[root@slave ~]# mysql -uroot -pyourpassword -e “use db1; select count(*) from db”

+———-+

| count(*) |

+———-+

| 0 |

+———-+

slave上的该表也被清空了。这样好像不太明显，不妨继续把db表删除试试:

[root@master ~]# mysql -uroot -p123456 -e "use db1; drop table db"
[root@slave ~]# mysql -uroot -pyourpassword -e "use db1; select count(*) from db"
ERROR 1146 (42S02) at line 1: Table 'db1.db' doesn't exist

风哥博客

别着急，最好的总在最不经意的时候出现！

月度归档：2017年07月

HA heartbeat(高可用架构)配置

Mysql5.6主从架构