首页 > 解决方案 > 每个节点上有两个网络接口的 Pacemaker 集群

问题描述

我正在尝试在 2 个节点之间创建一个集群,每个节点有 2 个网络接口。这个想法是,当节点处于活动状态时,节点的集群会发生变化,其 2 个接口中的一些会下降(或逻辑上的 2 个)。问题是集群只有在活动节点的接口 eth1 下降时才会改变节点。如果活动节点的接口 eth0 下降,则集群永远不会更改节点。

这是节点的网络配置

node1:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
      inet 192.168.0.3  netmask 255.255.255.248  broadcast 192.168.0.7
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
      inet 172.26.34.2  netmask 255.255.255.248  broadcast 172.26.34.7

node2:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
      inet 192.168.0.4  netmask 255.255.255.248  broadcast 192.168.0.7
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
      inet 172.26.34.3  netmask 255.255.255.248  broadcast 172.26.34.7

这些是我用来在节点之间创建集群并分配资源的命令:

pcs cluster auth node1 node2 -u hacluster -p 1234 --debug --force 
pcs cluster setup --name HAFirewall node1 node2 --force
pcs cluster start --all
pcs resource create VirtualIP_eth0 ocf:heartbeat:IPaddr2 ip=192.168.0.1 cidr_netmask=29 nic=eth0 op monitor interval=30s --group InterfacesHA
pcs resource create VirtualIP_eth1 ocf:heartbeat:IPaddr2 ip=172.26.34.1 cidr_netmask=29 nic=eth1 op monitor interval=30s --group InterfacesHA
pcs property set stonith-enabled=false
pcs property set no-quorum-policy=ignore
pcs resource enable InterfacesHA

这是corosync.conf文件的配置:

totem {
    version: 2
    secauth: off
    cluster_name: HAFirewall
    transport: udpu
}

nodelist {
    node {
        ring0_addr: node1
        nodeid: 1
    }

    node {
        ring0_addr: node2
        nodeid: 2
    }
}

quorum {
    provider: corosync_votequorum
    two_node: 1
}

logging {
    to_logfile: yes
    logfile: /var/log/corosync/corosync.log
    to_syslog: yes
}

这是pcs status命令的输出:

Cluster name: HAFirewall
Stack: corosync
Current DC: node1 (version 1.1.16-94ff4df) - partition WITHOUT quorum
Last updated: Tue Oct 27 19:01:35 2020
Last change: Tue Oct 27 18:22:27 2020 by hacluster via crmd on node2

2 nodes configured
2 resources configured

Online: [ node1 ]
OFFLINE: [ node2 ]

Full list of resources:

 Resource Group: InterfacesHA
     VirtualIP_eth0 (ocf::heartbeat:IPaddr2):   Started node1
     VirtualIP_eth1 (ocf::heartbeat:IPaddr2):   Started node1

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

这是crm configure show命令的输出:

node 1: node1
node 2: node2
primitive VirtualIP_eth0 IPaddr2 \
    params ip=192.168.0.1 cidr_netmask=29 \
    op start interval=0s timeout=20s \
    op stop interval=0s timeout=20s \
    op monitor interval=30s
primitive VirtualIP_eth1 IPaddr2 \
    params ip=172.26.34.1 cidr_netmask=29 \
    op start interval=0s timeout=20s \
    op stop interval=0s timeout=20s \
    op monitor interval=30s
group InterfacesHA VirtualIP_eth0 VirtualIP_eth1
location cli-prefer-InterfacesHA InterfacesHA role=Started inf: node1
property cib-bootstrap-options: \
    stonith-enabled=false \
    no-quorum-policy=ignore \
    have-watchdog=false \
    dc-version=1.1.16-94ff4df \
    cluster-infrastructure=corosync \
    cluster-name=HAFirewall

这些是node1处于活动状态并启动虚拟 IP 时的接口:

eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
 link/ether ac:1f:6b:90:a5:58 brd ff:ff:ff:ff:ff:ff
 inet 192.168.0.3/29 brd 192.168.0.7 scope global eth0
    valid_lft forever preferred_lft forever
 inet 192.168.0.1/29 brd 192.168.0.7 scope global secondary eth0
    valid_lft forever preferred_lft forever
 inet6 fe80::ae1f:6bff:fe90:a558/64 scope link 
    valid_lft forever preferred_lft forever
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
 link/ether ac:1f:6b:90:a5:59 brd ff:ff:ff:ff:ff:ff
 inet 172.26.34.2/29 brd 172.26.34.7 scope global eth1
    valid_lft forever preferred_lft forever
 inet 172.26.34.1/29 brd 172.26.34.7 scope global secondary eth1
    valid_lft forever preferred_lft forever
 inet6 fe80::ae1f:6bff:fe90:a559/64 scope link 
    valid_lft forever preferred_lft forever

知道为什么集群在 eth1 接口关闭时可以正常工作而在 etho 接口关闭时无法正常工作吗?

问候和感谢。

标签: high-availabilitypacemakercorosync

解决方案


我相信您需要在 corosync.conf 中指定两个接口:

  interface {
    ringnumber: 0
    bindnetaddr: 192.168.0.4
    ...

  interface {
    ringnumber: 1
    bindnetaddr: 172.26.34.3
    ...

推荐阅读