high-availability - 每个节点上有两个网络接口的 Pacemaker 集群
问题描述
我正在尝试在 2 个节点之间创建一个集群,每个节点有 2 个网络接口。这个想法是,当节点处于活动状态时,节点的集群会发生变化,其 2 个接口中的一些会下降(或逻辑上的 2 个)。问题是集群只有在活动节点的接口 eth1 下降时才会改变节点。如果活动节点的接口 eth0 下降,则集群永远不会更改节点。
这是节点的网络配置:
node1:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.3 netmask 255.255.255.248 broadcast 192.168.0.7
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.26.34.2 netmask 255.255.255.248 broadcast 172.26.34.7
node2:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.4 netmask 255.255.255.248 broadcast 192.168.0.7
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.26.34.3 netmask 255.255.255.248 broadcast 172.26.34.7
这些是我用来在节点之间创建集群并分配资源的命令:
pcs cluster auth node1 node2 -u hacluster -p 1234 --debug --force
pcs cluster setup --name HAFirewall node1 node2 --force
pcs cluster start --all
pcs resource create VirtualIP_eth0 ocf:heartbeat:IPaddr2 ip=192.168.0.1 cidr_netmask=29 nic=eth0 op monitor interval=30s --group InterfacesHA
pcs resource create VirtualIP_eth1 ocf:heartbeat:IPaddr2 ip=172.26.34.1 cidr_netmask=29 nic=eth1 op monitor interval=30s --group InterfacesHA
pcs property set stonith-enabled=false
pcs property set no-quorum-policy=ignore
pcs resource enable InterfacesHA
这是corosync.conf文件的配置:
totem {
version: 2
secauth: off
cluster_name: HAFirewall
transport: udpu
}
nodelist {
node {
ring0_addr: node1
nodeid: 1
}
node {
ring0_addr: node2
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
logging {
to_logfile: yes
logfile: /var/log/corosync/corosync.log
to_syslog: yes
}
这是pcs status命令的输出:
Cluster name: HAFirewall
Stack: corosync
Current DC: node1 (version 1.1.16-94ff4df) - partition WITHOUT quorum
Last updated: Tue Oct 27 19:01:35 2020
Last change: Tue Oct 27 18:22:27 2020 by hacluster via crmd on node2
2 nodes configured
2 resources configured
Online: [ node1 ]
OFFLINE: [ node2 ]
Full list of resources:
Resource Group: InterfacesHA
VirtualIP_eth0 (ocf::heartbeat:IPaddr2): Started node1
VirtualIP_eth1 (ocf::heartbeat:IPaddr2): Started node1
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
这是crm configure show命令的输出:
node 1: node1
node 2: node2
primitive VirtualIP_eth0 IPaddr2 \
params ip=192.168.0.1 cidr_netmask=29 \
op start interval=0s timeout=20s \
op stop interval=0s timeout=20s \
op monitor interval=30s
primitive VirtualIP_eth1 IPaddr2 \
params ip=172.26.34.1 cidr_netmask=29 \
op start interval=0s timeout=20s \
op stop interval=0s timeout=20s \
op monitor interval=30s
group InterfacesHA VirtualIP_eth0 VirtualIP_eth1
location cli-prefer-InterfacesHA InterfacesHA role=Started inf: node1
property cib-bootstrap-options: \
stonith-enabled=false \
no-quorum-policy=ignore \
have-watchdog=false \
dc-version=1.1.16-94ff4df \
cluster-infrastructure=corosync \
cluster-name=HAFirewall
这些是node1处于活动状态并启动虚拟 IP 时的接口:
eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether ac:1f:6b:90:a5:58 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.3/29 brd 192.168.0.7 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.0.1/29 brd 192.168.0.7 scope global secondary eth0
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:fe90:a558/64 scope link
valid_lft forever preferred_lft forever
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether ac:1f:6b:90:a5:59 brd ff:ff:ff:ff:ff:ff
inet 172.26.34.2/29 brd 172.26.34.7 scope global eth1
valid_lft forever preferred_lft forever
inet 172.26.34.1/29 brd 172.26.34.7 scope global secondary eth1
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:fe90:a559/64 scope link
valid_lft forever preferred_lft forever
知道为什么集群在 eth1 接口关闭时可以正常工作而在 etho 接口关闭时无法正常工作吗?
问候和感谢。
解决方案
我相信您需要在 corosync.conf 中指定两个接口:
interface {
ringnumber: 0
bindnetaddr: 192.168.0.4
...
interface {
ringnumber: 1
bindnetaddr: 172.26.34.3
...
推荐阅读
- python - 如何通过 HTTP 将文件发送到 Telegram 机器人?
- laravel - 为什么 Spatie/Image 包即使只处理尺寸也会压缩图像?
- c++ - 切换循环结束时未显示的游戏总数
- java - 使图像与文字一样高
- wpf - Caliburn.Micro 我应该使用 Screen 还是 Conductor.AllActive 作为我的父视图
- android - AndroidX 的未解决参考 ActivityTestRule
- logging - 使用 Intellij 登录时跳过内容
- python - 用另一个对象过滤 Django 对象
- microsoft-graph-api - 为什么 Sharepoint 增量查询不返回初始请求的所有元素?
- visual-studio - Visual Studio docker 支持不适用于 CI/CD