首页 > 解决方案 > Kubernetes 问题中的 Artemis 复制

问题描述

我正在 Kubernetes 环境中部署一个简单的主/从复制配置。他们使用静态连接器。当我删除 master pod 时,slave 成功接管了职责,但是当 master pod 恢复时,slave pod 并没有以 live 状态终止,所以我最终得到了两个 live 服务器。当这种情况发生时,我注意到它们也形成了一个内部桥梁。我在 Kubernetes 之外的本地运行了完全相同的配置,并且从站成功终止并在主站恢复时恢复为从站。关于为什么会发生这种情况的任何想法?我正在使用 Artemis 2.6.4 版。

大师broker.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration xmlns="urn:activemq" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">
  <jms xmlns="urn:activemq:jms">
    <queue name="jms.queue.acmadapter_to_acm_design">
      <durable>true</durable>
    </queue>
  </jms>
  <core xmlns="urn:activemq:core" xsi:schemaLocation="urn:activemq:core ">
    <acceptors>
      <acceptor name="netty-acceptor">tcp://0.0.0.0:61618</acceptor>
    </acceptors>
    <connectors>
      <connector name="netty-connector-master">tcp://artemis-service-0.artemis-service.falconx.svc.cluster.local:61618</connector>
      <connector name="netty-connector-backup">tcp://artemis-service2-0.artemis-service.falconx.svc.cluster.local:61618</connector>
    </connectors>
    <ha-policy>
      <replication>
        <master>
           <!--we need this for auto failback-->
           <check-for-live-server>true</check-for-live-server>
        </master>
      </replication>
    </ha-policy>
    <cluster-connections>
      <cluster-connection name="my-cluster">
          <connector-ref>netty-connector-master</connector-ref>
          <static-connectors>
            <connector-ref>netty-connector-master</connector-ref>
            <connector-ref>netty-connector-backup</connector-ref>
          </static-connectors>
      </cluster-connection>
    </cluster-connections>
  </core>
</configuration>

奴隶broker.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration xmlns="urn:activemq" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">
  <core xmlns="urn:activemq:core" xsi:schemaLocation="urn:activemq:core ">
    <acceptors>
      <acceptor name="netty-acceptor">tcp://0.0.0.0:61618</acceptor>
    </acceptors>
    <connectors>
      <connector name="netty-connector-backup">tcp://artemis-service2-0.artemis-service.test.svc.cluster.local:61618</connector>
      <connector name="netty-connector-master">tcp://artemis-service-0.artemis-service.test.svc.cluster.local:61618</connector>
    </connectors>
    <ha-policy>
      <replication>
          <slave>
            <allow-failback>true</allow-failback>
            <!-- not needed but tells the backup not to restart after failback as there will be > 0 backups saved -->
            <max-saved-replicated-journals-size>0</max-saved-replicated-journals-size>
          </slave>
      </replication>
    </ha-policy>
    <cluster-connections>
      <cluster-connection name="my-cluster">
          <connector-ref>netty-connector-backup</connector-ref>
          <static-connectors>
            <connector-ref>netty-connector-master</connector-ref>
            <connector-ref>netty-connector-backup</connector-ref>
          </static-connectors>
      </cluster-connection>
    </cluster-connections>
  </core>
</configuration>

标签: kubernetesactivemq-artemis

解决方案


停止时,大师的日记很可能正在丢失。日志(特别是server.lock文件)保存节点的唯一标识符(由复制的从属共享)。如果日志在节点被删除时丢失,那么当它恢复时它无法与它的从属配对,这将解释你正在观察的行为。确保日志处于持久卷声明中。

此外,值得注意的是,由于存在脑裂的风险,不建议使用单个主/从对。一般来说,建议您有 3 个主/从对来建立适当的仲裁。


推荐阅读