首页 > 解决方案 > ActiveMQ Artemis:javax.jms.ConnectionFactory 在故障转移的情况下不接收集群拓扑

问题描述

在我的 Windows 机器上,我设置了一个由 2 个代理(版本 2.19.0)组成的本地测试集群:1 个主机,1 个从机。ha-policy 是复制和集群通信通过 JGroups。

大师broker.xml

<configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">

   <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="urn:activemq:core ">

      <name>broker1</name>

      <persistence-enabled>true</persistence-enabled>
      
      <connectors>
         <!-- Connector used to be announced through cluster connections and notifications -->
         <connector name="netty-connector">tcp://127.0.0.2:61616</connector>
      </connectors> 


      <graceful-shutdown-enabled>true</graceful-shutdown-enabled>


      <acceptors>
         <!-- Acceptor for every supported protocol -->
         <acceptor name="artemis">tcp://127.0.0.2:61616</acceptor>

         <!-- STOMP Acceptor. -->
         <acceptor name="stomp">tcp://127.0.0.2:61613?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=STOMP;useEpoll=true;anycastPrefix=/queue/;multicastPrefix=/topic/</acceptor>

      </acceptors>


      <cluster-user>user</cluster-user>
      <cluster-password>pw</cluster-password>
      
      <ha-policy>
         <replication>
            <master>
               <check-for-live-server>true</check-for-live-server>
            </master>
         </replication>   
      </ha-policy>

      <broadcast-groups>
         <broadcast-group name="bg-group1">
           <jgroups-file>test-jgroups-jdbc_ping.xml</jgroups-file>
            <jgroups-channel>active_broadcast_channel</jgroups-channel>
            <connector-ref>netty-connector</connector-ref>
         </broadcast-group>
      </broadcast-groups>

      <discovery-groups>
         <discovery-group name="dg-group1">
            <jgroups-file>test-jgroups-jdbc_ping.xml</jgroups-file>
            <jgroups-channel>active_broadcast_channel</jgroups-channel>
            <refresh-timeout>10000</refresh-timeout>
         </discovery-group>
      </discovery-groups>

      <cluster-connections>
         <cluster-connection name="my-cluster">
            <connector-ref>netty-connector</connector-ref>
            <use-duplicate-detection>true</use-duplicate-detection>
            <message-load-balancing>ON_DEMAND</message-load-balancing>
            <max-hops>1</max-hops>
            <discovery-group-ref discovery-group-name="dg-group1"/>
         </cluster-connection>
      </cluster-connections>

       [...]
   </core>
</configuration>

奴隶broker.xml

<configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">

   <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="urn:activemq:core ">

      <name>broker2</name>

      <persistence-enabled>true</persistence-enabled>      
      
      <connectors>
         <!-- Connector used to be announced through cluster connections and notifications -->
         <connector name="netty-connector">tcp://127.0.0.3:61616</connector>
         <connector name="master1-netty-connector">tcp://127.0.0.2:61616</connector>
      </connectors>

      <graceful-shutdown-enabled>true</graceful-shutdown-enabled>

      <acceptors>
          <!-- Acceptor for every supported protocol -->
         <acceptor name="artemis">tcp://127.0.0.3:61616</acceptor>

         <!-- STOMP Acceptor. -->
         <acceptor name="stomp">tcp://127.0.0.3:61613?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=STOMP;useEpoll=true;anycastPrefix=/queue/;multicastPrefix=/topic/</acceptor>
      </acceptors>


      <cluster-user>user</cluster-user>
      <cluster-password>pw</cluster-password>
      
      <!-- failover config -->
      <ha-policy>
         <replication>
            <slave>
               <allow-failback>true</allow-failback>
            </slave>
         </replication>
      </ha-policy>

      <broadcast-groups>
         <broadcast-group name="bg-group1">
           <jgroups-file>test-jgroups-jdbc_ping.xml</jgroups-file>
            <jgroups-channel>active_broadcast_channel</jgroups-channel>
            <connector-ref>netty-connector</connector-ref>
         </broadcast-group>
      </broadcast-groups>

      <discovery-groups>
         <discovery-group name="dg-group1">
            <jgroups-file>test-jgroups-jdbc_ping.xml</jgroups-file>
            <jgroups-channel>active_broadcast_channel</jgroups-channel>
            <refresh-timeout>10000</refresh-timeout>
         </discovery-group>
      </discovery-groups>

      <cluster-connections>
         <cluster-connection name="my-cluster">
            <connector-ref>netty-connector</connector-ref>
            <use-duplicate-detection>true</use-duplicate-detection>
            <message-load-balancing>ON_DEMAND</message-load-balancing>
            <max-hops>1</max-hops>
            <discovery-group-ref discovery-group-name="dg-group1"/>
         </cluster-connection>
      </cluster-connections>
     [...]

   </core>
</configuration>

我省略了默认的配置行,这样就不会太长了。为了重现示例,只需替换broker.xml.

我还在运行一个 Spring Boot 应用程序(2.3.8),它拥有一个javax.jms.ConnectionFactory和一些连接到 Artemis 服务器集群的消费者/生产者。

对于初始连接,应用程序连接到主代理(一台主机)。

import javax.jms.ConnectionFactory;
import org.apache.activemq.artemis.jms.client.ActiveMQConnectionFactory;

[...]

    @Bean
    public ConnectionFactory filterConnectionFactory(ArtemisConfig artemisConfig) {
           return new ActiveMQConnectionFactory(
                        "tcp://127.0.0.2:61616?ha=true&blockOnDurableSend=false",
                        artemisConfig.getUserName(), artemisConfig.getPassword());
}

我对故障转移的期望:

现实:现在还没有。因此,如果发生故障转移,应用程序将无法恢复连接。

我猜代理配置本身应该没问题,因为在故障转移的情况下,主服务器会将所有地址复制到从服务器。
故障回复也在工作。

我的设置有什么问题吗?有没有办法记录拓扑的接收?

test-jgroups-jdbc_ping.xml

<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns="urn:org:jgroups"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
    <TCP recv_buf_size="${tcp.recv_buf_size:5M}"
         send_buf_size="${tcp.send_buf_size:5M}"
         max_bundle_size="64K"
         max_bundle_timeout="30"
         sock_conn_timeout="300"

         timer_type="new3"
         timer.min_threads="4"
         timer.max_threads="10"
         timer.keep_alive_time="3000"
         timer.queue_max_size="500"

         thread_pool.enabled="true"
         thread_pool.min_threads="2"
         thread_pool.max_threads="8"
         thread_pool.keep_alive_time="5000"
         thread_pool.queue_enabled="true"
         thread_pool.queue_max_size="10000"
         thread_pool.rejection_policy="discard"

         oob_thread_pool.enabled="true"
         oob_thread_pool.min_threads="1"
         oob_thread_pool.max_threads="8"
         oob_thread_pool.keep_alive_time="5000"
         oob_thread_pool.queue_enabled="false"
         oob_thread_pool.queue_max_size="100"
         oob_thread_pool.rejection_policy="discard"/>
         
   <!--  <TRACE/> -->
    
    <!-- <SSL_KEY_EXCHANGE
        keystore_name="./activemq.example.keystore"
        keystore_password="activemqexample"
    /> -->

    <JDBC_PING connection_url="jdbc:postgresql://127.0.0.1:5432/test" connection_username="test" connection_password="test" connection_driver="org.postgresql.Driver" initialize_sql="CREATE TABLE IF NOT EXISTS JGROUPSPING (own_addr varchar(200),bind_addr varchar(200),created timestamp DEFAULT CURRENT_TIMESTAMP,cluster_name varchar(200),ping_data BYTEA,constraint PK_JGROUPSPING PRIMARY KEY (own_addr, cluster_name))"/>
    
    <MERGE3  min_interval="10000"
             max_interval="30000"/>
    <FD_SOCK/>
    <FD timeout="3000" max_tries="3" />
    <VERIFY_SUSPECT timeout="1500"  />
    <BARRIER />
    <pbcast.NAKACK2 use_mcast_xmit="false"
                    discard_delivered_msgs="true"/>
    <UNICAST3 />
    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                   max_bytes="4M"/>
    <pbcast.GMS print_local_addr="true" join_timeout="2000"
                view_bundling="true"/>
    <MFC max_credits="2M"
         min_threshold="0.4"/>
    <FRAG2 frag_size="60K"  />
    <!--RSVP resend_interval="2000" timeout="10000"/-->
    <pbcast.STATE_TRANSFER/>
</config>

主日志:

2021-11-10 16:58:53,209 INFO  [org.apache.activemq.artemis.integration.bootstrap] AMQ101000: Starting ActiveMQ Artemis Server
2021-11-10 16:58:53,330 INFO  [org.apache.activemq.artemis.core.server] AMQ221000: live Message Broker is starting with configuration Broker Configuration (clustered=true,journalDirectory=data/journal,bindingsDirectory=data/bindings,largeMessagesDirectory=data/large-messages,pagingDirectory=data/paging)
2021-11-10 16:59:15,465 INFO  [org.apache.activemq.artemis.core.server] AMQ221013: Using NIO Journal
2021-11-10 16:59:15,528 INFO  [org.apache.activemq.artemis.core.server] AMQ221057: Global Max Size is being adjusted to 1/2 of the JVM max size (-Xmx). being defined as 536.870.912
2021-11-10 16:59:20,988 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-server]. Adding protocol support for: CORE
2021-11-10 16:59:20,990 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol support for: AMQP
2021-11-10 16:59:20,990 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol support for: HORNETQ
2021-11-10 16:59:20,991 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol support for: MQTT
2021-11-10 16:59:20,992 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding protocol support for: OPENWIRE
2021-11-10 16:59:20,992 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol support for: STOMP
2021-11-10 16:59:22,317 INFO  [org.apache.activemq.artemis.core.server] AMQ221080: Deploying address DLQ supporting [ANYCAST]
2021-11-10 16:59:22,328 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying ANYCAST queue DLQ on address DLQ
2021-11-10 16:59:22,393 INFO  [org.apache.activemq.artemis.core.server] AMQ221080: Deploying address ExpiryQueue supporting [ANYCAST]
2021-11-10 16:59:22,394 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying ANYCAST queue ExpiryQueue on address ExpiryQueue
2021-11-10 16:59:24,953 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started NIO Acceptor at 127.0.0.2:61616 for protocols [CORE,MQTT,AMQP,HORNETQ,STOMP,OPENWIRE]
2021-11-10 16:59:24,972 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started NIO Acceptor at 127.0.0.2:61613 for protocols [STOMP]
2021-11-10 16:59:24,973 INFO  [org.apache.activemq.artemis.core.server] AMQ221007: Server is now live
2021-11-10 16:59:24,973 INFO  [org.apache.activemq.artemis.core.server] AMQ221001: Apache ActiveMQ Artemis Message Broker version 2.19.0 [broker1, nodeID=1a4624d9-423f-11ec-b875-40167e37963a] 
2021-11-10 16:59:25,776 INFO  [org.apache.activemq.hawtio.branding.PluginContextListener] Initialized activemq-branding plugin
2021-11-10 16:59:26,092 INFO  [org.apache.activemq.hawtio.plugin.PluginContextListener] Initialized artemis-plugin plugin
2021-11-10 16:59:26,267 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\master1\data\journal\activemq-data-2.amq (size=10.485.760) to replica.
2021-11-10 16:59:27,597 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\master1\data\bindings\activemq-bindings-3.bindings (size=1.048.576) to replica.
2021-11-10 16:59:27,628 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\master1\data\bindings\activemq-bindings-2.bindings (size=1.048.576) to replica.
2021-11-10 16:59:28,338 INFO  [io.hawt.HawtioContextListener] Initialising hawtio services
2021-11-10 16:59:28,355 INFO  [io.hawt.system.ConfigManager] Configuration will be discovered via system properties
2021-11-10 16:59:28,358 INFO  [io.hawt.jmx.JmxTreeWatcher] Welcome to Hawtio 2.14.0
2021-11-10 16:59:28,365 INFO  [io.hawt.web.auth.AuthenticationConfiguration] Starting hawtio authentication filter, JAAS realm: "activemq" authorized role(s): "amq" role principal classes: "org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"
2021-11-10 16:59:28,376 INFO  [io.hawt.web.auth.LoginRedirectFilter] Hawtio loginRedirectFilter is using 1800 sec. HttpSession timeout
2021-11-10 16:59:28,394 INFO  [io.hawt.web.proxy.ProxyServlet] Proxy servlet is disabled
2021-11-10 16:59:28,403 INFO  [io.hawt.web.servlets.JolokiaConfiguredAgentServlet] Jolokia overridden property: [key=policyLocation, value=file:/D:/apache-artemis-2.19.0-bin/apache-artemis-2.19.0/bin/master1/etc/\jolokia-access.xml]
2021-11-10 16:59:29,401 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://127.0.0.2:8261
2021-11-10 16:59:29,401 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API available at http://127.0.0.2:8261/console/jolokia
2021-11-10 16:59:29,403 INFO  [org.apache.activemq.artemis] AMQ241004: Artemis Console available at http://127.0.0.2:8261/console

从日志:

2021-11-10 16:58:55,465 INFO  [org.apache.activemq.artemis.integration.bootstrap] AMQ101000: Starting ActiveMQ Artemis Server
2021-11-10 16:58:55,597 INFO  [org.apache.activemq.artemis.core.server] AMQ221000: backup Message Broker is starting with configuration Broker Configuration (clustered=true,journalDirectory=data/journal,bindingsDirectory=data/bindings,largeMessagesDirectory=data/large-messages,pagingDirectory=data/paging)
2021-11-10 16:58:55,625 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\journal to D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\journal\oldreplica.1
2021-11-10 16:58:55,631 INFO  [org.apache.activemq.artemis.core.server] AMQ221055: There were too many old replicated folders upon startup, removing D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\paging\oldreplica.22
2021-11-10 16:58:55,721 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\paging to D:\apache-artemis-2.19.0-bin\apache-artemis-2.19.0\bin\backup1\data\paging\oldreplica.24
2021-11-10 16:58:55,847 INFO  [org.apache.activemq.artemis.core.server] AMQ221013: Using NIO Journal
2021-11-10 16:58:55,955 INFO  [org.apache.activemq.artemis.core.server] AMQ221057: Global Max Size is being adjusted to 1/2 of the JVM max size (-Xmx). being defined as 536.870.912
2021-11-10 16:58:56,419 INFO  [org.apache.activemq.hawtio.branding.PluginContextListener] Initialized activemq-branding plugin
2021-11-10 16:58:56,683 INFO  [org.apache.activemq.hawtio.plugin.PluginContextListener] Initialized artemis-plugin plugin
2021-11-10 16:58:58,188 INFO  [io.hawt.HawtioContextListener] Initialising hawtio services
2021-11-10 16:58:58,210 INFO  [io.hawt.system.ConfigManager] Configuration will be discovered via system properties
2021-11-10 16:58:58,214 INFO  [io.hawt.jmx.JmxTreeWatcher] Welcome to Hawtio 2.14.0
2021-11-10 16:58:58,223 INFO  [io.hawt.web.auth.AuthenticationConfiguration] Starting hawtio authentication filter, JAAS realm: "activemq" authorized role(s): "amq" role principal classes: "org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"
2021-11-10 16:58:58,234 INFO  [io.hawt.web.auth.LoginRedirectFilter] Hawtio loginRedirectFilter is using 1800 sec. HttpSession timeout
2021-11-10 16:58:58,258 INFO  [io.hawt.web.proxy.ProxyServlet] Proxy servlet is disabled
2021-11-10 16:58:58,273 INFO  [io.hawt.web.servlets.JolokiaConfiguredAgentServlet] Jolokia overridden property: [key=policyLocation, value=file:/D:/apache-artemis-2.19.0-bin/apache-artemis-2.19.0/bin/backup1/etc/\jolokia-access.xml]
2021-11-10 16:58:59,913 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://127.0.0.4:8261
2021-11-10 16:58:59,913 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API available at http://127.0.0.4:8261/console/jolokia
2021-11-10 16:58:59,915 INFO  [org.apache.activemq.artemis] AMQ241004: Artemis Console available at http://127.0.0.4:8261/console
2021-11-10 16:59:06,946 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-server]. Adding protocol support for: CORE
2021-11-10 16:59:06,947 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol support for: AMQP
2021-11-10 16:59:06,948 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol support for: HORNETQ
2021-11-10 16:59:06,948 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol support for: MQTT
2021-11-10 16:59:06,949 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding protocol support for: OPENWIRE
2021-11-10 16:59:06,949 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol support for: STOMP
2021-11-10 16:59:20,433 WARNING [org.jgroups.protocols.TCP] JGRP000032: D114336-63392: no physical address for 53866421-fef5-9b4b-fdcb-ed1196697da8, dropping message
2021-11-10 16:59:25,161 INFO  [org.apache.activemq.artemis.core.server] AMQ221109: Apache ActiveMQ Artemis Backup Server version 2.19.0 [null] started, waiting live to fail before it gets active
2021-11-10 16:59:28,533 INFO  [org.apache.activemq.artemis.core.server] AMQ221024: Backup server ActiveMQServerImpl::name=broker2 is synchronized with live server, nodeID=1a4624d9-423f-11ec-b875-40167e37963a.
2021-11-10 16:59:28,679 INFO  [org.apache.activemq.artemis.core.server] AMQ221031: backup announced

需要明确的是,我正在使用 JGroups,因为这应该稍后在 k8s 环境中使用,我想对其进行测试。我首先尝试了标准的 UDP 配置,但同样,我没有收到拓扑。所以我的错误一定是在别的地方。

标签: javaspring-bootactivemq-artemis

解决方案


我相信问题是你需要reconnectAttempts>0在你的 URL 中(例如reconnectAttempts=25)。reconnectAttempts的默认值为0

此外,如果您在连接时没有收到拓扑更新,您会看到如下ActiveMQConnectionTimedOutException消息:

Timed out waiting to receive cluster topology.

我还建议您在 URL 中指定备份的主机和端口以及主服务器的主机和端口。这样,如果应用程序启动时主服务器关闭,它仍然能够连接到备份服务器。由于您只有主服务器的主机和端口,因此只要主服务器关闭并且应用程序启动,或者即使应用程序只是重新创建ActiveMQConnectionFactory它,它也会失败。这是一个简单的例子:

@Bean
public ConnectionFactory filterConnectionFactory(ArtemisConfig artemisConfig) {
    return new ActiveMQConnectionFactory(
                    "(tcp://127.0.0.2:61616,tcp://127.0.0.3:61616)?ha=true&reconnectAttempts=25&blockOnDurableSend=false",
                    artemisConfig.getUserName(), artemisConfig.getPassword());
}

推荐阅读