首页 > 解决方案 > 启动 Atlas 时服务器 null 的会话 0x0

问题描述

我刚刚在 HDP 2.6.3 中安装了 Atlas,Atlas 服务器的启动出现了以下错误:

/var/log/atlas/application.log

2019-12-17 23:41:30,446 INFO  - [main-SendThread(1:2181):] ~ Opening socket connection to server 1/0.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (ClientCnxn:1019)
2019-12-17 23:41:30,447 ERROR - [main-SendThread(1:2181):] ~ Unable to open socket to 1/0.0.0.1:2181 (ClientCnxnSocketNIO:289)
2019-12-17 23:41:30,447 WARN  - [main-SendThread(1:2181):] ~ Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (ClientCnxn:1146)
java.net.SocketException: Invalid argument
    at sun.nio.ch.Net.connect0(Native Method)
    at sun.nio.ch.Net.connect(Net.java:454)
    at sun.nio.ch.Net.connect(Net.java:446)
    at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
    at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
    at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
    at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1011)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1047)
2019-12-17 23:41:30,548 WARN  - [main:] ~ Possibly transient ZooKeeper, quorum=1:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase-unsecure/hbaseid (RecoverableZooKeeper:272)
2019-12-17 23:41:31,548 INFO  - [main-SendThread(1:2181):] ~ Opening socket connection to server 1/0.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (ClientCnxn:1019)
2019-12-17 23:41:31,549 ERROR - [main-SendThread(1:2181):] ~ Unable to open socket to 1/0.0.0.1:2181 (ClientCnxnSocketNIO:289)
2019-12-17 23:41:31,549 WARN  - [main-SendThread(1:2181):] ~ Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (ClientCnxn:1146)
java.net.SocketException: Invalid argument
    at sun.nio.ch.Net.connect0(Native Method)
    at sun.nio.ch.Net.connect(Net.java:454)
    at sun.nio.ch.Net.connect(Net.java:446)
    at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
    at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
    at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
    at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1011)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1047)

我的 Zookeeper、Kafka 和 Solr 运行良好。以下是我尝试并输出的一些故障排除命令:

# netstat -plant | grep 2181
tcp        0      0 127.0.0.1:38388         127.0.0.1:2181          ESTABLISHED 67615/java
tcp6       0      0 :::2181                 :::*                    LISTEN      59604/java
tcp6       0      0 127.0.0.1:39600         127.0.0.1:2181          ESTABLISHED 60859/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:42112         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:42304         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:38380         127.0.0.1:2181          ESTABLISHED 9159/java
tcp6       0      0 127.0.0.1:42116         127.0.0.1:2181          ESTABLISHED 61237/java
tcp6       0      0 127.0.0.1:38398         127.0.0.1:2181          ESTABLISHED 1361/java
tcp6       0      0 127.0.0.1:38400         127.0.0.1:2181          ESTABLISHED 9159/java
tcp6       0      0 127.0.0.1:38354         127.0.0.1:2181          ESTABLISHED 1051/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38804         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38388         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:38390         127.0.0.1:2181          ESTABLISHED 1051/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:42116         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38384         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38400         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38354         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38398         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:38804         127.0.0.1:2181          ESTABLISHED 59825/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38390         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:39600         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38394         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:38394         127.0.0.1:2181          ESTABLISHED 1051/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38380         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:38384         127.0.0.1:2181          ESTABLISHED 1051/java
tcp6       0      0 127.0.0.1:42112         127.0.0.1:2181          ESTABLISHED 61237/java
tcp6       0      0 127.0.0.1:38358         127.0.0.1:2181          ESTABLISHED 1361/java
tcp6       0      0 127.0.0.1:2181          127.0.0.1:38358         ESTABLISHED 59604/java
tcp6       0      0 127.0.0.1:42304         127.0.0.1:2181          ESTABLISHED 61237/java

它只是一个单节点 HDP,所有服务都在一台主机上。

如何解决此问题以便启动 Atlas?

更新 1

在日志的早期部分,它使用demo.myserver.local:2181得很好:

2019-12-18 00:33:34,699 INFO  - [main:] ~ Client environment:java.io.tmpdir=/tmp (ZooKeeper:100)
2019-12-18 00:33:34,713 INFO  - [main:] ~ Client environment:java.compiler=<NA> (ZooKeeper:100)
2019-12-18 00:33:34,714 INFO  - [main:] ~ Client environment:os.name=Linux (ZooKeeper:100)
2019-12-18 00:33:34,725 INFO  - [main:] ~ Client environment:os.arch=amd64 (ZooKeeper:100)
2019-12-18 00:33:34,726 INFO  - [main:] ~ Client environment:os.version=3.10.0-1062.1.1.el7.x86_64 (ZooKeeper:100)
2019-12-18 00:33:34,727 INFO  - [main:] ~ Client environment:user.name=atlas (ZooKeeper:100)
2019-12-18 00:33:34,727 INFO  - [main:] ~ Client environment:user.home=/home/atlas (ZooKeeper:100)
2019-12-18 00:33:34,735 INFO  - [main:] ~ Client environment:user.dir=/home/atlas (ZooKeeper:100)
2019-12-18 00:33:34,737 INFO  - [main:] ~ Initiating client connection, connectString=demo.myserver.local:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@3fa7df1 (ZooKeeper:438)
2019-12-18 00:33:35,051 INFO  - [main-SendThread(demo.myserver.local:2181):] ~ Opening socket connection to server demo.myserver.local/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (ClientCnxn:1019)
2019-12-18 00:33:35,103 INFO  - [main-SendThread(demo.myserver.local:2181):] ~ Socket connection established, initiating session, client: /127.0.0.1:39586, server: demo.myserver.local/127.0.0.1:2181 (ClientCnxn:864)
2019-12-18 00:33:35,239 INFO  - [main-SendThread(demo.myserver.local:2181):] ~ Session establishment complete on server demo.myserver.local/127.0.0.1:2181, sessionid = 0x16f16396b0a000f, negotiated timeout = 60000 (ClientCnxn:1279)
2019-12-18 00:33:40,047 WARN  - [main:] ~ Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (NativeCodeLoader:62)

==> /var/log/atlas/gc-worker.log.0.current <==
Heap after GC invocations=1 (full 0):
 par new generation   total 552960K, used 31365K [0x0000000080000000, 0x00000000a5800000, 0x00000000a5800000)
  eden space 491520K,   0% used [0x0000000080000000, 0x0000000080000000, 0x000000009e000000)
  from space 61440K,  51% used [0x00000000a1c00000, 0x00000000a3aa1688, 0x00000000a5800000)
  to   space 61440K,   0% used [0x000000009e000000, 0x000000009e000000, 0x00000000a1c00000)
 concurrent mark-sweep generation total 1482752K, used 0K [0x00000000a5800000, 0x0000000100000000, 0x0000000100000000)
 Metaspace       used 22798K, capacity 23030K, committed 23424K, reserved 1069056K
  class space    used 2777K, capacity 2863K, committed 2944K, reserved 1048576K
}

阿特拉斯配置

cat /etc/atlas/conf/atlas-application.properties
# Generated by Apache Ambari. Wed Dec 18 00:33:04 2019

atlas.audit.hbase.tablename=ATLAS_ENTITY_AUDIT_EVENTS
atlas.audit.hbase.zookeeper.quorum=1
atlas.audit.zookeeper.session.timeout.ms=60000
atlas.auth.policy.file=/usr/hdp/current/atlas-server/conf/policy-store.txt
atlas.authentication.keytab=/etc/security/keytabs/atlas.service.keytab
atlas.authentication.method.file=true
atlas.authentication.method.file.filename=/usr/hdp/current/atlas-server/conf/users-credentials.properties
atlas.authentication.method.kerberos=false
atlas.authentication.method.ldap=false
atlas.authentication.method.ldap.ad.base.dn=
atlas.authentication.method.ldap.ad.bind.dn=
atlas.authentication.method.ldap.ad.bind.password=
atlas.authentication.method.ldap.ad.default.role=ROLE_USER
atlas.authentication.method.ldap.ad.domain=
atlas.authentication.method.ldap.ad.referral=ignore
atlas.authentication.method.ldap.ad.url=
atlas.authentication.method.ldap.ad.user.searchfilter=(sAMAccountName={0})
atlas.authentication.method.ldap.base.dn=
atlas.authentication.method.ldap.bind.dn=
atlas.authentication.method.ldap.bind.password=
atlas.authentication.method.ldap.default.role=ROLE_USER
atlas.authentication.method.ldap.groupRoleAttribute=cn
atlas.authentication.method.ldap.groupSearchBase=
atlas.authentication.method.ldap.groupSearchFilter=
atlas.authentication.method.ldap.referral=ignore
atlas.authentication.method.ldap.type=ldap
atlas.authentication.method.ldap.url=
atlas.authentication.method.ldap.user.searchfilter=
atlas.authentication.method.ldap.userDNpattern=uid=
atlas.authentication.principal=atlas
atlas.authorizer.impl=simple
atlas.cluster.name=myhdp
atlas.enableTLS=false
atlas.graph.index.search.backend=solr5
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.zookeeper-url=demo.myserver.local:2181/solr
atlas.graph.storage.backend=hbase
atlas.graph.storage.hbase.table=atlas_titan
atlas.graph.storage.hostname=demo.myserver.local
atlas.kafka.bootstrap.servers=demo.myserver.local:6667
atlas.kafka.enable.auto.commit=false
atlas.kafka.hook.group.id=atlas
atlas.kafka.session.timeout.ms=30000
atlas.kafka.zookeeper.connect=demo.myserver.local:2181
atlas.kafka.zookeeper.connection.timeout.ms=30000
atlas.kafka.zookeeper.session.timeout.ms=60000
atlas.kafka.zookeeper.sync.time.ms=20
atlas.lineage.schema.query.hive_table=hive_table where __guid='%s'\, columns
atlas.lineage.schema.query.Table=Table where __guid='%s'\, columns
atlas.notification.create.topics=true
atlas.notification.embedded=false
atlas.notification.replicas=1
atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
atlas.proxyusers=knox
atlas.rest.address=http://demo.myserver.local:21000
atlas.server.address.id1=demo.myserver.local:21000
atlas.server.bind.address=demo.myserver.local
atlas.server.ha.enabled=false
atlas.server.http.port=21000
atlas.server.https.port=21443
atlas.server.ids=id1
atlas.solr.kerberos.enable=false
atlas.ssl.exclude.protocols=TLSv1.2
atlas.sso.knox.browser.useragent=
atlas.sso.knox.enabled=false
atlas.sso.knox.providerurl=
atlas.sso.knox.publicKey=

标签: apache-zookeeperhortonworks-data-platformapache-atlas

解决方案


推荐阅读