apache-zookeeper - 启动 Atlas 时服务器 null 的会话 0x0
问题描述
我刚刚在 HDP 2.6.3 中安装了 Atlas,Atlas 服务器的启动出现了以下错误:
/var/log/atlas/application.log
2019-12-17 23:41:30,446 INFO - [main-SendThread(1:2181):] ~ Opening socket connection to server 1/0.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (ClientCnxn:1019)
2019-12-17 23:41:30,447 ERROR - [main-SendThread(1:2181):] ~ Unable to open socket to 1/0.0.0.1:2181 (ClientCnxnSocketNIO:289)
2019-12-17 23:41:30,447 WARN - [main-SendThread(1:2181):] ~ Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (ClientCnxn:1146)
java.net.SocketException: Invalid argument
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1011)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1047)
2019-12-17 23:41:30,548 WARN - [main:] ~ Possibly transient ZooKeeper, quorum=1:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase-unsecure/hbaseid (RecoverableZooKeeper:272)
2019-12-17 23:41:31,548 INFO - [main-SendThread(1:2181):] ~ Opening socket connection to server 1/0.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (ClientCnxn:1019)
2019-12-17 23:41:31,549 ERROR - [main-SendThread(1:2181):] ~ Unable to open socket to 1/0.0.0.1:2181 (ClientCnxnSocketNIO:289)
2019-12-17 23:41:31,549 WARN - [main-SendThread(1:2181):] ~ Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (ClientCnxn:1146)
java.net.SocketException: Invalid argument
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1011)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1047)
我的 Zookeeper、Kafka 和 Solr 运行良好。以下是我尝试并输出的一些故障排除命令:
# netstat -plant | grep 2181
tcp 0 0 127.0.0.1:38388 127.0.0.1:2181 ESTABLISHED 67615/java
tcp6 0 0 :::2181 :::* LISTEN 59604/java
tcp6 0 0 127.0.0.1:39600 127.0.0.1:2181 ESTABLISHED 60859/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:42112 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:42304 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:38380 127.0.0.1:2181 ESTABLISHED 9159/java
tcp6 0 0 127.0.0.1:42116 127.0.0.1:2181 ESTABLISHED 61237/java
tcp6 0 0 127.0.0.1:38398 127.0.0.1:2181 ESTABLISHED 1361/java
tcp6 0 0 127.0.0.1:38400 127.0.0.1:2181 ESTABLISHED 9159/java
tcp6 0 0 127.0.0.1:38354 127.0.0.1:2181 ESTABLISHED 1051/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38804 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38388 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:38390 127.0.0.1:2181 ESTABLISHED 1051/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:42116 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38384 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38400 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38354 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38398 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:38804 127.0.0.1:2181 ESTABLISHED 59825/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38390 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:39600 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38394 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:38394 127.0.0.1:2181 ESTABLISHED 1051/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38380 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:38384 127.0.0.1:2181 ESTABLISHED 1051/java
tcp6 0 0 127.0.0.1:42112 127.0.0.1:2181 ESTABLISHED 61237/java
tcp6 0 0 127.0.0.1:38358 127.0.0.1:2181 ESTABLISHED 1361/java
tcp6 0 0 127.0.0.1:2181 127.0.0.1:38358 ESTABLISHED 59604/java
tcp6 0 0 127.0.0.1:42304 127.0.0.1:2181 ESTABLISHED 61237/java
它只是一个单节点 HDP,所有服务都在一台主机上。
如何解决此问题以便启动 Atlas?
更新 1
在日志的早期部分,它使用demo.myserver.local:2181
得很好:
2019-12-18 00:33:34,699 INFO - [main:] ~ Client environment:java.io.tmpdir=/tmp (ZooKeeper:100)
2019-12-18 00:33:34,713 INFO - [main:] ~ Client environment:java.compiler=<NA> (ZooKeeper:100)
2019-12-18 00:33:34,714 INFO - [main:] ~ Client environment:os.name=Linux (ZooKeeper:100)
2019-12-18 00:33:34,725 INFO - [main:] ~ Client environment:os.arch=amd64 (ZooKeeper:100)
2019-12-18 00:33:34,726 INFO - [main:] ~ Client environment:os.version=3.10.0-1062.1.1.el7.x86_64 (ZooKeeper:100)
2019-12-18 00:33:34,727 INFO - [main:] ~ Client environment:user.name=atlas (ZooKeeper:100)
2019-12-18 00:33:34,727 INFO - [main:] ~ Client environment:user.home=/home/atlas (ZooKeeper:100)
2019-12-18 00:33:34,735 INFO - [main:] ~ Client environment:user.dir=/home/atlas (ZooKeeper:100)
2019-12-18 00:33:34,737 INFO - [main:] ~ Initiating client connection, connectString=demo.myserver.local:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@3fa7df1 (ZooKeeper:438)
2019-12-18 00:33:35,051 INFO - [main-SendThread(demo.myserver.local:2181):] ~ Opening socket connection to server demo.myserver.local/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (ClientCnxn:1019)
2019-12-18 00:33:35,103 INFO - [main-SendThread(demo.myserver.local:2181):] ~ Socket connection established, initiating session, client: /127.0.0.1:39586, server: demo.myserver.local/127.0.0.1:2181 (ClientCnxn:864)
2019-12-18 00:33:35,239 INFO - [main-SendThread(demo.myserver.local:2181):] ~ Session establishment complete on server demo.myserver.local/127.0.0.1:2181, sessionid = 0x16f16396b0a000f, negotiated timeout = 60000 (ClientCnxn:1279)
2019-12-18 00:33:40,047 WARN - [main:] ~ Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (NativeCodeLoader:62)
==> /var/log/atlas/gc-worker.log.0.current <==
Heap after GC invocations=1 (full 0):
par new generation total 552960K, used 31365K [0x0000000080000000, 0x00000000a5800000, 0x00000000a5800000)
eden space 491520K, 0% used [0x0000000080000000, 0x0000000080000000, 0x000000009e000000)
from space 61440K, 51% used [0x00000000a1c00000, 0x00000000a3aa1688, 0x00000000a5800000)
to space 61440K, 0% used [0x000000009e000000, 0x000000009e000000, 0x00000000a1c00000)
concurrent mark-sweep generation total 1482752K, used 0K [0x00000000a5800000, 0x0000000100000000, 0x0000000100000000)
Metaspace used 22798K, capacity 23030K, committed 23424K, reserved 1069056K
class space used 2777K, capacity 2863K, committed 2944K, reserved 1048576K
}
阿特拉斯配置
cat /etc/atlas/conf/atlas-application.properties
# Generated by Apache Ambari. Wed Dec 18 00:33:04 2019
atlas.audit.hbase.tablename=ATLAS_ENTITY_AUDIT_EVENTS
atlas.audit.hbase.zookeeper.quorum=1
atlas.audit.zookeeper.session.timeout.ms=60000
atlas.auth.policy.file=/usr/hdp/current/atlas-server/conf/policy-store.txt
atlas.authentication.keytab=/etc/security/keytabs/atlas.service.keytab
atlas.authentication.method.file=true
atlas.authentication.method.file.filename=/usr/hdp/current/atlas-server/conf/users-credentials.properties
atlas.authentication.method.kerberos=false
atlas.authentication.method.ldap=false
atlas.authentication.method.ldap.ad.base.dn=
atlas.authentication.method.ldap.ad.bind.dn=
atlas.authentication.method.ldap.ad.bind.password=
atlas.authentication.method.ldap.ad.default.role=ROLE_USER
atlas.authentication.method.ldap.ad.domain=
atlas.authentication.method.ldap.ad.referral=ignore
atlas.authentication.method.ldap.ad.url=
atlas.authentication.method.ldap.ad.user.searchfilter=(sAMAccountName={0})
atlas.authentication.method.ldap.base.dn=
atlas.authentication.method.ldap.bind.dn=
atlas.authentication.method.ldap.bind.password=
atlas.authentication.method.ldap.default.role=ROLE_USER
atlas.authentication.method.ldap.groupRoleAttribute=cn
atlas.authentication.method.ldap.groupSearchBase=
atlas.authentication.method.ldap.groupSearchFilter=
atlas.authentication.method.ldap.referral=ignore
atlas.authentication.method.ldap.type=ldap
atlas.authentication.method.ldap.url=
atlas.authentication.method.ldap.user.searchfilter=
atlas.authentication.method.ldap.userDNpattern=uid=
atlas.authentication.principal=atlas
atlas.authorizer.impl=simple
atlas.cluster.name=myhdp
atlas.enableTLS=false
atlas.graph.index.search.backend=solr5
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.zookeeper-url=demo.myserver.local:2181/solr
atlas.graph.storage.backend=hbase
atlas.graph.storage.hbase.table=atlas_titan
atlas.graph.storage.hostname=demo.myserver.local
atlas.kafka.bootstrap.servers=demo.myserver.local:6667
atlas.kafka.enable.auto.commit=false
atlas.kafka.hook.group.id=atlas
atlas.kafka.session.timeout.ms=30000
atlas.kafka.zookeeper.connect=demo.myserver.local:2181
atlas.kafka.zookeeper.connection.timeout.ms=30000
atlas.kafka.zookeeper.session.timeout.ms=60000
atlas.kafka.zookeeper.sync.time.ms=20
atlas.lineage.schema.query.hive_table=hive_table where __guid='%s'\, columns
atlas.lineage.schema.query.Table=Table where __guid='%s'\, columns
atlas.notification.create.topics=true
atlas.notification.embedded=false
atlas.notification.replicas=1
atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
atlas.proxyusers=knox
atlas.rest.address=http://demo.myserver.local:21000
atlas.server.address.id1=demo.myserver.local:21000
atlas.server.bind.address=demo.myserver.local
atlas.server.ha.enabled=false
atlas.server.http.port=21000
atlas.server.https.port=21443
atlas.server.ids=id1
atlas.solr.kerberos.enable=false
atlas.ssl.exclude.protocols=TLSv1.2
atlas.sso.knox.browser.useragent=
atlas.sso.knox.enabled=false
atlas.sso.knox.providerurl=
atlas.sso.knox.publicKey=
解决方案
推荐阅读
- serverless-framework - 如何从 serverless.yaml 中的打包中排除文件?
- sql - 根据时间范围更新表
- tensorflow - 什么样的张量流操作会在 GPU 上执行?
- flutter - Flutter:打开抽屉时键盘弹出
- r - r system.time() 导致我的 r 会话挂起
- geo - 如何将 Turf.js 多边形转换为 WKT
- sql - 选择 distinct 并在该值上加一
- arrays - 子类类型作为函数参数
- swift - 如何在 ARKit 中获取相机的前向(LookAt)向量?
- reactjs - 在 Next.js 项目中,无需在页面名称末尾需要 .html 的 S3 存储桶的 cloudFront 分发中进行路径设置