首页 > 技术文章 > Flink常见问题解决记录

createweb 2020-12-11 10:55 原文

1.Hardlink from files of previous local stored state might cross devices

开启了state.backend.local-recovery: true 任务checkpoint的时候一直报错

 Fail to create hard link from xx java.nio.file.FileSystemException xx Invalid cross-device link

如果'taskmanager.state.local.root-dirs'没有设置的话 local recovery 会使用'io.tmp.dirs'作为存储目录,如果后端存储目录和这个目录不在同一块物理盘上 就会出现这个问题

解决办法可以指定到同一个盘上

参考

https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#taskmanager-state-local-root-dirs

https://issues.apache.org/jira/browse/FLINK-10954

 2.Flink 1.11.1新环境部署 hadoop环境问题

java.lang.IllegalStateException: No Executor found. Please make sure to export the HADOOP_CLASSPATH environment variable or have hadoop in your classpath.
 For more information refer to the "Deployment & Operations" section of the official 
 Apache Flink documentation. at org.apache.flink.yarn.cli.FallbackYarnSessionCli.isActive(FallbackYarnSessionCli.java:59) at 
 org.apache.flink.client.cli.CliFrontend.validateAndGetActiveCommandLine(CliFrontend.java:1090) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:218) at 
 org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992) at 
 org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) at 
 org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992)

flink 1.11开始官方不再维护flink-shaded-hadoop-2-uber jar,可以使用之前版本的shade,社区建议导入hadoop classpath,执行这行代码,不需要改任何东西

export HADOOP_CLASSPATH=`hadoop classpath`

参考:

https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/hadoop.html

https://blog.csdn.net/Zsigner/article/details/110949389

 3.flink on yarn 增加flink-shaded-hadoop-2-uber-2.8.3-10.0.jar到lib目录 提交任务报错

java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.mapred.JobConf
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2306)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:94)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:106)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:102)
    at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:450)
    at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:310)
    at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:277)
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:833)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:803)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:676)
    at org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:161)
    at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:124)
    at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:93)
    at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:195)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
    at org.apache.flink.yarn.YarnClusterClientFactory.getClusterDescriptor(YarnClusterClientFactory.java:77)
    at org.apache.flink.yarn.YarnClusterClientFactory.createClusterDescriptor(YarnClusterClientFactory.java:61)
    at org.apache.flink.yarn.YarnClusterClientFactory.createClusterDescriptor(YarnClusterClientFactory.java:43)
    at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:64)
    at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1812)
    at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:128)
    at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:76)
    at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1699)

解决:在bin/flink 脚本最开始增加 export HADOOP_CLASSPATH=`hadoop classpath` 

4.flink1.11.1执行./bin/sql-client.sh embedded 报错

Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.CatalogFactory' in the classpath.

准备两个jar包放在flink的lib/下 

flink-connector-hive_2.11-1.11.1.jar和hive-exec-2.3.7.jar hive版本根据环境更改

5../bin/sql-client.sh  embedded启动flink sql client的时候 报连接不上hive的metastore 报错如下

Exception in thread "main" org.apache.flink.table.client.SqlClientException: Unexpected exception. This is a bug. Please consider filing an issue.
    at org.apache.flink.table.client.SqlClient.main(SqlClient.java:213)
Caused by: org.apache.flink.table.client.gateway.SqlExecutionException: Could not create execution context.
    at org.apache.flink.table.client.gateway.local.ExecutionContext$Builder.build(ExecutionContext.java:870)
    at org.apache.flink.table.client.gateway.local.LocalExecutor.openSession(LocalExecutor.java:227)
    at org.apache.flink.table.client.SqlClient.start(SqlClient.java:108)
    at org.apache.flink.table.client.SqlClient.main(SqlClient.java:201)
Caused by: org.apache.flink.table.catalog.exceptions.CatalogException: Failed to create Hive Metastore client
    at org.apache.flink.table.catalog.hive.client.HiveShimV230.getHiveMetastoreClient(HiveShimV230.java:52)
    at org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper.createMetastoreClient(HiveMetastoreClientWrapper.java:240)
    at org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper.<init>(HiveMetastoreClientWrapper.java:71)
    at org.apache.flink.table.catalog.hive.client.HiveMetastoreClientFactory.create(HiveMetastoreClientFactory.java:35)
    at org.apache.flink.table.catalog.hive.HiveCatalog.open(HiveCatalog.java:223)
    at org.apache.flink.table.catalog.CatalogManager.registerCatalog(CatalogManager.java:191)
    at org.apache.flink.table.api.internal.TableEnvironmentImpl.registerCatalog(TableEnvironmentImpl.java:337)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.lambda$null$5(ExecutionContext.java:627)
    at java.util.HashMap.forEach(HashMap.java:1289)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.lambda$initializeCatalogs$6(ExecutionContext.java:625)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:264)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.initializeCatalogs(ExecutionContext.java:624)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.initializeTableEnvironment(ExecutionContext.java:523)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.<init>(ExecutionContext.java:183)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.<init>(ExecutionContext.java:136)
    at org.apache.flink.table.client.gateway.local.ExecutionContext$Builder.build(ExecutionContext.java:859)
    ... 3 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.table.catalog.hive.client.HiveShimV230.getHiveMetastoreClient(HiveShimV230.java:50)
    ... 18 more
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1709)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:83)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:89)
    ... 23 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1707)
    ... 26 more
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
    at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:480)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:247)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1707)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:83)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:89)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.table.catalog.hive.client.HiveShimV230.getHiveMetastoreClient(HiveShimV230.java:50)
    at org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper.createMetastoreClient(HiveMetastoreClientWrapper.java:240)
    at org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper.<init>(HiveMetastoreClientWrapper.java:71)
    at org.apache.flink.table.catalog.hive.client.HiveMetastoreClientFactory.create(HiveMetastoreClientFactory.java:35)
    at org.apache.flink.table.catalog.hive.HiveCatalog.open(HiveCatalog.java:223)
    at org.apache.flink.table.catalog.CatalogManager.registerCatalog(CatalogManager.java:191)
    at org.apache.flink.table.api.internal.TableEnvironmentImpl.registerCatalog(TableEnvironmentImpl.java:337)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.lambda$null$5(ExecutionContext.java:627)
    at java.util.HashMap.forEach(HashMap.java:1289)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.lambda$initializeCatalogs$6(ExecutionContext.java:625)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:264)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.initializeCatalogs(ExecutionContext.java:624)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.initializeTableEnvironment(ExecutionContext.java:523)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.<init>(ExecutionContext.java:183)
    at org.apache.flink.table.client.gateway.local.ExecutionContext.<init>(ExecutionContext.java:136)
    at org.apache.flink.table.client.gateway.local.ExecutionContext$Builder.build(ExecutionContext.java:859)
    at org.apache.flink.table.client.gateway.local.LocalExecutor.openSession(LocalExecutor.java:227)
    at org.apache.flink.table.client.SqlClient.start(SqlClient.java:108)
    at org.apache.flink.table.client.SqlClient.main(SqlClient.java:201)
Caused by: java.net.ConnectException: 拒绝连接 (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:607)
    at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
    ... 33 more
)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:529)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:247)
    ... 31 more

检查后发现hive metastore服务没起起来

执行 nohup hive --service metastore &  默认端口是9083

 

推荐阅读