首页 > 解决方案 > Flink 作业总是显示“已创建”状态

问题描述

我正在尝试在 k8s 环境中运行 Flink 作业,集群看起来还可以。我可以从 UI 中看到 jobmanager 和 taskmanager 运行良好。但是当我尝试运行 Flink 作业时,UI 显示作业正在运行,但任务状态始终保持“已创建”。我发出 GET 请求以获取该作业的指标,发现任务状态为“已调度”。
我不知道集群在哪里出现问题,任何人都可以就如何处理它提供一些指标或建议。我怀疑jobmanager 可以与taskmanager 联系,但是taskmanager 从UI 看起来运行良好。我还检查了集群的资源,已经足够了。需要你的帮助,谢谢!! 在此处输入图像描述 在此处输入图像描述

我的 flink 集群(独立)在 kubernetes 上运行。一个 master pod,一个 worker pod,但它们都没有任何系统级别的日志。我不确定在哪里可以配置它。但是当我尝试对其执行 wordcount.jar 示例时,它会在下面显示错误日志:

org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy
    at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:116)
    at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:78)
    at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)
    at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:185)
    at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:179)
    at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:503)
    at org.apache.flink.runtime.scheduler.UpdateSchedulerNgOnInternalFailuresListener.notifyTaskFailure(UpdateSchedulerNgOnInternalFailuresListener.java:49)
    at org.apache.flink.runtime.executiongraph.ExecutionGraph.notifySchedulerNgAboutInternalTaskFailure(ExecutionGraph.java:1710)
    at org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1287)
    at org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1255)
    at org.apache.flink.runtime.executiongraph.Execution.markFailed(Execution.java:1086)
    at org.apache.flink.runtime.executiongraph.ExecutionVertex.markFailed(ExecutionVertex.java:748)
    at org.apache.flink.runtime.scheduler.DefaultExecutionVertexOperations.markFailed(DefaultExecutionVertexOperations.java:41)
    at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskDeploymentFailure(DefaultScheduler.java:435)
    at org.apache.flink.runtime.scheduler.DefaultScheduler.lambda$assignResourceOrHandleError$6(DefaultScheduler.java:422)
    at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl.lambda$internalAllocateSlot$0(SchedulerImpl.java:168)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$SingleTaskSlot.release(SlotSharingManager.java:726)
    at org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.release(SlotSharingManager.java:537)
    at org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.lambda$new$0(SlotSharingManager.java:432)
    at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at org.apache.flink.runtime.concurrent.FutureUtils.lambda$forwardTo$21(FutureUtils.java:1120)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source)
    at org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1036)
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)
    at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
    at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
    at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
    at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
    at akka.actor.Actor.aroundReceive(Actor.scala:517)
    at akka.actor.Actor.aroundReceive$(Actor.scala:515)
    at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
    at akka.actor.ActorCell.invoke(ActorCell.scala:561)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
    at akka.dispatch.Mailbox.run(Mailbox.scala:225)
    at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
    at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
    Could not allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources.
    at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441)
    ... 47 more
Caused by: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException
    at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source)
    at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source)
    ... 27 more
Caused by: java.util.concurrent.TimeoutException
    ... 25 more

任务管理器错误日志:

2020-09-04 09:09:53 DEBUG a.r.t.n.NettyTransport [] - Remote connection to [/192.168.3.147:55194] was disconnected because of [id: 0xcf013b4a, /192.168.3.147:55194 :> /172.16.0.210:6122] DISCONNECTED
2020-09-04 09:09:53 DEBUG a.r.t.ProtocolStateActor [] - Association between local [tcp://flink@172.16.0.210:6122] and remote [tcp://flink@192.168.3.147:55194] was disassociated because the ProtocolStateActor failed: Unknown
2020-09-04 09:10:00 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:10:10 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:10:20 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:10:30 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:10:40 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:10:51 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:10:53 DEBUG a.r.t.n.NettyTransport [] - Remote connection to [/192.168.3.147:56740] was disconnected because of [id: 0xeab6536f, /192.168.3.147:56740 :> /172.16.0.210:6122] DISCONNECTED
2020-09-04 09:10:53 DEBUG a.r.t.ProtocolStateActor [] - Association between local [tcp://flink@172.16.0.210:6122] and remote [tcp://flink@192.168.3.147:56740] was disassociated because the ProtocolStateActor failed: Unknown
2020-09-04 09:10:56 INFO  o.a.f.r.t.TaskExecutor [] - Receive slot request c37da0fd74b14bd257a4ecce33c06d79 for job 69482630ce87466bb580bff416c284e5 from resource manager with leader id 00000000000000000000000000000000.
2020-09-04 09:10:56 DEBUG o.a.f.r.m.MemoryManager [] - Initialized MemoryManager with total memory size 37580964 and page size 32768.
2020-09-04 09:10:56 INFO  o.a.f.r.t.TaskExecutor [] - Allocated slot for c37da0fd74b14bd257a4ecce33c06d79.
2020-09-04 09:10:56 INFO  o.a.f.r.t.DefaultJobLeaderService [] - Add job 69482630ce87466bb580bff416c284e5 for job leader monitoring.
2020-09-04 09:10:56 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - New leader information for job 69482630ce87466bb580bff416c284e5. Address: akka.tcp://flink@flink:6123/user/rpc/jobmanager_12, leader id: 00000000000000000000000000000000.
2020-09-04 09:10:56 INFO  o.a.f.r.t.DefaultJobLeaderService [] - Try to register at job manager akka.tcp://flink@flink:6123/user/rpc/jobmanager_12 with leader id 00000000-0000-0000-0000-000000000000.
2020-09-04 09:10:56 DEBUG o.a.f.r.r.a.AkkaRpcService [] - Try to connect to remote RPC endpoint with address akka.tcp://flink@flink:6123/user/rpc/jobmanager_12. Returning a org.apache.flink.runtime.jobmaster.JobMasterGateway gateway.
2020-09-04 09:10:56 INFO  o.a.f.r.t.DefaultJobLeaderService [] - Resolved JobManager address, beginning registration
2020-09-04 09:10:56 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager attempt 1 (timeout=100ms)
2020-09-04 09:10:56 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager (akka.tcp://flink@flink:6123/user/rpc/jobmanager_12) attempt 1 timed out after 100 ms
2020-09-04 09:10:56 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager attempt 2 (timeout=200ms)
2020-09-04 09:10:56 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager (akka.tcp://flink@flink:6123/user/rpc/jobmanager_12) attempt 2 timed out after 200 ms
2020-09-04 09:10:56 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager attempt 3 (timeout=400ms)
2020-09-04 09:10:56 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager (akka.tcp://flink@flink:6123/user/rpc/jobmanager_12) attempt 3 timed out after 400 ms
2020-09-04 09:10:56 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager attempt 4 (timeout=800ms)
2020-09-04 09:10:57 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager (akka.tcp://flink@flink:6123/user/rpc/jobmanager_12) attempt 4 timed out after 800 ms
2020-09-04 09:10:57 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager attempt 5 (timeout=1600ms)
2020-09-04 09:10:59 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager (akka.tcp://flink@flink:6123/user/rpc/jobmanager_12) attempt 5 timed out after 1600 ms
2020-09-04 09:10:59 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager attempt 6 (timeout=3200ms)
2020-09-04 09:11:01 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:11:02 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager (akka.tcp://flink@flink:6123/user/rpc/jobmanager_12) attempt 6 timed out after 3200 ms
2020-09-04 09:11:02 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Registration at JobManager attempt 7 (timeout=6400ms)
2020-09-04 09:11:06 DEBUG o.a.f.r.t.TaskExecutor [] - Free slot with allocation id c37da0fd74b14bd257a4ecce33c06d79 because: The slot c37da0fd74b14bd257a4ecce33c06d79 has timed out.
2020-09-04 09:11:06 DEBUG o.a.f.r.t.s.TaskSlotTableImpl [] - Free slot TaskSlot(index:2, state:ALLOCATED, resource profile: ResourceProfile{cpuCores=1.0000000000000000, taskHeapMemory=12.800mb (13421771 bytes), taskOffHeapMemory=0 bytes, managedMemory=35.840mb (37580964 bytes), networkMemory=8.960mb (9395241 bytes)}, allocationId: c37da0fd74b14bd257a4ecce33c06d79, jobId: 69482630ce87466bb580bff416c284e5).
java.lang.Exception: The slot c37da0fd74b14bd257a4ecce33c06d79 has timed out.
    at org.apache.flink.runtime.taskexecutor.TaskExecutor.timeoutSlot(TaskExecutor.java:1653) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.runtime.taskexecutor.TaskExecutor.access$2800(TaskExecutor.java:173) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.runtime.taskexecutor.TaskExecutor$SlotActionsImpl.lambda$timeoutSlot$1(TaskExecutor.java:1940) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.actor.Actor.aroundReceive(Actor.scala:517) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.actor.Actor.aroundReceive$(Actor.scala:515) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dist_2.12-1.11.1.jar:1.11.1]
    at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dist_2.12-1.11.1.jar:1.11.1]
2020-09-04 09:11:06 INFO  o.a.f.r.t.DefaultJobLeaderService [] - Remove job 69482630ce87466bb580bff416c284e5 from job leader monitoring.
2020-09-04 09:11:06 DEBUG o.a.f.r.t.DefaultJobLeaderService [] - Retrying registration towards akka.tcp://flink@flink:6123/user/rpc/jobmanager_12 was cancelled.
2020-09-04 09:11:06 DEBUG o.a.f.r.s.TaskExecutorLocalStateStoresManager [] - Releasing local state under allocation id c37da0fd74b14bd257a4ecce33c06d79.
2020-09-04 09:11:11 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:11:20 DEBUG a.a.LocalActorRefProvider(akka://flink) [] - Resolve (deserialization) of path [temp/$y] doesn't match an active actor. It has probably been stopped, using deadLetters.
2020-09-04 09:11:21 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:11:24 DEBUG a.a.LocalActorRefProvider(akka://flink) [] - Resolve (deserialization) of path [temp/$E] doesn't match an active actor. It has probably been stopped, using deadLetters.
2020-09-04 09:11:24 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from b128e6525f090a1f2a909ea501307f1d.
2020-09-04 09:11:24 DEBUG a.a.LocalActorRefProvider(akka://flink) [] - Resolve (deserialization) of path [temp/$z] doesn't match an active actor. It has probably been stopped, using deadLetters.
2020-09-04 09:11:24 DEBUG a.a.LocalActorRefProvider(akka://flink) [] - Resolve (deserialization) of path [temp/$A] doesn't match an active actor. It has probably been stopped, using deadLetters.
2020-09-04 09:11:24 DEBUG a.a.LocalActorRefProvider(akka://flink) [] - Resolve (deserialization) of path [temp/$B] doesn't match an active actor. It has probably been stopped, using deadLetters.
2020-09-04 09:11:24 DEBUG a.a.LocalActorRefProvider(akka://flink) [] - Resolve (deserialization) of path [temp/$C] doesn't match an active actor. It has probably been stopped, using deadLetters.
2020-09-04 09:11:24 DEBUG a.a.LocalActorRefProvider(akka://flink) [] - Resolve (deserialization) of path [temp/$D] doesn't match an active actor. It has probably been stopped, using deadLetters.
2020-09-04 09:11:31 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:11:34 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from b128e6525f090a1f2a909ea501307f1d.
2020-09-04 09:11:41 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:11:44 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from b128e6525f090a1f2a909ea501307f1d.
2020-09-04 09:11:51 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:11:53 DEBUG a.r.t.n.NettyTransport [] - Remote connection to [/192.168.3.147:58278] was disconnected because of [id: 0xaaf4e1c4, /192.168.3.147:58278 :> /172.16.0.210:6122] DISCONNECTED
2020-09-04 09:11:53 DEBUG a.r.t.ProtocolStateActor [] - Association between local [tcp://flink@172.16.0.210:6122] and remote [tcp://flink@192.168.3.147:58278] was disassociated because the ProtocolStateActor failed: Unknown
2020-09-04 09:11:54 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from b128e6525f090a1f2a909ea501307f1d.
2020-09-04 09:12:01 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:12:04 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from b128e6525f090a1f2a909ea501307f1d.
2020-09-04 09:12:11 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:12:21 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:12:31 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:12:41 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:12:51 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:12:53 DEBUG a.r.t.n.NettyTransport [] - Remote connection to [/192.168.3.147:59802] was disconnected because of [id: 0x6fea1015, /192.168.3.147:59802 :> /172.16.0.210:6122] DISCONNECTED
2020-09-04 09:12:53 DEBUG a.r.t.ProtocolStateActor [] - Association between local [tcp://flink@172.16.0.210:6122] and remote [tcp://flink@192.168.3.147:59802] was disassociated because the ProtocolStateActor failed: Unknown
2020-09-04 09:13:01 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:13:11 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:13:21 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:13:31 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:13:41 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:13:51 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.
2020-09-04 09:13:53 DEBUG a.r.t.n.NettyTransport [] - Remote connection to [/192.168.3.147:33116] was disconnected because of [id: 0xc7769526, /192.168.3.147:33116 :> /172.16.0.210:6122] DISCONNECTED
2020-09-04 09:13:53 DEBUG a.r.t.ProtocolStateActor [] - Association between local [tcp://flink@172.16.0.210:6122] and remote [tcp://flink@192.168.3.147:33116] was disassociated because the ProtocolStateActor failed: Unknown
2020-09-04 09:14:01 DEBUG o.a.f.r.t.TaskExecutor [] - Received heartbeat request from d5ddacd80a913c1ae961a83cbe58a598.

标签: apache-flink

解决方案


推荐阅读