首页 > 解决方案 > HPC Pack 2019 - 无法连接到在头节点服务器上运行集群管理器的头节点

问题描述

我正在使用 Windows Server 2019,并且我激活了 Active Directory 并使另一台装有 Windows 10 的设备成为域的成员。我想在 HPC pack 2019 的帮助下对计算机进行集群,但是当我运行 HPC pack 2019 时,我收到以下错误:

The connection to the management service failed. detail error: Microsoft.Hpc.RetryCountExhaustException: Retry Count of RetryManager is exhausted. ---> System.ServiceModel.EndpointNotFoundException: Could not connect to net.tcp://10m:9893/Sdm. The connection attempt lasted for a time span of 00:00:03.0055687. TCP error code 10061: No connection could be made because the target machine actively refused it 192.168.1.148:9893.  ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 192.168.1.148:9893
   at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
   at System.Net.Sockets.Socket.Connect(EndPoint remoteEP)
   at System.ServiceModel.Channels.SocketConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
   --- End of inner exception stack trace ---

Server stack trace: 
   at System.ServiceModel.Channels.SocketConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
   at System.ServiceModel.Channels.BufferedConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
   at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout)
   at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout)
   at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
   at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
   at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
   at System.ServiceModel.Channels.ServiceChannel.CallOpenOnce.System.ServiceModel.Channels.ServiceChannel.ICallOnce.Call(ServiceChannel channel, TimeSpan timeout)
   at System.ServiceModel.Channels.ServiceChannel.CallOnceManager.CallOnce(TimeSpan timeout, CallOnceManager cascade)
   at System.ServiceModel.Channels.ServiceChannel.EnsureOpened(TimeSpan timeout)
   at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
   at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
   at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]: 
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at Microsoft.SystemDefinitionModel.Store.IRemoteSdmStore.QueryDocuments(String filter)
   at Microsoft.SystemDefinitionModel.Store.SdmRemoteStore.<>c__DisplayClass45_0.<QueryDocuments>b__0()
   at Microsoft.SystemDefinitionModel.Store.SdmRetry.<>c__DisplayClass4_0`1.<InvokeWithRetry>b__0()
   at Microsoft.Hpc.RetryManager.<InvokeWithRetryAsync>d__33`1.MoveNext()
   --- End of inner exception stack trace ---
   at Microsoft.Hpc.RetryManager.<InvokeWithRetryAsync>d__33`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.SystemDefinitionModel.Store.SdmRetry.InvokeWithRetry[T](Func`1 function, CancellationToken cancellationToken)
   at Microsoft.SystemDefinitionModel.ModelDocuments.LoadAllDocuments()
   at Microsoft.SystemDefinitionModel.DefinitionSpaceView.TryResolve(String simpleName)
   at Microsoft.SystemDefinitionModel.ModelDocuments..ctor(Model model)
   at Microsoft.SystemDefinitionModel.Model.InitializeModel()
   at Microsoft.ComputeCluster.Management.ClusterManager.ConnectCore(Model model)
   at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Hpc.RetryManager.<>c__DisplayClass34_0.<<InvokeWithRetryAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Hpc.RetryManager.<InvokeWithRetryAsync>d__33`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Hpc.RetryManager.<InvokeWithRetryAsync>d__34.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ComputeCluster.Management.ClusterManager.<ConnectAsync>d__90.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ComputeCluster.Management.ClusterManager.<ConnectAsyncWithModel>d__89.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ComputeCluster.Management.ClusterManager.<ConnectAsync>d__87.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.ComputeCluster.Admin.ConnectionManager.<ConnectManagementService>b__10_0(Task`1 t)

当我单击本地节点时,它向我显示此错误:

错误 HPC 包 2019 图像

谁能帮我?

标签: windowsservercluster-computinghpc

解决方案


尝试将头节点的名称显式添加到命令行:

/scheduler:<headnodename>

推荐阅读