windows - HPC Pack 2019 - 无法连接到在头节点服务器上运行集群管理器的头节点
问题描述
我正在使用 Windows Server 2019,并且我激活了 Active Directory 并使另一台装有 Windows 10 的设备成为域的成员。我想在 HPC pack 2019 的帮助下对计算机进行集群,但是当我运行 HPC pack 2019 时,我收到以下错误:
The connection to the management service failed. detail error: Microsoft.Hpc.RetryCountExhaustException: Retry Count of RetryManager is exhausted. ---> System.ServiceModel.EndpointNotFoundException: Could not connect to net.tcp://10m:9893/Sdm. The connection attempt lasted for a time span of 00:00:03.0055687. TCP error code 10061: No connection could be made because the target machine actively refused it 192.168.1.148:9893. ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 192.168.1.148:9893
at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
at System.Net.Sockets.Socket.Connect(EndPoint remoteEP)
at System.ServiceModel.Channels.SocketConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
--- End of inner exception stack trace ---
Server stack trace:
at System.ServiceModel.Channels.SocketConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
at System.ServiceModel.Channels.BufferedConnectionInitiator.Connect(Uri uri, TimeSpan timeout)
at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout)
at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.CallOpenOnce.System.ServiceModel.Channels.ServiceChannel.ICallOnce.Call(ServiceChannel channel, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.CallOnceManager.CallOnce(TimeSpan timeout, CallOnceManager cascade)
at System.ServiceModel.Channels.ServiceChannel.EnsureOpened(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at Microsoft.SystemDefinitionModel.Store.IRemoteSdmStore.QueryDocuments(String filter)
at Microsoft.SystemDefinitionModel.Store.SdmRemoteStore.<>c__DisplayClass45_0.<QueryDocuments>b__0()
at Microsoft.SystemDefinitionModel.Store.SdmRetry.<>c__DisplayClass4_0`1.<InvokeWithRetry>b__0()
at Microsoft.Hpc.RetryManager.<InvokeWithRetryAsync>d__33`1.MoveNext()
--- End of inner exception stack trace ---
at Microsoft.Hpc.RetryManager.<InvokeWithRetryAsync>d__33`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.SystemDefinitionModel.Store.SdmRetry.InvokeWithRetry[T](Func`1 function, CancellationToken cancellationToken)
at Microsoft.SystemDefinitionModel.ModelDocuments.LoadAllDocuments()
at Microsoft.SystemDefinitionModel.DefinitionSpaceView.TryResolve(String simpleName)
at Microsoft.SystemDefinitionModel.ModelDocuments..ctor(Model model)
at Microsoft.SystemDefinitionModel.Model.InitializeModel()
at Microsoft.ComputeCluster.Management.ClusterManager.ConnectCore(Model model)
at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Hpc.RetryManager.<>c__DisplayClass34_0.<<InvokeWithRetryAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Hpc.RetryManager.<InvokeWithRetryAsync>d__33`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Hpc.RetryManager.<InvokeWithRetryAsync>d__34.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ComputeCluster.Management.ClusterManager.<ConnectAsync>d__90.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ComputeCluster.Management.ClusterManager.<ConnectAsyncWithModel>d__89.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ComputeCluster.Management.ClusterManager.<ConnectAsync>d__87.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ComputeCluster.Admin.ConnectionManager.<ConnectManagementService>b__10_0(Task`1 t)
当我单击本地节点时,它向我显示此错误:
谁能帮我?
解决方案
尝试将头节点的名称显式添加到命令行:
/scheduler:<headnodename>
推荐阅读
- google-chrome - ReCaptcha 加载需要很长时间
- java - Spring DataSourceTransactionManager 不提交
- python - 从现有的 excel 文件中获取数据并写入按日期过滤的新 excel 文件
- ios - 快速更新数组 4
- php - 将字符串添加到表中现有值的更短方法
- plone - 受自定义前提条件保护的文件项
- javascript - 未显示下拉列表值的数组
- xml - 空手道 - XML - 打印子节点
- php - 移动一些数组字段
- swift - 如何从 MarketingCloudSDKInboxMessagesDataSource 中获取特定的消息数据?