首页 > 解决方案 > SCS 和 MOSEK 求解器继续运行

问题描述

我的应用程序使用 ECOS Solver 已经很长时间了,突然之间我们开始得到不可行的解决方案,从而导致求解器错误。在网上查看了一些堆栈和建议,我看到了对 MOSEK 和 SCS 求解器的建议。

我尝试将我的 ECOS 替换为 SCS 和 MOSEK Solvers,但我的运行永无止境。通常我的跑步会在 2 小时内结束,但在更换之后会跑 8 小时左右,而且永远不会结束。请给我建议。

下面是参数,

'solver': {'name': 'MOSEK', 'backup_name': 'SCS', 'verbose': True, 'max_iters': 3505}

请帮助

错误日志:

由于阶段故障而中止作业:阶段 6.0 中的任务 1934 失败 4 次,最近一次失败:阶段 6.0 中丢失任务 1934.3(TID 5028,ip-10-219-208-218.ec2.internal,执行程序 1):org. apache.spark.api.python.PythonException:回溯(最近一次调用最后一次):文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/py_dependencies.zip/cat/tf/tf_model/model.py”,第 262 行,适合引发 SolverError cvxpy.error.SolverError

在处理上述异常的过程中,又出现了一个异常:

回溯(最后一次调用):文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/expressions/constants/constant.py”,第 243 行,位于 extremal_eig_near_ref ev = SA_eigsh(sigma) 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/expressions/constants/constant.py”,第 238 行,在 SA_eigsh 返回 eigsh (A,k=1,sigma=sigma,return_eigenvectors=False)文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/scipy/sparse/linalg/eigen/arpack /arpack.py",第 1687 行,在 eigsh params.iterate() 文件中"envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py”,第571行,在迭代self._raise_no_convergence() “envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py”,第 377 行,在 _raise_no_convergence 提高 ArpackNoConverence (num_iter, k_ok, self.k), ev, vec) scipy.sparse.linalg.eigen.arpack.arpack.ArpackNoConvergence:ARPACK 错误 -1:没有收敛(361 次迭代,0/1 特征向量收敛)_raise_no_convergence() 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py”,第 3 行,第 3 行ArpackNoConvergence(msg % (num_iter, k_ok, self.k), ev, vec) scipy.sparse.linalg.eigen.arpack.arpack.ArpackNoConvergence: ARPACK 错误 -1: 没有收敛(361 次迭代,0/1 特征向量收敛)_raise_no_convergence() 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py”,第 3 行,第 3 行ArpackNoConvergence(msg % (num_iter, k_ok, self.k), ev, vec) scipy.sparse.linalg.eigen.arpack.arpack.ArpackNoConvergence: ARPACK 错误 -1: 没有收敛(361 次迭代,0/1 特征向量收敛)无收敛(361 次迭代,0/1 特征向量收敛)无收敛(361 次迭代,0/1 特征向量收敛)

在处理上述异常的过程中,又出现了一个异常:

Traceback (most recent call last): File "envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/pyspark.zip/pyspark/worker.py", line 377, in main process() File "envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/pyspark.zip /pyspark/worker.py",第 372 行,在进程中 serializer.dump_stream(func(split_index,iterator),outfile) 文件 "envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/pyspark.zip/pyspark/serializers.py",第 400 行,在 dump_stream vs = list(itertools.islice(iterator, batch)) 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/pyspark.zip/pyspark/util.py”,第 113 行,在包装器中返回 f(*args, **kwargs ) 文件 ”envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000001/py_dependencies.zip/pyspark_scripts/spark_tf_pipeline.py", line 49, in File "envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/py_dependencies.zip/cat/tf/tf_model/tf_get_from_smu_records.py", line 38 , in tf_get_from_smu_records data_points, current_date_string, params) File "envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/py_dependencies.zip/cat/tf/tf_model/tf_get_from_smu_records.py", line 24, in tf_get_outputs_from_smu_records model_output, _ = fit_model(ts_wrapper, params) File “envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/py_dependencies.zip/cat/tf/tf_model/fit_model.py”,第 13 行,在 fit_model machine_model 中。fit() File "envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/py_dependencies.zip/cat/tf/tf_model/machine_model.py", line 62, in fit self._fit() File "envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/py_dependencies. zip/cat/tf/tf_model/machine_model.py”,第 120 行,在 _fit self.model.fit() 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/py_dependencies.zip/cat/tf/tf_model/model.py”中,第 267 行,适合 self._fit(self.solver_params['backup_name']) 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/py_dependencies.zip/cat/tf/tf_model/model.py”,第 245 行,在 _fit feastol tols['feastol_inacc']) 文件 "envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/problems/problem.py",第401行,在求解返回solve_func(self, *args, **kwargs)文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/problems/problem.py”,第818行,在_solve self、data、warm_start、verbose、kwargs)文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/reductions/solvers/solving_chain.py”,第341行,在solve_via_data solver_opts,问题File._solver_cache)envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/reductions/solvers/conic_solvers/cvxopt_conif.py”,第162行,在solve_via_reddata中如果=self.removeant_row = s.INFEASIBLE:文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/reductions/solvers/conic_solvers/cvxopt_conif.py”,第286行remove_redundant, = extremal_eig_near_ref(gram, ref=TOL) 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/expressions/constants/constant.py”,第 247 行, extremal_eig_near_ref ev = SA_eigsh(sigma) 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/cvxpy/expressions/constants/constant.py”,第 238 行,在 SA_eigsh 返回 eigsh(A, k=1, sigma =sigma,return_eigenvectors=False)文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py”,第1687行,在 eigsh params.iterate() 文件“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py”中,行571,在迭代 self._raise_no_convergence() 文件中“envpath/appcache/application_1618545751422_0044/container_1618545751422_0044_02_000002/miniconda/envs/project/lib/python3.6/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py”,第 377 行,在 _raise_no_convergence num_iter, k_ok, self.k), ev, vec) scipy.sparse.linalg.eigen.arpack.arpack.ArpackNoConvergence:ARPACK 错误 -1:没有收敛(361 次迭代,0/1 特征向量收敛)

at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:456)
at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:592)
at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:575)
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:410)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:227)
at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$3.apply(ShuffleExchangeExec.scala:283)
at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$3.apply(ShuffleExchangeExec.scala:252)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1405)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

驱动程序堆栈跟踪:

标签: pythoncvxpymosekecos

解决方案


推荐阅读