apache-spark - com.google.gson.JsonSyntaxException:java.lang.IllegalStateException:预期的 BEGIN_OBJECT 在 cross_validation_metrics_summary
问题描述
我正在使用H2ODRF
和H2OGridSearch
模型创建具有随机离散网格搜索超参数优化的随机森林管道。但是,当我将 nfolds 设置为大于 1 的任何数字并调用fit()
时,我会收到错误消息。我的代码如下所示:
val drf = new H2ODRF()
.setFeaturesCols(featuresCols)
.setLabelCol(labelCol)
.setColumnsToCategorical(categoricalCols)
.setSplitRatio(splitRatio)
.setNfolds(4)
val nps = Map(
"ntrees" -> Array(10, 50).map(_.asInstanceOf[AnyRef]))
val search = new H2OGridSearch()
.setHyperParameters(hyperParams)
.setAlgo(drf)
val model = search.fit(data) // data is a Spark DataFrame
com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 1 column 608096 path $.cross_validation_metrics_summary[0].data[0][0]
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:224)
at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:41)
at com.google.gson.internal.bind.ArrayTypeAdapter.read(ArrayTypeAdapter.java:72)
at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:41)
at com.google.gson.internal.bind.ArrayTypeAdapter.read(ArrayTypeAdapter.java:72)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.read(ReflectiveTypeAdapterFactory.java:129)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:220)
at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:41)
at com.google.gson.internal.bind.ArrayTypeAdapter.read(ArrayTypeAdapter.java:72)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.read(ReflectiveTypeAdapterFactory.java:129)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:220)
at com.google.gson.Gson.fromJson(Gson.java:887)
at com.google.gson.Gson.fromJson(Gson.java:852)
at com.google.gson.Gson.fromJson(Gson.java:801)
at ai.h2o.sparkling.backend.utils.RestCommunication$class.ai$h2o$sparkling$backend$utils$RestCommunication$$deserialize(RestCommunication.scala:164)
at ai.h2o.sparkling.backend.utils.RestCommunication$$anonfun$request$1.apply(RestCommunication.scala:147)
at ai.h2o.sparkling.backend.utils.RestCommunication$$anonfun$request$1.apply(RestCommunication.scala:145)
at ai.h2o.sparkling.utils.ScalaUtils$.withResource(ScalaUtils.scala:28)
at ai.h2o.sparkling.backend.utils.RestCommunication$class.request(RestCommunication.scala:145)
at ai.h2o.sparkling.ml.algos.H2OGridSearch.request(H2OGridSearch.scala:46)
at ai.h2o.sparkling.backend.utils.RestCommunication$class.query(RestCommunication.scala:54)
at ai.h2o.sparkling.ml.algos.H2OGridSearch.query(H2OGridSearch.scala:46)
at ai.h2o.sparkling.ml.algos.H2OGridSearch.getGridModels(H2OGridSearch.scala:129)
at ai.h2o.sparkling.ml.algos.H2OGridSearch.fit(H2OGridSearch.scala:163)
at ai.h2o.sparkling.ml.algos.H2OGridSearch.fit(H2OGridSearch.scala:46)
at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:153)
at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableViewLike$Transformed$class.foreach(IterableViewLike.scala:44)
at scala.collection.SeqViewLike$AbstractTransformed.foreach(SeqViewLike.scala:37)
at org.apache.spark.ml.Pipeline.fit(Pipeline.scala:149)
... 59 elided
Caused by: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 1 column 608096 path $.cross_validation_metrics_summary[0].data[0][0]
at com.google.gson.stream.JsonReader.beginObject(JsonReader.java:385)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:213)
... 90 more
该错误似乎是由cross_validation_metrics_summary
仅在 Nfolds 大于 1 时返回的字段引起的。是否有解决此问题的方法?
编辑:我正在使用Prostate Data并使用 Spark 版本2.4.4
、Scala 版本2.11.12
,并使用以下苏打水版本ai.h2o:sparkling-water-package_2.11:3.30.0.4-1-2.4
。
编辑:通过 Sparkling Water 源代码搜索后,开始看起来问题出在GridSchemaV99
. 是否有我应该更新的设置/配置来寻找不同的架构?
解决方案
推荐阅读
- c++ - 还有其他关于向量的知识吗
- rust - 如何在具有相同名称的多个模块中“外部”一个函数?
- md5 - 使用 Azure Datalake Gen1 上传文件完整性检查
- python - 循环 URL 列表并保存为 txt 文件
- python - 在python中导入钩子
- spring-security-oauth2 - Spring Security OAuth2 - 如何修改令牌响应 JSON?
- kubernetes - 访问 Aurora DB 时来自 Kubernetes 中容器的 UnknownHostException
- prolog - Prolog:如何生成简单的数学表达式?
- javascript - 在组件中传递 ID
- arrays - 如何根据本地 json 文件填充表格视图?