首页 > 解决方案 > Zeppelin 无法使用 spark 解释器加载 mongodb 集合

问题描述

我正在使用 zeppelin 版本 0.8.0、mongodb 4.0、spark 2.2.0、mongospark 连接器 2.2.4、mongo java 驱动程序 3.8

      sc.version
      import com.mongodb.spark.MongoSpark
      import com.mongodb.spark.config.{ReadConfig, WriteConfig}
      import com.mongodb.spark.sql._
      import org.apache.spark.sql.functions._
      import org.bson.Document
      import collection.JavaConverters._
      import org.apache.zeppelin.display.angular.paragraphscope._
      import AngularElem._
      val readConfig = ReadConfig(Map("uri" -> "mongodb://127.0.0.1:27017/", 
      "database" -> "test","collection" -> "Collection_f"))
      val zipDf = spark.sparkSession.read.mongo(readConfig).toDF()

给出:

      import com.mongodb.spark.MongoSpark 
      import com.mongodb.spark.config. 
      {ReadConfig, WriteConfig} import com.mongodb.spark.sql._ 
      import org.apache.spark.sql.functions._ 
      import org.bson.Document 
      import collection.JavaConverters._ 
      import org.apache.zeppelin.display.angular.paragraphscope._ 
      import AngularElem._ 
      readConfig:com.mongodb.spark.config.ReadConfig.Self= 
      ReadConfig(test,Collection_f,Some(mongo 
      db://127.0.0.1:27017/),1000,DefaultMongoPartitioner,Map(),15, 
      ReadPreferenceConfig(primary,None), 
      ReadConcernConfig(None), 
      AggregationConfig(None,None),false,true,250,true, 
      true) 
         org.apache.spark.SparkException: Job aborted due to stage failure: 
         Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 
         0.0 in stage 0.0 (TID 0, localhost, executor driver): 
         com.mongodb.MongoCommandException: Command failed with error 16820 
         (Location16820): 'Sort exceeded memory limit of 104857600 bytes, 
         but did not opt in to external sorting. Aborting operation. Pass 
         allowDiskUse:true to opt in.' on server 127.0.0.1:27017. The full 
         response is { "ok" : 0.0, "errmsg" : "Sort exceeded memory limit of 
         104857600 bytes, but did not opt in to external sorting. Aborting 
         operation. Pass allowDiskUse:true to opt in.", "code" : 16820, 
         "codeName" : "Location16820" }

我认为这是一个取决于allowDiskUse变量的问题。我可以在哪里修复它?

标签: mongodbapache-sparkmongodb-queryapache-zeppelin

解决方案


它通过更改为 2.2.3 连接器来解决在此处输入链接描述


推荐阅读