问题描述
我创建了这段代码来覆盖 mongo DB 中的集合。但是当我覆盖我的集合时,我的索引被删除,有没有办法发送索引或在我覆盖集合时创建索引?
from pyspark.sql import SparkSession
# Connect to CosmosDB to write on the collection
userName = dbutils.secrets.get(scope="MONGO" , key="MONGO_USER")
primaryKey = dbutils.secrets.get(scope="MONGO" , key="MONGO_PASS")
host = dbutils.secrets.get(scope="MONGO" , key="MONGO_HOST")
port = dbutils.secrets.get(scope="MONGO" , key="MONGO_PORT")
database = "ccvcmdbmongosbox"
collection = "COLL_CCVIMPACTOS"
# Structure the connection
connectionString = "mongodb://{0}:{1}@{2}:{3}/{4}.{5}?ssl=true&replicaSet=globaldb&retrywrites=false&maxIdleTimeMS=120000".format(userName, primaryKey, host, port, database, collection)
spark = SparkSession\
.builder\
.config('spark.mongodb.input.uri', connectionString)\
.config('spark.mongodb.output.uri', connectionString)\
.config('spark.jars.packages', 'org.mongodb.spark:mongo-spark-connector_2.11:2.3.1')\
.getOrCreate()
impactos_mongo.write.format("com.mongodb.spark.sql.DefaultSource")\
.mode("overwrite")\
.option("uri", connectionString)\
.option("replaceDocument", False)\
.option("maxBatchSize", 100)\
.option("database", database)\
.option("collection", collection)\
.save(
标签: mongodbpysparkdatabricks
解决方案
推荐阅读