python - 从 Python 脚本将 MongoDB 数据导入 Azure ML Studio
问题描述
目前在 Azure ML 中执行 python 脚本,代码如下。(Python 2.7.11)其中从 mongoDB 获得的结果正在尝试使用 pyMongo 在 DataFrame 中返回。
我收到一个错误,例如::
"C:\pyhome\lib\site-packages\pymongo\topology.py", line 97, in select_servers
self._error_message(selector))
ServerSelectionTimeoutError: ... ('The write operation timed out',)
如果您知道错误的原因以及需要改进的地方,请告诉我。
我的源代码:
import pymongo as m
import pandas as pd
def azureml_main(dataframe1 = None, dataframe2 = None):
uri = "mongodb://xxxxx:yyyyyyyyyyyyyyy@zzz.mongodb.net:xxxxx/?ssl=true&replicaSet=globaldb"
client = m.MongoClient(uri,connect=False)
db = client['dbName']
coll = db['colectionName']
cursor = coll.find()
df = pd.DataFrame(list(cursor))
return df,
错误详情:
Error 0085: The following error occurred during script evaluation, please view the output log for more information:
---------- Start of error message from Python interpreter ----------
Caught exception while executing function: Traceback (most recent call last):
File "C:\server\invokepy.py", line 199, in batch
odfs = mod.azureml_main(*idfs)
File "C:\temp\55a174d8dc584942908423ebc0bac110.py", line 32, in azureml_main
result = pd.DataFrame(list(cursor))
File "C:\pyhome\lib\site-packages\pymongo\cursor.py", line 977, in next
if len(self.__data) or self._refresh():
File "C:\pyhome\lib\site-packages\pymongo\cursor.py", line 902, in _refresh
self.__read_preference))
File "C:\pyhome\lib\site-packages\pymongo\cursor.py", line 813, in __send_message
**kwargs)
File "C:\pyhome\lib\site-packages\pymongo\mongo_client.py", line 728, in _send_message_with_response
server = topology.select_server(selector)
File "C:\pyhome\lib\site-packages\pymongo\topology.py", line 121, in select_server
address))
File "C:\pyhome\lib\site-packages\pymongo\topology.py", line 97, in select_servers
self._error_message(selector))
ServerSelectionTimeoutError: xxxxx-xxx.mongodb.net:xxxxx: ('The write operation timed out',)
Process returned with non-zero exit code 1
解决方案
据我所知,有一个限制Execute Python Scripts
会导致这个问题,请参阅Limitations
下面的部分了解它。
限制
执行 Python 脚本当前具有以下限制:
- 沙盒执行。Python 运行时当前是沙盒化的,因此不允许以持久方式访问网络或本地文件系统。模块完成后,本地保存的所有文件都会被隔离并删除。Python 代码无法访问运行它的机器上的大多数目录,当前目录及其子目录除外。
由于上述原因,您无法通过模块中pymongo
的驱动程序直接从 Azure Cosmos DB 在线导入数据。Execute Python Script
但是您可以将Import Data
模块与 Azure Cosmos DB 的连接和参数信息一起使用,并将其输出连接到输入Execute Python Script
以获取数据,如下图所示。
有关在线导入数据的更多信息,请参阅Import from online data sources
官方文档的部分Import your training data into Azure Machine Learning Studio from various data sources
。
推荐阅读
- javascript - 将表单输入保存在excel中。可下载
- ruby - YAML 使用枚举数失败
- chatbot - 更改 Zoom App (Chatbot) 的端点 URL
- javascript - 为什么可以单击提交按钮以提交表单中的数据?
- javascript - 输入字段上的自动选择在模式内不起作用
- html - prevent page jumping to anchor target
- c - Why is the code taking and giving random numbers when inputted numbers in the range?
- azure-powershell - Azure Powershell Runbook converting excel file in blob storage to csv
- azure - In the azure build pipeline, how to cleanup the docker images (created as part of the pipeline) from self hosted agent once it is pushed?
- reactjs - 用间隔反应 ApexCharts 更新图表