python - AttributeError:无法获取属性“InsertNews”
我正在尝试编写一个程序来抓取网站内容。该脚本似乎运行了一段时间,但在几次迭代后停止
Traceback (most recent call last):
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\mult
问题描述
我正在尝试编写一个程序来抓取网站内容。该脚本似乎运行了一段时间,但在几次迭代后停止
Traceback (most recent call last):
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\util.py", line 300, in _run_finalizers
finalizer()
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\util.py", line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 581, in _terminate_pool
cls._help_stuff_finish(inqueue, task_handler, len(pool))
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 568, in _help_stuff_finish
inqueue._reader.recv()
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
AttributeError: Can't get attribute 'InsertNews' on <module '__main__' from 'c:\\program files (x86)\\microsoft visual studio\\2019\\common7\\ide\\extensions\\microsoft\\python\\core\\debugpy\\__main__.py'>
这是我要运行的脚本
from boilerpy3 import extractors
import pymongo
import multiprocessing as mp
def InsertNews(newsite, symbol):
print(symbol)
print(newsite)
extractor = extractors.ArticleExtractor()
try:
content = extractor.get_content_from_url(newsite)
except Exception:
pass
print(content)
record={symbol,content}
mydb["StocksPressRelease"].insert_one(record)
if __name__ == "__main__":
print("started")
pool = mp.Pool(mp.cpu_count())
myclient = pymongo.MongoClient("mongodb+srv://un:pwd@cluster0.subkd.azure.mongodb.net/db?retryWrites=true&w=majority&connectTimeoutMS=900000")
mydb = myclient["db"]
mycol = mydb["Stocks"]
for x in mycol.find({},{"_id": 0, "symbol":1, "newsite": 1 }):
results = pool.apply_async(InsertNews,args=(x["newsite"],x["symbol"]))
pool.close()
从我在这篇文章中读到的内容来看,多处理池对于未在导入模块中定义的对象无法正常工作。您可以尝试在单独的模块中编写InsertNews函数,然后将其导入。
文件:news.py
from boilerpy3 import extractors
def InsertNews(newsite, symbol):
print(symbol)
print(newsite)
extractor = extractors.ArticleExtractor()
try:
content = extractor.get_content_from_url(newsite)
except Exception:
pass
print(content)
文件:main.py
import pymongo
import multiprocessing as mp
import news
if __name__ == "__main__":
print("started")
pool = mp.Pool(mp.cpu_count())
myclient = pymongo.MongoClient("mongodb+srv://un:pwd@cluster0.subkd.azure.mongodb.net/db?retryWrites=true&w=majority&connectTimeoutMS=900000")
mydb = myclient["db"]
mycol = mydb["Stocks"]
for x in mycol.find({},{"_id": 0, "symbol":1, "newsite": 1 }):
results = pool.apply_async(news.InsertNews,args=(x["newsite"],x["symbol"]))
pool.close()
解决方案
从我在这篇文章中读到的内容来看,多处理池对于未在导入模块中定义的对象无法正常工作。您可以尝试在单独的模块中编写InsertNews函数,然后将其导入。
文件:news.py
from boilerpy3 import extractors
def InsertNews(newsite, symbol):
print(symbol)
print(newsite)
extractor = extractors.ArticleExtractor()
try:
content = extractor.get_content_from_url(newsite)
except Exception:
pass
print(content)
文件:main.py
import pymongo
import multiprocessing as mp
import news
if __name__ == "__main__":
print("started")
pool = mp.Pool(mp.cpu_count())
myclient = pymongo.MongoClient("mongodb+srv://un:pwd@cluster0.subkd.azure.mongodb.net/db?retryWrites=true&w=majority&connectTimeoutMS=900000")
mydb = myclient["db"]
mycol = mydb["Stocks"]
for x in mycol.find({},{"_id": 0, "symbol":1, "newsite": 1 }):
results = pool.apply_async(news.InsertNews,args=(x["newsite"],x["symbol"]))
pool.close()
推荐阅读
- apache-kafka - 在不安装 Confluent Platform 的情况下使用 Confluent Hub
- amazon-web-services - 如何通过过滤器的另一个帐户的 SQS 订阅一个帐户的 SNS 主题?
- perl - Perl - 文件读取 - 得到“GLOB”
- vb.net - 选择案例未按预期工作(特定于代码)
- r - 如何在编写excel时在r中给出所有边框和一些颜色
- aws-lambda - 无法从 Lambda 函数查询 DynamoDB 表
- ios - iOS:使用良好授权调用 https://api.dropboxapi.com/2/users/get_current_account 时的代码状态为 400
- javascript - 如何在回调结束时得到通知 Promise all callback style
- azure - 向 Azure 移动应用请求添加 Where 或 OrderBy 子句时出错
- regex - 正则表达式 [\n.]* 似乎无法匹配任何内容
我正在尝试编写一个程序来抓取网站内容。该脚本似乎运行了一段时间,但在几次迭代后停止
Traceback (most recent call last):
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\mult
问题描述
我正在尝试编写一个程序来抓取网站内容。该脚本似乎运行了一段时间,但在几次迭代后停止
Traceback (most recent call last):
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\util.py", line 300, in _run_finalizers
finalizer()
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\util.py", line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 581, in _terminate_pool
cls._help_stuff_finish(inqueue, task_handler, len(pool))
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\pool.py", line 568, in _help_stuff_finish
inqueue._reader.recv()
File "D:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\multiprocessing\connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
AttributeError: Can't get attribute 'InsertNews' on <module '__main__' from 'c:\\program files (x86)\\microsoft visual studio\\2019\\common7\\ide\\extensions\\microsoft\\python\\core\\debugpy\\__main__.py'>
这是我要运行的脚本
from boilerpy3 import extractors
import pymongo
import multiprocessing as mp
def InsertNews(newsite, symbol):
print(symbol)
print(newsite)
extractor = extractors.ArticleExtractor()
try:
content = extractor.get_content_from_url(newsite)
except Exception:
pass
print(content)
record={symbol,content}
mydb["StocksPressRelease"].insert_one(record)
if __name__ == "__main__":
print("started")
pool = mp.Pool(mp.cpu_count())
myclient = pymongo.MongoClient("mongodb+srv://un:pwd@cluster0.subkd.azure.mongodb.net/db?retryWrites=true&w=majority&connectTimeoutMS=900000")
mydb = myclient["db"]
mycol = mydb["Stocks"]
for x in mycol.find({},{"_id": 0, "symbol":1, "newsite": 1 }):
results = pool.apply_async(InsertNews,args=(x["newsite"],x["symbol"]))
pool.close()
从我在这篇文章中读到的内容来看,多处理池对于未在导入模块中定义的对象无法正常工作。您可以尝试在单独的模块中编写InsertNews函数,然后将其导入。
文件:news.py
from boilerpy3 import extractors
def InsertNews(newsite, symbol):
print(symbol)
print(newsite)
extractor = extractors.ArticleExtractor()
try:
content = extractor.get_content_from_url(newsite)
except Exception:
pass
print(content)
文件:main.py
import pymongo
import multiprocessing as mp
import news
if __name__ == "__main__":
print("started")
pool = mp.Pool(mp.cpu_count())
myclient = pymongo.MongoClient("mongodb+srv://un:pwd@cluster0.subkd.azure.mongodb.net/db?retryWrites=true&w=majority&connectTimeoutMS=900000")
mydb = myclient["db"]
mycol = mydb["Stocks"]
for x in mycol.find({},{"_id": 0, "symbol":1, "newsite": 1 }):
results = pool.apply_async(news.InsertNews,args=(x["newsite"],x["symbol"]))
pool.close()
解决方案
从我在这篇文章中读到的内容来看,多处理池对于未在导入模块中定义的对象无法正常工作。您可以尝试在单独的模块中编写InsertNews函数,然后将其导入。
文件:news.py
from boilerpy3 import extractors
def InsertNews(newsite, symbol):
print(symbol)
print(newsite)
extractor = extractors.ArticleExtractor()
try:
content = extractor.get_content_from_url(newsite)
except Exception:
pass
print(content)
文件:main.py
import pymongo
import multiprocessing as mp
import news
if __name__ == "__main__":
print("started")
pool = mp.Pool(mp.cpu_count())
myclient = pymongo.MongoClient("mongodb+srv://un:pwd@cluster0.subkd.azure.mongodb.net/db?retryWrites=true&w=majority&connectTimeoutMS=900000")
mydb = myclient["db"]
mycol = mydb["Stocks"]
for x in mycol.find({},{"_id": 0, "symbol":1, "newsite": 1 }):
results = pool.apply_async(news.InsertNews,args=(x["newsite"],x["symbol"]))
pool.close()
推荐阅读
- apache-kafka - 在不安装 Confluent Platform 的情况下使用 Confluent Hub
- amazon-web-services - 如何通过过滤器的另一个帐户的 SQS 订阅一个帐户的 SNS 主题?
- perl - Perl - 文件读取 - 得到“GLOB”
- vb.net - 选择案例未按预期工作(特定于代码)
- r - 如何在编写excel时在r中给出所有边框和一些颜色
- aws-lambda - 无法从 Lambda 函数查询 DynamoDB 表
- ios - iOS:使用良好授权调用 https://api.dropboxapi.com/2/users/get_current_account 时的代码状态为 400
- javascript - 如何在回调结束时得到通知 Promise all callback style
- azure - 向 Azure 移动应用请求添加 Where 或 OrderBy 子句时出错
- regex - 正则表达式 [\n.]* 似乎无法匹配任何内容