python - 文件 ”",第 1006 行,在 _gcd_import 错误中
问题描述
我制作了一个简单的爬虫,它应该从 Instagram 个人资料中获取和下载图像,但我不断收到此错误消息。
我尝试按照本教程https://docs.scrapy.org/en/latest/topics/media-pipeline.html
#instagarm_spider.py
class InstagramSpider(scrapy.Spider):
name = 'ig'
start_urls = [
'https://www.instagram.com/kyliejenner/'
]
def parse(self, response):
image_urls = scrapy.Field()
images = scrapy.Field()
yield {'image_urls': image_urls,
'images': images}
设置.py
BOT_NAME = 'instagramscraper' SPIDER_MODULES = ['instagramscraper.spiders']
NEWSPIDER_MODULE = 'instagramscraper.spiders'
ITEM_PIPELINES = { 'scrapy.pipelines.images.ImagesPipeline': 1 }
IMAGES_STORE =
'C:\Users\jliv3\PycharmProjects\Instagram\instagramscraper\instagramscraper\sp
iders\test'
这是我收到的所有错误消息。
File "C:\Users\jliv3\AppData\Local\Programs\Python\Python37\lib\runpy.py",
line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\Users\jliv3\AppData\Local\Programs\Python\Python37\lib\runpy.py",
line 85, in _run_code
exec(code, run_globals)
File
"C:\Users\jliv3\PycharmProjects\Instagram\venv\Scripts\scrapy.exe\__main__.py"
, line 9, in <module>
File "c:\users\jliv3\pycharmprojects\instagram\venv\lib\site-
packages\scrapy\cmdline.py", line 114, in execute
settings = get_project_settings()
File "c:\users\jliv3\pycharmprojects\instagram\venv\lib\site-
packages\scrapy\utils\project.py", line 68, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "c:\users\jliv3\pycharmprojects\instagram\venv\lib\site-
packages\scrapy\settings\__init__.py", line 294, in setmodule
module = import_module(module)
File
"C:\Users\jliv3\AppData\Local\Programs\Python\Python37\lib\importlib\__init__.
py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 724, in exec_module
File "<frozen importlib._bootstrap_external>", line 860, in get_code
File "<frozen importlib._bootstrap_external>", line 791, in source_to_code
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File
"C:\Users\jliv3\PycharmProjects\Instagram\instagramscraper\instagramscraper\se
ttings.py", line 70
IMAGES_STORE =
'C:\Users\jliv3\PycharmProjects\Instagram\instagramscraper\instagramscraper\spiders\test'
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
解决方案
您可以尝试像这样设置您的 IMAGES_STORE:
IMAGES_STORE = r'C:\Users\jliv3\PycharmProjects\Instagram\instagramscraper\instagramscraper\spiders\test'
这样,'\U' 将被视为文字字符,这正是您在这里所需要的。
推荐阅读
- php - 填充
- javascript - 无法写入 Firebase,没有错误
- php - 带有 ssl、代理配置和 php7.1 的 ngnix 上的 502 Bad Gateway
- android - Android Studio 在构建过程中找不到可绘制对象
- java - Java-8:在文件的 1 行中多次匹配 1 个模式(通过过滤器)
- ios - 设置图像属性时,UIImageView 图像不会明显更新
- python - 使用 rrule 将日期分配给相关月份
- c++ - Opengl:如何正确映射缓冲区?
- javascript - 异步递归在用于轮询 fn 的 Js 代码中是否安全
- java - 如何获取节点的字符串值?