首页 > 解决方案 > 使用特定主题标签和时间范围的 Instaloader 数据抓取

问题描述

我需要帮助使用 instaloader 从 Instagram 抓取包含特定时间范围内#slowfashion 的帖子。

我想从帖子中抓取视觉和文本数据(特别是发布的图像、它们的描述和评论)。

标签: web-scrapingtimehashtagperiodinstaloader

解决方案


from datetime import datetime
from itertools import dropwhile, takewhile

import instaloader

# Use parameters to save diffrent metadata
L = instaloader.Instaloader(download_pictures=True,download_videos=False,download_comments=False,save_metadata=True)

# Login
username = input("Enter your username: ")
L.interactive_login(username=username)

# User Query
search = input("Enter Hashtag: ")
limit = int(input("How many posts to download: "))

# Hashtag object
hashtags = instaloader.Hashtag.from_name(L.context, search).get_posts()

# Download Period
SINCE = datetime(2021, 5, 1)
UNTIL = datetime(2021, 3, 1)

no_of_downloads = 0
for post in takewhile(lambda p: p.date > UNTIL, dropwhile(lambda p: p.date > SINCE, hashtags)):
    if no_of_downloads == limit:
        break
    print(post.date)
    L.download_post(post, "#"+search)
    no_of_downloads += 1

推荐阅读