首页 > 解决方案 > 使用 Praw 不断获取属性错误以在 subreddit 中抓取特定搜索词

问题描述

我是一个极端的新手。我的目标是通过搜索词“吸烟者”从 subreddit r/Coronavirus 中抓取 reddit 帖子和评论。我不断收到“AttributeError:'MoreComments'对象没有属性'body'”指的是“commentsDict [“Body”].append(topLevelComments.body)”行。还有另外两行使用(topLevelComments.author、.score 和 .body)不断导致它崩溃。当我用 ".append(topLevelComments.) 注释掉所有行时,它返回:ValueError("arrays must be all be same length") 我疯了,因为这段代码在 2 天前运行良好。请帮助,下面的代码.我注释掉了引起麻烦的行,但也不确定如何处理错误2.不过一步一步:

commentsDict = {"Post" : [], "Post Score" : [], "No of Comments":[], "Post Date":[], \
                "Body":[],"Score":[],"Date":[],"Author":[], "id":[], "p_auth":[], "Post body":[]}

for submission in reddit.subreddit("Coronavirus").search("smoker"):
    submission.comment_sort = 'new'
    topLevelComments = list(submission.comments)
    for topLevelComments in submission.comments:
        commentsDict["Post"].append(submission.title)#title of post with comment
        commentsDict["Post Score"].append(submission.score)
        commentsDict["Post body"].append(submission.selftext)
        commentsDict["id"].append(submission.id)
        commentsDict["p_auth"].append(submission.author)
        commentsDict["No of Comments"].append(submission.num_comments)
        date = submission.created_utc
        timestamp = datetime.datetime.fromtimestamp(date)
        commentsDict["Post Date"].append(timestamp.strftime('%Y-%m-%D %H:%M:%S'))
        #commentsDict["Body"].append(topLevelComments.body)
        #commentsDict["Score"].append(topLevelComments.score)
        #date = topLevelComments.created
        timestamp = datetime.datetime.fromtimestamp(date)
        commentsDict["Date"].append(timestamp.strftime('%Y-%m-%D %H:%M:%S'))
        #commentsDict["Author"].append(topLevelComments.author)

commentsDF = pd.DataFrame(commentsDict)

commentsDF.to_csv('smoker_covid.csv', index=True) 

标签: attributespraw

解决方案


推荐阅读