首页 > 解决方案 > 如何从用户时间轴中为 python 上的特定 covid 相关关键字提取推文?

问题描述

我想从 {user} 时间线回复中检索至少 1000 条推文 ● 1000 条推文中至少有 100 条推文与 Covid-19 关键字相关,例如 ["covid19", "wuhan", "mask", "lockdown", “隔离”、“sars-cov-2”] 等。

我编写了检索推文的函数:

def get_tweets_by_user(self, screen_name):
        '''
        Use user_timeline api to fetch POI related tweets, some postprocessing may be required.
        :return: List
        '''

        result = []

        tweets = api.user_timeline(screen_name=screen_name, 
                           # 200 is the maximum allowed count
                           count=200,
                           include_rts = True,
                           # Necessary to keep full_text 
                           # otherwise only the first 140 words are extracted
                           tweet_mode = 'extended'
                           )
        
        for tw in tweets:
            result.append(tw)

        return result

现在如何从用户时间线中检索 100 条与 covid-19 关键字相关的推文?

标签: pythontwittertweepy

解决方案


注册 Twitter 开发者 API。您将需要几个消费者密钥。告诉他们你是学生。

import requests as re
import json
import twitter # install this library to work with twitter dev.

consumer_key = "your key"
consumer_secret = "your key"
access_token = "your key"
access_token_secret = "your key"

api = twitter.Api(consumer_key=yourkey,
                  consumer_secret=yoursecret,
                  access_token_key=youraccesstoken,
                  access_token_secret=yourtokensecret)
FILTER = ["covid-19 string here"] # PUT YOUR COVID 19 STRING HERE
LANGUAGES = ['en']
store_file = "outputfileforcovidtweets.txt"
_location = ["put coordinates here"]
def main():
    with open(store_file, 'a') as z:
        for line in api.GetStreamFilter(track=FILTER, languages=LANGUAGES, locations=_location):
            z.write(json.dumps(line))
            z.write('\n')
            
main()

这将收集实时推文到您的输出文件。:)


推荐阅读