如何使用TwythonStreamer从Twitter API获取全文字段值



尝试通过以下代码获取完整的tweet。我知道您想将参数tweet_mode设置为值"extended",但由于我不是这里的标准查询,我不知道它适合哪里。对于文本字段,我总是得到被"…"截断的部分文本后面跟着URL。有了这个配置,你将如何获得完整的推文:

from twython import Twython, TwythonStreamer
import json
import pandas as pd
import csv
def process_tweet(tweet):
d = {}
d['hashtags'] = [hashtag['text'] for hashtag in tweet['entities']['hashtags']]
d['text'] = tweet['text']
d['user'] = tweet['user']['screen_name']
d['user_loc'] = tweet['user']['location']
return d


# Create a class that inherits TwythonStreamer
class MyStreamer(TwythonStreamer):     
# Received data
def on_success(self, data):
# Only collect tweets in English
if data['lang'] == 'en':
tweet_data = process_tweet(data)
self.save_to_csv(tweet_data)
# Problem with the API
def on_error(self, status_code, data):
print(status_code, data)
self.disconnect()

# Save each tweet to csv file
def save_to_csv(self, tweet):
with open(r'tweets_about_california.csv', 'a') as file:
writer = csv.writer(file)
writer.writerow(list(tweet.values()))
# Instantiate from our streaming class
stream = MyStreamer(creds['CONSUMER_KEY'], creds['CONSUMER_SECRET'], 
creds['ACCESS_TOKEN'], creds['ACCESS_SECRET'])
# Start the stream
stream.statuses.filter(track='california', tweet_mode='extended')

tweet_mode=extended参数对v1.1流API没有影响,因为所有推文都以扩展和默认(140(格式发送。

如果Tweet对象的值为truncated: true,则有效载荷中会有一个附加元素extended_tweet。这里将存储full_text值。

请注意,此答案仅适用于v1.1 Twitter API,在v2中,流媒体API中默认返回所有Tweet文本(Twython目前不支持v2(。

相关内容

最新更新