Python编码问题,搜索tweet



我编写了以下代码来抓取'utf-8'编码的tweet:

kws=[]        
f=codecs.open("keywords", encoding='utf-8')
kws = f.readlines()
f.close()
print kws
for kw in kws:
    timeline_endpoint ='https://api.twitter.com/1.1/search/tweets.json?q='+kw+'&count=100&lang=fr'
    print timeline_endpoint
    response, data = client.request(timeline_endpoint)
    tweets = json.loads(data)
    for tweet in tweets['statuses']:
        my_on_data(json.dumps(tweet.encode('utf-8')))
    time.sleep(3)

,但我得到以下错误:

response, data = client.request(timeline_endpoint)
File "build/bdist.linux-x86_64/egg/oauth2/__init__.py", line 676, in request
File "build/bdist.linux-x86_64/egg/oauth2/__init__.py", line 440, in to_url
File "/usr/lib/python2.7/urllib.py", line 1357, in urlencode
    l.append(k + '=' + quote_plus(str(elt)))
UnicodeEncodeError: 'ascii' codec can't encode character u'xe9' in position 1: ordinal not in range(128)

好了,下面是使用不同搜索方法的解决方案:

auth = tweepy.OAuthHandler("k1", "k2")
auth.set_access_token("k3", "k4")
api = tweepy.API(auth)
for kw in kws:
            max_tweets = 10
            searched_tweets = [status for status in tweepy.Cursor(api.search, q=kw.encode('utf-8')).items(max_tweets)]                
            for tweet in searched_tweets:
                my_on_data(json.dumps(tweet._json))
            time.sleep(3)

最新更新