下面是我从twitter流中读取数据的代码。当我尝试通过终端运行它时,既没有返回数据,也没有返回任何错误。当我终止进程时,返回以下跟踪信息:
File "Soundcloud.py", line 59, in <module>
twitter_stream.filter(track=['soundcloud.com'])
File "/Library/Python/2.7/site-packages/tweepy/streaming.py", line 430, in filter
self._start(async)
File "/Library/Python/2.7/site-packages/tweepy/streaming.py", line 346, in _start
self._run()
File "/Library/Python/2.7/site-packages/tweepy/streaming.py", line 255, in _run
self._read_loop(resp)
File "/Library/Python/2.7/site-packages/tweepy/streaming.py", line 298, in _read_loop
line = buf.read_line().strip()
File "/Library/Python/2.7/site-packages/tweepy/streaming.py", line 171, in read_line
self._buffer += self._stream.read(self._chunk_size)
File "/Library/Python/2.7/site-packages/requests/packages/urllib3/response.py", line 243, in read
data = self._fp.read(amt)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 588, in read
return self._read_chunked(amt)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 630, in _read_chunked
line = self.fp.readline(_MAXLINE + 1)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 480, in readline
data = self._sock.recv(self._rbufsize)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 734, in recv
return self.read(buflen)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.py", line 621, in read
v = self._sslobj.read(len or 1024)
代码如下:
import tweepy
from tweepy import Stream
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
import json
import time
import pymysql
import sys
import extraction
import requests
import codecs
import urllib2
import urllib
from urllib import urlopen
from BeautifulSoup import BeautifulSoup
#twitter Authentication-keys not entered
consumer_key = ''
consumer_secret = ''
access_token = ''
access_secret = ''
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
global conn
conn=pymysql.connect(db='twitter_users', user='root' , host='127.0.0.1' , port=3307)
global cursor
cursor=conn.cursor()
class MyListener(StreamListener):
def on_status(self, status):
tweet_json=json.loads(status)
print(tweet_json)
for i in tweet_json:
user_handle=i['user']['screen_name']
user_followers=i['user']['followers_count']
user_statuses=i['user']['statuses_count']
user_location=i['user']['location']
user_geo=i['geo']
tweet_place=i['place']
tweet_device=i['source']
tweet_id=i['id_str']
url=i['entities']['urls'][0]['expanded_url']
print(url)
return True
def on_error(self, status):
print(status)
return True
def on_timeout(self):
print("Received timeout. Sleeping for 20 secs")
time.sleep(20)
return True
twitter_stream = Stream(auth, MyListener())
twitter_stream.filter(track=['soundcloud.com'])
这段代码有几处不正确。
-
on_status
不返回JSON
对象,它返回ResultSet
。使用user_handle=status.user.screen_name
访问这些信息 - 在
on_status
函数中不需要for
循环。当每条推文被获取时,它会自动循环。
修复这些错误,看看是否改变了什么