使用twitter api和spark搜索特定的关键词



我正在尝试这个代码,我用#Apple替换了#。

val ssc = new StreamingContext("local[*]", "PopularHashtags", Seconds(1))
val tweets = TwitterUtils.createStream(ssc, None)
val statuses = tweets.map(status => status.getText())
val tweetwords = statuses.flatMap(tweetText => tweetText.split(" "))
val hashtags = tweetwords.filter(word => word.startsWith("#"))
val hashtagKeyValues = hashtags.map(hashtag => (hashtag, 1))
val hashtagCounts = hashtagKeyValues.reduceByKeyAndWindow( (x,y) => x + y, (x,y) => x - y, Seconds(1000), Seconds(1))
val sortedResults = hashtagCounts.transform(rdd => rdd.sortBy(x => x._2, false))
sortedResults.print

但我没有得到任何结果。

这种流媒体是否对推文的数量以及从哪个地区获取推文有一定的限制?我还试着寻找#OPPO,因为在我的推特账户中,这是一个趋势,所以我试着寻找它,但仍然没有得到任何结果。

val ssc = new StreamingContext("local[*]", "PopularHashtags", Seconds(1))
//The keyword you want to look for can be specified in a sequence as follows
var seq:Seq[String] = Seq("#Rajasthan","#Apple")
val tweets = TwitterUtils.createStream(ssc, None, seq)
val statuses = tweets.map(status => status.getText())
val tweetwords = statuses.flatMap(tweetText => tweetText.split(" "))
val hashtags = tweetwords.filter(word=>word.contains("#"))
val hashtagKeyValues = hashtags.map(hashtag => (hashtag, 1))
val hashtagCounts = hashtagKeyValues.reduceByKeyAndWindow( (x,y) => x + y, (x,y) => x - y, Seconds(1000), Seconds(1))
val sortedResults = hashtagCounts.transform(rdd => rdd.sortBy(x => x._2, false))
sortedResults.print

相关内容

  • 没有找到相关文章

最新更新