我希望使用pandas数据帧中的一列绘制Wordcloud
这是我的代码:
all_words=''.join( [tweet for tweet in tweet_table['tokens'] ] )
word_Cloud=WordCloud(width=500, height=300, random_state=21, max_font_size=119).generate(all_words)
plt.imshow(word_Cloud, interpolation='bilinear')
我想要绘制的列tweet_table['tokens']
如下所示:
0 [da, trumpanzee, follower, blm, balance, wp, g...
1 [counting, blacklivesmatter, received, trainin...
2 [okay, like, little, kids, pretty, smart, know...
3 [thank, oscopelabs, got, mounted, loud, amp, p...
4 [bpi, proud, supported, hoops, 4l, f, e, see, ...
...
44713 [tomorrow, buy, charity, compilation, undergro...
44714 [needs, erected, state, capitol, think, darkfa...
44715 [clay, county, sheriffs, motto, screw, amp, in...
44716 [films, eleven, films, bravo, bad, ass, video,...
44717 [everybody, give, listen, blm]
我上面的代码给了我以下错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-227-4066d6d1a153> in <module>
2 # REMOVE STOP WORDS
3
----> 4 all_words=''.join( [tweet for tweet in tweet_table['tokens'] ] )
TypeError: sequence item 0: expected str instance, list found
请问我怎样才能纠正这个错误?列tweet_table['token']
是tokenized
并且从任何stopwords
中清除
非常感谢
Ps:当我对tweet_table['clean_text']
这一列使用类似的代码时,代码工作得很好。
列tweet_table['clean_text']
如下所示:
0 You have a da trumpanzee follower in ...
1 Over 279 and counting If BlackLivesMatte...
2 Okay but like little kids are pretty smart and...
3 Thank you oscopelabs got it mounted loud amp...
4 BPI is proud to have supported Hoops4L Y F E ...
...
44713 TOMORROW you can buy the charity compilation...
44714 That needs to be erected at the State Capi...
44715 Clay County Sheriffs Motto To Screw amp ...
44716 Films Eleven Films bravo Bad ass vid...
44717 everybody should give this a listen ...
我刚刚修复了
allwords=''.join( str(tweet_table['tokens']))
word_Cloud=WordCloud(width=500, height=300, random_state=21,
max_font_size=119).generate(allwords)
plt.imshow(word_Cloud, interpolation='bilinear')
其中CCD_ 7没有任何停止字。否则,我们创建一个停止语列表,并将其添加为下面的代码
from wordcloud import WordCloud,STOPWORDS
stopwords_newlist = ["https", "co"] + list(STOPWORDS)
allwords=''.join( str(tweet_table['tokens']))
word_Cloud=WordCloud(width=500, height=300, random_state=21, stopwords=stopwords_newlist,
max_font_size=119).generate(allwords)
plt.imshow(word_Cloud, interpolation='bilinear')