如何将所有标记化的单词组合成一列中的句子?
tokenized_word = ['really','smart','people']
在sentence = really smart people
中
Python中有一个标准的join
操作:
sentence = ' '.join(tokenized_word)
如果你想把它转换成Pandas列,你可以这样做:
`df['cl_name']=句子
def remove_punctuation(txt):
txt_nopunt = " ".join([c for c in txt if c not in string.punctuation])
return txt_nopunt
data['tokenized_word'] = data['tokenized_word'].apply(lambda x: remove_punctuation(x))