我有一个文本文件,其格式如下:
1 1089874 108992 PCCW's chief operating officer. Current Chief Operating Officer Mike.
1 3019446 3019327 The world's two largest. late summer sales frenzy caused more of an industry backlash than expected.
为了清楚起见,有一个标签(1(+由标签分隔+id1(1089874(+由空格分隔+id2(1089925(+由空间分隔+text1+由标签分隔+text2
我想阅读文本文件,并在python中的不同列表中提取label
、text1
和text2
。我该怎么做?谢谢
假设变量line
中的每一行都有,只需执行以下操作:
cols = line.split() # Splits by any white space
label = cols[0]
text1 = cols[1]
text2 = ' '.join(cols[2:])
或者,重新阅读您的要求,我认为您实际上想要:
cols = line.split('t')
label = cols[0]
text1 = ' '.join(cols[1].split()[2:])
text2 = cols[2]