如何组合元组中的元素或相应地列出python

我有几个元组是这样的。我想把所有的单词组合成一个句子。

('1.txt','sentence 1.1','city')
('1.txt','sentence 1.1','apple')
('1.txt','sentence 1.1','ok')
('1.txt','sentence 1.2','go')
('1.txt','sentence 1.2','home')
('1.txt','sentence 1.2','city')
('2.txt','sentence 2.1','sign')
('2.txt','sentence 2.1','tree')
('2.txt','sentence 2.1','cat')
('2.txt','sentence 2.2','good')
('2.txt','sentence 2.2','image')

如何根据句子组合单词例如：

('1.txt','sentence 1.1','city apple ok')
('1.txt','sentence 1.2','go home city')
('2.txt','sentence 2.1','sign tree cat')
('2.txt','sentence 2.2','good image')

或者以这种方式作为列表或字典

['1.txt','sentence 1.1',['city','apple','ok']]
['1.txt','sentence 1.2',['go','home','city']]
['2.txt','sentence 2.1',['sign', 'tree', 'cat']]
['2.txt','sentence 2.2',['good', 'image']]

如果我想转换成字典，该怎么做？

根据输入数据，单词似乎是根据元组的第一项和第二项(索引0和1)的组合键入的。

您可以构建一个字典，将此项组合映射到单词，并进行一些后处理，将数据重新格式化为所需的结构。

这是一种程序性的O(n)方法。

import collections
sentences = collections.defaultdict(list)
for file_name, sentence_id, word in input_data:
sentences[(file_name, sentence_id)].append(word)
# sentences is now formatted like {('1.txt', 'sentence 1.1'): ['city', 'apple', 'go']}
for key, val in sentences.items():
print list(key) + [val]
# ['1.txt', 'sentence 1.1', ['city', 'apple', 'go']]

您也可以使用groupby，并将每个元组的前两个元素作为关键字，假设您的元组列表已经按前面的前两种元素排序：

from itertools import groupby
[[k[0], k[1], [i[2] for i in g]] for k, g in groupby(lst, key = lambda x: x[:2])]
#[['1.txt', 'sentence 1.1', ['city', 'apple', 'ok']],
# ['1.txt', 'sentence 1.2', ['go', 'home', 'city']],
# ['2.txt', 'sentence 2.1', ['sign', 'tree', 'cat']],
# ['2.txt', 'sentence 2.2', ['good', 'image']]]

你可以试试这个

l=[]
l.append(('1.txt','sentence 1.1','city'))
l.append(('1.txt','sentence 1.1','apple'))
l.append( ('1.txt','sentence 1.1','ok') )
l.append( ('1.txt','sentence 1.2','go') )
l.append( ('1.txt','sentence 1.2','home') )
l.append( ('1.txt','sentence 1.2','city') )
l.append( ('2.txt','sentence 2.1','sign') )
l.append( ('2.txt','sentence 2.1','tree') )
l.append( ('2.txt','sentence 2.1','cat') )
l.append( ('2.txt','sentence 2.2','good') )
l.append( ('2.txt','sentence 2.2','image') )
d={}
for i in l:
myKey=i[0]+" "+i[1]
if myKey in d:
d[myKey].append(i[2])
else:
d[myKey]=[]
ans=[]
for k in d:
v=k.split(" ")
ans.append([v[0],''.join(v[1]+" "+v[2]),d[k]])
print sorted(ans)

相关内容

最新更新

热门标签：