如何组合元组中的元素或相应地列出python



我有几个元组是这样的。我想把所有的单词组合成一个句子。

('1.txt','sentence 1.1','city')
('1.txt','sentence 1.1','apple')
('1.txt','sentence 1.1','ok')
('1.txt','sentence 1.2','go')
('1.txt','sentence 1.2','home')
('1.txt','sentence 1.2','city')
('2.txt','sentence 2.1','sign')
('2.txt','sentence 2.1','tree')
('2.txt','sentence 2.1','cat')
('2.txt','sentence 2.2','good')
('2.txt','sentence 2.2','image')

如何根据句子组合单词例如:

('1.txt','sentence 1.1','city apple ok')
('1.txt','sentence 1.2','go home city')
('2.txt','sentence 2.1','sign tree cat')
('2.txt','sentence 2.2','good image')

或者以这种方式作为列表或字典

['1.txt','sentence 1.1',['city','apple','ok']]
['1.txt','sentence 1.2',['go','home','city']]
['2.txt','sentence 2.1',['sign', 'tree', 'cat']]
['2.txt','sentence 2.2',['good', 'image']]

如果我想转换成字典,该怎么做?

根据输入数据,单词似乎是根据元组的第一项和第二项(索引0和1)的组合键入的。

您可以构建一个字典,将此项组合映射到单词,并进行一些后处理,将数据重新格式化为所需的结构。

这是一种程序性的O(n)方法。

import collections
sentences = collections.defaultdict(list)
for file_name, sentence_id, word in input_data:
sentences[(file_name, sentence_id)].append(word)
# sentences is now formatted like {('1.txt', 'sentence 1.1'): ['city', 'apple', 'go']}
for key, val in sentences.items():
print list(key) + [val]
# ['1.txt', 'sentence 1.1', ['city', 'apple', 'go']]

您也可以使用groupby,并将每个元组的前两个元素作为关键字,假设您的元组列表已经按前面的前两种元素排序:

from itertools import groupby
[[k[0], k[1], [i[2] for i in g]] for k, g in groupby(lst, key = lambda x: x[:2])]
#[['1.txt', 'sentence 1.1', ['city', 'apple', 'ok']],
# ['1.txt', 'sentence 1.2', ['go', 'home', 'city']],
# ['2.txt', 'sentence 2.1', ['sign', 'tree', 'cat']],
# ['2.txt', 'sentence 2.2', ['good', 'image']]]

你可以试试这个

l=[]
l.append(('1.txt','sentence 1.1','city'))
l.append(('1.txt','sentence 1.1','apple'))
l.append( ('1.txt','sentence 1.1','ok') )
l.append( ('1.txt','sentence 1.2','go') )
l.append( ('1.txt','sentence 1.2','home') )
l.append( ('1.txt','sentence 1.2','city') )
l.append( ('2.txt','sentence 2.1','sign') )
l.append( ('2.txt','sentence 2.1','tree') )
l.append( ('2.txt','sentence 2.1','cat') )
l.append( ('2.txt','sentence 2.2','good') )
l.append( ('2.txt','sentence 2.2','image') )
d={}
for i in l:
myKey=i[0]+" "+i[1]
if myKey in d:
d[myKey].append(i[2])
else:
d[myKey]=[]
ans=[]
for k in d:
v=k.split(" ")
ans.append([v[0],''.join(v[1]+" "+v[2]),d[k]])
print sorted(ans)

最新更新