在python中使用循环遍历元组提取数据



我有以下元组,包含某参考点的分子数(MolNum)和对应的distance。分子按distances的大小从小到大排列。我可以提取MolNumdistances作为两个单独的列表。但是,我想通过满足以下条件来获得g的元素,if 10 < distance < 100;得到gg。我怎么得到这个?

g = [(MolNum(378), 2.4613922385709617e-14),
 (MolNum(373), 40.6680008439399),
 (MolNum(353), 72.49296570091882),
 (MolNum(354), 83.18203548933252),
 (MolNum(359), 88.23588863972836),
 (MolNum(372), 97.47433492265824),
 (MolNum(369), 104.59206739018573),
 (MolNum(370), 114.66573137439451),
 (MolNum(361), 122.33788252133775),
 (MolNum(376), 137.2686523522959),
 (MolNum(360), 141.72521396936926),
 (MolNum(371), 145.96842598002533),
 (MolNum(352), 149.8990795114449),
 (MolNum(366), 164.55606071030496),
 (MolNum(358), 180.72531479536423),
 (MolNum(375), 182.21612213617874),
 (MolNum(364), 185.78028496680486),
 (MolNum(363), 192.02220222384793),
 (MolNum(368), 194.0298647708072),
 (MolNum(365), 194.57037736733918),
 (MolNum(356), 201.91526815811372),
 (MolNum(362), 217.8580017023349),
 (MolNum(357), 234.3818585062885),
 (MolNum(374), 241.33751568809993),
 (MolNum(367), 249.36129229747306),
 (MolNum(355), 253.59625354913504)]

满足条件后;

gg = [(MolNum(373), 40.6680008439399),
 (MolNum(353), 72.49296570091882),
 (MolNum(354), 83.18203548933252),
 (MolNum(359), 88.23588863972836),
 (MolNum(372), 97.47433492265824)] 
gg = [(mol_num, distance) for mol_num, distance in g if 10 < distance < 100]

您可以使用内置的filter函数来实现这一点,在第一个参数中以lambda表达式的形式给出条件,在第二个参数中给出要过滤的列表-

gg = list(filter(lambda x: 10 < x[1] < 100,g))

对于Python 2.7,您不需要list(...)作为过滤器返回列表。


在Python 3中。x, filter()函数返回一个迭代器,该迭代器产生满足条件的元素(即条件返回True)。

在Python 2.7中,filter()函数返回满足条件的元素列表(即返回True的条件)。


例子/演示-

>>> class MolNum:
...     def __init__(self, n):
...             self.n = n
...
>>> g = [(MolNum(378), 2.4613922385709617e-14),
...  (MolNum(373), 40.6680008439399),
...  (MolNum(353), 72.49296570091882),
...  (MolNum(354), 83.18203548933252),
...  (MolNum(359), 88.23588863972836),
...  (MolNum(372), 97.47433492265824),
...  (MolNum(369), 104.59206739018573),
...  (MolNum(370), 114.66573137439451),
...  (MolNum(361), 122.33788252133775),
...  (MolNum(376), 137.2686523522959),
...  (MolNum(360), 141.72521396936926),
...  (MolNum(371), 145.96842598002533),
...  (MolNum(352), 149.8990795114449),
...  (MolNum(366), 164.55606071030496),
...  (MolNum(358), 180.72531479536423),
...  (MolNum(375), 182.21612213617874),
...  (MolNum(364), 185.78028496680486),
...  (MolNum(363), 192.02220222384793),
...  (MolNum(368), 194.0298647708072),
...  (MolNum(365), 194.57037736733918),
...  (MolNum(356), 201.91526815811372),
...  (MolNum(362), 217.8580017023349),
...  (MolNum(357), 234.3818585062885),
...  (MolNum(374), 241.33751568809993),
...  (MolNum(367), 249.36129229747306),
...  (MolNum(355), 253.59625354913504)]
>>>
<filter object at 0x02302E70>
>>> gg = list(filter(lambda x: 10 < x[1] < 100,g))
>>> len(gg)
5

你可以这样试试

gg = [item for item in g if 10<item[1]<100]

或者你可以考虑@Anand S Kumar使用filter()的答案,这是一个更python的方式。

希望有所帮助

您可能想看看Pandas,这是一种非常常用的用于这种类型的表格数据分析的包:

import pandas as pd
g= pd.DataFrame(g)
gg = g[g[1].between(10,100)] 
gg
Out[239]: 
     0          1
1  373  40.668001
2  353  72.492966
3  354  83.182035
4  359  88.235889
5  372  97.474335

最新更新