如何在for循环中填充索引字典?



我有一个转置的Dataframe tr:

14636JDUTC_02451957.362457243.982452531.89JDUTC_12451957.372452149.362457243.992452531.90JDUTC_22451957.372457244.002452531.91

使用正确的名称约定,我将更改您的代码后:

import numpy as np
import pandas as pd
import sys
if sys.version_info[0] < 3:
from StringIO import StringIO
else:
from io import StringIO
s = StringIO("""idx 7128    8719    14051   14636
JDUTC_0 2451957.36  2452149.36  2457243.98  2452531.89
JDUTC_1 2451957.37  2452149.36  2457243.99  2452531.90
JDUTC_2 2451957.37  2452149.36  2457244.00  2452531.91
JDUTC_3 NaN 2452149.36  NaN NaN
JDUTC_4 NaN 2452149.36  NaN NaN
JDUTC_5 NaN 2452149.36  NaN NaN
JDUTC_6 1.23    2452149.37  NaN NaN
JDUTC_7 NaN NaN NaN NaN
JDUTC_8 NaN NaN NaN NaN
JDUTC_9 NaN NaN NaN NaN""")
tr = pd.read_csv(s, sep="t", index_col=0)

(人们应该提供最少的工作代码-但经常忘记提供例如构建数据框架等和导入的代码)

:


a = {}
b = []
for name, values in tr.items():
b.clear() # this is problematic as you know
for ind, val in enumerate(values):
if np.isnan(val):
b.append(ind)
continue
else:
pass
a[name] = b

continuepass是不必要的-它们只是说"继续";有了循环。在Python中,您不必强制给出else分支:

for name, values in tr.items():
b.clear() # This is still problematic at this state.
for ind, val in enumerate(values):
if np.isnan(val):
b.append(ind)
a[name] = b

使用for循环的这种数据收集最好使用列表推导式来完成:

a = {}
for name, values in tr.items():
b = [ind for ind, val in enumerate(values) if np.isnan(val)]
a[name] = b
# now the result is already correct!

最后,您甚至可以为字典构建列表推导式当熟悉列表推导式时,使整个代码成为一行代码,但易于阅读:

a = {name: [i for i, x in enumerate(vals) if np.isnan(x)] for name, vals in tr.items()}

你可以看到结果:

a
# which returns:
{'7128': [3, 4, 5, 7, 8, 9],
'8719': [7, 8, 9],
'14051': [3, 4, 5, 6, 7, 8, 9],
'14636': [3, 4, 5, 6, 7, 8, 9]}

列表推导式正朝着函数式编程(FP)的方向发展。这正好处理了不应用突变(如b.append()b.clear()方法)的问题,因为—正如您所看到的:您的案例演示了使用突变时如何容易生成错误。——并将有助于讨论——为什么FP——虽然乍一看似乎对大脑不友好——是这样的这是一种对大脑更友好的编程方式。

列表推导式是"map"的python形式。-如果你使用"if"内部列表推导——这在python中相当于"filter"FP的人知道这就像呼吸的第二个大脑。

问题是您将相同的列表分配给所有键。

a = {}
b=[] # < --- You create one Array/list 'b'
for _, contents in tr.items():
b.clear()
for ind, val in enumerate(contents):
if np.isnan(val):
b.append(ind)
continue
else:
pass
print(_)
print(b)
a[_] = b # <-- assign same array to all keys.
print(a)

查看我对上面代码的注释

b.clear()

这一行只是清除相同的数组,它不创建一个新的数组。

要按预期运行代码,请在循环中创建一个新的数组/列表。

a = {}
for _, contents in tr.items():
b = [] # <--- new array/list is created
for ind, val in enumerate(contents):
if np.isnan(val):
b.append(ind)
continue
else:
pass
print(_)
print(b)
a[_] = b # <--- Now you assign the new array 'b' to a[_]
print(a)

相关内容

  • 没有找到相关文章

最新更新