为什么秒 for 循环无法正确执行?



我正在尝试编写两个for循环,将返回不同输入的分数,并使用新分数创建一个新字段。第一个循环可以正常工作,但第二个循环永远不会返回正确的分数。

import pandas as pd
d = {'a':['foo','bar'], 'b':[1,3]}
df = pd.DataFrame(d)
score1 = df.loc[df['a'] == 'foo']
score2 = df.loc[df['a'] == 'bar']
for i in score1['b']:
if i < 3:
score1['c'] = 0
elif i <= 3 and i < 4:
score1['c'] = 1
elif i >= 4 and i < 5:
score1['c'] = 2
elif i >= 5 and i < 8:
score1['c'] = 3
elif i == 8:
score1['c'] = 4
for j in score2['b']:
if j < 2:
score2['c'] = 0
elif j <= 2 and i < 4:
score2['c'] = 1
elif j >= 4 and i < 6:
score2['c'] = 2
elif j >= 6 and i < 8:
score2['c'] = 3
elif j == 8:
score2['c'] = 4

print(score1)
print(score2)

当我运行脚本时,它返回以下内容:

print(score1)
a  b  c
0  foo  1  0
print(score2)
a  b
1  bar  3

为什么score2不创建新字段"c"还是分数?

避免使用for循环有条件地更新非Python列表的DataFrame列。使用Pandas和Numpy的矢量化方法,如numpy.select它可以扩展到数百万行!请记住,这些数据科学工具的计算方式与一般使用的Python有很大不同:

# LIST OF BOOLEAN CONDITIONS
conds = [
score1['b'].lt(3),                            # EQUIVALENT TO < 3
score1['b'].between(3, 4, inclusive="left"),  # EQUIVALENT TO >= 3 or < 4
score1['b'].between(4, 5, inclusive="left"),  # EQUIVALENT TO >= 4 or < 5
score1['b'].between(5, 8, inclusive="left"),  # EQUIVALENT TO >= 5 or < 8
score1['b'].eq(8)                             # EQUIVALENT TO == 8
]   
# LIST OF VALUES
vals = [0, 1, 2, 3, 4]
# VECTORIZED ASSIGNMENT
score1['c'] = numpy.select(conds, vals, default=numpy.nan)
# LIST OF BOOLEAN CONDITIONS
conds = [
score2['b'].lt(2),
score2['b'].between(2, 4, inclusive="left"),
score2['b'].between(4, 6, inclusive="left"),
score2['b'].between(6, 8, inclusive="left"),
score2['b'].eq(8)
]   
# LIST OF VALUES
vals = [0, 1, 2, 3, 4]
# VECTORIZED ASSIGNMENT
score2['c'] = numpy.select(conds, vals, default=numpy.nan)

在第二个for循环的第一次迭代中,j将在3中。使你的条件都不满足。

for j in score2['b']:
if j < 3:
score2['c'] = 0
elif j <= 3 and i < 5:
score2['c'] = 1
elif j >= 5 and i < 7:
score2['c'] = 2
elif j >= 7 and i < 9:
score2['c'] = 3
elif j == 9:
score2['c'] = 4

最新更新