将行添加到数据帧中,这些行是数据帧python的乘积值



我有一个4x10维度的数据框架,行代表1-10个作业类别,所有作业都属于其中一个类别。该表说明了数据库中的人将作业1-10作为第一作业、作为第二作业等的概率:

prob_all_dict = {'prob_1': {1.0: 0.03409090909090909,
2.0: 0.022727272727272728,
3.0: 0.045454545454545456,
4.0: 0.5340909090909091,
5.0: 0.06818181818181818,
6.0: 0.011363636363636364,
7.0: 0.13636363636363635,
8.0: 0.06818181818181818,
9.0: 0.045454545454545456,
10.0: 0.03409090909090909},
'prob_2': {1.0: 0.045454545454545456,
2.0: 0.011363636363636364,
3.0: 0.03409090909090909,
4.0: 0.4659090909090909,
5.0: 0.11363636363636363,
6.0: 0.045454545454545456,
7.0: 0.1590909090909091,
8.0: 0.045454545454545456,
9.0: 0.03409090909090909,
10.0: 0.045454545454545456},
'prob_3': {1.0: 0.1111111111111111,
2.0: 0,
3.0: 0.06349206349206349,
4.0: 0.3968253968253968,
5.0: 0.07936507936507936,
6.0: 0,
7.0: 0.19047619047619047,
8.0: 0.1111111111111111,
9.0: 0,
10.0: 0.047619047619047616},
'prob_4': {1.0: 0,
2.0: 0,
3.0: 0.043478260869565216,
4.0: 0.391304347826087,
5.0: 0.13043478260869565,
6.0: 0,
7.0: 0.08695652173913043,
8.0: 0.2608695652173913,
9.0: 0,
10.0: 0.08695652173913043}}
prob_all = pd.DataFrame.from_dict(prob_all_dict)

从";prob_all";数据帧";out";是通过将一些单元格与其他单元格相乘来创建的:我已经将第一份工作的概率作为数据帧中的第一行,以及第二份工作的条件概率,这取决于人们在第一份工作中从事的工作类别,例如,如果工作1类别为3等,则拥有工作类别2的概率。

out=[prob_all['prob_1']]+[prob_all['prob_2']*prob_all['prob_1'].iloc[x] for x in range(0,10)]
out=pd.concat(out,axis=1)
out=(out.join(pd.concat([prob_all['prob_3']*out.iloc[x,1] for x in range(0,10)],axis=1))
.join(pd.concat([prob_all['prob_3']*out.iloc[x,2] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,3] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,4] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,5] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,6] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,7] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,8] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,9] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,10] for x in range(0,10)],axis=1),rsuffix='x')
).values
out=pd.DataFrame(out).T

在第三步中,我想根据一个人在第一份、第二份和第三份工作中的表现,将工作类别的概率设为1-10。我在第三块代码中手动完成了这一操作,但我想";自动地";对所有1000个组合都这样做:

out.iloc[11,0]*prob_all['prob_4'][1]
out.iloc[11,0]*prob_all['prob_4'][2]
out.iloc[11,0]*prob_all['prob_4'][3]
out.iloc[11,0]*prob_all['prob_4'][4]
out.iloc[11,0]*prob_all['prob_4'][5]
out.iloc[11,0]*prob_all['prob_4'][6]
out.iloc[11,0]*prob_all['prob_4'][7]
out.iloc[11,0]*prob_all['prob_4'][8]
out.iloc[11,0]*prob_all['prob_4'][9]
out.iloc[11,0]*prob_all['prob_4'][10]
out.iloc[11,1]*prob_all['prob_4'][1]
out.iloc[11,1]*prob_all['prob_4'][2]
out.iloc[11,1]*prob_all['prob_4'][3]
out.iloc[11,1]*prob_all['prob_4'][4]
out.iloc[11,1]*prob_all['prob_4'][5]
out.iloc[11,1]*prob_all['prob_4'][6]
out.iloc[11,1]*prob_all['prob_4'][7]
out.iloc[11,1]*prob_all['prob_4'][8]
out.iloc[11,1]*prob_all['prob_4'][9]
out.iloc[11,1]*prob_all['prob_4'][10]
out.iloc[11,2]*prob_all['prob_4'][1]
out.iloc[11,2]*prob_all['prob_4'][2]
out.iloc[11,2]*prob_all['prob_4'][3]
out.iloc[11,2]*prob_all['prob_4'][4]
out.iloc[11,2]*prob_all['prob_4'][5]
out.iloc[11,2]*prob_all['prob_4'][6]
out.iloc[11,2]*prob_all['prob_4'][7]
out.iloc[11,2]*prob_all['prob_4'][8]
out.iloc[11,2]*prob_all['prob_4'][9]
out.iloc[11,2]*prob_all['prob_4'][10]
out.iloc[11,3]*prob_all['prob_4'][1]
out.iloc[11,3]*prob_all['prob_4'][2]
out.iloc[11,3]*prob_all['prob_4'][3]
out.iloc[11,3]*prob_all['prob_4'][4]
out.iloc[11,3]*prob_all['prob_4'][5]
out.iloc[11,3]*prob_all['prob_4'][6]
out.iloc[11,3]*prob_all['prob_4'][7]
out.iloc[11,3]*prob_all['prob_4'][8]
out.iloc[11,3]*prob_all['prob_4'][9]
out.iloc[11,3]*prob_all['prob_4'][10]
out.iloc[11,4]*prob_all['prob_4'][1]
out.iloc[11,4]*prob_all['prob_4'][2]
out.iloc[11,4]*prob_all['prob_4'][3]
out.iloc[11,4]*prob_all['prob_4'][4]
out.iloc[11,4]*prob_all['prob_4'][5]
out.iloc[11,4]*prob_all['prob_4'][6]
out.iloc[11,4]*prob_all['prob_4'][7]
out.iloc[11,4]*prob_all['prob_4'][8]
out.iloc[11,4]*prob_all['prob_4'][9]
out.iloc[11,4]*prob_all['prob_4'][10]
out.iloc[11,5]*prob_all['prob_4'][1]
out.iloc[11,5]*prob_all['prob_4'][2]
out.iloc[11,5]*prob_all['prob_4'][3]
out.iloc[11,5]*prob_all['prob_4'][4]
out.iloc[11,5]*prob_all['prob_4'][5]
out.iloc[11,5]*prob_all['prob_4'][6]
out.iloc[11,5]*prob_all['prob_4'][7]
out.iloc[11,5]*prob_all['prob_4'][8]
out.iloc[11,5]*prob_all['prob_4'][9]
out.iloc[11,5]*prob_all['prob_4'][10]
out.iloc[11,6]*prob_all['prob_4'][1]
out.iloc[11,6]*prob_all['prob_4'][2]
out.iloc[11,6]*prob_all['prob_4'][3]
out.iloc[11,6]*prob_all['prob_4'][4]
out.iloc[11,6]*prob_all['prob_4'][5]
out.iloc[11,6]*prob_all['prob_4'][6]
out.iloc[11,6]*prob_all['prob_4'][7]
out.iloc[11,6]*prob_all['prob_4'][8]
out.iloc[11,6]*prob_all['prob_4'][9]
out.iloc[11,6]*prob_all['prob_4'][10]
out.iloc[11,7]*prob_all['prob_4'][1]
out.iloc[11,7]*prob_all['prob_4'][2]
out.iloc[11,7]*prob_all['prob_4'][3]
out.iloc[11,7]*prob_all['prob_4'][4]
out.iloc[11,7]*prob_all['prob_4'][5]
out.iloc[11,7]*prob_all['prob_4'][6]
out.iloc[11,7]*prob_all['prob_4'][7]
out.iloc[11,7]*prob_all['prob_4'][8]
out.iloc[11,7]*prob_all['prob_4'][9]
out.iloc[11,7]*prob_all['prob_4'][10]
out.iloc[11,8]*prob_all['prob_4'][1]
out.iloc[11,8]*prob_all['prob_4'][2]
out.iloc[11,8]*prob_all['prob_4'][3]
out.iloc[11,8]*prob_all['prob_4'][4]
out.iloc[11,8]*prob_all['prob_4'][5]
out.iloc[11,8]*prob_all['prob_4'][6]
out.iloc[11,8]*prob_all['prob_4'][7]
out.iloc[11,8]*prob_all['prob_4'][8]
out.iloc[11,8]*prob_all['prob_4'][9]
out.iloc[11,8]*prob_all['prob_4'][10]
out.iloc[11,9]*prob_all['prob_4'][1]
out.iloc[11,9]*prob_all['prob_4'][2]
out.iloc[11,9]*prob_all['prob_4'][3]
out.iloc[11,9]*prob_all['prob_4'][4]
out.iloc[11,9]*prob_all['prob_4'][5]
out.iloc[11,9]*prob_all['prob_4'][6]
out.iloc[11,9]*prob_all['prob_4'][7]
out.iloc[11,9]*prob_all['prob_4'][8]
out.iloc[11,9]*prob_all['prob_4'][9]
out.iloc[11,9]*prob_all['prob_4'][10]
out.iloc[12,0]*prob_all['prob_4'][1]
out.iloc[12,0]*prob_all['prob_4'][2]
out.iloc[12,0]*prob_all['prob_4'][3]
out.iloc[12,0]*prob_all['prob_4'][4]
out.iloc[12,0]*prob_all['prob_4'][5]
out.iloc[12,0]*prob_all['prob_4'][6]
out.iloc[12,0]*prob_all['prob_4'][7]
out.iloc[12,0]*prob_all['prob_4'][8]
out.iloc[12,0]*prob_all['prob_4'][9]
out.iloc[12,0]*prob_all['prob_4'][10]
out.iloc[12,1]*prob_all['prob_4'][1]
out.iloc[12,1]*prob_all['prob_4'][2]
out.iloc[12,1]*prob_all['prob_4'][3]
out.iloc[12,1]*prob_all['prob_4'][4]
out.iloc[12,1]*prob_all['prob_4'][5]
out.iloc[12,1]*prob_all['prob_4'][6]
out.iloc[12,1]*prob_all['prob_4'][7]
out.iloc[12,1]*prob_all['prob_4'][8]
out.iloc[12,1]*prob_all['prob_4'][9]
out.iloc[12,1]*prob_all['prob_4'][10]

有人能帮我做这个吗?我在这个问题上已经坚持了很长时间。非常感谢。

您可以使用循环来更改索引并将每一行附加到数据帧

for i in range(11, 13):
for j in range(10):
out.loc[len(out)] = [out.iloc[i, j] * prob_all['prob_4'][k] for k in range(1, 11)]

out的计算也可以简化

for i in range(1, 11):
out = out.join(pd.concat([prob_all['prob_3'] * out.iloc[x, i] for x in range(0, 10)], axis=1), rsuffix='x')
out = pd.DataFrame(out.values).T

相关内容

  • 没有找到相关文章

最新更新