在熊猫中采用单元格的值来指示列名



Input

DBN Grade   3   4   5
0  01M015     3  30  44  15
1  01M015     4  30  44  15
2  01M015     5  30  44  15

期望的输出

DBN Grade   3   4   5  Enrollment
0  01M015     3  30  44  15  30
1  01M015     4  30  44  15  44
2  01M015     5  30  44  15  15

您将如何创建"注册"列?

请注意,我们为每条记录查找的列取决于 df['Grade'] 处的值。

我尝试了 df[df['Grade']] 的变体,以便找到列 df['3'],但我没有成功。

有没有办法简单地做到这一点?

import pandas as pd
import numpy as np
data={'DBN':['01M015','01M015','01M015'],
'Grade':['3','4','5'],
'3':['30','30','30'],
'4':['44','44','44'],
'5':['15','15','15']}
df = pd.DataFrame(data)
# This line below doesn't work: raises ValueError: Length of values does not match length of index
df['Enrollment'] = [df[c] if (df.loc[i,'Grade'] == c) else None for i in df.index for c
in df.columns]

设置索引,然后使用lookup

df.set_index('Grade').lookup(df['Grade'], df['Grade'])

array(['30', '44', '15'], dtype=object)

如果数据是数字(在示例数据中都是字符串(,则可能会遇到一些问题,需要强制转换才能使查找成功。

import pandas as pd
import numpy as np
data={'DBN':['01M015','01M015','01M015'],
'Grade':['3','4','5'],
'3':['30','30','30'],
'4':['44','44','44'],
'5':['15','15','15']}
df = pd.DataFrame(data)
enrollmentList = []
for index, row in df.iterrows():
enrollmentList.append(row[row["Grade"]])

df['Enrollment'] =  enrollmentList

最新更新