我有一个这种格式的学生信息字典。我不能更改这个,这是我试图使用的另一个程序的输出。
student_info_dict = {
"Student_1_Name": "Alice",
"Student_1_Age": 23,
"Student_1_Phone_Number": 1111,
"Student_1_before_after": (120, 109),
"Student_2_Name": "Bob",
"Student_2_Age": 56,
"Student_2_Phone_Number": 1234,
"Student_2_before_after": (115, 107),
"Student_3_Name": "Casie",
"Student_3_Age": 47,
"Student_3_Phone_Number": 4567,
"Student_3_before_after": (180, 140),
"Student_4_Name": "Donna",
"Student_4_Age": 33,
"Student_4_Phone_Number": 6789,
"Student_4_before_after": (150, 138),
}
我的字典的键增加1以显示下一个学生的信息。如何将其转换为如下所示的DataFrame:
Name Age Phone_Number Before_and_After
0 Alice 23 1111 (120,109)
1 Bob 56 1234 (115,107)
3 Casie 47 4567 (180,140)
4 Donna 33 6789 (150,138)
使用说明:
#create Series
s = pd.Series(student_info_dict)
#split index created by keys to second _
s.index = s.index.str.split('_', n=2, expand=True)
#remove first level (Student) and reshape to DataFrame
df = s.droplevel(0).unstack()
print (df)
Age Name Phone_Number before_after
1 23 Alice 1111 (120, 109)
2 56 Bob 1234 (115, 107)
3 47 Casie 4567 (180, 140)
4 33 Donna 6789 (150, 138)
您可以使用一个简单的字典推导提供给Series构造函数,获取学生号和字段名作为键,然后解堆叠到DataFrame:
df = (pd.Series({tuple(k.split('_',2)[1:]): v
for k,v in student_info_dict.items()})
.unstack(1)
)
输出:
Age Name Phone_Number before_after
1 23 Alice 1111 (120, 109)
2 56 Bob 1234 (115, 107)
3 47 Casie 4567 (180, 140)
4 33 Donna 6789 (150, 138)
使用以下代码:
import pandas as pd
name=[]
age=[]
pn=[]
baa=[]
for i in student_info_dict.keys():
if i.find('Name')>=0:
name.append(i)
elif i.find('Age')>=0:
age.append(i)
elif i.find('Phone')>=0:
pn.append(i)
elif i.find('before')>=0:
baa.append(i)
df = pd.DataFrame('Name':name, 'Age':age, 'Phone_number':pn, 'before_after':baa)