我有一个数据帧想要转换。我想将列合并在一起,并使具有相同 id 的值显示为单独的行。
所以从这个:
id children1 children2 children3
1 No Children NaN NaN
2 12-16 years 17+ years Nan
3 No Children Nan Nan
4 5-7 years 8-11 years 12-16 years
对此:
id children
1 No Children
2 12-16 years
2 17+ years
3 No Children
4 5-7 years
4 8-11 years
4 12-16 years
有没有简单的方法可以做到这一点?
数据:
{'id': [1, 2, 3, 4],
'children1': ['No Children', '12-16 years', 'No Children', '5-7 years'],
'children2': [nan, '17+ years', nan, '8-11 years'],
'children3': [nan, nan, nan, '12-16 years']}
我认为您正在寻找stack
将列转换为索引的方法。现在要获得确切的形状,您需要先使用"id"进行set_index
(以便在stack
ing时将"id"保留为索引)。
out = df.set_index('id').stack().droplevel(1).rename('children').reset_index()
输出:
id children
0 1 No Children
1 2 12-16 years
2 2 17+ years
3 3 No Children
4 4 5-7 years
5 4 8-11 years
6 4 12-16 years
new = (df.set_index('id').agg(list,1)#Put all row values into a list except id
.explode()#Ensure each element in a list is put in a row
.replace('Nan', np.nan)# Make Nan -> NaN
.dropna()#Drop all NaNs
.to_frame('Children')#Rename column 0 to Childresn
)
结果
Children
id
1 NoChildren
2 12-16years
2 17+years
3 NoChildren
4 5-7years
4 8-11years
4 12-16years