将具有相同索引的列合并为 1 个单列(熊猫)



我有一个数据帧想要转换。我想将列合并在一起,并使具有相同 id 的值显示为单独的行。

所以从这个:

id  children1       children2     children3
1   No Children    NaN           NaN
2   12-16 years    17+ years     Nan
3   No Children    Nan           Nan
4   5-7 years      8-11 years    12-16 years

对此:

id  children       
1   No Children    
2   12-16 years
2   17+ years     
3   No Children   
4   5-7 years
4   8-11 years
4   12-16 years

有没有简单的方法可以做到这一点?

数据:

{'id': [1, 2, 3, 4],
'children1': ['No Children', '12-16 years', 'No Children', '5-7 years'],
'children2': [nan, '17+ years', nan, '8-11 years'],
'children3': [nan, nan, nan, '12-16 years']}

我认为您正在寻找stack将列转换为索引的方法。现在要获得确切的形状,您需要先使用"id"进行set_index(以便在stacking时将"id"保留为索引)。

out = df.set_index('id').stack().droplevel(1).rename('children').reset_index()

输出:

id     children
0   1  No Children
1   2  12-16 years
2   2    17+ years
3   3  No Children
4   4    5-7 years
5   4   8-11 years
6   4  12-16 years
new = (df.set_index('id').agg(list,1)#Put all row values into a list except id
.explode()#Ensure each element in a list is put in a row
.replace('Nan', np.nan)# Make Nan -> NaN
.dropna()#Drop all NaNs
.to_frame('Children')#Rename column 0 to Childresn
)

结果

Children
id            
1   NoChildren
2   12-16years
2     17+years
3   NoChildren
4     5-7years
4    8-11years
4   12-16years
​

最新更新