通过枚举列名将长数据帧转换为单行数据帧



有一个如下所示的df:

data = 
[{'len_overlap': 2, 'prox': 1.0, 'freq_sum_w': 0.03962264150943396},
{'len_overlap': 22, 'prox': np.nan, 'freq_sum_w': 0.0311111962264150943396}]
df = pd.DataFrame(data)
01

通过unstack()to_frame()Transpose(T)属性尝试:

out=df.unstack().to_frame().T

最后:

out.columns=out.columns.map(lambda x:'_'.join(map(str,x)))

out:的输出

len_overlap_0   len_overlap_1   prox_0  prox_1  freq_sum_w_0    freq_sum_w_1
0   2.0             22.0            1.0     NaN     0.039623        0.031111

一行但更复杂:

>>> df.unstack() 
.to_frame() 
.set_index(pd.MultiIndex.from_product([df.columns, df.index.astype(str)])
.sortlevel(1)[0]
.to_flat_index()
.map('_'.join)) 
.transpose()
freq_sum_w_0  len_overlap_0  prox_0  freq_sum_w_1  len_overlap_1    prox_1
0           2.0           22.0     1.0           NaN       0.039623  0.031111

IMHO,我认为;更多Pandas方式";是使用MultiIndex:

>>> df.stack().to_frame().transpose()
0                           1
len_overlap prox freq_sum_w len_overlap freq_sum_w
0         2.0  1.0   0.039623        22.0   0.031111

或更好(如pd.melt(:

>>> df.stack()
0  len_overlap     2.000000
prox            1.000000
freq_sum_w      0.039623
1  len_overlap    22.000000
freq_sum_w      0.031111

Try,

df_out = df.unstack()
df_out = df_out.sort_index(level=1)
df_out.index = [f'{i}_{j}' for i, j in df_out.index]
df_out.to_frame().T

输出:

freq_sum_w_0  len_overlap_0  prox_0  freq_sum_w_1  len_overlap_1  prox_1
0      0.039623            2.0     1.0      0.031111           22.0     NaN

最新更新