python数据框架与复合索引融化



我有一个数据框架:

sample_2

您可以先将str.split作为列,然后将df.melt作为列。

df[['letter','index']] = df['index'].str.split('/', expand=True)
res = (df.melt(id_vars=['letter','index'],var_name='sample')
.sort_values(by='index')
.rename(columns={ 'index' : 'number'})
.reset_index(drop=True)
)
print(res)
letter number    sample  value
0      A      1  sample_1     23
1      A      1  sample_2     33
2      A      2  sample_1     24
3      A      2  sample_2     34
4      A      3  sample_1     25
5      A      3  sample_2     35

using melt:

df = df.melt(value_vars=['sample_1','sample_2'], id_vars='index',var_name='sample')
df[['letter','number']] = df['index'].str.split('/',expand=True)

输出:

>>>
index    sample  value letter number
0   A/1  sample_1     23      A      1
1   A/2  sample_1     24      A      2
2   A/3  sample_1     25      A      3
3   A/1  sample_2     33      A      1
4   A/2  sample_2     34      A      2
5   A/3  sample_2     35      A      3

您可以尝试pd.wide_to_long然后拆分和重命名列

df_ = pd.wide_to_long(df, ["sample_"], i="index", j="sample").reset_index()
df_[['index', 'number']] = df_['index'].str.split('/', expand=True)
df_ = df_.rename(columns={'index': 'letter', 'sample_': 'value'})
df_ = df_.assign(sample='sample_' + df_['sample'].astype(str), index=df_.index+1)
print(df_)
letter    sample  value number  index
0      A  sample_1     23      1      1
1      A  sample_1     24      2      2
2      A  sample_1     25      3      3
3      A  sample_2     33      1      4
4      A  sample_2     34      2      5
5      A  sample_2     35      3      6

对于更嵌套的版本,可以使用

df[['index', 'number']] = df['index'].str.split('/', expand=True)
df = (pd.wide_to_long(df, ["sample_"], i=["index", 'number'], j="sample").reset_index()
.rename(columns={'index': 'letter', 'sample_': 'value'})
.pipe(lambda df: df.assign(sample='sample_' + df['sample'].astype(str), index=df.index+1))
)

最新更新