删除重复的行值，但创建新列

我有一个类似于下面的表:

Miles  Side  Direction  Param1   Width  Height  Date
0.5    Left  Right      5        0.6    0.8     2023-01-04
0.5    Right Right      5        0.5    0.9     2023-01-04
1      Left  Left       4        0.3    0.3     2023-01-04
1      Right Left       4        0.5    0.5     2023-01-04

从表中可以看出，Miles、Direction、Param1和Date有重复的值。但是，侧面、宽度和高度会有所不同。我要做的是删除重复的值，并为不同的值创建新的列。这个表应该如下所示:

Miles   Direction   Param1  Side1   Width1  Height1 Side2   Width2  Height2 Date
0.5     Right       5       Left    0.6     0.8     Right   0.5     0.9  2023-01-04
1       Left        4       Left    0.3     0.3     Right   0.5     0.5  2023-01-04

我已尝试使用以下命令:

pivot函数，但当有多个重复参数时，它似乎不起作用
透视表-似乎这将工作，但我认为我错过了一些东西。

我试了这样做:

df = pd.pivot_table(df, values=['Side','Width','Height'], index=['Miles, Direction','Param1','Date'], columns=None)

但是我认为这里缺少了一些东西，因为数据显示完全不正确。任何帮助将非常感激-谢谢!

使用pandas.pivot_table的命题:

dup_cols = ['Miles', 'Direction','Param1','Date']
var_cols = ['Side','Width','Height']

out = (
pd.pivot_table(df.
assign(idx=df.groupby(dup_cols).cumcount()+1),
index=dup_cols,
values=var_cols,
columns='idx',
fill_value='',
aggfunc=lambda x: x)
.pipe(lambda d: d.set_axis([f'{col}{num}' for col,num in d.columns], axis=1))
.reset_index()
)

#输出:

print(out)

Miles Direction  Param1        Date  Height1  Height2 Side1  Side2  Width1  Width2
0    0.5     Right       5  2023-01-04      0.8      0.9  Left  Right     0.6     0.5
1    1.0      Left       4  2023-01-04      0.3      0.5  Left  Right     0.3     0.5

尝试:

df['tmp'] = df.groupby(['Miles', 'Direction', 'Param1', 'Date']).cumcount() + 1
df = df.set_index(['Miles', 'Direction', 'Param1', 'Date', 'tmp'])
df = df.unstack('tmp')
df.columns = [f'{a}{b}' for a, b in df.columns]
df = df.reset_index()
print(df)

打印:

Miles Direction  Param1        Date Side1  Side2  Width1  Width2  Height1  Height2
0    0.5     Right       5  2023-01-04  Left  Right     0.6     0.5      0.8      0.9
1    1.0      Left       4  2023-01-04  Left  Right     0.3     0.5      0.3      0.5

#输出:

相关内容

最新更新

热门标签：