如何在python panda中融化自定义格式的数据形状



我在Panda数据帧中存储了以下数据库格式

ID          Block
MGKfdkldr   Product 1
MGKfdkldr   Product 2
MGKfdkldr   Product 3
GLOsdasd    Product 2
GLOsdasd    Product 3
NewNew      Product 1
OldOld      Product 4
OldOld      Product 8

以下是示例数据帧代码

df1 = pd.DataFrame({'ID':['MGKfdkldr','MGKfdkldr','MGKfdkldr','GLOsdasd','GLOsdasd','NewNew','OldOld','OldOld'],'Block':['Product 1','Product 2','Product 3','Product 2','Product 3','Product 1','Product 4','Product 8']})

我正在寻找以下数据格式(预期输出(:

ID          Block-1     Block-2     Block-3
MGKfdkldr   Product 1   Product 2   Product 3
GLOsdasd    Product 2   Product 3   
NewNew      Product 1       
OldOld      Product 4   Product 8   

我试着用pd.melt函数来融化它,但它只是将数据转换到列标题,但我正在寻找比特差异。有没有其他方法可以让我得到预期的输出?

有人能帮我吗?请

您要查找的函数是pivot而不是melt。您还需要提供一个";计数器";列,只需对重复的"ID"s进行计数即可使所有内容正确对齐。

df1["Block_id"] = df1.groupby("ID").cumcount() + 1
new_df = (df1.pivot("ID", "Block_id", "Block") # reshapes our data
.add_prefix("Block-")                # adds "Block-" to our column names
.rename_axis(columns=None)           # fixes funky column index name
.reset_index())                      # inserts "ID" as a regular column instead of an Index
print(new_df)
ID    Block-1    Block-2    Block-3
0   GLOsdasd  Product 2  Product 3        NaN
1  MGKfdkldr  Product 1  Product 2  Product 3
2     NewNew  Product 1        NaN        NaN
3     OldOld  Product 4  Product 8        NaN

如果您想要实际的空格(例如空字符串""(而不是NaN,则可以使用new_df.fillna("")

最新更新