我想堆叠或组合三列,但不想失去它们相关的第四列。我想合并这三个列,并创建一个额外的列,说明初始列名。
我想从
您可以尝试一下,但是一定有更好的解决方案!
我使用这个数据(不想从你的例子中复制完全相同的值,因为明显的懒惰):
df = pd.DataFrame({"Index": [1, 2, 3, 4, 5], "apples": [1, 2, 3, 4, 5], "bananas" : [6, 7, 8, 9, 10], "strawberries": [11, 12, 13, 14, 15], "colors": ["blue", "green", "red", "yellow", "purple"]})
df
Index apples bananas strawberries colors
1 1 6 11 blue
2 2 7 12 green
3 3 8 13 red
4 4 9 14 yellow
5 5 10 15 purple
我做了以下操作:
fruits = ["apples", "bananas", "strawberries"]
new_df = pd.DataFrame()
for fruit in fruits:
temp_df = df[[fruit, "colors"]]
temp_df["fruits"] = fruit
temp_df.columns = ["fruit values", "color", "fruits"]
new_df = new_df.append(temp_df)
new_df = new_df.sort_values("color")
new_df = new_df.reset_index(drop=True)
导致:
new_df
fruit values color fruits
0 1 blue apples
1 6 blue bananas
2 11 blue strawberries
3 2 green apples
4 7 green bananas
5 12 green strawberries
6 5 purple apples
7 10 purple bananas
8 15 purple strawberries
9 3 red apples
10 8 red bananas
11 13 red strawberries
12 4 yellow apples
13 9 yellow bananas
14 14 yellow strawberries
你可以试试:
import pandas as pd
# create original dataframe
df = pd.DataFrame()
df['apples']=[4.63,24.3,5.24,5.255,9.4]
df['bananas']=[6.57,7.366,2.3,4.9,7.3]
df['strawberries']=[26.2,5.39,8.5,9.2,3.4]
df['color']=['Blue','Green','Red','Yellow','Purple']
# unpivot dataframe
df2 = pd.melt(df,
id_vars='color',
value_vars=list(df.columns[:-1]), # list of fruits
var_name='fruit',
value_name='fruit values')
df2
导致:
color fruit fruit values
0 Blue apples 4.630
1 Green apples 24.300
2 Red apples 5.240
3 Yellow apples 5.255
4 Purple apples 9.400
5 Blue bananas 6.570
6 Green bananas 7.366
7 Red bananas 2.300
8 Yellow bananas 4.900
9 Purple bananas 7.300
10 Blue strawberries 26.200
11 Green strawberries 5.390
12 Red strawberries 8.500
13 Yellow strawberries 9.200
14 Purple strawberries 3.400