从另一列创建列，该列是一个项目列表

假设我有一个DataFrame，列A是一个字符串列表，格式为"Type:Value"Type可以有5个不同的值Value可以是任何值。我想要做的是创建新的5列(每个列都有适当的Type名称)，其中每列中的值将是具有给定Type的项目列表。因此，如果我有(为了简单起见，只有一行):

df = pd.DataFrame("A": [["Type1:Value1", "Type2:Value2", "Type1:Value3"]])

，那么结果应该是:

df = pd.DataFrame("Type1": [["Value1", "Value3"]], "Type2":[["Value2"]])

不用说了，但可能有更好的方法来做到这一点。

import pandas as pd
df = pd.DataFrame({"A": [["Type1:Value1", "Type2:Value2", "Type1:Value3"]]})
buffer_dict = {}  # placeholder dict
for index, row in df.iterrows():
for str_value in row['A']:
str_list = str_value.split(':')
key = str_list[0]  # these are just for readability
value = str_list[1]
buffer_dict.setdefault(key, []).append(value)  # set default to list and append values
buffer_dict.update((k, [v]) for k, v in buffer_dict.items())  # enclose values in list so we can convert to df
result = pd.DataFrame.from_dict(buffer_dict)
print(result)

结果:

Type1     Type2
0  [Value1, Value3]  [Value2]

编辑:我错过了只能有5种类型的部分。我的解决方案是假设这是未知的，并且将适用于任何数量的类型。

一个解决方案。这也可以在循环中完成。但是由于列的数量很少，代码的自动化程度较低。

df = pd.DataFrame({"A": [["Type1:Value1", "Type2:Value2", "Type1:Value3"]]})
df[['x','y','z']] = df.A[0]

df['type1'] = df.x.str.split(':').str[1]
df['type2'] = "[" +"[" + df.x.str.split(':').str[1]+ "]"
df['type1'] = "[" +"[" + df['type1'] +","+ df.x.str.split(':').str[1] + "]"+ "]"

print(df.drop(['A','x','y','z'], axis = 'columns'))

type1      type2
0  [[Value1,Value1]]  [[Value1]

相关内容

最新更新

热门标签：