添加一个具有字典中的键和值的列



我有这个熊猫数据帧

import pandas as pd

df = pd.DataFrame([{'col1': ['plane', 'chair']}, {'col1': ['computer', 'beach', 'book', 'language']}, {'col1': ['rice', 'bus', 'street']}])

我有这本字典

categories = {
'transport': ['car', 'truck', 'plane'],
'reading': ['book', 'library'],
'food': ['rice', 'milk', 'tea']
}

我想要这样的最终输出:

index col1  col2
0: ['plane', 'chair'], transport-plane
1: ['computer', 'beach', 'book', 'language'], reading-book
2: ['rice', 'bus', 'street'], food-rice

我希望col2拥有字典中的关键字和值。

我只添加了字典中的键,但没有添加字典中的密钥和值。

尝试:

tmp = {vv: k for k, v in categories.items() for vv in v}
x = df.explode("col1")
x["col2"] = x["col1"].apply(
lambda x: "{}-{}".format(tmp[x], x) if x in tmp else np.nan
)
x = x.groupby(level=0).agg(
col1=("col1", list), col2=("col2", lambda x: ", ".join(x[x.notna()]))
)
print(x)

打印:

col1             col2
0                     [plane, chair]  transport-plane
1  [computer, beach, book, language]     reading-book
2                [rice, bus, street]        food-rice

怎么样:

import pandas as pd

df = pd.DataFrame([{'col1': ['plane', 'chair']}, {'col1': ['computer', 'beach', 'book', 'language']}, {'col1': ['rice', 'bus', 'street']}])
categories = {
'transport': ['car', 'truck', 'plane'],
'reading': ['book', 'library'],
'food': ['rice', 'milk', 'tea']
}
def match_pairs(categories, df):
col2=[]
index=0
for v in categories:
print(f'{df["col1"][index]} at index {index}')
for entry in df['col1'][index]:
print(f"Finding [{entry}] in {categories[v]}...")
if entry in categories[v]:
col2.append(v+'-'+entry)
break
index += 1
print(col2)
df['col2'] = col2
return df
print (match_pairs(categories, df))

最新更新