将字符串列表拆分为列名，如果字符串中存在，则设置true或false

我有这个数据帧：

id            class                     text
1       ["oil","water"]                text1
2           ["oil"]                    text2
3     ["sun","water","earth"]          text3

我有一个使用这个代码的所有可能类的列表：

import ast
df.class.map(ast.literal_eval).explode().value_counts()

oil
water
sun
earth

我想创建一个新的数据帧，将所有类作为列名，如果列名对应于类列，则设置1：

id            class                 text       oil    water   sun   earth
1       ["oil","water"]           text1       1       1      0       0
2           ["oil"]               text2       1       0      0       0
3     ["sun","water","earth"]     text3       0       1      1       1

我试过了：

f = df.explode('class').pivot(columns='class', index='text', values='text').notnull().astype(int)

但是列名没有正确拆分

(df.merge(df.assign(value=1).explode('class').
pivot_table('value', 'id', 'class', aggfunc = sum,fill_value = 0).
reset_index()))
id                class   text  earth  oil  sun  water
0   1         [oil, water]  text1      0    1    0      1
1   2                [oil]  text2      0    1    0      0
2   3  [sun, water, earth]  text3      1    0    1      1

相关内容

最新更新

热门标签：