我在数据帧中有一个名称列,其中有多个名称。
数据帧
import pandas as pd
df = pd.DataFrame({'name': ['Brailey, Mr. William Theodore Ronald', 'Roger Marie Bricoux',
"Mr. Roderick Robert Crispin",
"Cunningham"," Mr. Alfred Fleming"]})`
输出
Name
0 Brailey, Mr. William Theodore Ronald
1 Roger Marie Bricoux
2 Mr. Roderick Robert Crispin
3 Cunningham
4 Mr. Alfred Fleming
我写了一个行分类函数,就像如果我传递行/名称一样,它应该返回输出类
mus = ['Brailey, Mr. William Theodore Ronald', 'Roger Marie Bricoux', 'John Frederick Preston Clarke']
def classify_role(row):
if row.loc['name'] in mus:
return 'musician'
调用函数
is_brailey = df['name'].str.startswith('Brailey')
print(classify_role(df[is_brailey].iloc[0]))
应该显示"音乐家" 但是输出显示不同的类,我想我在这里写错了什么classify_role()
必须是此行if row.loc['name'] in mus:
总结: 如果我把一个人的名字放在startswith()
中,我需要musi
它应该返回musician
编辑:如果要测试列表中是否存在值,您可以创建字典并通过以下方式测试成员资格Series.isin
:
mus = ['Brailey, Mr. William Theodore Ronald', 'Roger Marie Bricoux',
'John Frederick Preston Clarke']
cat1 = ['Mr. Alfred Fleming','Cunningham']
d = {'musician':mus, 'category':cat1}
for k, v in d.items():
df.loc[df['Name'].isin(v), 'type'] = k
print (df)
Name type
0 Brailey, Mr. William Theodore Ronald musician
1 Roger Marie Bricoux musician
2 Mr. Roderick Robert Crispin NaN
3 Cunningham category
4 Mr. Alfred Fleming category
应更改解决方案:
mus = ['Brailey, Mr. William Theodore Ronald', 'Roger Marie Bricoux',
'John Frederick Preston Clarke']
def classify_role(row):
if row in mus:
return 'musician'
df['type'] = df['Name'].apply(classify_role)
print (df)
Name type
0 Brailey, Mr. William Theodore Ronald musician
1 Roger Marie Bricoux musician
2 Mr. Roderick Robert Crispin None
3 Cunningham None
4 Mr. Alfred Fleming None
您可以将元组中的值传递给Series.str.startswith
,解决方案应扩展以按字典匹配更多类别:
d = {'musician': ['Brailey, Mr. William Theodore Ronald'],
'cat1':['Roger Marie Bricoux', 'Cunningham']}
for k, v in d.items():
df.loc[df['Name'].str.startswith(tuple(v)), 'type'] = k
print (df)
Name type
0 Brailey, Mr. William Theodore Ronald musician
1 Roger Marie Bricoux cat1
2 Mr. Roderick Robert Crispin NaN
3 Cunningham cat1
4 Mr. Alfred Fleming NaN