将列表与DataFrame中的列进行比较.如果匹配,则追加到一个新列



我在数据帧中有一列,其中包含一个以位置代码结尾的字符串。例如,Growers SeGrowersSecret 14AG CHEM

locations = ["AG CHEM", "AG SEED", "BH CHEM", "BH FARM", 'BH GREEN', 'CT CHEM', 'Bighorn Farm', 'Courthouse Farm']


df["Location Code"] = ""
loc = []

for i in df["str"]:
stlen = len(i)

for x in locations:
loclen = len(x)
start, stop = stlen - loclen, 50
if :
loc.append(x)

df["Location Code"]  = loc   

位置列表包含所有可能的位置。我想将列表与字符串的那部分进行比较,并在数据帧中有一个单独的列来表示位置。我试过str.endswith(),但也没用。

非常感谢所有的帮助!

使用此代码:

def f(x):
for i in locations:
if x.find(i)>-1:
return i
df['location']= df['str'].apply(f)

给定:

col
0  Growers SeGrowersSecret 14AG CHEM

操作:

locations = ["AG CHEM", "AG SEED", "BH CHEM", "BH FARM", 'BH GREEN', 'CT CHEM', 'Bighorn Farm', 'Courthouse Farm']
regex = '(' + '|'.join(locations) + ')'
df['locations'] = df.col.str.extract(regex)
print(df)

输出:

col locations
0  Growers SeGrowersSecret 14AG CHEM   AG CHEM

最新更新