我在数据帧中有一列,其中包含一个以位置代码结尾的字符串。例如,Growers SeGrowersSecret 14AG CHEM
locations = ["AG CHEM", "AG SEED", "BH CHEM", "BH FARM", 'BH GREEN', 'CT CHEM', 'Bighorn Farm', 'Courthouse Farm']
df["Location Code"] = ""
loc = []
for i in df["str"]:
stlen = len(i)
for x in locations:
loclen = len(x)
start, stop = stlen - loclen, 50
if :
loc.append(x)
df["Location Code"] = loc
位置列表包含所有可能的位置。我想将列表与字符串的那部分进行比较,并在数据帧中有一个单独的列来表示位置。我试过str.endswith()
,但也没用。
非常感谢所有的帮助!
使用此代码:
def f(x):
for i in locations:
if x.find(i)>-1:
return i
df['location']= df['str'].apply(f)
给定:
col
0 Growers SeGrowersSecret 14AG CHEM
操作:
locations = ["AG CHEM", "AG SEED", "BH CHEM", "BH FARM", 'BH GREEN', 'CT CHEM', 'Bighorn Farm', 'Courthouse Farm']
regex = '(' + '|'.join(locations) + ')'
df['locations'] = df.col.str.extract(regex)
print(df)
输出:
col locations
0 Growers SeGrowersSecret 14AG CHEM AG CHEM