下面有一个字符串和一个数据框架:
s = """Econsult :
IHD/ DM/ HTN
Suggest
Tablet augmentin duo 625mg 1-0-1 for 3 days after breakfast and after dinner
Capsule providac 1-0-1 for 3 days after breakfast and after dinner
Tablet pantodac 40mg 1-0-0 for 3 days before breakfast
Monday
Decide based on reports
Thanks"""
df =
| col1 | synonnym1 |synonym2
------------------------------------------
| Diabetes | DM |Diabetes Mellitus
-----------------------------------------
| Hear failure | Congestive |NaN
| | heart failure |
-------------------------------------------
| Hypertension | HTN | HBP
你可以看到在字符串中我有一个单词"DM"在数据框的第二列,我有单词"DM"。我想循环数据框的每个元素,使其在字符串中匹配,并返回该数据框的行。在本例中,DM匹配,因此我期望返回(Diabetes, DM, Diabetes Mellitus)。同样,HBP可以在下一次匹配,因此它将返回(Hypertension, HTN, HBP)。
我试着
data['res'] = data.apply( lambda col:col.str.contains('|'.join(str('a')))).any(axis=1)
result = data[data['res']==True]
,但这会将输入字符串匹配到数据帧。我希望数据框的每个元素都与输入字符串匹配,并返回行。
使用pd.DataFrame.applymap
:
new_df = df[df.applymap(lambda x: x in s).any(1)]
print(new_df)
输出:
col1 synonnym1 synonym2
0 Diabetes DM Diabetes Mellitus
2 Hypertension HTN HBP