如何根据字符串匹配数据帧的每一行和每一列- Python



下面有一个字符串和一个数据框架:

s = """Econsult :
IHD/ DM/ HTN
Suggest
Tablet augmentin duo 625mg 1-0-1 for 3 days after breakfast and after dinner
Capsule providac 1-0-1 for 3 days after breakfast and after dinner
Tablet pantodac 40mg 1-0-0 for 3 days before breakfast
Monday
Decide based on reports
Thanks"""

df =

|  col1         | synonnym1      |synonym2
------------------------------------------
| Diabetes      | DM             |Diabetes Mellitus
-----------------------------------------
| Hear failure  | Congestive     |NaN
|               | heart failure  |
-------------------------------------------
| Hypertension  |  HTN           | HBP

你可以看到在字符串中我有一个单词"DM"在数据框的第二列,我有单词"DM"。我想循环数据框的每个元素,使其在字符串中匹配,并返回该数据框的行。在本例中,DM匹配,因此我期望返回(Diabetes, DM, Diabetes Mellitus)。同样,HBP可以在下一次匹配,因此它将返回(Hypertension, HTN, HBP)。

我试着

data['res'] = data.apply( lambda col:col.str.contains('|'.join(str('a')))).any(axis=1)
result = data[data['res']==True]

,但这会将输入字符串匹配到数据帧。我希望数据框的每个元素都与输入字符串匹配,并返回行。

使用pd.DataFrame.applymap:

new_df = df[df.applymap(lambda x: x in s).any(1)]
print(new_df)

输出:

col1 synonnym1           synonym2
0      Diabetes        DM  Diabetes Mellitus
2  Hypertension       HTN                HBP

最新更新