我有两个dataframe
main_df,约800行
类别th>一个 3
由于关键字框架中只有50行,因此可以遍历这些行并相应地更新主框架:
import numpy as np
import pandas as pd
main_df = pd.DataFrame({'description': ['ABCD', 'XYZ', 'ABC', 'QWE'],
'category': ['ONE', 'THREE', np.nan, np.nan]})
keyword_df = pd.DataFrame({'keyword': ['AB'],
'category': ['FIVE']})
for key in keyword_df.itertuples(index=False):
mask = (main_df['description'].str.startswith(key[0])
& main_df['category'].isnull())
main_df.loc[mask, 'category'] = key[1]
main_df
description category
0 ABCD ONE
1 XYZ THREE
2 ABC FIVE
3 QWE NaN