np.根据前缀和值列选择pandas dataframe



我有两个dataframe

main_df,约800行

类别th>一个3

由于关键字框架中只有50行,因此可以遍历这些行并相应地更新主框架:

import numpy as np
import pandas as pd
main_df = pd.DataFrame({'description': ['ABCD', 'XYZ', 'ABC', 'QWE'],
'category': ['ONE', 'THREE', np.nan, np.nan]})
keyword_df = pd.DataFrame({'keyword': ['AB'],
'category': ['FIVE']}) 
for key in keyword_df.itertuples(index=False):
mask = (main_df['description'].str.startswith(key[0]) 
& main_df['category'].isnull())
main_df.loc[mask, 'category'] = key[1] 
main_df
description   category
0   ABCD          ONE
1   XYZ           THREE
2   ABC           FIVE
3   QWE           NaN

最新更新