如何使用模式过滤Pandas数据框架?



是否有一种方法来过滤熊猫DataFrame行使用通配符模式?

数据的初始状态示例。

df = pd.DataFrame([
['noun','nominative','singular','m','',''],
['noun','nominative','singular','f','',''],
['noun','nominative','singular','n','',''],
['noun','accusative','singular','n','',''],
['noun','accusative','singular','n','',''],
['noun','accusative','singular','n','',''],
['verb','','singular','','present','1per'],
['verb','','singular','','present','2per'],
['verb','','singular','','present','3per'],
['verb','','plural','','present','1per'],
['verb','','plural','','present','2per'],
['verb','','plural','','present','3per'],
],columns=['pos', 'case', 'number', 'gender', 'tense', 'person'])
mask = pd.Series(['noun','nominative','singular','*','',''])

数据的客观最终状态:

['noun','nominative','singular','m','',''],
['noun','nominative','singular','f','',''],
['noun','nominative','singular','n','',''],

在进行比较时可以省略通配符列:

pattern = ['noun', 'nominative', 'singular', '', '']
cols_to_match = ['pos', 'case', 'number', 'tense', 'person']
mask = (df[cols_to_match] == pattern).all(axis=1)
df_filtered = df[mask]

最新更新