假设我有一个名人的数据框架,包括他们的年龄、种族、身高、行业等。
我想创建一个函数,在这里我可以系统地过滤数据帧,这样就可以应用多个过滤器。
例如
def filter_data(df, filter_col, filter_val, filter_amount):
if filter_amount == 1:
df = df[df.filter_col[0] == filter_val[0]]
if filter_amount == 2:
df = df[(df.filter_col[0] == filter_val[0]) & (df.filter_col[1] == filter_val[1])]
etc
其中,filter_col是要根据其进行筛选的列的列表,filter_val也是值的列表,而filter_amount是整数
我希望它是系统的,这样对于任何过滤量,它都可以根据列表的值过滤数据集,而不必手动将其编码为
帮助。
由于过滤器执行和(&(,因此这样做是有意义的:
import pandas as pd
def filter_data(df, filter_col, filter_val, filter_amount):
out = df.copy()
for i in range(filter_amount):
out = out[out[filter_col[i]] == filter_val[i]]
return out
def main():
x = pd.DataFrame({"Age": [12, 44, 23], "Ethnicity": ["White", "Black", "White"], "Height": [180, 182, 168]})
# Age Ethnicity Height
# 0 12 White 180
# 1 44 Black 182
# 2 23 White 168
y = filter_data(x, ["Ethnicity", "Height"], ["White", 180], 1)
# Age Ethnicity Height
# 0 12 White 180
# 2 23 White 168
z = filter_data(x, ["Ethnicity", "Height"], ["White", 180], 2)
# Age Ethnicity Height
# 0 12 White 180
filter_vals = [1, 2, 3]
filter_amount = 3
filtered_df = [df[df[col] == val] for val in filter_vals]