我有一个csv与几个人。我想构建一个函数,它将根据给定的所有参数进行过滤,或者在没有传递参数的情况下返回整个数据框。
那么给定这个作为csv:
FirstName LastName City
Matt Fred Austin
Jim Jack NYC
Larry Bob Houston
Matt Spencer NYC
如果我调用我的函数find
,假设这里是我期望看到的取决于我传递的参数
find(first="Matt", last="Fred")
Output: Matt Fred Austin
find()
Output: Full Dataframe
find(last="Spencer")
Output: Matt Spencer Fred
find(address="NYC")
Output: All people living in NYC in dataframe
这是我尝试过的:
def find(first=None, last=None, city=None):
file= pd.read_csv(list)
searched = file.loc[(file["FirstName"] == first) & (file["LastName" == last]) & (file["City"] == city)]
return searched
如果只传递名字,则返回空白
你可以这样做:
import numpy as np
def find(**kwargs):
assert np.isin(list(kwargs.keys()), df.columns).all()
return df.loc[df[list(kwargs.keys())].eq(list(kwargs.values())).all(axis=1)]
search = find(FirstName="Matt", LastName="Fred")
print(search)
# FirstName LastName City
#0 Matt Fred Austin
find(LastName="Spencer")
# FirstName LastName City
#3 Matt Spencer NYC
如果您想使用"first"
,"last"
和"city"
:
def find(**kwargs):
df_index = df.rename(columns={"FirstName": "first",
"LastName": "last",
"City": "city"})
assert np.isin(list(kwargs.keys()), df_index.columns).all()
return df.loc[df_index[list(kwargs.keys())]
.eq(list(kwargs.values())).all(axis=1)]
另一种过滤列的方法:
csv_path = os.path.abspath('test.csv')
df = pd.read_table(csv_path, sep='s+')
def find_by_attrs(df, **attrs):
if attrs.keys() - df.columns:
raise KeyError('Improper column name(s)')
return df[df[attrs.keys()].eq(attrs.values()).all(1)]
print(find_by_attrs(df, City="NYC"))
输出:
FirstName LastName City
1 Jim Jack NYC
3 Matt Spencer NYC