我试图使用loc来获取数据帧中某个条件下的行子集,但我想通过用户输入来获取该条件,然后将其输入到loc语句中以创建行子集。
我尝试了很多方法,但我不认为loc会接受这种格式的字符串中的条件,有办法解决这个问题吗?
参见下面的尝试:
col_one = input("Please enter the condition you would like to set. E.g. State == "New York":)
user_input_test.append(col_one)
one_condition_input = self.df.loc[self.df[user_input_test],:]
# I also tried to use slice but no luck:
col_one = input("Please enter the condition you would like to set. E.g. State == "New York":)
period = slice(col_one)
self.one_condition_input = self.df.loc[period,:]
# And I tired to use format, taking two user inputs, one with column name and one with the condition, but again no luck:
col_one = input("Please enter the column you would like to set. E.g. State":)
col_two = input("Please enter the condition you would like to set. E.g. == "New York":)
one_condition_input = self.df.loc[self.df["{}".format(col_one)]"{}".format(col_two),:]
我希望能够接受整个条件的用户输入,并像这样粘贴:
col_one = input("Please enter the condition you would like to set. E.g. State == "New York":)
self.one_condition_input = self.df.loc[df.col_one,:]
但很明显,这里col_one不是df的属性,所以它不起作用。
尝试pandas.DataFrame.query()
,您可以传递一个表达式。因此,您可以要求用户插入表达式,然后将其传递给函数。
expr = input()
df.query(expr, inplace = True)
Pandas查询文档
DataFrame.loc
属性:通过标签或布尔值数组访问一组行和列。
DataFrame.iloc
属性:纯整数基于位置的索引,用于按位置进行选择。
实际上,这些接受一个值作为文本字符串,将其索引到相应的列,我建议您使用用户输入,但使用这些值进行条件处理
user_input_test.append(col_one)
one_condition_input = df.loc[df[user_input_test],:]
相反:
user_input_test.append(col_one)
cond = re.findall(r'w+', user_input)
col = cond[0]
col_element = " ".join(cond[1:])
one_condition_input = df.loc[df[col == col_element],:]
.
.
.
>>> user_input = "State == New York" # User input value
>>> cond = re.findall(r'w+', user_input) # Separate strings
['State', 'New', 'York']
>>> # This is equivalent to df.loc[df["State" == "New York"]]
>>> one_condition_input = df.loc[df[col == col_element],:] # Values correspoding to columns containing "New York" state.