我有几个Pandas数据帧,名称为-timeslices1_df。我想从这个数据帧中提取某些列,但有用户输入条件。
def user_input_dataframe():
timeslices_number=int(input("timeslice_number:"))
process_number=int(input("process_number:"))
core_number=input("core_number:")
#timeslices_4__profilerdataprocess_45__c0_us_ example column_name
dataframe_name="timeslices_"+str(timeslices_number)+"_df"
column_name="timeslices_"+str(timeslices_number) +'__'+ "profilerdataprocess_"+str(process_number)+'__'+str(core_number)+'_'+"us"
#print(column_name)
list_of_datasets = [timeslices0_df,timeslices1_df,timeslices2_df ,timeslices3_df,timeslices4_df,timeslices5_df,
timeslices6_df,timeslices7_df,timeslices8_df]
for index, dataset in enumerate(list_of_datasets):
if dataframe_name in dataset:
X_df=pd.DataFrame()
X_df.append(dataframe_name)
X1 = [col for col in X_df.columns if column_name in col]
X2=pd.DataFrame()
X2=X_df[X1]
X2['date'] = pd.date_range(start='1/1/2020', periods=len(X1), freq='D')
X2=X2.set_index('date')
return X2
我不能这样做,因为用户输入是一个字符串。我得到这个错误。是否有其他方法可以使用用户输入功能检索数据帧?
Sample Input:
timeslice_number:4
process_number:45
core_number:c0
Expected Output:new dataframe with a single selected column
Actual Output: Empty dataframe
您似乎想要以编程方式访问变量。内置函数locals((和globals((可以实现这一点。
variable1 = 1
variable2 = 2
variable3 = 3
i = 2
print(locals().get(f'variable{i}'))
# prints '2'
def get_variable(i):
return globals().get(f'variable{i}')
print(get_variable(3))
# prints '3'
但是,将DataFrames保存在列表或dict中不是更干净吗?类似于:
timeslice_dfs = [
timeslices1_df,
timeslices2_df,
# etc.
]
dfs = {
'timeslice1': timeslices1_df,
# etc.
}
你能试试吗?
def user_input_dataframe():
timeslices_number=int(input("timeslice_number:"))
process_number=int(input("process_number:"))
core_number=input("core_number:")
#timeslices_4__profilerdataprocess_45__c0_us_ example column_name
column_name="timeslices_"+str(timeslices_number) +'__'+ "profilerdataprocess_"+str(process_number)+'__'+str(core_number)+'_'+"us"
list_of_datasets = [timeslices0_df,timeslices1_df,timeslices2_df ,timeslices3_df,timeslices4_df,timeslices5_df,
timeslices6_df,timeslices7_df,timeslices8_df]
if timeslices_number >= len(list_of_datasets):
return None
dataset = list_of_datasets[timeslices_number]
X1 = dataset[[col for col in dataset.columns if column_name in col]]
X1['date'] = pd.date_range(start='1/1/2020', periods=len(X1), freq='D')
X1=X1.set_index('date')
return X1