在多个DataFrames Python中查找数字的最大值



我有1000多个.txt文件,其中包含我转换到字典中的股票日期和价格(以文件名(股票行情(为关键字,每个文件的数据作为数据帧(。我用.rolling计算移动平均线,然后找到移动平均线和价格之间的百分比差。因此,百分比差异是每个DataFrame各自的列。所有这些的代码看起来是这样的:

filelist = os.listdir(r'Insert File Path')
filepath = r'Insert File Path'

dic1 = {}
for file in filelist:
df = pd.read_csv(filepath + file,sep='t')
dic1[file]= df
for value in dic1.values():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
for value in dic1.values():
value['ma'] = value['Prices'].rolling(window=50).mean()
for value in dic1.values():
value['diff'] = value['Prices'] - value['ma']
for value in dic1.values():
value['pctdiff']= value['diff']/value['Prices']

我的问题是如何找到pctdiff列中前5个最大(也是最小,因为它们可能是负数(的列?

我试过:

for df in dic1.values():
for num in df['pctdiff'].max():
print(num.max())

但我得到了以下错误:"float"对象不可迭代">

这就是你的意思吗?

list_result = []
for key,value in dic1.items():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
value['ma'] = value['Prices'].rolling(window=50).mean()
value['diff'] = value['Prices'] - value['ma']
value['pctdiff']= value['diff']/value['Prices']
list_result.append([key,value['pctdiff'].max()])
list_result.sort(key = lambda x : x[1] )
highest_list = list_result[-5:]
smallest_list = list_result[:5]

只是为了让代码更干净,并在一个for循环中运行所有变量加法,而不是四个

filelist = os.listdir(r'Insert File Path')
filepath = r'Insert File Path'
dic1 = {}
for file in filelist:
df = pd.read_csv(filepath + file,sep='t')
dic1[file]= df
for value in dic1.values():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
value['ma'] = value['Prices'].rolling(window=50).mean()
value['diff'] = value['Prices'] - value['ma']
value['pctdiff']= value['diff']/value['Prices']

然后在这里使用@Edchum的答案按绝对值对pctdiff进行排序(如果对象是其他对象,则将其转换为pandas系列(。类似的东西(如果你想分类存储(

...
for value in dic1.values():
...
pctdiff = value['diff']/value['Prices']
pctdiff = pctdiff.reindex(pctdiff.abs().sort_values().index)
value['pctdiff']= pctdiff

最新更新