>我正在尝试从从 csv 文件目录创建的数据帧列表中调用数据帧中的特定列
我有一个从多个 csv 文件创建的数据帧目录:
df_dict = {x: pd.read_csv('{}'.format(x)) for x in files}
其中文件是每个 CSV 的目录。
这段代码运行良好,并导致我想要的输出(如下(。
import pandas_datareader.data as web
import pandas as pd
import datetime as dt
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
from io import StringIO
import time
import csv
from operator import itemgetter
import glob
import os
#File counter
path, dirs, filess = next(os.walk('C:/Users/eagles/Desktop/Financials'))
file_count = len(filess)
#File location & dataframe creation
files = glob.glob('C:/Users/eagles/Desktop/Financials/*.csv')
df_dict = {x: pd.read_csv('{}'.format(x)) for x in files}
#Print each dataframe from directory
for x in range(file_count):
print(df_dict[files[x]])
{'C:/Users/eagles/Desktop/Financials\DELL.csv': Date Symbols Adj Close ... Low Open Volume
0 2019-08-20 DELL 48.470001 ... 48.189999 48.720001 1842700
1 2019-08-21 DELL 48.980000 ... 48.549999 48.669998 1389500
2 2019-08-22 DELL 49.040001 ... 48.419998 49.430000 1619800
3 2019-08-23 DELL 45.810001 ... 45.730000 47.750000 3670700
4 2019-08-26 DELL 46.410000 ... 45.950001 46.000000 2040900
.. ... ... ... ... ... ... ...
83 2019-12-17 DELL 49.930000 ... 49.560001 49.720001 2366600
84 2019-12-18 DELL 50.000000 ... 49.619999 50.000000 2302800
85 2019-12-19 DELL 49.889999 ... 49.779999 50.060001 1518000
86 2019-12-20 DELL 49.619999 ... 49.509998 50.299999 2704500
[87 rows x 8 columns], 'C:/Users/eagles/Desktop/Financials\EBS.csv': Date Symbols Adj Close ... Low Open Volume
0 2019-08-20 EBS 42.900002 ... 42.880001 44.020000 312300
1 2019-08-21 EBS 42.099998 ... 41.400002 43.509998 372000
2 2019-08-22 EBS 41.599998 ... 41.310001 42.380001 365900
3 2019-08-23 EBS 40.820000 ... 40.590000 41.680000 347800
.. ... ... ... ... ... ... ...
83 2019-12-17 EBS 52.680000 ... 51.169998 51.610001 265800
84 2019-12-18 EBS 52.340000 ... 51.320000 52.540001 374300
85 2019-12-19 EBS 53.919998 ... 51.689999 52.430000 250600
86 2019-12-20 EBS 54.419998 ... 53.590000 53.900002 817200
[87 rows x 8 columns], 'C:/Users/eagles/Desktop/Financials\GRPN.csv': Date Symbols Adj Close Close High Low Open Volume
0 2019-08-20 GRPN 2.51 2.51 2.54 2.41 2.43 7692500
1 2019-08-21 GRPN 2.51 2.51 2.54 2.48 2.54 5141800
2 2019-08-22 GRPN 2.47 2.47 2.67 2.46 2.49 9225700
3 2019-08-23 GRPN 2.40 2.40 2.47 2.37 2.45 8404700
.. ... ... ... ... ... ... ... ...
83 2019-12-17 GRPN 2.39 2.39 2.57 2.35 2.54 18253900
84 2019-12-18 GRPN 2.39 2.39 2.44 2.38 2.42 4645600
85 2019-12-19 GRPN 2.28 2.28 2.40 2.24 2.39 11894500
86 2019-12-20 GRPN 2.23 2.23 2.29 2.21 2.28 11354400
我还可以使用df_dict[files[0]]
调用特定的数据帧
但我不能做的是调用特定的数据帧和特定的列。
我希望看到的输出是仅包含每个数据帧的"关闭"列:
#Dell
df_dict[i]['Close']
Date Close
0 2019-08-20 48.470001
1 2019-08-21 48.980000
2 2019-08-22 49.040001
3 2019-08-23 45.810001
4 2019-08-26 46.410000
#EBS
df_dict[i+1]['Close']
Date Close
0 2019-08-20 48.470001
1 2019-08-21 48.980000
2 2019-08-22 49.040001
3 2019-08-23 45.810001
4 2019-08-26 46.410000
#GRPN
df_dict[i+2]['Close']
Date Close
0 2019-08-20 48.470001
1 2019-08-21 48.980000
2 2019-08-22 49.040001
3 2019-08-23 45.810001
4 2019-08-26 46.410000
有人对我如何实现这一目标有建议吗?
无法模拟类似的数据。但是您可以尝试以下代码
for x in range(file_count):
print(df_dict[files[x]][['Date','Close']])