Python/Pandas - 调用特定列的数据帧目录



>我正在尝试从从 csv 文件目录创建的数据帧列表中调用数据帧中的特定列

我有一个从多个 csv 文件创建的数据帧目录:

df_dict = {x: pd.read_csv('{}'.format(x)) for x in files}

其中文件是每个 CSV 的目录。

这段代码运行良好,并导致我想要的输出(如下(。

import pandas_datareader.data as web
import pandas as pd
import datetime as dt
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
from io import StringIO
import time
import csv
from operator import itemgetter
import glob
import os
#File counter
path, dirs, filess = next(os.walk('C:/Users/eagles/Desktop/Financials'))
file_count = len(filess)
#File location & dataframe creation
files = glob.glob('C:/Users/eagles/Desktop/Financials/*.csv')
df_dict = {x: pd.read_csv('{}'.format(x)) for x in files}
#Print each dataframe from directory
for x in range(file_count):
print(df_dict[files[x]])
{'C:/Users/eagles/Desktop/Financials\DELL.csv':           Date Symbols  Adj Close    ...           Low       Open    Volume
0   2019-08-20    DELL  48.470001    ...     48.189999  48.720001   1842700
1   2019-08-21    DELL  48.980000    ...     48.549999  48.669998   1389500
2   2019-08-22    DELL  49.040001    ...     48.419998  49.430000   1619800
3   2019-08-23    DELL  45.810001    ...     45.730000  47.750000   3670700
4   2019-08-26    DELL  46.410000    ...     45.950001  46.000000   2040900
..         ...     ...        ...    ...           ...        ...       ...
83  2019-12-17    DELL  49.930000    ...     49.560001  49.720001   2366600
84  2019-12-18    DELL  50.000000    ...     49.619999  50.000000   2302800
85  2019-12-19    DELL  49.889999    ...     49.779999  50.060001   1518000
86  2019-12-20    DELL  49.619999    ...     49.509998  50.299999   2704500
[87 rows x 8 columns], 'C:/Users/eagles/Desktop/Financials\EBS.csv':           Date Symbols  Adj Close   ...           Low       Open   Volume
0   2019-08-20     EBS  42.900002   ...     42.880001  44.020000   312300
1   2019-08-21     EBS  42.099998   ...     41.400002  43.509998   372000
2   2019-08-22     EBS  41.599998   ...     41.310001  42.380001   365900
3   2019-08-23     EBS  40.820000   ...     40.590000  41.680000   347800
..         ...     ...        ...   ...           ...        ...      ...
83  2019-12-17     EBS  52.680000   ...     51.169998  51.610001   265800
84  2019-12-18     EBS  52.340000   ...     51.320000  52.540001   374300
85  2019-12-19     EBS  53.919998   ...     51.689999  52.430000   250600
86  2019-12-20     EBS  54.419998   ...     53.590000  53.900002   817200
[87 rows x 8 columns], 'C:/Users/eagles/Desktop/Financials\GRPN.csv':           Date Symbols  Adj Close  Close  High   Low  Open    Volume
0   2019-08-20    GRPN       2.51   2.51  2.54  2.41  2.43   7692500
1   2019-08-21    GRPN       2.51   2.51  2.54  2.48  2.54   5141800
2   2019-08-22    GRPN       2.47   2.47  2.67  2.46  2.49   9225700
3   2019-08-23    GRPN       2.40   2.40  2.47  2.37  2.45   8404700
..         ...     ...        ...    ...   ...   ...   ...       ...
83  2019-12-17    GRPN       2.39   2.39  2.57  2.35  2.54  18253900
84  2019-12-18    GRPN       2.39   2.39  2.44  2.38  2.42   4645600
85  2019-12-19    GRPN       2.28   2.28  2.40  2.24  2.39  11894500
86  2019-12-20    GRPN       2.23   2.23  2.29  2.21  2.28  11354400

我还可以使用df_dict[files[0]]调用特定的数据帧

但我不能做的是调用特定的数据帧和特定的列。

我希望看到的输出是仅包含每个数据帧的"关闭"列:

#Dell
df_dict[i]['Close']
Date          Close
0   2019-08-20  48.470001 
1   2019-08-21  48.980000 
2   2019-08-22  49.040001
3   2019-08-23  45.810001 
4   2019-08-26  46.410000
#EBS
df_dict[i+1]['Close']
Date          Close
0   2019-08-20 48.470001 
1   2019-08-21 48.980000 
2   2019-08-22 49.040001
3   2019-08-23 45.810001 
4   2019-08-26 46.410000
#GRPN
df_dict[i+2]['Close']
Date          Close
0   2019-08-20  48.470001 
1   2019-08-21  48.980000 
2   2019-08-22  49.040001
3   2019-08-23  45.810001 
4   2019-08-26  46.410000

有人对我如何实现这一目标有建议吗?

无法模拟类似的数据。但是您可以尝试以下代码

for x in range(file_count): print(df_dict[files[x]][['Date','Close']])

最新更新