现在我有了这样的数据帧:(Ticker和date是索引(
Ticker Date Open Low High Close Adj. Close Shares Outstanding
A 2007-01-25 34.99 34.05 35.48 34.30 NaN
2007-01-26 34.30 33.46 34.60 34.41 NaN
....
2007-06-27 34.30 34.00 34.40 34.09 NaN
2007-06-28 33.98 33.68 34.08 33.97 NaN
2007-07-01 34.08 33.63 34.32 34.01 NaN
B 2007-01-12 34.99 34.05 35.48 34.30 NaN
...
2007-08-27 34.30 33.46 34.60 34.41 NaN
2007-08-28 34.30 34.00 34.40 34.09 NaN
2007-09-01 33.98 33.68 34.08 33.97 NaN
2007-09-02 34.08 33.63 34.32 34.01 NaN
C 2007-03-28 34.99 34.05 35.48 34.30 NaN
2007-03-29 34.30 33.46 34.60 34.41 NaN
2007-04-01 34.30 34.00 34.40 34.09 NaN
2007-04-02 33.98 33.68 34.08 33.97 NaN
2007-04-03 34.08 33.63 34.32 34.01 NaN
对于每个股票行情,都有多年来每天的股价。然而,我只想要每个公司的季度末价格,并获得所有公司季度末价格的数据帧。我还想将所有公司的每个季度划分为不同的数据框架。
例如:
Ticker Date Open Low High Close Adj. Close Shares Outstanding
A 2007-06-30 34.99 34.05 35.48 34.30 NaN
B 2007-06-30 34.30 33.46 34.60 34.41 NaN
....
c 2007-06-30 34.30 34.00 34.40 34.09 NaN
曾经的问题是,由于股市周末不会开盘。因此,每个股票行情的季度末可能不是日历季度末。例如,第二季度结束于6/30,但库存季度结束可能在6/29。有没有办法找到每个季度给定日期范围的最后一天?
您只需要一个简单的GroupBy:
quarter = pd.PeriodIndex(df['Date'], freq='Q', name='Quarter')
result = df.groupby(['Ticker', quarter]).last()
获取特定季度的数据:
result.loc[('A', '2019Q1')]
我首先通过groupby
和last
将每个季度中每个股票行情机的最后一行计算到一个临时数据帧中,然后从中提取每个季度的单个数据帧:
tmp = df.groupby(['Ticker', pd.PeriodIndex(df['Date'], freq='Q', name='Quarter')]
).last((.reset_index(级别=0(
individual_df = {str(i): df.reset_index(drop=True) for i, df in tmp.groupby(['Quarter'])}
对于您的示例数据,它给出:
>>> pprint.pprint(individual_df)
{'2007Q1': Ticker Date Open Low High Close Adj. Close Shares Outstanding
0 A 2007-01-26 34.30 33.46 34.60 34.41 NaN
1 B 2007-01-12 34.99 34.05 35.48 34.30 NaN
2 C 2007-03-29 34.30 33.46 34.60 34.41 NaN,
'2007Q2': Ticker Date Open Low High Close Adj. Close Shares Outstanding
0 A 2007-06-28 33.98 33.68 34.08 33.97 NaN
1 C 2007-04-03 34.08 33.63 34.32 34.01 NaN,
'2007Q3': Ticker Date Open Low High Close Adj. Close Shares Outstanding
0 A 2007-07-01 34.08 33.63 34.32 34.01 NaN
1 B 2007-09-02 34.08 33.63 34.32 34.01 NaN}