如何使用for循环以1米为间隔查找过去2年的股价



这里是Python的新手。我有7项资产。我找到了过去两年的每日调整收盘价。但是,我需要一分钟一分钟的数据。这就是我目前所拥有的:

import pandas as pd
import yfinance as yf
import datetime as dt
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
def dl_data(i):
years = 2
end = dt.datetime.today()
start = end - dt.timedelta(365*years) 
tickers = ["SBUX", "MCD", "CMG", "WEN", "DPZ", "YUM", "DENN"]
return(yf.download(tickers, start, end)['Adj Close'])
data3 = yf.download(tickers, period='7d', interval='1m')['Adj Close']
for i in range(1,504): 
data3 = data3.append(dl_data(i))

Python在7天内只允许1米的间隔,否则会收到错误消息。因此,我决定使用循环将其附加到原始数据集。然而,当我在循环运行后编写data3.head((时,它最早会返回到2021年11月8日。我的理解是,对于范围(1504(中的i,在过去504天内执行循环,对吗?否则怎么办?

yfinance不允许您下载超过30天的1分钟图表上的历史数据。并且只允许您在7天内下载它们。

以下是如果你尝试的话会得到的错误:

1 Failed download:
- EURUSD=X: 1m data not available for startTime=1654387200 and endTime=1654905600. The requested range must be within the last 30 days.

虽然由于yfinance的限制,我无法完全回答您的问题,但我可以回答它,为您提供长达30天的100万数据。如果我被迫使用贷款,我就是这样做的。由于这些限制,我通常使用alpacabinance之类的其他东西。

首先,制作一个包含日期范围的数据帧

import pandas as pd
TODAY = pd.to_datetime("today").date()
START = (TODAY - pd.DateOffset(days=29)).date()
# Reference: https://stackoverflow.com/a/48131963/16051077
d1 = pd.date_range(start=START, end=TODAY, freq="7D")
d2 = d1.shift(6, freq="d")
# fix end date (make sure latest end_date it doesn't go over end_date)
d2 = list(d2)[:-1] + [min(d2[-1], pd.Timestamp(TODAY))]
dates = pd.DataFrame(
data=dict(start_date=d1, end_date=d2), columns=("start_date", "end_date")
)

输出:

start_date    end_date
0   2022-06-06  2022-06-12
1   2022-06-13  2022-06-19
2   2022-06-20  2022-06-26
3   2022-06-27  2022-07-03
4   2022-07-04  2022-07-05

在for循环中使用datesDataFrame

import yfinance as yf
df_list = []
for i in dates.index:
start = dates.at[i, "start_date"]
end = dates.at[i, "end_date"]
tickers = ["TSLA", "MSFT", "AMZN"]
df = yf.download(tickers, start=start, end=end, interval="1m")["Adj Close"]
df_list.append(df)
history = pd.concat(df_list)

输出:

注意:由于市场假期,数据不包括所有日期

[*********************100%***********************]  3 of 3 completed
[*********************100%***********************]  3 of 3 completed
[*********************100%***********************]  3 of 3 completed
[*********************100%***********************]  3 of 3 completed
[*********************100%***********************]  3 of 3 completed
AMZN        MSFT        TSLA
2022-06-06 09:30:00-04:00   125.574501  273.179993  731.722900
2022-06-06 09:31:00-04:00   125.190002  273.500000  730.260010
2022-06-06 09:32:00-04:00   124.559998  273.190002  727.300110
2022-06-06 09:33:00-04:00   124.167503  273.519989  726.155029
2022-06-06 09:34:00-04:00   124.719902  273.220001  723.989990
... ... ... ...
2022-07-01 15:56:00-04:00   109.489998  259.029999  680.890015
2022-07-01 15:57:00-04:00   109.389999  259.079987  680.869995
2022-07-01 15:58:00-04:00   109.474998  259.369995  680.710022
2022-07-01 15:59:00-04:00   109.550003  259.539001  681.890015
2022-07-01 16:00:00-04:00   109.559998  259.579987  681.789978
7409 rows × 3 columns

完整代码:

import pandas as pd
import yfinance as yf
TODAY = pd.to_datetime("today").date()
START = (TODAY - pd.DateOffset(days=29)).date()
# Reference: https://stackoverflow.com/a/48131963/16051077
d1 = pd.date_range(start=START, end=TODAY, freq="7D")
d2 = d1.shift(6, freq="d")
# fix end date (make sure latest end_date it doesn't go over end_date)
d2 = list(d2)[:-1] + [min(d2[-1], pd.Timestamp(TODAY))]
dates = pd.DataFrame(
data=dict(start_date=d1, end_date=d2), columns=("start_date", "end_date")
)
df_list = []
for i in dates.index:
start = dates.at[i, "start_date"]
end = dates.at[i, "end_date"]
tickers = ["TSLA", "MSFT", "AMZN"]
df = yf.download(tickers, start=start, end=end, interval="1m")["Adj Close"]
df_list.append(df)
history = pd.concat(df_list)