从yfnance获得的股票报价不是从每个月的第一个开始的,所以我需要在csv中提取每个月第一个可用qupte的Date
和Close
。
Date,Open,High,Low,Close,Adj Close,Volume
2021-01-04 00:00:00-05:00,88.0,88.12449645996094,85.35700225830078,86.30650329589844,86.30650329589844,37324000
2021-01-05 00:00:00-05:00,86.25450134277344,87.34149932861328,85.84500122070312,87.00250244140625,87.00250244140625,20360000
[...]
2021-02-01 00:00:00-05:00,92.22949981689453,95.7770004272461,92.22949981689453,94.65350341796875,94.65350341796875,40252000
2021-02-02 00:00:00-05:00,95.65650177001953,97.46849822998047,95.31849670410156,95.95600128173828,95.95600128173828,66332000
2021-02-03 00:00:00-05:00,103.28050231933594,105.33100128173828,100.67749786376953,102.94400024414062,102.94400024414062,97882000
2021-02-04 00:00:00-05:00,103.03099822998047,103.46499633789062,101.75499725341797,102.68150329589844,102.68150329589844,48596000
[...]
2021-04-29 00:00:00-04:00,119.46150207519531,120.20800018310547,118.69249725341797,119.63800048828125,119.63800048828125,41234000
2021-04-30 00:00:00-04:00,118.4010009765625,119.09249877929688,117.3280029296875,117.67500305175781,117.67500305175781,44856000
2021-05-03 00:00:00-04:00,118.24549865722656,119.07749938964844,116.7750015258789,117.15399932861328,117.15399932861328,28242000
2021-05-04 00:00:00-04:00,115.67749786376953,116.24949645996094,112.83399963378906,115.34149932861328,115.34149932861328,44818000
到目前为止,我只获得了从0开始的天数,但这仍然不是我想要的:
import csv
with open("quotes.csv", "r") as file:
data = csv.reader(file)
for line in data:
if len(line[0]) > 4 and int(line[0][8]) < 1:
print([line[0], line[4]])
以下是我想从上面的示例csv中获得的内容的示例(我跳过了几个月只是为了方便,我仍然需要每个月一天(:
2021-01-04 00:00:00-05:00, 86.30650329589844
2021-02-01 00:00:00-05:00, 94.65350341796875
2021-05-03 00:00:00-04:00, 117.15399932861328
我建议您将第一个值转换为Pythondatetime
对象。然后,您可以根据需要轻松地测试日期。例如,以下方法检测月份的任何变化:
from datetime import datetime
import csv
with open("quotes.csv", "r") as file:
data = csv.reader(file)
header = next(data) # skip the header row
cur_month = None
for line in data:
try:
date = datetime.strptime(line[0][:19], "%Y-%m-%d %H:%M:%S")
if date.month != cur_month:
cur_month = date.month
print(line[0], line[4])
except:
pass # skip over badly formatted dates or empty rows
对于给定的示例,它将给出:
2021-01-04 00:00:00-05:00 86.30650329589844
2021-02-01 00:00:00-05:00 94.65350341796875
2021-04-29 00:00:00-04:00 119.63800048828125
2021-05-03 00:00:00-04:00 117.15399932861328
(由于缺少许多行(