在使用pandas_datareader库获取所有SP500公司的数据时,我似乎无法超越伯克希尔哈撒韦公司。
这是代码:
import bs4 as bs
import pickle
import datetime as dt
import os
import pandas as pd
import pandas_datareader.data as web
import requests
def save_sp500_tickers():
resp = requests.get('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = bs.BeautifulSoup(resp.text, "1xml")
table = soup.find('table', {'class': 'wikitable sortable'})
tickers = []
for row in table.findAll('tr')[1:]:
ticker = row.findAll('td')[0].text
mapping = str.maketrans('.', '-')
ticker = ticker.translate(mapping)
tickers.append(ticker)
with open("sp500tickers.pickle", "wb") as f:
pickle.dump(tickers, f)
print(tickers)
return tickers
#save_sp500_tickers()
def get_data_from_yahoo(reload_sp500 = False):
if reload_sp500:
tickers = save_sp500_tickers()
else:
with open("sp500tickers.pickle", "rb") as f:
tickers = pickle.load(f)
if not os.path.exists('stock_dfs'):
os.makedirs('stock_dfs')
start = dt.datetime(2000, 1, 1)
end = dt.datetime(2016, 12, 31)
for ticker in tickers:
if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
df = web.DataReader(ticker, 'yahoo', start, end)
df.to_csv('stock_dfs/{}.csv'.format(ticker))
else:
print('Already have {}'.format(ticker))
get_data_from_yahoo()
是的,我正在学习 pythonprogramming.net 的课程任何帮助将不胜感激!谢谢!
维基百科页面将"."作为符号中的空字符,而雅虎有"-"。替换字符就足够了:
df = web.DataReader(ticker.replace('.','-'), 'yahoo', start, end)
编辑:你的代码很好,我测试了它,它可以工作。当我执行它时,它得到了"BRK-B",而你的则变成了"BRK"。B'显然不存在。我不知道为什么你在那里有"1xml",而它应该是"lxml"。