无法在 Python 中获取有关从纳斯达克检索数据的帮助



我计划使用纳斯达克的数据进行一些金融研究和学习。

我想从纳斯达克检索数据,以便标头具有以下内容:

股票代码
公司名称
最后销售
市值
首次公开募股 年份
行业 行业

最后更新

我使用Python代码通过以下方式获取"公司列表和股票代码名称":

import pandas as pd
import json
PACKAGE_NAME = 'nasdaq-listings'
PACKAGE_TITLE = 'Nasdaq Listings'
nasdaq_listing = 'ftp://ftp.nasdaqtrader.com/symboldirectory/nasdaqlisted.txt'# Nasdaq only

def process():
nasdaq = pd.read_csv(nasdaq_listing,sep='|')
nasdaq = _clean_data(nasdaq)
# Create a few other data sets
nasdaq_symbols = nasdaq[['Symbol','Company Name']] # Nasdaq  w/ 2 columns
# (dataframe, filename) datasets we will put in schema & create csv
datasets = [(nasdaq,'nasdaq-listed'), (nasdaq_symbols,'nasdaq-listed-symbols')]
for df, filename in datasets:
df.to_csv('data/' + filename + '.csv', index=False)
with open("datapackage.json", "w") as outfile:
json.dump(_create_datapackage(datasets), outfile, indent=4, sort_keys=True)

def _clean_data(df):
# TODO: do I want to save the file creation time (last row)
df = df.copy()
# Remove test listings
df = df[df['Test Issue'] == 'N']
# Create New Column w/ Just Company Name
df['Company Name'] = df['Security Name'].apply(lambda x: x.split('-')[0]) #nasdaq file uses - to separate stock type
#df['Company Name'] = TODO, remove stock type for otherlisted file (no separator)
# Move Company Name to 2nd Col
cols = list(df.columns)
cols.insert(1, cols.pop(-1))
df = df.loc[:, cols]
return df

def _create_file_schema(df, filename):
fields = []
for name, dtype in zip(df.columns,df.dtypes):
if str(dtype) == 'object' or str(dtype) == 'boolean': # does datapackage.json use boolean type?
dtype = 'string'
else:
dtype = 'number'
fields.append({'name':name, 'description':'', 'type':dtype})
return {
'name': filename,
'path': 'data/' + filename + '.csv',
'format':'csv',
'mediatype': 'text/csv',
'schema':{'fields':fields}
}

def _create_datapackage(datasets):
resources = []
for df, filename in datasets:
resources.append(_create_file_schema(df,filename))
return {
'name': PACKAGE_NAME,
'title': PACKAGE_TITLE,
'license': '',
'resources': resources,
}

process()

现在,对于这些符号中的每一个,我想获取其他数据(如上所示(。

有没有办法我可以做到这一点?

你看过熊猫数据阅读器吗?您也许可以从那里获取其他数据。它有多个数据源,例如谷歌、雅虎财经、

http://pandas-datareader.readthedocs.io/en/latest/remote_data.html#remote-data-google

最新更新