基本上我有这段为我工作的代码,它的目的是从API下载整个系列关于股票报价在wallstreetbets子上被提及多少次。
这是代码:
import requests
tickers = open("ticker_list.txt","r")
for ticker in tickers:
ticker = ticker.strip()
url = "https://XXX SENSIBLE INFO/historical/wallstreetbets/"+ticker
headers = {'XXX (SENSIBLE INFO'}
r = requests.get(url, headers=headers)
print(r.content)
其中。txt文件是一个包含大约8000个股票代码的简单列表。
我将向您展示输出的第一行是什么,仅作为一个示例:
b'[{"Date": "2018-08-10", "Ticker": "AA", "Mentions": 1}, {"Date": "2018-08-28", "Ticker": "AA", "Mentions": 1}, {"Date": "2018-09-07", "Ticker": "AA", "Mentions": 1}, etc...
b'[{"Date": "2020-12-07", "Ticker": "AACQ", "Mentions": 1}, {"Date": "2020-12-08", "Ticker": "AACQ", "Mentions": 1}, {"Date": "2020-12-22", "Ticker": "AACQ", "Mentions": 1},... etc...
b'[{"Date": "2018-08-08", "Ticker": "AAL", "Mentions": 1}, {"Date": "2018-08-20", "Ticker": "AAL", "Mentions": 1}, {"Date": "2018-09-11", "Ticker": "AAL", "Mentions": 1}, .... etc
我现在要做的是将所有数据存储在csv文件中,以便结果表将像这样解释:
从站点返回的数据是JSON格式的,因此可以使用r.json()
将其转换为Python数据结构。接下来,有两件事会对你有所帮助。首先,Counter
可用于跟踪json数据中的所有Mentions
,defaultdict
可用于为每个报价机构建每个日期条目。setall_tickers
可用于跟踪数据中看到的所有报价机,然后用于形成输出CSV文件的头。
例如:
from collections import defaultdict, Counter
from datetime import datetime
import requests
import csv
dates = defaultdict(Counter)
all_tickers = set()
tickers = open("ticker_list.txt")
for ticker in tickers:
ticker = ticker.strip()
url = f"https://XXX SENSIBLE INFO/historical/wallstreetbets/{ticker}"
headers = {'XXX (SENSIBLE INFO'}
r = requests.get(url, headers=headers)
for row in r.json():
all_tickers.add(row['Ticker'])
date = datetime.strptime(row['Date'], '%Y-%m-%d') # convert to datetime format
dates[date][row['Ticker']] += row['Mentions']
with open('output.csv', 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames=['Date', *sorted(all_tickers)])
csv_output.writeheader()
for date, values in sorted(dates.items(), key=lambda x: x[0]):
row = {'Date' : date.strftime('%d/%m/%Y')} # Create an output date format of day/month/year
row.update(values)
csv_output.writerow(row)
这将产生您需要的输出。