我正试图使用sec-api模块中的ExtractorApi从10-Q报告中提取特定部分。该模块适用于10-K,但在10-Q的某些部分出现故障。例如,如果我想从10-Q中提取项目3,下面的代码非常有效:
from sec_api import ExtractorApi
extractorApi = ExtractorApi("YOUR API KEY") #Replace this with own API key
# 10-Q filing
filing_url = "https://www.sec.gov/Archives/edgar/data/789019/000156459021002316/msft-10q_20201231.htm"
# get the standardized and cleaned text of section
section_text = extractorApi.get_section(filing_url, "3", "text")
print(section_text)
但当我试图提取第1A项时。风险因素,下面的代码返回"未定义":
from sec_api import ExtractorApi
extractorApi = ExtractorApi("YOUR API KEY") #Replace this with own API key
# 10-Q filing
filing_url = "https://www.sec.gov/Archives/edgar/data/789019/000156459021002316/msft-10q_20201231.htm"
# get the standardized and cleaned text of section
section_text = extractorApi.get_section(filing_url, "21A", "text") #Using 21A from the documentation of sec-api
print(section_text)
是否有从10-Q文件中提取这些部分的变通方法?
是的,Extractor API也支持提取10-Q文件的部分。
如果要提取项目第1A节(风险因素(,请尝试使用part2item1a
作为项目参数,而不是21A
。
正确的代码如下:
from sec_api import ExtractorApi
extractorApi = ExtractorApi("YOUR API KEY") # Replace this with own API key
# 10-Q filing
filing_url = "https://www.sec.gov/Archives/edgar/data/789019/000156459021002316/msft-10q_20201231.htm"
# get the standardized and cleaned text of section
section_text = extractorApi.get_section(filing_url, "part2item1a", "text") # Using part2item1a from the documentation of sec-api
print(section_text)
注意,您也可以直接使用sec-api.io REST api。这允许最大限度地减少外部依赖项的使用,并简化与REST框架的外部集成。
这里有一个例子:
import requests
filing_url = "https://www.sec.gov/Archives/edgar/data/789019/000156459021002316/msft-10q_20201231.htm"
extractor_api_url = "https://api.sec-api.io/extractor"
params = {
"url": filing_url,
"item": section,
"type": "part2item1a",
"token": "YOUR API KEY",
}
response = requests.get(extractor_api_url, params=params)
part2item1a = response.text