从交互式图表中提取数据点使用python?



是否可以在此链接中从图表中提取数据点?

https://ycharts.com/companies/AAPL/market_cap

图表位于//*[@id="dataChartCanvass1"]

不是图表下面的表格。

我试图查看网站的来源,但我只能看到表格中的数据点。

是否可以使用python和请求?我应该从哪里开始呢?

您可以模拟它们的Ajax调用来获得图表代码点,例如:

import json
import requests
import pandas as pd
api_url = "https://ycharts.com/charts/fund_data.json"
params = {
"securities": "id:AAPL,include:true,,",  # <-- ticker here
"calcs": "id:market_cap,include:true,,",
"correlations": "",
"format": "real",
"recessions": "false",
"zoom": "5",
"startDate": "",
"endDate": "",
"chartView": "",
"splitType": "single",
"scaleType": "linear",
"note": "",
"title": "",
"source": "false",
"units": "false",
"quoteLegend": "true",
"partner": "",
"quotes": "",
"legendOnChart": "true",
"securitylistSecurityId": "",
"displayTicker": "false",
"ychartsLogo": "",
"useEstimates": "false",
"maxPoints": "918",
}
data = requests.get(api_url, params=params).json()
# uncomment to see all data:
# print(json.dumps(data, indent=4))
df = pd.DataFrame(
data["chart_data"][0][0]["raw_data"], columns=["date", "value"]
)
df["date"] = pd.to_datetime(df["date"] / 1000, unit="s")
df["value"] = df["value"].astype(int)
print(df)

打印:

date    value
0   2016-08-29   575593
1   2016-09-06   580335
2   2016-09-09   555710
3   2016-09-16   619239
4   2016-09-23   607331
5   2016-09-30   603253
6   2016-10-07   608643
7   2016-10-14   627239
8   2016-10-21   621747
9   2016-10-28   606390
10  2016-11-04   580368
11  2016-11-11   578182
...and so on.

最新更新