更新:我发现了一些可能有用的东西,但我仍然有一点麻烦,不知道如何实现它。如果我尝试像这样映射get_data,我不确定如何将每次调用的结果分配给各自的变量。
parameters = [
[service, profile_id, '30daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop'],
[service, profile_id, '60daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop'],
...
[service, profile_id, '90daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==mobile']
]
with ThreadPoolExecutor(max_workers=4) as executor:
executor.map(get_data, parameters)
我正在编写一个Python应用程序(使用Google分析API),该应用程序允许用户获得关于过去30天、60天和90天内用于访问给定站点的十大桌面浏览器、按版本划分的桌面浏览器、移动浏览器和移动操作系统的报告。到目前为止,一切似乎都很正常。
然而,性能到处都是。总共有12个API请求——4组数据中每组3个。有时应用程序需要大约10秒才能运行,有时则需要一分钟以上。这似乎完全取决于API如何响应。所以我的问题是:有没有办法可以将这些请求组合起来,或者将它们排列成可以并发执行的方式?
我试着寻找整合请求的方法,这样我可能只需要对每组数据做一个请求,这些请求将返回30天、60天和90天的信息,但我没能找到任何东西。至于让请求并发,我只是不太确定如何去做这样的事情。我能找到的最接近的东西是这个问题/答案,但我不太明白关于批处理的答案。
相关代码如下:
def get_data(service, profile_id, days, dimensions, segment):
return service.data().ga().get(
ids='ga:' + profile_id,
start_date=days,
end_date='today',
metrics='ga:sessions',
dimensions=dimensions,
sort='-ga:sessions',
segment=segment,
max_results=10).execute()
def get_results(service, profile_id):
global glob_startdate
global glob_months
# get top 10 desktop browsers
print("Getting top 10 desktop browsers...")
data_1a = get_data(service, profile_id, '30daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
data_1b = get_data(service, profile_id, '60daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
data_1c = get_data(service, profile_id, '90daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
data1 = [data_1a, data_1b, data_1c]
# get top 10 desktop browser versions
print("Getting top 10 desktop browser versions...")
data_2a = get_data(service, profile_id, '30daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==desktop')
data_2b = get_data(service, profile_id, '60daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==desktop')
data_2c = get_data(service, profile_id, '90daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==desktop')
data2 = [data_2a, data_2b, data_2c]
# get top 10 mobile OS's
print("Getting top 10 mobile OS's...")
data_3a = get_data(service, profile_id, '30daysAgo', 'ga:operatingSystem,ga:operatingSystemVersion', 'sessions::condition::ga:deviceCategory==mobile')
data_3b = get_data(service, profile_id, '60daysAgo', 'ga:operatingSystem,ga:operatingSystemVersion', 'sessions::condition::ga:deviceCategory==mobile')
data_3c = get_data(service, profile_id, '90daysAgo', 'ga:operatingSystem,ga:operatingSystemVersion', 'sessions::condition::ga:deviceCategory==mobile')
data3 = [data_3a, data_3b, data_3c]
# get top 10 mobile browsers
print("Getting top 10 mobile browsers...")
data_4a = get_data(service, profile_id, '30daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==mobile')
data_4b = get_data(service, profile_id, '60daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==mobile')
data_4c = get_data(service, profile_id, '90daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==mobile')
data4 = [data_4a, data_4b, data_4c]
谢谢!
由于API的配额和限制,一次最多可以批处理10个请求。
from apiclient.http import BatchHttpRequest
import httplib2
def call_back(request_id, response, exception):
"""Do something with the response of each call"""
pass
def get_request(service, profile_id, days, dimensions, segment):
"""Note I removed the execute() from the end of this method."""
return service.data().ga().get(
ids='ga:' + profile_id,
start_date=days,
end_date='today',
metrics='ga:sessions',
dimensions=dimensions,
sort='-ga:sessions',
segment=segment,
max_results=10)
# Create a batch Http Request object
batch = BatchHttpRequest(callback=self.call_back)
# Construct your queries.
# get top 10 desktop browsers
print("Getting top 10 desktop browsers...")
request_1a = get_request(service, profile_id, '30daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
request_1b = get_request(service, profile_id, '60daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
request_1c = get_request(service, profile_id, '90daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
for request in [request_1a, request_1b, request_1c]:
batch.add(request)
batch.execute(http=httplib2.Http())