如何将电子表格中的所有工作表导出为CSV文件使用驱动API与Python中的服务帐户?



我已经建立了一个与Drive API的成功服务连接,并且我正在创建导出url,以便通过使用Google的AuthorizedSession类发送请求,将电子表格中的每个工作表下载为CSV文件。由于某种原因,只有一部分CSV文件返回正确,其他文件包含损坏的HTML。当我发送单个请求时,表单总是返回正确,但当我循环遍历表单并开始发送请求时,事情开始中断。我已经确定以这种方式传递凭据的方式存在问题,但我不确定是否正确使用AuthorizedSession。有人能帮我解决这个问题吗?

from googleapiclient.discovery import build
from google.oauth2 import service_account
from google.auth.transport.requests import AuthorizedSession
import re
import shutil
import urllib.parse

CLIENT_SECRET_FILE = "client_secret.json"
API_NAME = "sheets"
API_VERSION = "v4"
SCOPES = ["https://www.googleapis.com/auth/drive.readonly"]
SPREADSHEET_ID = "Spreadsheet ID goes here"
print(CLIENT_SECRET_FILE, API_NAME, API_VERSION, SCOPES, sep="-")
cred = service_account.Credentials.from_service_account_file(
CLIENT_SECRET_FILE, scopes=SCOPES
)
try:
service = build(API_NAME, API_VERSION, credentials=cred)
print(API_NAME, "service created successfully")
result = service.spreadsheets().get(spreadsheetId=SPREADSHEET_ID).execute()
export_url = re.sub("/edit$", "/export", result["spreadsheetUrl"])
authed_session = AuthorizedSession(cred)
for sheet in result["sheets"]:
sheet_name = sheet["properties"]["title"]
params = {"format": "csv", "gid": sheet["properties"]["sheetId"]}
query_params = urllib.parse.urlencode(params)
url = export_url + "?" + query_params
response = authed_session.get(url)
file_path = "./Downloads/" + sheet_name + ".csv"
with open(file_path, "wb") as csv_file:
csv_file.write(response.content)
print("Downloaded sheet: " + sheet_name)
print("Downloads complete")
except Exception as e:
print("Unable to connect")
print(e)

这段代码将为您提供一个sheetsservice

"""Hello sheets."""
from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials

SCOPES = ['"https://www.googleapis.com/auth/drive.readonly']
KEY_FILE_LOCATION = '<REPLACE_WITH_JSON_FILE>'
VIEW_ID = '<REPLACE_WITH_VIEW_ID>'

def initialize_sheet():
"""Initializes an sheetservice object.
Returns:
An authorized sheetservice object.
"""
credentials = ServiceAccountCredentials.from_json_keyfile_name(
KEY_FILE_LOCATION, SCOPES)
# Build the service object.
sheet= build('sheet', 'v4', credentials=credentials)
return sheet

如果您使用由此方法构建的相同的表单服务,那么循环

应该不会有任何问题。

我认为你的authed_session = AuthorizedSession(cred)response = authed_session.get(url)的脚本是正确的。我想,在你的情况下,请求的数量可能会在短时间内很大,这可能是由于你的问题的原因。那么作为一个简单的修改,下面的修改怎么样?

:

for sheet in result["sheets"]:
sheet_name = sheet["properties"]["title"]
params = {"format": "csv", "gid": sheet["properties"]["sheetId"]}
query_params = urllib.parse.urlencode(params)
url = export_url + "?" + query_params
response = authed_session.get(url)
file_path = "./Downloads/" + sheet_name + ".csv"
with open(file_path, "wb") as csv_file:
csv_file.write(response.content)
print("Downloaded sheet: " + sheet_name)

:

for sheet in result["sheets"]:
sheet_name = sheet["properties"]["title"]
params = {"format": "csv", "gid": sheet["properties"]["sheetId"]}
query_params = urllib.parse.urlencode(params)
url = export_url + "?" + query_params
response = authed_session.get(url)
file_path = "./Downloads/" + sheet_name + ".csv"
with open(file_path, "wb") as csv_file:
csv_file.write(response.content)
print("Downloaded sheet: " + sheet_name)
time.sleep(3)  # <--- Added. Please adjust the value of 3 for your actual situation.
  • 在这种情况下,请使用import time

相关内容

  • 没有找到相关文章

最新更新