从谷歌驱动器读取excel文件,无需下载文件



我想从谷歌驱动器上的excel文件中读取excel表,而无需在本地机器上下载!我搜索了谷歌驱动api,但找不到解决方案,我尝试了以下代码,请需要建议:

'''
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
drive = GoogleDrive(gauth)
file_id = 'abc'
file_name = 'abc.xlsx'  
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile(file_name)

class TestCase:
def __init__(self, file_name, sheet):
self.file_name = file_name
self.sheet = sheet
testcase = pd.read_excel(file_name, usecols=None, sheet_name=sheet)
print(testcase)

class TestCaseSteps:
def __init__(self, file_name, sheet):
self.file_name = file_name
self.sheet = sheet
testcase = pd.read_excel(file_name, usecols=None, sheet_name=sheet)
print(testcase)

testcases=TestCase(文件名,'A'(steps=TestCaseSteps(文件名,'B'(''

我相信你的目标和情况如下。

  • 您想要使用pd.read_excel读取从Google Drive下载的XLSX
  • 您希望在不将下载的XLSX数据保存为文件的情况下实现这一点
  • 您的gauth = GoogleAuth()可用于下载XLSX格式的谷歌电子表格

在这种情况下,我想提出以下流程。

  1. 以XLSX格式下载Google电子表格。
    • 在这种情况下,它直接向端点请求使用requests库将电子表格导出为XLSX格式
    • 访问令牌是从gauth = GoogleAuth()中检索的
  2. 下载的XLSX数据是用pd.read_excel读取的。
    • 在这种情况下,BytesIO用于读取数据

通过此流程,当电子表格作为XLSX数据下载时,可以读取XLSX的数据,而无需将其保存为文件。当上面的流程反映到脚本中时,它变成如下。

示例脚本:

在运行脚本之前,请设置电子表格ID。

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
import requests
from io import BytesIO
spreadsheetId = "###"  # <--- Please set the Spreadsheet ID.
# 1. Download the Google Spreadsheet as XLSX format.
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
url = "https://www.googleapis.com/drive/v3/files/" + spreadsheetId + "/export?mimeType=application%2Fvnd.openxmlformats-officedocument.spreadsheetml.sheet"
res = requests.get(url, headers={"Authorization": "Bearer " + gauth.attr['credentials'].access_token})
# 2. The downloaded XLSX data is read with `pd.read_excel`.
sheet = "Sheet1"
values = pd.read_excel(BytesIO(res.content), usecols=None, sheet_name=sheet)
print(values)

参考文献:

  • 下载Google Workspace文档
  • pandas.read_excel

添加:

在下面的示例脚本中,它假设XLSX文件被放入Google Drive,并且XLSX文档被下载。

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
import requests
from io import BytesIO
file_id = "###"  # <--- Please set the file ID of XLSX file.
# 1. Download the XLSX data.
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?alt=media"
res = requests.get(url, headers={"Authorization": "Bearer " + gauth.attr['credentials'].access_token})
# 2. The downloaded XLSX data is read with `pd.read_excel`.
sheet = "Sheet1"
values = pd.read_excel(BytesIO(res.content), usecols=None, sheet_name=sheet)
print(values)

获取Excel文件的链接。替换"/编辑usp=share_link">表示"strong>"/出口format=xlsx">


import pandas as pd
url = "https://docs.google.com/spreadsheets/d/1py9cdXWCorKrRz7fUp0doPaOTJ2tiexB/edit?usp=share_link&ouid=107751642562197088720&rtpof=true&sd=true"
url_for_pandas = url.replace("/edit?usp=share_link", "/export?format=xlsx")
df = pd.read_excel(url_for_pandas)
print(df)

最新更新