如何使用Python在Google Drive上使用文件ID获取文件的url

在下面的代码中，我获得了Google Drive上csv文件的fileID。现在，我想将文件内容直接存储在pandas框架中，而不是下载csv文件，然后提取数据(如代码所示(。

import io
import os.path
import pandas as pd
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.http import MediaIoBaseDownload

# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/drive.readonly']
# Login to Google Drive
def login():
creds = None
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.json'):
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
print ("Login to your to your Google Drive account which holds/shares the file database")
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'./src/credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.json', 'w') as token:
token.write(creds.to_json())
# Return service
service = build('drive', 'v3', credentials=creds)

return service

# Download files from Google Drive
def downloadFile(file_name):
# Authenticate
service = login()
# Search file by name
response = service.files().list(q=f"name='{file_name}'", spaces='drive', fields='nextPageToken, files(id, name)').execute()
for file in response.get('files', []):
file_id = file.get('id')
# Download file file if it exists
if ("file_id" in locals()):
request = service.files().get_media(fileId=file_id)
fh = io.FileIO(f"./data/{file_name}.csv", "wb")
downloader = MediaIoBaseDownload(fh, request)
print (f"Downloading {file_name}.csv")
else:
print (f"33[1;31m Warning: Can't download >> {file_name} << because it is missing!!!33[0;0m")
return

downloadFile("NameOfFile")

有什么办法做到这一点吗？非常感谢您的帮助

从The problem is to be able to do that I need the file's URL but I'm not able to retrieve it.，我认为您的文件可能是Google电子表格。当文件是Google电子表格时，检索到的元数据中不包括webContentLink。

如果我对你的情况的理解是正确的，那么下面的修改如何？

修改的脚本：

来源：

file_id = file.get('id')
# !!! Here, I would like to get the URL of the file and download it to a pandas data frame !!!
file_url = file.get("webContentLink")

收件人：

file_id = file.get('id')
file_url = file.get("webContentLink")
if not file_url:
request = service.files().export_media(fileId=file_id, mimeType='text/csv')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%" % int(status.progress() * 100))
fh.seek(0)
df = pd.read_csv(fh)
print(df)

在本次修改中，Google电子表格使用Drive API导出为CSV数据，导出的数据被放入数据帧中
在本次修改中，请添加import io和from googleapiclient.http import MediaIoBaseDownload

注意：

在这种情况下，Google电子表格将使用Drive API导出为CSV数据。因此，请包括https://www.googleapis.com/auth/drive.readonly或https://www.googleapis.com/auth/drive的范围。当您的作用域仅为https://www.googleapis.com/auth/drive.metadata.readonly时，会发生错误。请小心

参考：

文件：导出

添加：

当文件为CSV数据时，请按如下方式进行修改。

file_id = file.get('id')
request = service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%" % int(status.progress() * 100))
fh.seek(0)
df = pd.read_csv(fh)
print(df)

修改的脚本：

来源：

收件人：

注意：

参考：

添加：

相关内容

最新更新

热门标签：