如何将SharePoint在线阅读(Office365)Excel Files添加到Python中,特别是带有工作或学校



这个问题与下面的链接非常相似。如何在Python中使用工作或学校帐户阅读SharePoint在线(Office365)Excel文件?

本质上,我想将excel文件从sharepoint中导入熊猫以进行进一步分析。

问题是当我运行下面的代码时,我会收到以下错误。

XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'rn<!DOCT'

我的代码:

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File 
url = 'https://companyname.sharepoint.com/SitePages/Home.aspx'
username = 'fakeaccount@company.com'
password = 'password!'
relative_url = '/Shared%20Documents/Folder%20Number1/Folder%20Number2/Folder3/Folder%20Number%Four/Target_Excel_File_v4.xlsx?d=w8f97c2341898_random_numbers_and_letters_a065c12cbcsf=1&e=KXoU4s'

ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  #this gives me a KeyError: 'Title'
  #print("Web title: {0}".format(web.properties['Title']))
  print('Authentication Successful')
else:
  print(ctx_auth.get_last_error())

import io
import pandas as pd
response = File.open_binary(ctx, relative_url)
#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start
#read file into pandas dataframe
df = pd.read_excel(bytes_file_obj)
print(df)

对于那些在这个问题上像我一样结束的人,我发现必须将完整的URL路径通往File,而不仅仅是路径:

#import all the libraries
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File 
import io
import pandas as pd
#target url taken from sharepoint and credentials
url = 'https://company.sharepoint.com/Shared%20Documents/Folder%20Number1/Folder%20Number2/Folder3/Folder%20Number4/Target_Excel_File_v4.xlsx?cid=_Random_letters_and_numbers-21dbf74c'
username = 'Dumby_account@company.com'
password = 'Password!'
ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  print("Authentication successful")
response = File.open_binary(ctx, url)
#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start
#read excel file and each sheet into pandas dataframe 
df = pd.read_excel(bytes_file_obj, sheetname = None)

也许值得注意的是,官方存储库有许多有关SharePoint,Drive和Teams的共同操作的示例。

安装的一个注释:

PIP安装Office365-Rest-Python-Client

还有一个Office365软件包,但上面的包似乎是正确的OneNenter链接说明

以下可与客户端ID和秘密代码(LIB:Office365)

一起使用
# Credential to connect to your SP Site
SITE_URL ='https://XXXXXX.sharepoint.com/sites/yoursitename'
CLIENT_ID = 'xxxxxxxx-xxx-xxxx-xxxxxxxxxxxxxxxxx'
CLIENT_SECRET= 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
# Establish the connection
context = ClientContext(SITE_URL).with_credentials(ClientCredential(CLIENT_ID, CLIENT_SECRET))
response = File.open_binary(context, '/'.join(['/sites/yoursitename/Shared Documents/Work/OnlyFolderName', 
                                           'yourfilename.xlsx']))

df = pd.read_excel(io.BytesIO(response.content))

最新更新