如何在python中从谷歌电子表格下载原始数据



我想要谷歌电子表格中显示的数据,但没有可用的下载选项。我试着使用Beautifulsoup4库,但没能弄清楚。

这是数据:https://docs.google.com/spreadsheets/d/e/2PACX-1vSc_2y5N0I67wDU38DjDh35IZSIS30rQf7_NYZhtYYGU1jJYT6_kDx4YpF-qw0LSlGsBYP8pqM_a1Pd/pubhtml#

您可以使用google-api-python-client

这里有一个Quickstart文档。

它可以归结为这样的东西:

SAMPLE_SPREADSHEET_ID = '<your spreadsheet id>'
SAMPLE_RANGE_NAME = '<your desired range>'
service = build('sheets', 'v4', credentials=creds)
sheet = service.spreadsheets()
result = sheet.values().get(spreadsheetId=SAMPLE_SPREADSHEET_ID,
range=SAMPLE_RANGE_NAME).execute()
values = result.get('values', [])

但一定要阅读完整的Quickstart才能了解全貌。(样本代码取自此处。(

您尝试使用的漂亮汤方法将以这种方式工作。

read_url =  urllib.request.urlopen('your_sheet_url').read() #read the url
data = BeautifulSoup(read_url,"html.parser")
table = data.table                                          #extract table 
output_rows = []
df = pd.DataFrame(columns=['State','','Confirmed','Recovered','Deaths','Active','Last_Updated_Time'])
for table_row in table.findAll('tr'):                      #iterate though rows
columns = table_row.findAll('td')
output_row = []
for column in columns:                                 #iterate though columns
print(column.text)
output_row.append(column.text)                     #append into a list
print(len(output_row))
output_rows.append(output_row)
try:
df = df.append(pd.Series(output_row,index = df.columns.tolist()),ignore_index = True) #add to the final dataframe
except:
pass
df.toexcel("Output.xlsx")                              # save the datafram as excel file

相关内容

最新更新