r-导入带有两个标题的在线excel文件



我正试图直接下载一份R的补充材料,但遇到了问题。首先,我尝试了:

datatable = fread("https://doi.org/10.1371/journal.pone.0242866.s001") 

第一个问题是:

Error in fread("https://doi.org/10.1371/journal.pone.0242866.s001") : 
embedded nul in string: 'xf7xc5B1xfcLxf8xu~xad1601xddL5(xb2I`x8bҶ05tLxeePdx97"{24xd9'p@C31Rx9b#x8a34Sx9b2302xa704Ψxcd9E.(rIx91+01xddM~Mi7x94vKxe0x8exd2&tx85&)xc6xd621Uxd6a23x96X17733Fxde'J|x9fYx9937L30ǫxdc17xefexe1xx91xfdx89xd2?lx84q6Txc1x84qD*0137xab32xf0(x8a Q25xf8Xx95A0axecxb3'
In addition: Warning message:
In fread("https://doi.org/10.1371/journal.pone.0242866.s001") :
Detected 1 column names but the data has 3 columns (i.e. invalid file). Added 2 extra default column names at the end.

我试过

datatable = data.table(read.csv("https://doi.org/10.1371/journal.pone.0242866.s001"))

然而,结果只给我带来了一个变量。

所以我尝试了

datatable = data.table(read.csv2("https://doi.org/10.1371/journal.pone.0242866.s001"))

同样,问题仍然存在,但的观测次数

当我尝试使用read_excel并添加一个跳过以查看是否可以排除第一行时,路径中出现了错误。

datatable = data.table(read_excel("https://doi.org/10.1371/journal.pone.0242866.s001"), skip = 1)
Error: `path` does not exist: ‘https://doi.org/10.1371/journal.pone.0242866.s001’

有人能帮我吗?

您可以使用openxlsxtail():

library(openxlsx)
datatable <- openxlsx::read.xlsx(
'https://doi.org/10.1371/journal.pone.0242866.s001', sheet = 1
) %>%
tail(-1)

试试这个:

library(openxlsx)
openxlsx::datatable = read.xlsx("https://doi.org/10.1371/journal.pone.0242866.s001", sheet = 1)

我参考了Abdinardo Oliveira对这个问题的回答。