从url中下载多个.csv



我有多个.csv数据集,我想上传它们并以不同的名称保存它们。提前感谢。

casualty_2005 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2005-gla-data-extract-casualty.csv", header=T)
casualty_2006 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2006-gla-data-extract-casualty.csv", header=T)
casualty_2007 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2007-gla-data-extract-casualty.csv", header=T)
casualty_2008 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2008-gla-data-extract-casualty.csv", header=T)
casualty_2009 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2009-gla-data-extract-casualty.csv", header=T)
casualty_2010 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2010-gla-data-extract-casualty.csv", header=T)
casualty_2011 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2011-gla-data-extract-casualty.csv", header=T)
casualty_2012 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2012-gla-data-extract-casualty.csv", header=T)
casualty_2013 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2013-gla-data-extract-casualty.csv", header=T)
casualty_2014 <- read.csv("https://tfl.gov.uk/cdn/static/cms/documents/2014-gla-data-extract-casualty.csv", header=T)

类似的问题已经问过很多次了,创建一个名称/链接的向量,然后在lapply循环中读取所有文件。
注意read.csv的默认值是header = TRUE

url_fmt <- "https://tfl.gov.uk/cdn/static/cms/documents/%04d-gla-data-extract-casualty.csv"
url_years <- 2005:2014
url_vec <- sprintf(url_fmt, url_years)
df_list <- lapply(url_vec, read.csv)
names(df_list) <- url_years
head(df_list[[1]])       # first file, top 6 rows
head(df_list[["2005"]])  # same file
head(df_list$`2005`)     # same file

编辑

看完doctshind的回答后,我意识到问题是问如何下载文件,而不是如何读取文件。

设置新目录的说明可选。

#old_dir <- getwd()
#setwd('~/tmp')
lapply(url_vec, function(x) download.file(x, destfile = basename(x)))
list.files(pattern = '\.csv')
# [1] "2005-gla-data-extract-casualty.csv"
# [2] "2006-gla-data-extract-casualty.csv"
# [3] "2007-gla-data-extract-casualty.csv"
# [4] "2008-gla-data-extract-casualty.csv"
# [5] "2009-gla-data-extract-casualty.csv"
# [6] "2010-gla-data-extract-casualty.csv"
# [7] "2011-gla-data-extract-casualty.csv"
# [8] "2012-gla-data-extract-casualty.csv"
# [9] "2013-gla-data-extract-casualty.csv"
#[10] "2014-gla-data-extract-casualty.csv"
#setwd(old_dir)

下面是完整的工作代码和解释:

# List of File URLs
urlist <- list("https://tfl.gov.uk/cdn/static/cms/documents/2005-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2006-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2007-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2008-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2009-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2010-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2011-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2012-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2013-gla-data-extract-casualty.csv","https://tfl.gov.uk/cdn/static/cms/documents/2014-gla-data-extract-casualty.csv")
setwd("~/so")
#for loop to select each url and store them
for (i in 1: length(urlist)) {
#define path to download
#get filename from the url path
destfile<-basename(urlist[[i]])
#download current file
download.file(urlist[[i]]), destfile)
}

最新更新