如何根据时间限制或服务器错误响应重试循环迭代



我运行下面的for循环,以便从服务器获取一些信息。循环遍历DimDate天的列表,并从服务器发出GET请求。这是我目前使用的代码。每一天都将存储在df中,然后在末尾附加到df.master。

df.master = data.frame()
df = data.frame()
for (i in 1:length(DimDate)){
message(i)
df = fromJSON(paste0(url1,"/",function, "/", id, "/", token1, "/", DimDate[i] ))
df.master <- bind_rows(df.master, df)
}

我的问题是,有时服务器会挂起(它会卡在特定的i中,然后返回响应消息,特别是400-500-503。这会扼杀循环。

iteration 232
iteration 233
iteration 234
Error in open.connection(con, "rb") : 
Failure when receiving data from the peer

你能告诉我对循环代码的改进吗?这样,如果循环在特定的i上停留x分钟或返回错误,循环将重试i几次,然后停止(如果重试次数过多(或跳到下一个i(如果服务器返回正确的数据且没有错误(

第1版:我到目前为止的进展

max_retries <- 3
for (i in 1:length(DimDate)){
message(paste("current:", i))
retries <- 0L
OK <- FALSE
while(!OK){
df <- tryCatch( expr = { fromJSON(paste0(url1,"/",
funcion,"/",
nit,"/",
token1,"/",
DimDate[i] ))},
error = function(e) { e } )
if(inherits(df, "error")){
if(retries < max_retries){
Sys.sleep(1)
retries <- retries + 1L
message(paste("   Retry:", retries))
} else break
} else {OK <- TRUE}
}
if(OK) {
df.master <- bind_rows(df.master, df)
cat("Dim 'df.master':", dim(df.master), "n")
}
}

作为测试,我用GET传递了一个损坏的函数,这样我就可以从服务器上得到一个错误。代码在当前迭代中尝试3次,不会停止,而是跳到下一次迭代

current: 1
Retry: 1
Retry: 2
Retry: 3
current: 2
Retry: 1
Retry: 2
Retry: 3
current: 3
Retry: 1
Retry: 2
Retry: 3
current: 4
Retry: 1
Retry: 2
>

问题的问题似乎是每当出现错误时,要求循环调用tryCatchSys.sleep最多次数
也许是以下内容。

df.master <- NULL
max_retries <- 2
for (i in 1:length(DimDate)){
message(paste("Current:", i))
retries <- 0L
OK <- FALSE
while(!OK){
df <- tryCatch(fromJSON(paste0(url1,"/", Function, "/", id, "/", token1, "/", DimDate[i] )),
error = function(e) e)
if(inherits(df, "error")){
message(df$message)
if(retries < max_retries){
Sys.sleep(1)
retries <- retries + 1L
message(paste("   Retry:", retries))
} else break
} else OK <- TRUE
}
if(OK) {
df.master <- bind_rows(df.master, df)
cat("Dim 'df.master':", dim(df.master), "n")
}
}

如果代码不是循环,而是重写为一个函数,返回可以用GET读取的数据帧和错误,那么,稍后,可以重试那些给出错误的代码。下面的函数有一个额外的参数wait,即在使用默认值1重试之前等待的时间(以秒为单位(。

retry_fromJSON <- function(DDate, URL, max_retries = 3, wait = 1, verbose = TRUE){
out <- lapply(seq_along(DDate), function(i){
if(verbose){
message(paste("Current:", i))
}
current_URL <- paste(URL, DDate[i], sep = "/")
retries <- 0L
OK <- FALSE
while(!OK){
df <- tryCatch(fromJSON(current_URL), error = function(e) e)
if(inherits(df, "error")){
if(verbose){
message(df$message)
}
if(retries < max_retries){
Sys.sleep(time = wait)
retries <- retries + 1L
if(verbose){
message(paste("   Retry:", retries))
}
} else break
} else {OK <- TRUE}
}
if(OK) {
if(verbose){
msg <- paste("Read df with dim:", dim(df))
message(msg)
}
}
df
})
err <- sapply(out, inherits, "error")
df.master <- do.call(rbind.data.frame, out[!err])
errors <- sapply(out[err], "[[", "message")
list(df.master = df.master, which.err = which(err), errors = errors)
}
url2 <- paste(url1, funcion, id, token1, sep = "/")
result <- retry_fromJSON(DimDate, url2)
result$df.master    # the data.frame
result$errors       # the error messages
result$which.err    # DimDate indices that threw error

现在可以进行第二次传球了。

NewDimDate <- DimDate[result$which.err]
result <- retry_fromJSON(NewDimDate, url2)

最新更新