将一个环境中的data.frame对象追加到r中的GlobalEnv(或另一个环境)中相应的data.frame对象



我有几个现有的data.frame对象需要从互联网上更新。但是,由于更新与上述现有对象具有相同的名称,因此我将更新作为data.frame对象放在单独的环境中。

然后,想法是将更新附加到现有的data.frame对象。但我不知道我怎么能迭代地做到这一点(即,在循环中?)与rbind从一个环境到GlobalEnv(或另一个环境,就此而言)。

同样,我没有把它们放在这里,但是会有几个其他的data.frame对象(具有其他名称)将在GlobalEnv(或将加载它们的环境)中。

下面这段代码应该是可复制的(带有注释和指向源代码的链接):

library(quantmod)
# Load ticker data from 2020-01-01 till 2021-02-02
tickers <- c("NKLA", "MPNGF", "RMO", "JD", "MSFT")
getSymbols.yahoo(tickers, auto.assign = TRUE, env = globalenv(), 
from = "2020-01-01", to = "2021-02-02")
# Close all Internet connections as a precaution
# https://stackoverflow.com/a/52758758/2950721
closeAllConnections()
# Find xts objects
xtsObjects <- names(which(unlist(eapply(.GlobalEnv, is.xts))))
# Convert xts to data.frame
# https://stackoverflow.com/a/69246047/2950721
for (i in seq_along(xtsObjects)) {
assign(xtsObjects[i], fortify.zoo(get(xtsObjects[i])))
}

# Redo the previous process but in separate environment for updated
# values of the same tickers (comments and sources are not repeated)
symbolUpdates.env <- new.env()
getSymbols.yahoo(tickers, auto.assign = TRUE, env = symbolUpdates.env,
from = "2021-02-03")
closeAllConnections()
symbolUpdatesXtsObjects <- names(which(unlist(eapply(symbolUpdates.env, 
is.xts))))
for (i in seq_along(symbolUpdatesXtsObjects)) {
assign(envir = symbolUpdates.env, symbolUpdatesXtsObjects[i], 
fortify.zoo(get(symbolUpdatesXtsObjects[i], 
envir = symbolUpdates.env)))
}
# Find ```data.frame``` objects both in ```GlobalEnv``` and 
# ```symbolUpdates.env```
globalEnvDataframeObjects <- names(which(unlist(eapply(.GlobalEnv, 
is.data.frame))))
symbolUpdatesDataframeObjects <- names(which(unlist(eapply(symbolUpdates.env, 
is.data.frame))))

# This rbind definitely does not work!!!
for (i in seq_along(globalEnvDataframeObjects)) {
rbind(envir = .GlobalEnv, globalEnvDataframeObjects[i], envir =
symbolUpdates.env, symbolUpdatesDataframeObjects[i])
}

我的问题:

  • 最好没有额外的包,除了基本的R包,哪段代码可以迭代地将symbolUpdatesDataframeObjects附加到相应的globalEnvDataframeObjects上?
  • 如果globalEnvDataframeObjects在另一个环境中(即,不是.GlobalEnv,而是"子环境"),代码是否相同?比如symbolUpdates.env)?
    • 如果没有,会发生什么变化?
  • 是否有比我正在尝试使用的方法更好/更明智的方法?

Thanks in advance.


系统:

  • R version: 4.1.1 (2021-08-10)
  • RStudio版本:1.4.1717
  • 操作系统:macOS Catalina 10.15.7和macOS Big Sur 11.6

这里可能需要intersect

interObj <- intersect(globalEnvDataframeObjects, symbolUpdatesDataframeObjects)
interObj <- interObj[match(interObj, symbolUpdatesDataframeObjects)]
nrow(get(interObj[1]))
[1] 273
for (i in seq_along(interObj)) {
assign(interObj[i], rbind(get(interObj[i], envir = .GlobalEnv), 
get(symbolUpdatesDataframeObjects[i], envir = symbolUpdates.env)), envir = .GlobalEnv)
}

如果需要在多个环境中存储data.frames,请使用以下命令:

# Install pacakges if they are not already installed: necessary_packages => vector
necessary_packages <- c("quantmod")
# Create a vector containing the names of any packages needing installation:
# new_pacakges => vector
new_packages <- necessary_packages[!(necessary_packages %in%
installed.packages()[, "Package"])]
# If the vector has more than 0 values, install the new pacakges
# (and their) associated dependencies:
if(length(new_packages) > 0){
install.packages(
new_packages, 
dependencies = TRUE
)
}
# Initialise the packages in the session: list of boolean => stdout (console)
lapply(
necessary_packages, 
require, 
character.only = TRUE
)
# Load ticker data from 2020-01-01 till 2021-02-02
tickers <- c(
"NKLA", 
"MPNGF", 
"RMO", 
"JD", 
"MSFT"
)
# Create a new environment: environment => symbolUpdates.env
symbolUpdates.env <- new.env()
# Create a vector of from dates: from_dates => Date Vector
from_dates <- as.Date(
c(
"2020-01-01", 
"2020-02-03"
)
)
# Create a vector of to dates:
to_dates <- as.Date(
c(
"2021-02-02", 
format(
Sys.Date(),
"%Y-%m-%d"
)
)
)
# Create a vetor environments: env_vec => vector of environments
env_vec <- c(
.GlobalEnv, 
symbolUpdates.env
)
# Function to retreive ticker as a data.frame: 
# retrieve_ticker_df => function()
retrieve_ticker_df <- function(ticker_vec, from_date, to_date){

# Create a list of size length(tickers):
# df_list => empty list
df_list <- vector(
"list", 
length(ticker_vec)
)

# Store each ticker's response as a data.frame in the list:
# df_list => list of data.frames
df_list <- setNames(
lapply(
seq_along(ticker_vec),
function(i){
# Retrieve the data.frame: tmp => data.frame
tmp <- getSymbols.yahoo(
ticker_vec[i],
auto.assign = FALSE, 
from = from_date,
to = to_date,
return.class = 'data.frame',
)

# Close all Internet connections as a precaution
# https://stackoverflow.com/a/52758758/2950721
closeAllConnections()

# Create a data.frame and revert index to sequential
# integers: data.frame => env
data.frame(
cbind(
date = as.Date(
row.names(
tmp
)
),
tmp
),
row.names = NULL
)
}
),
ticker_vec
)
# Explicitly define returned object: list of data.frames => env
return(df_list)
}
# Store all the data.frames in a list of data.frames, 
# store each list of data.frames in a list: 
# ticker_df_list_list => list of list of data.frames
ticker_df_list_list <- lapply(
seq_along(env_vec),
function(i){
retrieve_ticker_df(
tickers, 
from_dates[i], 
to_dates[i]
)
}
)
# Push each of the lists to the appropriate environment: 
# data.frames => env
lapply(
seq_along(ticker_df_list_list),
function(i){
list2env(
ticker_df_list_list[[i]],
envir = env_vec[[i]]
)
}
)
# Initialise an empty list to create some memory
# bound_df_list => empty list
bound_df_list <- vector(
"list", 
length(tickers)
)
# Allocate some memory by initialising an
# empty list: ir_list => list
ir_list <- vector(
"list",
length(env_vec) * length(tickers)
)
# Unlist the env_vec, and retrieve the ticker
# data.frames: ir_list => list of data.frames
ir_list <- unlist(
lapply(
env_vec,
function(x){
mget(
tickers, 
envir = x
)
}
),
recursive = FALSE
)
# Split-apply-combine based on the 
# data.frame names: bound_df_list => list of data.frames
bound_df_list <- lapply(
split(
ir_list,
names(ir_list)
),
function(x){
do.call(
rbind, 
x
)
}
)
# Clear up the intermediate objects:
rm(ticker_df_list_list, ir_list, env_vec); gc()

如果不是强制使用多个环境:

# Install pacakges if they are not already installed: necessary_packages => vector
necessary_packages <- c("quantmod")
# Create a vector containing the names of any packages needing installation:
# new_pacakges => vector
new_packages <- necessary_packages[!(necessary_packages %in%
installed.packages()[, "Package"])]
# If the vector has more than 0 values, install the new pacakges
# (and their) associated dependencies:
if(length(new_packages) > 0){
install.packages(
new_packages, 
dependencies = TRUE
)
}
# Initialise the packages in the session: list of boolean => stdout (console)
lapply(
necessary_packages, 
require, 
character.only = TRUE
)
# Load ticker data from 2020-01-01 till 2021-02-02
tickers <- c(
"NKLA", 
"MPNGF", 
"RMO", 
"JD", 
"MSFT"
)
# Create a new environment: environment => symbolUpdates.env
symbolUpdates.env <- new.env()
# Create a vector of from dates: from_dates => Date Vector
from_dates <- as.Date(
c(
"2020-01-01", 
"2020-02-03"
)
)
# Create a vector of to dates:
to_dates <- as.Date(
c(
"2021-02-02", 
format(
Sys.Date(),
"%Y-%m-%d"
)
)
)
# Function to retreive ticker as a data.frame: 
# retrieve_ticker_df => function()
retrieve_ticker_df <- function(ticker_vec, from_date, to_date){
# Create a list of size length(tickers):
# df_list => empty list
df_list <- vector(
"list", 
length(ticker_vec)
)

# Store each ticker's response as a data.frame in the list:
# df_list => list of data.frames
df_list <- setNames(
lapply(
seq_along(ticker_vec),
function(i){
# Retrieve the data.frame: tmp => data.frame
tmp <- getSymbols.yahoo(
ticker_vec[i],
auto.assign = FALSE, 
from = from_date,
to = to_date,
return.class = 'data.frame',
)

# Close all Internet connections as a precaution
# https://stackoverflow.com/a/52758758/2950721
closeAllConnections()

# Create a data.frame and revert index to sequential
# integers: data.frame => env
data.frame(
cbind(
date = as.Date(
row.names(
tmp
)
),
tmp
),
row.names = NULL
)
}
),
ticker_vec
)
# Explicitly define returned object: list of data.frames => env
return(df_list)
}
# Store all the data.frames in a list of data.frames, 
# store each list of data.frames in a list: 
# ticker_df_list_list => list of list of data.frames
ticker_df_list_list <- lapply(
seq_along(from_dates),
function(i){
retrieve_ticker_df(
tickers, 
from_dates[i], 
to_dates[i]
)
}
)
# Initialise an empty list to create some memory:
# ir_list => empty list
ir_list <- vector(
"list",
length(tickers) * length(from_dates)
)
# Populate the list with each of the named data.frames: 
# ir_list => list of data.frames
ir_list <- unlist(
ticker_df_list_list, 
recursive = FALSE
)
# Initialise an empty list to create some memory
# bound_df_list => empty list
bound_df_list <- vector(
"list", 
length(tickers)
)
# Split-apply-combine: bound_df_list => list of data.frames
bound_df_list <- lapply(
split(
ir_list,
names(ir_list)
),
function(x){
do.call(
rbind, 
x
)
}
)
# Clear up the intermediate objects:
rm(ticker_df_list_list, ir_list); gc()

相关内容

  • 没有找到相关文章