我想在我的环境中对所有数据帧使用下面的cleanfunction
。
cleanfunction <- function(dataframe) {
dataframe <- as.data.frame(dataframe)
## get mode of all vars
var_mode <- sapply(dataframe, mode)
## produce error if complex or raw is found
if (any(var_mode %in% c("complex", "raw"))) stop("complex or raw not allowed!")
## get class of all vars
var_class <- sapply(dataframe, class)
## produce error if an "AsIs" object has "logical" or "character" mode
if (any(var_mode[var_class == "AsIs"] %in% c("logical", "character"))) {
stop("matrix variables with 'AsIs' class must be 'numeric'")
}
## identify columns that needs be coerced to factors
ind1 <- which(var_mode %in% c("logical", "character"))
## coerce logical / character to factor with `as.factor`
dataframe[ind1] <- lapply(dataframe[ind1], as.factor)
return(dataframe)
}
set.seed(10238)
DT = data.table(
A = rep(1:3, each = 5L),
B = rep(1:5, 3L),
C = sample(15L),
D = sample(15L)
)
DT_II <- copy(DT)
dfs <- ls()
现在我想把这个函数应用到环境中的所有df上。我已经尝试了十种方法,但是我不能得到正确的语法。
for (i in seq_along(dfs)) {
get(dfs[i])[ , lapply(.SD, cleanfunction)]
}
<标题>编辑:我找到了这个解决方案,但是它不存储结果。
eapply(globalenv(), function(x) if (is.data.frame(x)) cleanfunction(x))
如何在每个对象中存储结果?
标题>您的get(dfs[i])
返回对data.table
的引用,但是然后您是lapply
-该框架的每一列,我从函数参数dataframe
推断您期望一个完整的帧。可以这样开头:
for (i in seq_along(dfs)) {
get(dfs[i])[ , cleanfunction(.SD)]
}
但是要知道这个操作返回一个新的帧,它没有使用规范的data.table
机制来就地更新数据。我建议你更新你的函数,总是强制data.table
,并参考它工作。
cleanfunction <- function(dataframe) {
setDT(dataframe)
## get mode of all vars
var_mode <- sapply(dataframe, mode)
## produce error if complex or raw is found
if (any(var_mode %in% c("complex", "raw"))) stop("complex or raw not allowed!")
## get class of all vars
var_class <- sapply(dataframe, class)
## produce error if an "AsIs" object has "logical" or "character" mode
if (any(var_mode[var_class == "AsIs"] %in% c("logical", "character"))) {
stop("matrix variables with 'AsIs' class must be 'numeric'")
}
## identify columns that needs be coerced to factors
ind1 <- which(var_mode %in% c("logical", "character"))
## coerce logical / character to factor with `as.factor`
if (length(ind1)) dataframe[, c(ind1) := lapply(.SD, as.factor), .SDcols = ind1]
return(dataframe)
}
由于您当前的数据不会触发任何更改,我将更新一个:
DT[,quux:="A"]
head(DT)
# A B C D quux
# <int> <int> <int> <int> <char>
# 1: 1 1 12 15 A
# 2: 1 2 4 6 A
# 3: 1 3 5 7 A
# 4: 1 4 9 1 A
# 5: 1 5 6 14 A
# 6: 2 1 15 13 A
for (i in seq_along(dfs)) cleanfunction(get(dfs[i]))
head(DT)
# A B C D quux
# <int> <int> <int> <int> <fctr>
# 1: 1 1 12 15 A
# 2: 1 2 4 6 A
# 3: 1 3 5 7 A
# 4: 1 4 9 1 A
# 5: 1 5 6 14 A
# 6: 2 1 15 13 A
注意,for
循环完全依赖于引用更新;这里忽略cleanfunction
的返回值。
由于data.table
引用语义,该方法完全工作;如果您使用data.frame
或tbl_df
,这可能需要用assign(dfs[i], cleanfunction(..))
包装对cleanfunction(.)
的调用。
这对你有用吗?:
# store all dataframes from environment a list
dfs <- Filter(function(x) is(x, "data.frame"), mget(ls()))
#then apply your function
lapply(dfs, cleanfunction)