假设我有一个名为a
的参考数据帧。我想知道如何自动添加a
中存在但在其他 data.frame b
和 d
中缺少的任何变量?
注意:我的目标是用这个函数,这样就可以基于单个参考数据帧完成任意数量的 data.frame 和任意数量的变量。
a <- data.frame(x = 2:3, y = 4:5, z = c(T, F)) ## reference data.frame
b <- data.frame(x = 6:7) ## Add y and z here
d <- data.frame(x = 7:8) ## Add y and z here
假设所有涉及的数据帧共享相同数量的行,您可以简单地:
toadd<-setdiff(colnames(a),colnames(b))
b[toadd]<-a[toadd]
将上述内容包装在一个函数中:
f<-function(refdf, ...) {
res<-listdf<-list(...)
res<-lapply(listdf, function(x) {
toadd<-setdiff(names(refdf),names(x))
x[toadd]<-refdf[toadd]
x
})
c(list(refdf),res)
}
然后尝试例如:
f(a,b)
f(a,b,d)
# Using a reference data.frame perform a right join in order
# to append required vectors to provided data.frames:
add_empty_vecs <- function(refdf, ...){
# Store the names of the other data.frames: df_names => character vector
df_names <- as.list(substitute(list(...)))[-1L]
# Return the right joined the reference data.frame to the
# provided data.frames: list => .GlobalEnv()
setNames(lapply(list(...), function(y){
merge(refdf, y, by = intersect(names(refdf), names(y)), all.y = TRUE)
}
), c(df_names))
}
# Apply function only df b:
add_empty_vecs(a, b)
# Apply function to both df b & df d:
add_empty_vecs(a, b, d)
# Apply function to all b, d, e:
add_empty_vecs(a, b, d, e)
数据:
a <- data.frame(x = 2:3, y = 4:5, z = c(T, F)) ## reference data.frame
b <- data.frame(x = 6:7) ## Add y and z here
d <- data.frame(x = 7:8) ## Add y and z here
e <- data.frame(x = 9:10)