我有一个用R编写的大数据集,其中一列的每一行都有特定的列名(target),如下所示。我要做的是检查每一行的列名,从特定列复制值,并将其粘贴回"目标"的相关行。列。
target col1 col2 col3
col1 green one dog
col1 pink two cat
col3 blue three spider
col2 black four pig
col3 purple five elephant
col2 yellow six lion
这是我想要的结果:
target col1 col2 col3
green green one dog
pink pink two cat
spider blue three spider
four black four pig
fox purple five fox
six yellow six lion
我尝试了下面的代码,但是在第一个几百行之后,它复制了错误的值。
df$target[df$target=="[col1]"] <- df$col1
df$target[df$target=="[col2]"] <- df$col2
df$target[df$target=="[col3]"] <- df$col3
df$target[df$target=="[col4]"] <- df$col4
有什么建议吗?提前感谢
您可以使用sapply
:
df$test <- sapply(1:nrow(df),
function(x) df[x, names(df) == df[x,1]])
输出:
# target col1 col2 col3 test
# 1 col1 green one dog green
# 2 col1 pink two cat pink
# 3 col3 blue three spider spider
# 4 col2 black four pig four
# 5 col3 purple five elephant elephant
# 6 col2 yellow six lion six
注意我创建了一个test
列来检查,但是你可以覆盖target
df <- read.table(text = "target col1 col2 col3
col1 green one dog
col1 pink two cat
col3 blue three spider
col2 black four pig
col3 purple five elephant
col2 yellow six lion", header = TRUE)
基本矢量化方法:如果使用带有"["您可以获取行和列的值。
dat <- read.table(text="target col1 col2 col3
col1 green one dog
col1 pink two cat
col3 blue three spider
col2 black four pig
col3 purple five elephant
col2 yellow six lion", head=TRUE)
# extract col number from first column and add one to offset it
# then use that as the column numer for
dat[ matrix( c( 1:nrow(dat), as.numeric(gsub("col", "", dat[[1]]))+1 ), ncol=2) ]
[1] "green" "pink" "spider" "four" "elephant" "six"
然后将它分配给"target"列。
> dat[, 1] <- dat[ matrix( c( 1:nrow(dat), as.numeric(gsub("col", "", dat[[1]]))+1 ), ncol=2) ]
> dat
target col1 col2 col3
1 green green one dog
2 pink pink two cat
3 spider blue three spider
4 four black four pig
5 elephant purple five elephant
6 six yellow six lion