我的R代码
data <- read.csv('filename.csv')
typof(data)
[1] "list"
str(data)
'data frame' : 9 obs. of 10 variables
$Name: Factor w/9 levels "Name 1", "Name 2",....
$Column2: chr "","Text1","","Text2"
$Column3: chr "Text2","Text3","","Text1"
$Column4: chr "","","","Text1"
#and so on
要求:
我想要的只是在$Column2
,$Column3
,$Column4
中,...在没有空白值的情况下,添加前缀(Here this is
)和后缀(completed
)。因此,考虑到data Column2
上方,目前具有值"Text1"
的第二行应该成为"Here this is Text1 completed."
类似地,在Column3: 1st, 2nd and 4th cell
中需要添加前缀和后缀值。
i 不想使用loop ,除非并取消直接/necsary。
我的尝试:
我尝试了很少的尝试,例如 interaction
, mget
, append
,而更多的尝试似乎没有用。
我会如下矢量化
indx <- which(data[, -1] != "", arr.ind = TRUE) # Find all non-empty incidences
data[, -1][indx] <- paste("Here this is", data[, -1][indx], "completed.")
这将适用于前4列
apply(data[,2:4],2,function(x) ifelse(x != "",paste("Here this is ",x," completed."),x))
假设每个列的前缀和后缀相同。它确实返回了矩阵,但是将其转换为数据框架很容易。希望它有帮助。
编辑:刚刚意识到您的数据在列表中,因此您需要lapply
或sapply
。类似:
sapply(data,function(x) ifelse(x != "",paste("Here this is ",x," completed."),x))[,2:4]
还返回一个矩阵。
以下是set
的选项,该选项在不复制的情况下分配到位
library(data.table)
setDT(data)
for(j in 2:ncol(data)){
set(data, i = which(data[[j]]!=""),
j = j,
value = paste("Here there is ", data[[j]][data[[j]]!=""], " completed."))
}
data
# Name Column2 Column3
#1: Name 1 Here there is Text1 completed.
#2: Name 2 Here there is Text1 completed.
#3: Name 3 Here there is Text2 completed. Here there is Text2 completed.
#4: Name 4 Here there is Text3 completed.
数据
data <- structure(list(Name = structure(1:4, .Label = c("Name 1", "Name 2",
"Name 3", "Name 4"), class = "factor"), Column2 = c("", "Text1",
"Text2", ""), Column3 = c("Text1", "", "Text2", "Text3")), .Names = c("Name",
"Column2", "Column3"), row.names = c(NA, -4L), class = "data.frame")
使用 lapply
函数(也是循环):
# dummy data
df1 <- mtcars[1:5, 1:3]
# add blanks
df1[2,2] <- ""
df1
# mpg cyl disp
# Mazda RX4 21.0 6 160
# Mazda RX4 Wag 21.0 160
# Datsun 710 22.8 4 108
# Hornet 4 Drive 21.4 6 258
# Hornet Sportabout 18.7 8 360
# add prefix and suffix
res <- cbind(df1[, 1, drop = FALSE],
data.frame(
lapply(df1[, -1], function(i)
ifelse(i == "", i, paste("Here this is", i, "completed.")))))
res
# mpg cyl disp
# Mazda RX4 21.0 Here this is 6 completed. Here this is 160 completed.
# Mazda RX4 Wag 21.0 Here this is 160 completed.
# Datsun 710 22.8 Here this is 4 completed. Here this is 108 completed.
# Hornet 4 Drive 21.4 Here this is 6 completed. Here this is 258 completed.
# Hornet Sportabout 18.7 Here this is 8 completed. Here this is 360 completed.
DF = transform(ifelse(data$Column2 == "", data$Column2, sprintf('Here it is %s completed', data$Column2)))
DF <- data.frame (DF, data$Name, data$Column3)