我有一个90个名字的列表,我想划分并包括使用循环对象。我选择了基于模式列表的名称,但我不确定如何循环创建对象名称。我之前尝试过assign()函数,但它创建值(在反引号内)而不是对象。谢谢! !
这个列表有90个名字,每个样本名字重复5次,所以基本上总共有18个样本,每个样本有5个文件。我想创建一个对象,每个样本,其中包含对应于该样本的名称的列表,所以有5个项目的列表。所以我想创建一个循环,而不是复制粘贴函数(样本)。1 = Sample .names. diltions [grep("Sample 1_", Sample .names. diltions)]) 18次。我希望这能说得通?
#list
>sample.names.dilutions
> length(sample.names.dilutions)
[1] 90
#names in list
> sample.names.dilutions[1:20]
[1] "New AS Plate 21_AS Plate_Sample 1_100.fcs" "New AS Plate 21_AS Plate_Sample 1_25.fcs"
[3] "New AS Plate 21_AS Plate_Sample 1_250.fcs" "New AS Plate 21_AS Plate_Sample 1_50.fcs"
[5] "New AS Plate 21_AS Plate_Sample 1_500.fcs" "New AS Plate 21_AS Plate_Sample 10_100.fcs"
[7] "New AS Plate 21_AS Plate_Sample 10_25.fcs" "New AS Plate 21_AS Plate_Sample 10_250.fcs"
[9] "New AS Plate 21_AS Plate_Sample 10_50.fcs" "New AS Plate 21_AS Plate_Sample 10_500.fcs"
[11] "New AS Plate 21_AS Plate_Sample 11_100.fcs" "New AS Plate 21_AS Plate_Sample 11_25.fcs"
[13] "New AS Plate 21_AS Plate_Sample 11_250.fcs" "New AS Plate 21_AS Plate_Sample 11_50.fcs"
[15] "New AS Plate 21_AS Plate_Sample 11_500.fcs" "New AS Plate 21_AS Plate_Sample 12_100.fcs"
[17] "New AS Plate 21_AS Plate_Sample 12_25.fcs" "New AS Plate 21_AS Plate_Sample 12_250.fcs"
[19] "New AS Plate 21_AS Plate_Sample 12_50.fcs" "New AS Plate 21_AS Plate_Sample 12_500.fcs"
#function i want to create with loop
> sample.1 = sample.names.dilutions[grep("Sample 1_", sample.names.dilutions)]
> length(sample.1)
[1] 5
> sample.1
[1] "New AS Plate 21_AS Plate_Sample 1_100.fcs" "New AS Plate 21_AS Plate_Sample 1_25.fcs"
[3] "New AS Plate 21_AS Plate_Sample 1_250.fcs" "New AS Plate 21_AS Plate_Sample 1_50.fcs"
[5] "New AS Plate 21_AS Plate_Sample 1_500.fcs"
> #i have 18 different samples and want to assign value and subset according to sample name
> for(i in 1:18) {
+ print(sample.names[i], quote=FALSE) = sample.names.dilutions[grep(paste0("Sample ",i,"_"), sample.names.dilutions)]}
Error in print(sample.names[i], FALSE) <- sample.names.dilutions[grep(paste0("Sample ", :
could not find function "print<-"
我想我现在明白了;感谢你在评论中澄清你的问题。如果我遗漏了什么,或者您有任何问题,请告诉我。
术语,迅速
我相信您对基于每个元素中的模式将字符串向量拆分为多个更短的字符串向量感兴趣。列表就是向量的向量。
g
是一个包含20个字符串元素的向量(参见下面的数据代码块)。
is.vector(g)
#> [1] TRUE
这是一个只包含一个向量的列表。
str(list(g))
#> List of 1
#> $ : chr [1:20] "New AS Plate 21_AS Plate_Sample 12_50.fcs" "New AS Plate 21_AS Plate_Sample 1_100.fcs" "New AS Plate 21_AS Plate_Sample 1_25.fcs" "New AS Plate 21_AS Plate_Sample 1_250.fcs" ...
现在进入问题…
在你的问题中,你特别问了关于使用assign()
。虽然使用assign()
比较方便,但通常不推荐使用[1]。但有时候你得做你该做的,没什么好羞耻的。下面是你如何手动使用它,一次一个组(就像你在你的问题中显示的那样)。
# Using assign() one group at a time
h <- g[grep("Sample 1_", g)]
assign(x = "sample_1_group", value = h)
在for循环中使用assign()
是非常直接的(并且看起来合乎逻辑)。
定义for循环的第一步是定义循环的类型&;或者换句话说,在循环的每次迭代中会发生什么变化。在你的情况下,我们正在寻找一个数字来定义你的组。可以手动或编程地定义这些数的向量。
# Define groups manually
ids <- c(12,1,10,11)
ids
#> [1] 12 1 10 11
# Pattern match groups
all_ids <- gsub(pattern = ".*Sample (\d+).*", replacement = "\1", x = g)
all_ids
#> [1] "12" "1" "1" "1" "1" "1" "10" "10" "10" "10" "10" "11" "11" "11" "11"
#> [16] "11" "12" "12" "12" "12"
ids <- unique(all_ids)
ids
#> [1] "12" "1" "10" "11"
在知道要循环的对象之后,可以定义循环的结构和in中的函数。paste0()
是这里的主力。下面的循环遍历id(一次一个id),在g
中找到匹配的字符串,并将它们作为向量写入环境。因为我们使用的是assign()
,所以每次循环迭代后,我们期望在环境中出现一个新的向量。
# For-loop with assign
for(i in ids){
a <- paste0("Sample ", i, "_")
h <- g[grep(a, g)]
h_name <- paste0("sample_", i, "_group")
assign(x = h_name, value = h)
}
这在技术上是可行的,但不是最好的。您可能会发现,使用列表(向量的向量)来存储for循环中的信息实际上更方便。编程速度很快,不会有一堆新对象挤占你的工作空间,上面链接中所有可怕的东西(不是真的)都不会成为问题。你可以这样做:
# Save the results of a for-loop in a list!
# First, make a blank list to hold the results
results <- list()
for(i in ids){
a <- paste0("Sample ", i, "_")
h <- g[grep(a, g)]
h_name <- paste0("sample_", i, "_group")
results[[h_name]] <- h
}
results
#> $sample_12_group
#> [1] "New AS Plate 21_AS Plate_Sample 12_50.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 12_100.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 12_25.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 12_250.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 12_500.fcs"
#>
#> $sample_1_group
#> [1] "New AS Plate 21_AS Plate_Sample 1_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 1_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 1_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 1_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 1_500.fcs"
#>
#> $sample_10_group
#> [1] "New AS Plate 21_AS Plate_Sample 10_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 10_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 10_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 10_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 10_500.fcs"
#>
#> $sample_11_group
#> [1] "New AS Plate 21_AS Plate_Sample 11_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 11_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 11_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 11_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 11_500.fcs"
额外学分
for循环很棒:很容易看到它们内部发生了什么,很容易在其中进行大量数据处理,并且它们通常执行起来相当快。但有时这都是关于速度。R是矢量化的([老实说,我不太确定这是什么意思][2],除了"它可以同时进行多个计算"之外),但是for循环并没有很好地利用这一点。apply()
矢量化函数族可以做到这一点,而且在可能还使用for循环的情况下,它们通常很容易实现。以下是如何处理数据的方法:
# Vectorized
lapply(ids, function(i) g[grep(paste0("Sample ", i, "_"), g)])
#> [[1]]
#> [1] "New AS Plate 21_AS Plate_Sample 12_50.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 12_100.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 12_25.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 12_250.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 12_500.fcs"
#>
#> [[2]]
#> [1] "New AS Plate 21_AS Plate_Sample 1_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 1_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 1_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 1_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 1_500.fcs"
#>
#> [[3]]
#> [1] "New AS Plate 21_AS Plate_Sample 10_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 10_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 10_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 10_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 10_500.fcs"
#>
#> [[4]]
#> [1] "New AS Plate 21_AS Plate_Sample 11_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 11_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 11_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 11_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 11_500.fcs"
Created on 2021-10-14 by the reprex package (v2.0.1)
数据:
g <- c("New AS Plate 21_AS Plate_Sample 12_50.fcs",
"New AS Plate 21_AS Plate_Sample 1_100.fcs",
"New AS Plate 21_AS Plate_Sample 1_25.fcs",
"New AS Plate 21_AS Plate_Sample 1_250.fcs",
"New AS Plate 21_AS Plate_Sample 1_50.fcs",
"New AS Plate 21_AS Plate_Sample 1_500.fcs",
"New AS Plate 21_AS Plate_Sample 10_100.fcs",
"New AS Plate 21_AS Plate_Sample 10_25.fcs",
"New AS Plate 21_AS Plate_Sample 10_250.fcs",
"New AS Plate 21_AS Plate_Sample 10_50.fcs",
"New AS Plate 21_AS Plate_Sample 10_500.fcs",
"New AS Plate 21_AS Plate_Sample 11_100.fcs",
"New AS Plate 21_AS Plate_Sample 11_25.fcs",
"New AS Plate 21_AS Plate_Sample 11_250.fcs",
"New AS Plate 21_AS Plate_Sample 11_50.fcs",
"New AS Plate 21_AS Plate_Sample 11_500.fcs",
"New AS Plate 21_AS Plate_Sample 12_100.fcs",
"New AS Plate 21_AS Plate_Sample 12_25.fcs",
"New AS Plate 21_AS Plate_Sample 12_250.fcs",
"New AS Plate 21_AS Plate_Sample 12_500.fcs")
[1]:为什么使用assign是不好的?[2]:我如何知道R中的函数或操作是矢量化的?