将对作为R中的行中的单独条目进行比较

我有一个列名file，它是一个字符变量，提供关于昆虫左右翅膀的信息(L.dw.png和R.dw.png(以及一些其他属性。

我想看看左右翼是否不存在任何文件条目？每奇数行表示左翼，每偶数行表示右翼。

wings <- read.table("https://zenodo.org/record/6950928/files/AT-raw-coordinates.csv", header = TRUE, sep = ";")

前六个条目跟随

file  sample country  x1  y1  x2  y2  x3  y3  x4  y4
1 AT-0001-031-003678-L.dw.png AT-0001      AT 219 191 238 190 292 270 287 216
2 AT-0001-031-003678-R.dw.png AT-0001      AT 213 190 234 189 289 268 281 211
3 AT-0001-031-003679-L.dw.png AT-0001      AT 218 182 235 181 284 262 286 210
4 AT-0001-031-003679-R.dw.png AT-0001      AT 214 185 234 183 283 264 285 211
5 AT-0001-031-003680-L.dw.png AT-0001      AT 207 181 225 178 276 261 273 206
6 AT-0001-031-003680-R.dw.png AT-0001      AT 203 181 222 180 271 261 267 206

如果有人能帮我的话，我不能写代码脚本，因为我在搜索引擎后尝试了一些随机代码，但这并不能满足我的查询。

如果有人能领导，我将不胜感激。

这是一种过滤掉只有一行的id的方法，即L或R。我添加了一行来显示这种情况。

编辑：它首先从文件名所在的字符串中提取数字标识符(=代码((它基本上是一个子字符串(，然后提取wing(所以是L或R(，以便以后可以创建组。

library(dplyr)
df %>% 
mutate(code = substr(file, 13,18), 
wing = substr(file, 20, 20)) %>% 
group_split(code) %>% 
purrr::keep(~nrow(.) == 1) %>% 
bind_rows()

输出：

file                        sample  country    x1    y1    x2    y2    x3    y3    x4    y4 code   wing 
<chr>                       <chr>   <chr>   <int> <int> <int> <int> <int> <int> <int> <int> <chr>  <chr>
1 AT-0001-031-003681-R.dw.png AT-0001 AT        203   181   222   180   271   261   267   206 003681 R

数据：

df <- read.table(text = "                          file  sample country  x1  y1  x2  y2  x3  y3  x4  y4
1 AT-0001-031-003678-L.dw.png AT-0001      AT 219 191 238 190 292 270 287 216
2 AT-0001-031-003678-R.dw.png AT-0001      AT 213 190 234 189 289 268 281 211
3 AT-0001-031-003679-L.dw.png AT-0001      AT 218 182 235 181 284 262 286 210
4 AT-0001-031-003679-R.dw.png AT-0001      AT 214 185 234 183 283 264 285 211
5 AT-0001-031-003680-L.dw.png AT-0001      AT 207 181 225 178 276 261 273 206
6 AT-0001-031-003680-R.dw.png AT-0001      AT 203 181 222 180 271 261 267 206
7 AT-0001-031-003681-R.dw.png AT-0001      AT 203 181 222 180 271 261 267 206", h = TRUE)

另一个基于R的解决方案：

wings <- read.table("https://zenodo.org/record/6950928/files/AT-raw-coordinates.csv", 
header = TRUE, sep = ";")
wings$wing_side <- ifelse(grepl("L", wings$file), "L", "R")
wings$file_name <- substr(wings$file, 1, 19)
table <- as.data.frame.matrix(table(wings$file_name, wings$wing_side))
wings[which(table$L != table$R), ]$file

输出：

character(0) # only complete pairs in data.frame

有意删除任意两行：

wings <- read.table("https://zenodo.org/record/6950928/files/AT-raw-coordinates.csv", 
header = TRUE, sep = ";")
wings <- wings[-c(1,5), ] # remove two arbitrary rows
wings$wing_side <- ifelse(grepl("L", wings$file), "L", "R")
wings$file_name <- substr(wings$file, 1, 19)
table <- as.data.frame.matrix(table(wings$file_name, wings$wing_side))
wings[which(table$L != table$R), ]$file

输出：

"AT-0001-031-003678-R.dw.png" "AT-0001-031-003679-R.dw.png" # list of incomplete files

相关内容

最新更新

热门标签：