我对R很陌生,尤其是对整洁的诗句。我正在尝试编写一个脚本,我们可以使用它重写分类列表。我们已经有一个使用了很多 for 和 if 循环,我想尝试用整洁的宇宙来简化它,但我有点卡住了如何做到这一点。
我有一个看起来像这样的表格(真的很简化(
taxon_file<- tibble(name = c( "cockroach","cockroach2", "grasshopper", "spider", "lobster", "insect", "crustacea", "arachnid"),
Id = c(445,448,446,778,543,200,400,300),
parent_ID = c(200,200,200,300,400,200,400,300),
rank = c("genus","genus","genus","genus","genus","order","order","order")
)
+-------------+-----+-----------+----------+
| name | Id | parent_ID | rank |
+=============+=====+===========+==========+
| cockroach | 445 | 200 | genus |
| cockroach2 | 448 | 200 | genus |
| grasshopper | 446 | 200 | genus |
| spider | 778 | 300 | genus |
| lobster | 543 | 400 | genus |
| insect | 200 | 200 | order |
| crustacea | 400 | 400 | order |
| arachnid | 300 | 300 | order |
+-------------+-----+-----+------------+----------+
现在我想重新排列它,以便我得到一个新列,我可以在其中添加与parent_ID匹配的顺序(所以当 == ID parent_ID时,然后按列顺序写 name(。最终结果应该看起来像这样
+-------------+------------+------+-----------+
| name | order | Id | parent_ID |
+=============+============+======+===========+
| cockroach | insect | 445 | 200 |
| cockroach2 | insect | 448 | 200 |
| grasshopper | insect | 446 | 200 |
| spider | arachnid | 778 | 300 |
| lobster | crustacea | 543 | 400 |
+-------------+------------+------+-----------+
我尝试将 mutate 与 ifelse 语句组合在一起,但这只会将 NA 添加到整个订单列中。
蒂布尔被命名为taxon_list
taxon_list %>%
mutate(order = ifelse(parent_ID == Id, Name, NA))
我知道这行不通,因为它不会在整个数据集中搜索正确的行(这就是我之前对 alle for 循环所做的(。也许有人可以指出我正确的方向?
一种方法是将每个等级类型filter
为 2 个单独的 dfs,子集使用 select
,并merge
2。
df <- tibble(name = c( "cockroach","cockroach2", "grasshopper", "spider", "lobster", "insect", "crustacea", "arachnid"),
Id = c(445,448,446,778,543,200,400,300),
parent_ID = c(200,200,200,300,400,200,400,300),
rank = c("genus","genus","genus","genus","genus","order","order","order"))
library(tidyverse)
df_order <- df %>%
filter(rank == "order") %>%
select(order = name, parent_ID)
df_genus <- df %>%
filter(rank == "genus") %>%
select(name, Id, parent_ID) %>%
merge(df_order, by = "parent_ID")
结果:
parent_ID name Id order
1 200 cockroach 445 insect
2 200 cockroach2 448 insect
3 200 grasshopper 446 insect
4 300 spider 778 arachnid
5 400 lobster 543 crustacea