我有一个数据框架,其中一列报告一顿饭的组成部分,例如:
----------------------------------
| ID | Component |
----------------------------------
| 1 | Vegetables |
| 2 | Pasta |
| 3 | Pasta, Vegetables |
| 4 | Pulses, Vegetables |
| 5 | Meat, Pasta, Vegetables|
| 6 | Meat, Vegetables |
| 7 | Pulses |
| 8 | Meat |
----------------------------------
我希望添加一个额外的列,给每个人一个分数。如果他们的餐食中含有意大利面,我希望他们得到1分,如果没有,我希望他们得到0分。所以参与者2、3和5得1,而其他人得0。
是否有代码允许我将此应用于术语"pasta"?
任何帮助将不胜感激!谢谢。
我们可以使用grepl
匹配子串"面食"它返回一个逻辑向量,与as.integer
转换为二进制或+
df1$meal_score <- +(grepl('Pasta', df1$Component))
一个简单的解决方案:
library(tidyverse)
df1 %>%
mutate(score = +str_detect(Component, "Pasta"))
#> ID Component score
#> 1 1 Vegetables 0
#> 2 2 Pasta 1
#> 3 3 Pasta, Vegetables 1
#> 4 4 Pulses, Vegetables 0
#> 5 5 Meat, Pasta, Vegetables 1
#> 6 6 Meat, Vegetables 0
#> 7 7 Pulses 0
#> 8 8 Meat 0
数据:
txt <- "ID|Component
1|Vegetables
2|Pasta
3|Pasta, Vegetables
4|Pulses, Vegetables
5|Meat, Pasta, Vegetables
6|Meat, Vegetables
7|Pulses
8|Meat"
df1 <- read.table(text = txt, sep = "|", stringsAsFactors = F, header = T)
可以使用
library(dplyr)
df |> mutate(score = as.numeric(grepl("Pasta" , Component , fixed = T)))
输出ID Component score
1 1 Vegetables 0
2 2 Pasta 1
3 3 Pasta, Vegetables 1
4 4 Pulses, Vegetables 0
5 5 Meat, Pasta, Vegetables 1
6 6 Meat, Vegetables 0
7 7 Pulses 0
8 8 Meat 0
df <- structure(list(ID = 1:8, Component = c("Vegetables", "Pasta",
"Pasta, Vegetables", "Pulses, Vegetables", "Meat, Pasta, Vegetables",
"Meat, Vegetables", "Pulses", "Meat")), class = "data.frame", row.names = c(NA,
-8L))
您也可以将str_detect
函数与case_when
函数一起使用
library(stringr)
library(dplyr)
df <- data.frame(
ID = seq(1:8),
Component = c("Vegetables",
"Pasta",
"Pasta, Vegetables",
"Pulses, Vegetables",
"Meat, Pasta, Vegetables",
"Meat, Vegetables",
"Pulses",
"Meat")) %>%
mutate(
score = case_when(
str_detect(Component, "Pasta") ~ 1,
T ~ 0
)
)
> df
ID Component score
1 1 Vegetables 0
2 2 Pasta 1
3 3 Pasta, Vegetables 1
4 4 Pulses, Vegetables 0
5 5 Meat, Pasta, Vegetables 1
6 6 Meat, Vegetables 0
7 7 Pulses 0
8 8 Meat 0