r-将字符串转换为整洁的tibble


> dd <- tibble("Temper: 36.6℃  Pulse:76 bpm RR: 16bpm BP:148/58 mmHg")
> dd
# A tibble: 1 x 1
`"Temper: 36.6℃  Pulse:76 bpm RR: 16bpm BP:148/58 mmHg"`
<chr>                                                    
1 Temper: 36.6℃  Pulse:76 bpm RR: 16bpm BP:148/58 mmHg    
> ddtarget <- tibble(Temper=36.6,Pulse=76,RR=16,SBP=148,DBP=58)
> ddtarget
# A tibble: 1 x 5
Temper Pulse    RR   SBP   DBP
<dbl> <dbl> <dbl> <dbl> <dbl>
1   36.6    76    16   148    58

我有一个dd,并且想要得到一个ddtarget;我如何使用地图或其他有趣的东西来制作它?

我们可以先rename列(因为它的名称很奇怪(,然后在空白处和后面的大写字母上分割成单独的行,在冒号上分割成不同的列,最后使用pivot_wider获得宽格式的数据。

library(dplyr)
library(tidyr)
dd %>%
rename(col = `"Temper: 36.6℃  Pulse:76 bpm RR: 16bpm BP:148/58 mmHg"`) %>%
separate_rows(col, sep = "\s+(?=[A-Z])") %>%
separate(col, into = c('name', 'value'), sep = ':|:') %>%
pivot_wider()
# A tibble: 1 x 4
#  Temper   Pulse  RR       BP         
#  <chr>    <chr>  <chr>    <chr>      
#1 " 36.6℃" 76 bpm " 16bpm" 148/58 mmHg

这里有一个混乱的解决方案:

# String to convert to tibble: 
library(tidyverse)
dd <- tibble("Temper: 36.6℃  Pulse:76 bpm RR: 16bpm BP:148/58 mmHg")
# Store a vector of strings to become variables: 
dd_vars <-
grep(":", unlist(lapply(strsplit(as.character(dd), "\d+"),
function(w) {
x <- gsub(".* ", "", trimws(w, "both"))
y <- as.character(na.omit(ifelse(nchar(x) == 1, NA, x)))
})),
value = TRUE)
# Store a vector of the strings to become values: 
dd_values <- iconv(gsub("[A-Za-z]", "", grep("\d+", unlist(
lapply(strsplit(as.character(dd), ":"),
function(x) {
gsub(" .*", "", trimws(x, "both"))
})
),
value = TRUE)), 'utf-8', 'ascii', sub = '')
# Convert to a tibble with appropriate vectors: 
tib <-
as_tibble(data.frame(lapply(within(setNames(
data.frame(t(
data.frame(vars = dd_vars,
values = as.character(dd_values))
),
stringsAsFactors = FALSE),
gsub(":", "", dd_vars)
)[-1, ],
{
SBP <- unlist(strsplit(BP, "/"))[1]
DBP <- unlist(strsplit(BP, "/"))[2]
rm(BP)
}), as.numeric)))

最新更新