r-转换数据.框列字符转换为数字



我正在处理包含这样信息的数据帧。


df<- as.data.frame(read.table("headen.bed",header = FALSE, sep="t",stringsAsFactors=FALSE, quote=""))
C1  C2    C3 
33  12249  0,300,3900,400,4500,400,4200
83  9213   0,49,66,75,158,160,170,183,218
146 680    0,3,13,129,274,278,383,481,482,496

我想把C1加到C3的每个元素中,它会是这样的。

C1  C2    C3 
33  12249  33,333,3933,433,4533,433,433
83  9213   83 132 149 158 241 243 253 266 301
146 680    146 149 159 275 420 424 529 627 628 642

但不知怎么的,它表明C3是一个字符类,我试过了。使用as.numerictype.convert, character to factor and then numeric转换为数字类型的不同方法。但还是没人能提出最好的表演方式吗?

你可以试试,

mapply(function(x, y)paste(x + as.numeric(y), collapse = ','),df$C1 ,strsplit(df$C3, ','))
[1] "33,333,3933,433,4533,433,4233"  "83,132,149,158,241,243,253,266,301"  "146,149,159,275,420,424,529,627,628,642"

数据

df <- data.frame(C1 = c(33, 83, 146), 
C2 = c(1, 2, 3), 
C3 = c('0,300,3900,400,4500,400,4200', '0,49,66,75,158,160,170,183,218', '0,3,13,129,274,278,383,481,482,496'), 
stringsAsFactors = FALSE)

EDIT要将C3转换为数字,您必须将其拆分为多列。有很多方法可以做到这一点,如图所示。我喜欢splitstackshape方法,即

library(splitstackshape)
df1 <- cSplit(df, 'C3', sep = ',')
#C1 C2 C3_01 C3_02 C3_03 C3_04 C3_05 C3_06 C3_07 C3_08 C3_09 C3_10
#1:  33  1    33   333  3933   433  4533   433  4233    NA    NA    NA
#2:  83  2    83   132   149   158   241   243   253   266   301    NA
#3: 146  3   146   149   159   275   420   424   529   627   628   642
str(df1)
Classes ‘data.table’ and 'data.frame':  3 obs. of  12 variables:
$ C1   : num  33 83 146
$ C2   : num  1 2 3
$ C3_01: int  33 83 146
$ C3_02: int  333 132 149
$ C3_03: int  3933 149 159
$ C3_04: int  433 158 275
$ C3_05: int  4533 241 420
$ C3_06: int  433 243 424
$ C3_07: int  4233 253 529
$ C3_08: int  NA 266 627
$ C3_09: int  NA 301 628
$ C3_10: int  NA NA 642

最新更新