r-将一行的值与一个值进行比较,然后返回较小的值



我有一个包含多行和多列的dataframe,我想检查每一行的行值是否高于该行最后一列的值。我的数据如下:

date      ABI.BR      NOVN.S       ROG.S      NESN.S   NOVOb.CO    HRMS.PA      TTEF.PA     OREP.PA    SASY.PA    LVMH.PA     MAX 
1 31.12.2001  7342987743 21072681660 19737660401 57324018814 3199830778 1226900000 105895000000 13740400000 6.4880e+09 1.2229e+10 163033294916
2 31.12.2002  6280730128 17748921146 20316428627 61501808862 3350848339 1242300000  92108991318 14288000000 7.4480e+09 1.2693e+10 144604417185
3 31.12.2003  6324404232 19755193920 20002601887 56367998444 3510991763 1230000000  93961037997 14029100000 8.0480e+09 1.1962e+10 122995279989
4 31.12.2004  7850834622 20947725570 19134043533 54889985325 3905732023 1331400000  87346032957 13641300000 1.5733e+10 1.2481e+10 123515812436
5 31.12.2005 12233279545 26451714210 22833829836 58587603997 4527870421 1427400000 122854000000 14532500000 2.8513e+10 1.3910e+10 171566971202
6 31.12.2006 12660668341 26602920050 26148301223 61238063838 5198638861 1514900000 126235000000 15790100000 2.9489e+10 1.5306e+10 173975809100

我现在的目标是检查列2:10的每一行中的值是否高于列11(即MAX(中的相应值。如果一行中某列的值高于MAX列的值,我希望用MAX列的值覆盖该列的实际值。

例如,如果我有以下数据:

a b c MAX
2 3 4 3
4 5 6 6
6 7 9 8

我想在校正后的数据中有以下输出

a b c MAX
2 3 **3** 3
4 5 6 6
6 7 **8** 8

非常感谢你的帮助!

使用您的简短示例和库dplyr,您可以执行:

Reprex

  • 数据
df <- structure(list(a = c(2L, 4L, 6L), b = c(3L, 5L, 7L), c = c(4L, 
6L, 9L), MAX = c(3L, 6L, 8L)), class = "data.frame", row.names = c(NA, 
-3L))
  • 代码
library(dplyr)
df %>% transmute(across(.cols = 1:3, ~ ifelse(.x > MAX, MAX, .x)))
  • 输出
#>   a b c
#> 1 2 3 3
#> 2 4 5 6
#> 3 6 7 8

创建于2022-03-31由reprex包(v2.0.1(


编辑:使用您的真实数据并遵循@langtang的评论保留第一列和最后一列,您可以执行以下操作:

Reprex

  • 数据
structure(list(date = c("31.12.2001", "31.12.2002", "31.12.2003", 
"31.12.2004", "31.12.2005", "31.12.2006"), ABI.BR = c(7342987743, 
6280730128, 6324404232, 7850834622, 12233279545, 12660668341), 
NOVN.S = c(21072681660, 17748921146, 19755193920, 20947725570, 
26451714210, 26602920050), ROG.S = c(19737660401, 20316428627, 
20002601887, 19134043533, 22833829836, 26148301223), NESN.S = c(57324018814, 
61501808862, 56367998444, 54889985325, 58587603997, 61238063838
), NOVOb.CO = c(3199830778, 3350848339, 3510991763, 3905732023, 
4527870421, 5198638861), HRMS.PA = c(1226900000L, 1242300000L, 
1230000000L, 1331400000L, 1427400000L, 1514900000L), TTEF.PA = c(1.05895e+11, 
92108991318, 93961037997, 87346032957, 1.22854e+11, 1.26235e+11
), OREP.PA = c(13740400000, 1.4288e+10, 14029100000, 13641300000, 
14532500000, 15790100000), SASY.PA = c(6.488e+09, 7.448e+09, 
8.048e+09, 1.5733e+10, 2.8513e+10, 2.9489e+10), LVMH.PA = c(1.2229e+10, 
1.2693e+10, 1.1962e+10, 1.2481e+10, 1.391e+10, 1.5306e+10
), MAX = c(163033294916, 144604417185, 122995279989, 123515812436, 
171566971202, 173975809100)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))
  • 代码
library(dplyr)
df %>% mutate(across(.cols = 2:10, ~ ifelse(.x > MAX, MAX, .x)))
  • 输出
#>         date      ABI.BR      NOVN.S       ROG.S      NESN.S   NOVOb.CO
#> 1 31.12.2001  7342987743 21072681660 19737660401 57324018814 3199830778
#> 2 31.12.2002  6280730128 17748921146 20316428627 61501808862 3350848339
#> 3 31.12.2003  6324404232 19755193920 20002601887 56367998444 3510991763
#> 4 31.12.2004  7850834622 20947725570 19134043533 54889985325 3905732023
#> 5 31.12.2005 12233279545 26451714210 22833829836 58587603997 4527870421
#> 6 31.12.2006 12660668341 26602920050 26148301223 61238063838 5198638861
#>      HRMS.PA      TTEF.PA     OREP.PA    SASY.PA    LVMH.PA          MAX
#> 1 1226900000 105895000000 13740400000 6.4880e+09 1.2229e+10 163033294916
#> 2 1242300000  92108991318 14288000000 7.4480e+09 1.2693e+10 144604417185
#> 3 1230000000  93961037997 14029100000 8.0480e+09 1.1962e+10 122995279989
#> 4 1331400000  87346032957 13641300000 1.5733e+10 1.2481e+10 123515812436
#> 5 1427400000 122854000000 14532500000 2.8513e+10 1.3910e+10 171566971202
#> 6 1514900000 126235000000 15790100000 2.9489e+10 1.5306e+10 173975809100

创建于2022-03-31由reprex包(v2.0.1(

cols = c('a', 'b', 'c')
df[cols] = lapply(df[cols], (x) ifelse(x > df$MAX, df$MAX, x))
#   a b c MAX
# 1 2 3 3   3
# 2 4 5 6   6
# 3 6 7 8   8

相关内容

最新更新