aggregate()是否可以在R中用于查找按id分组的观测值列表中的最高值,以及该观测值的其他值


感谢您花时间研究我的问题。

我使用的是一个数据集,每个分组ID有多个观察结果(长格式数据(。我一直在使用aggregate((来选择每个ID的最高值,因为这样可以将最高值附加到每个ID只有一个实例的宽格式数据帧中。

我就是这样做的:

example.df <- data.frame (id  = c(rep(1:3, each = 3)), 
sex = c("M", "", "", "F", "", "", "M", "", ""), 
value = c(1:9), 
date = c("01-01-2021", "01-01-2020", "01-01-2019", "01-01-2018", "01-01-2017", "01-01-2016", "01-01-2015", "01-01-2014", "01-01-2013"))
highest_value <- aggregate(value ~ id, data = example.df, max)
example_subset <- (select(example.df, "id", "sex"))
example_subset <- (subset(example_subset, sex!=""))
example_subset <- merge(example_subset, highest_value, by.x = "id", by.y = "id", all.x = T)

这将创建一个具有ID、性别和该ID的最高值的数据帧。但是,我也想将与该最高值对应的日期附加到数据帧中。我不知道如何使用聚合来实现这一点,因为聚合只允许我选择多个sum_var,然后所有sum_var都应用fun=max。

我想要的结果:

example_subset$date <- c("01-01-2019", "01-01-2016", "01-01-2013")
example_subset

aggregate((对此可用吗?还是我在处理这个问题时使用了错误的工具?

编辑——我的示例数据似乎不太匹配。以下是我正在处理的数据的dput:

structure(list(id = c("20001", "20002", "20002", "20004", "20004"
), echo_lvef = c(30, 50, 50, 60, 55), echo_date = c("2019-11-21", 
"2013-06-14", "2013-06-14", "2012-02-07", "2017-09-14")), na.action = structure(c(`2` = 2L, 
`5` = 5L, `6` = 6L, `7` = 7L, `8` = 8L, `9` = 9L, `10` = 10L, 
`14` = 14L, `15` = 15L, `16` = 16L, `17` = 17L, `18` = 18L, `20` = 20L, 
`21` = 21L, `25` = 25L, `26` = 26L, `27` = 27L, `28` = 28L, `29` = 29L, 
`30` = 30L, `31` = 31L, `36` = 36L, `40` = 40L, `42` = 42L, `43` = 43L, 
`44` = 44L, `45` = 45L, `46` = 46L, `47` = 47L, `50` = 50L, `51` = 51L, 
`52` = 52L, `53` = 53L, `54` = 54L, `57` = 57L, `58` = 58L, `59` = 59L, 
`60` = 60L, `64` = 64L, `66` = 66L, `69` = 69L, `70` = 70L, `71` = 71L, 
`72` = 72L, `73` = 73L, `76` = 76L, `78` = 78L, `79` = 79L, `80` = 80L, 
`82` = 82L, `84` = 84L, `85` = 85L, `87` = 87L, `88` = 88L, `89` = 89L, 
`90` = 90L, `91` = 91L, `94` = 94L, `95` = 95L, `100` = 100L, 
`101` = 101L, `102` = 102L, `103` = 103L, `104` = 104L, `105` = 105L, 
`110` = 110L, `116` = 116L, `117` = 117L, `122` = 122L, `123` = 123L, 
`124` = 124L, `125` = 125L, `127` = 127L, `128` = 128L, `129` = 129L, 
`130` = 130L, `131` = 131L, `134` = 134L, `135` = 135L, `136` = 136L, 
`144` = 144L, `145` = 145L, `146` = 146L, `147` = 147L, `148` = 148L, 
`149` = 149L, `150` = 150L, `151` = 151L, `152` = 152L, `155` = 155L, 
`157` = 157L, `158` = 158L, `159` = 159L, `160` = 160L, `161` = 161L, 
`162` = 162L, `163` = 163L, `164` = 164L, `168` = 168L, `169` = 169L, 
`170` = 170L, `180` = 180L, `184` = 184L, `185` = 185L, `186` = 186L, 
`187` = 187L, `188` = 188L, `189` = 189L, `190` = 190L, `191` = 191L, 
`192` = 192L, `193` = 193L, `194` = 194L, `199` = 199L, `201` = 201L, 
`204` = 204L, `212` = 212L, `216` = 216L, `217` = 217L, `224` = 224L, 
`225` = 225L, `227` = 227L, `228` = 228L, `229` = 229L, `239` = 239L, 
`243` = 243L, `244` = 244L, `245` = 245L, `252` = 252L, `253` = 253L, 
`254` = 254L, `255` = 255L, `256` = 256L, `257` = 257L, `258` = 258L, 
`269` = 269L, `270` = 270L, `271` = 271L, `272` = 272L, `273` = 273L, 
`274` = 274L, `280` = 280L, `281` = 281L, `282` = 282L, `286` = 286L, 
`287` = 287L, `290` = 290L, `291` = 291L, `292` = 292L, `296` = 296L, 
`297` = 297L, `298` = 298L, `302` = 302L, `303` = 303L, `304` = 304L, 
`306` = 306L, `307` = 307L, `308` = 308L, `319` = 319L, `320` = 320L, 
`331` = 331L, `333` = 333L, `334` = 334L, `335` = 335L, `338` = 338L, 
`350` = 350L, `353` = 353L, `358` = 358L, `359` = 359L, `367` = 367L, 
`369` = 369L, `370` = 370L, `371` = 371L, `372` = 372L, `377` = 377L, 
`379` = 379L, `380` = 380L, `381` = 381L, `387` = 387L, `395` = 395L, 
`396` = 396L, `397` = 397L, `399` = 399L, `400` = 400L, `404` = 404L, 
`407` = 407L, `408` = 408L, `412` = 412L, `413` = 413L, `414` = 414L, 
`415` = 415L, `420` = 420L, `421` = 421L, `428` = 428L, `429` = 429L, 
`430` = 430L, `431` = 431L, `437` = 437L, `444` = 444L, `445` = 445L, 
`446` = 446L, `447` = 447L, `450` = 450L, `451` = 451L, `452` = 452L, 
`453` = 453L, `459` = 459L, `460` = 460L, `461` = 461L, `466` = 466L, 
`473` = 473L, `475` = 475L, `485` = 485L, `486` = 486L, `487` = 487L, 
`488` = 488L, `492` = 492L, `493` = 493L, `500` = 500L, `501` = 501L, 
`502` = 502L, `503` = 503L, `504` = 504L, `505` = 505L, `506` = 506L, 
`507` = 507L, `509` = 509L, `510` = 510L, `511` = 511L, `512` = 512L, 
`513` = 513L, `514` = 514L, `516` = 516L, `517` = 517L, `518` = 518L, 
`519` = 519L, `529` = 529L, `530` = 530L, `531` = 531L, `532` = 532L, 
`533` = 533L, `534` = 534L, `549` = 549L, `550` = 550L, `551` = 551L, 
`552` = 552L, `553` = 553L, `554` = 554L, `556` = 556L, `558` = 558L, 
`559` = 559L, `563` = 563L, `576` = 576L, `577` = 577L, `578` = 578L, 
`579` = 579L, `580` = 580L, `581` = 581L, `582` = 582L, `588` = 588L, 
`596` = 596L, `600` = 600L, `601` = 601L, `605` = 605L, `606` = 606L, 
`614` = 614L, `621` = 621L, `631` = 631L, `637` = 637L, `644` = 644L, 
`647` = 647L, `648` = 648L, `660` = 660L, `661` = 661L, `662` = 662L, 
`663` = 663L, `671` = 671L, `672` = 672L, `673` = 673L, `674` = 674L, 
`676` = 676L, `677` = 677L, `678` = 678L, `688` = 688L, `689` = 689L, 
`690` = 690L, `699` = 699L, `700` = 700L, `701` = 701L, `703` = 703L, 
`707` = 707L, `708` = 708L, `710` = 710L, `712` = 712L, `713` = 713L, 
`714` = 714L, `715` = 715L, `716` = 716L, `717` = 717L, `718` = 718L, 
`719` = 719L, `720` = 720L, `721` = 721L, `722` = 722L, `731` = 731L, 
`732` = 732L, `733` = 733L, `734` = 734L, `742` = 742L, `743` = 743L, 
`745` = 745L, `752` = 752L, `756` = 756L, `757` = 757L, `758` = 758L, 
`759` = 759L, `760` = 760L, `761` = 761L, `764` = 764L, `765` = 765L, 
`766` = 766L, `767` = 767L, `775` = 775L, `776` = 776L, `781` = 781L, 
`782` = 782L, `783` = 783L, `789` = 789L, `790` = 790L, `791` = 791L, 
`794` = 794L, `796` = 796L, `797` = 797L, `814` = 814L, `815` = 815L, 
`816` = 816L, `817` = 817L, `818` = 818L, `819` = 819L, `820` = 820L, 
`821` = 821L, `822` = 822L, `838` = 838L, `839` = 839L, `868` = 868L, 
`869` = 869L, `870` = 870L, `871` = 871L, `872` = 872L, `873` = 873L, 
`874` = 874L, `875` = 875L, `878` = 878L, `879` = 879L, `880` = 880L, 
`887` = 887L, `888` = 888L, `889` = 889L, `895` = 895L, `896` = 896L, 
`897` = 897L, `898` = 898L, `899` = 899L, `901` = 901L, `907` = 907L, 
`912` = 912L, `913` = 913L, `914` = 914L, `918` = 918L, `919` = 919L, 
`920` = 920L, `921` = 921L, `922` = 922L, `923` = 923L, `928` = 928L, 
`929` = 929L, `930` = 930L, `931` = 931L, `932` = 932L, `933` = 933L, 
`934` = 934L, `935` = 935L, `936` = 936L, `937` = 937L, `940` = 940L, 
`941` = 941L, `951` = 951L, `959` = 959L, `960` = 960L, `971` = 971L, 
`973` = 973L, `974` = 974L, `981` = 981L, `988` = 988L, `1005` = 1005L, 
`1013` = 1013L, `1014` = 1014L, `1015` = 1015L, `1019` = 1019L, 
`1020` = 1020L, `1021` = 1021L, `1022` = 1022L, `1023` = 1023L, 
`1024` = 1024L, `1025` = 1025L, `1026` = 1026L, `1027` = 1027L, 
`1028` = 1028L, `1034` = 1034L, `1035` = 1035L, `1036` = 1036L, 
`1037` = 1037L, `1038` = 1038L, `1039` = 1039L, `1040` = 1040L, 
`1041` = 1041L, `1043` = 1043L, `1044` = 1044L, `1045` = 1045L, 
`1046` = 1046L, `1047` = 1047L, `1064` = 1064L, `1065` = 1065L, 
`1066` = 1066L, `1067` = 1067L, `1068` = 1068L, `1075` = 1075L, 
`1083` = 1083L, `1084` = 1084L, `1085` = 1085L, `1094` = 1094L, 
`1095` = 1095L, `1105` = 1105L, `1106` = 1106L, `1107` = 1107L, 
`1117` = 1117L, `1118` = 1118L, `1119` = 1119L, `1120` = 1120L, 
`1121` = 1121L, `1123` = 1123L, `1133` = 1133L, `1134` = 1134L, 
`1155` = 1155L, `1162` = 1162L, `1163` = 1163L, `1164` = 1164L, 
`1165` = 1165L, `1166` = 1166L, `1167` = 1167L, `1168` = 1168L, 
`1172` = 1172L, `1173` = 1173L, `1174` = 1174L, `1181` = 1181L, 
`1182` = 1182L, `1188` = 1188L, `1189` = 1189L, `1190` = 1190L, 
`1195` = 1195L, `1196` = 1196L, `1197` = 1197L, `1198` = 1198L, 
`1199` = 1199L, `1209` = 1209L, `1211` = 1211L, `1212` = 1212L, 
`1213` = 1213L, `1214` = 1214L, `1215` = 1215L, `1216` = 1216L, 
`1217` = 1217L, `1218` = 1218L, `1226` = 1226L, `1240` = 1240L, 
`1241` = 1241L, `1247` = 1247L, `1248` = 1248L, `1249` = 1249L, 
`1250` = 1250L, `1251` = 1251L, `1252` = 1252L, `1261` = 1261L, 
`1268` = 1268L, `1273` = 1273L, `1274` = 1274L, `1275` = 1275L, 
`1276` = 1276L, `1277` = 1277L, `1278` = 1278L, `1279` = 1279L, 
`1280` = 1280L, `1281` = 1281L, `1282` = 1282L, `1283` = 1283L, 
`1286` = 1286L, `1287` = 1287L, `1288` = 1288L, `1289` = 1289L, 
`1290` = 1290L, `1294` = 1294L, `1295` = 1295L, `1296` = 1296L, 
`1297` = 1297L, `1298` = 1298L, `1301` = 1301L, `1302` = 1302L, 
`1310` = 1310L, `1318` = 1318L, `1324` = 1324L, `1325` = 1325L, 
`1329` = 1329L, `1330` = 1330L, `1331` = 1331L, `1332` = 1332L, 
`1333` = 1333L, `1335` = 1335L, `1338` = 1338L, `1349` = 1349L, 
`1350` = 1350L, `1353` = 1353L, `1354` = 1354L, `1355` = 1355L, 
`1356` = 1356L, `1357` = 1357L, `1360` = 1360L, `1362` = 1362L, 
`1363` = 1363L, `1368` = 1368L, `1369` = 1369L, `1376` = 1376L, 
`1377` = 1377L, `1383` = 1383L, `1387` = 1387L, `1390` = 1390L, 
`1399` = 1399L, `1400` = 1400L, `1401` = 1401L, `1402` = 1402L, 
`1405` = 1405L, `1406` = 1406L, `1407` = 1407L, `1408` = 1408L, 
`1410` = 1410L, `1417` = 1417L, `1422` = 1422L, `1423` = 1423L, 
`1434` = 1434L, `1435` = 1435L, `1436` = 1436L, `1437` = 1437L, 
`1438` = 1438L, `1439` = 1439L, `1440` = 1440L, `1441` = 1441L, 
`1442` = 1442L, `1443` = 1443L, `1444` = 1444L, `1445` = 1445L, 
`1448` = 1448L, `1449` = 1449L, `1452` = 1452L, `1453` = 1453L, 
`1454` = 1454L, `1455` = 1455L, `1456` = 1456L, `1459` = 1459L, 
`1460` = 1460L, `1461` = 1461L, `1462` = 1462L, `1467` = 1467L, 
`1470` = 1470L, `1471` = 1471L, `1472` = 1472L, `1473` = 1473L, 
`1484` = 1484L, `1489` = 1489L, `1499` = 1499L, `1500` = 1500L, 
`1501` = 1501L, `1502` = 1502L, `1503` = 1503L, `1504` = 1504L, 
`1505` = 1505L, `1506` = 1506L, `1509` = 1509L, `1510` = 1510L, 
`1511` = 1511L, `1512` = 1512L, `1513` = 1513L, `1516` = 1516L, 
`1517` = 1517L, `1518` = 1518L, `1525` = 1525L, `1527` = 1527L, 
`1528` = 1528L, `1529` = 1529L, `1532` = 1532L, `1533` = 1533L, 
`1534` = 1534L, `1536` = 1536L, `1537` = 1537L, `1540` = 1540L, 
`1541` = 1541L, `1546` = 1546L, `1551` = 1551L, `1553` = 1553L, 
`1554` = 1554L, `1557` = 1557L, `1560` = 1560L, `1561` = 1561L, 
`1568` = 1568L, `1569` = 1569L, `1571` = 1571L, `1572` = 1572L, 
`1581` = 1581L, `1582` = 1582L, `1583` = 1583L, `1591` = 1591L, 
`1592` = 1592L, `1593` = 1593L, `1594` = 1594L, `1595` = 1595L, 
`1596` = 1596L, `1597` = 1597L, `1598` = 1598L, `1599` = 1599L, 
`1601` = 1601L, `1604` = 1604L, `1609` = 1609L, `1612` = 1612L, 
`1614` = 1614L, `1615` = 1615L, `1616` = 1616L, `1617` = 1617L, 
`1619` = 1619L, `1620` = 1620L, `1622` = 1622L, `1623` = 1623L, 
`1624` = 1624L, `1626` = 1626L, `1630` = 1630L, `1631` = 1631L, 
`1632` = 1632L, `1636` = 1636L, `1637` = 1637L, `1641` = 1641L, 
`1648` = 1648L, `1649` = 1649L, `1651` = 1651L, `1653` = 1653L, 
`1654` = 1654L, `1655` = 1655L, `1661` = 1661L, `1662` = 1662L, 
`1663` = 1663L, `1664` = 1664L, `1666` = 1666L, `1667` = 1667L, 
`1668` = 1668L, `1670` = 1670L, `1672` = 1672L, `1674` = 1674L, 
`1675` = 1675L, `1676` = 1676L, `1679` = 1679L, `1680` = 1680L, 
`1681` = 1681L, `1692` = 1692L, `1693` = 1693L, `1700` = 1700L, 
`1701` = 1701L, `1706` = 1706L, `1707` = 1707L, `1708` = 1708L, 
`1709` = 1709L, `1715` = 1715L, `1719` = 1719L, `1722` = 1722L, 
`1724` = 1724L, `1725` = 1725L, `1734` = 1734L, `1735` = 1735L, 
`1737` = 1737L, `1739` = 1739L, `1740` = 1740L, `1745` = 1745L, 
`1747` = 1747L, `1754` = 1754L, `1755` = 1755L, `1757` = 1757L, 
`1760` = 1760L, `1761` = 1761L, `1762` = 1762L, `1764` = 1764L, 
`1765` = 1765L, `1766` = 1766L, `1769` = 1769L, `1770` = 1770L, 
`1771` = 1771L, `1772` = 1772L, `1778` = 1778L, `1779` = 1779L, 
`1780` = 1780L, `1783` = 1783L, `1784` = 1784L, `1788` = 1788L, 
`1790` = 1790L, `1791` = 1791L, `1798` = 1798L, `1799` = 1799L, 
`1800` = 1800L, `1801` = 1801L, `1802` = 1802L, `1803` = 1803L, 
`1804` = 1804L, `1805` = 1805L, `1807` = 1807L, `1808` = 1808L, 
`1811` = 1811L, `1814` = 1814L, `1815` = 1815L, `1820` = 1820L, 
`1823` = 1823L, `1826` = 1826L, `1827` = 1827L, `1830` = 1830L, 
`1832` = 1832L, `1840` = 1840L, `1841` = 1841L, `1844` = 1844L, 
`1850` = 1850L, `1851` = 1851L, `1861` = 1861L, `1862` = 1862L, 
`1863` = 1863L, `1864` = 1864L, `1870` = 1870L, `1871` = 1871L, 
`1877` = 1877L, `1878` = 1878L, `1879` = 1879L, `1886` = 1886L, 
`1889` = 1889L, `1891` = 1891L, `1892` = 1892L, `1893` = 1893L, 
`1894` = 1894L, `1895` = 1895L, `1896` = 1896L, `1897` = 1897L, 
`1898` = 1898L, `1899` = 1899L, `1900` = 1900L, `1901` = 1901L, 
`1902` = 1902L, `1903` = 1903L, `1904` = 1904L, `1905` = 1905L, 
`1906` = 1906L, `1907` = 1907L, `1908` = 1908L, `1909` = 1909L, 
`1910` = 1910L, `1911` = 1911L, `1912` = 1912L, `1913` = 1913L, 
`1914` = 1914L, `1915` = 1915L, `1916` = 1916L, `1917` = 1917L, 
`1918` = 1918L, `1919` = 1919L, `1920` = 1920L, `1921` = 1921L, 
`1922` = 1922L, `1923` = 1923L, `1924` = 1924L, `1925` = 1925L, 
`1926` = 1926L, `1927` = 1927L, `1928` = 1928L, `1929` = 1929L, 
`1930` = 1930L, `1931` = 1931L, `1932` = 1932L, `1933` = 1933L, 
`1934` = 1934L, `1935` = 1935L, `1936` = 1936L, `1937` = 1937L, 
`1938` = 1938L, `1939` = 1939L, `1940` = 1940L, `1941` = 1941L, 
`1942` = 1942L, `1943` = 1943L, `1944` = 1944L, `1945` = 1945L, 
`1946` = 1946L, `1947` = 1947L, `1948` = 1948L, `1949` = 1949L, 
`1950` = 1950L, `1951` = 1951L, `1953` = 1953L, `1956` = 1956L, 
`1957` = 1957L, `1958` = 1958L, `1959` = 1959L, `1960` = 1960L, 
`1961` = 1961L, `1962` = 1962L, `1963` = 1963L, `1964` = 1964L, 
`1965` = 1965L), class = "omit"), row.names = c(1L, 3L, 4L, 11L, 
12L), class = "data.frame")

dplyr中,您可以执行此

example.df <- data.frame (id  = c(rep(1:3, each = 3)), 
sex = c("M", "", "", "F", "", "", "M", "", ""), 
value = c(1:9), 
date = c("01-01-2021", "01-01-2020", "01-01-2019", "01-01-2018", "01-01-2017", "01-01-2016", "01-01-2015", "01-01-2014", "01-01-2013"))
library(dplyr, warn.conflicts = F)

example.df %>%
group_by(id) %>%
summarise(date = date[value == max(value)],
value = max(value), 
sex  = first(sex))
#> # A tibble: 3 x 4
#>      id date       value sex  
#>   <int> <chr>      <int> <chr>
#> 1     1 01-01-2019     3 M    
#> 2     2 01-01-2016     6 F    
#> 3     3 01-01-2013     9 M

创建于2021-06-21由reprex包(v2.0.0(

以下是我使用split()purrr::map()的方法:

library(dplyr)
split(example.df, ~ id) %>%
purrr::map(~ .x %>%
mutate(sex = max(sex)) %>%
arrange(desc(value)) %>%
slice_head()) %>%
bind_rows()

退货:

id sex value       date
1  1   M     3 01-01-2019
2  2   F     6 01-01-2016
3  3   M     9 01-01-2013

使用的数据:

example.df <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), sex = c("M", "", "", "F", "", "", "M", "", ""), value = 1:9, date = c("01-01-2021", "01-01-2020", "01-01-2019", "01-01-2018", "01-01-2017", "01-01-2016", "01-01-2015", "01-01-2014", "01-01-2013")), class = "data.frame", row.names = c(NA, -9L))

sex:列中填写NA后,可以使用merge

example.df$sex <- with(example.df, ave(sex, id, FUN=function(x) x[1]))
merge(example.df, highest_value)
#  id value sex       date
#1  1     3   M 01-01-2019
#2  2     6   F 01-01-2016
#3  3     9   M 01-01-2013

或者您可以使用split而不是aggregate:

example.df[sapply(split(seq_len(nrow(example.df)), example.df$id), 
function(i) i[example.df$value[i] == max(example.df$value[i])]),]
#  id sex value       date
#3  1   M     3 01-01-2019
#6  2   F     6 01-01-2016
#9  3   M     9 01-01-2013

或者如果你只想要一个匹配:

example.df[sapply(split(seq_len(nrow(example.df)), example.df$id), 
function(i) i[which.max(example.df$value[i])]),]

也许不是最优雅的解决方案,但对于您的示例,它运行良好:

highest_value <- aggregate(. ~ id, data = example.df, max)
highest_value$date <- example.df[highest_value$value,"date"]
id sex value       date
1  1   M     3 01-01-2019
2  2   F     6 01-01-2016
3  3   M     9 01-01-2013

我们也可以使用slice_max作为@AnilGoyal解决方案的替代方案

library(dplyr)
example.df %>% 
group_by(id) %>% 
mutate(firstsex = first(sex)) %>%
slice_max(n = 1, order_by = value) %>%
ungroup %>% 
mutate(sex = firstsex, firstsex = NULL)
# A tibble: 3 x 4
id sex   value date      
<int> <chr> <int> <chr>     
1     1 M         3 01-01-2019
2     2 F         6 01-01-2016
3     3 M         9 01-01-2013

最新更新