我在R中有一个数据帧,如下所示:
set.seed(123)
A <- as.data.frame(matrix(rnorm(20 * 5, mean = 0, sd = 1), 20, 5))
结果是:
> A
V1 V2 V3 V4 V5
1 -0.56047565 -1.06782371 -0.69470698 0.37963948 0.005764186
2 -0.23017749 -0.21797491 -0.20791728 -0.50232345 0.385280401
3 1.55870831 -1.02600445 -1.26539635 -0.33320738 -0.370660032
4 0.07050839 -0.72889123 2.16895597 -1.01857538 0.644376549
5 0.12928774 -0.62503927 1.20796200 -1.07179123 -0.220486562
6 1.71506499 -1.68669331 -1.12310858 0.30352864 0.331781964
7 0.46091621 0.83778704 -0.40288484 0.44820978 1.096839013
8 -1.26506123 0.15337312 -0.46665535 0.05300423 0.435181491
9 -0.68685285 -1.13813694 0.77996512 0.92226747 -0.325931586
10 -0.44566197 1.25381492 -0.08336907 2.05008469 1.148807618
11 1.22408180 0.42646422 0.25331851 -0.49103117 0.993503856
12 0.35981383 -0.29507148 -0.02854676 -2.30916888 0.548396960
13 0.40077145 0.89512566 -0.04287046 1.00573852 0.238731735
14 0.11068272 0.87813349 1.36860228 -0.70920076 -0.627906076
15 -0.55584113 0.82158108 -0.22577099 -0.68800862 1.360652449
16 1.78691314 0.68864025 1.51647060 1.02557137 -0.600259587
17 0.49785048 0.55391765 -1.54875280 -0.28477301 2.187332993
18 -1.96661716 -0.06191171 0.58461375 -1.22071771 1.532610626
19 0.70135590 -0.30596266 0.12385424 0.18130348 -0.235700359
20 -0.47279141 -0.38047100 0.21594157 -0.13889136 -1.026420900
我想在每一行中找到最高值的位置,并显示最高值在特定列中的百分比。即
V1 V2 V3 V4 V5
2% 25% 40% 30% 3%
如何用R计算?
max.col
和table
:
max.col(A)
# [1] 4 5 1 3 3 1 5 5 4 4 1 5 4 3 5 1 5 5 1 3
table(max.col(A))
# 1 3 4 5
# 5 4 4 7
table(names(A)[max.col(A)])/nrow(A)
# V1 V3 V4 V5
# 0.25 0.20 0.20 0.35
虽然这与你的预期输出不匹配,但我怀疑这是因为你只是在展示它的样子。。。
类似于r2evans的解决方案,但填充了0s:
max.col(A) |>
factor(levels = seq_along(A), labels = names(A)) |>
table() |>
prop.table()
# V1 V2 V3 V4 V5
# 0.25 0.00 0.20 0.20 0.35