因此,我有以下数据集Unifreq[2:6],如下所示:
> Unifreq[2:6]
and you for that with
343668 171744 165788 153540 103160
当我像这样索引数据时:
从这里看这个解决方案:
https://stackoverflow.com/questions/23167827/using-reshape-from-wide-to-long-in-r
然后我尝试以这种方式执行此操作:
data.frame(frequency = Unifreq[1:20])
我不知道如何完成它,但我取得了一些进展,现在得到了这个:
> data.frame(frequency = Unifreq[1:20])
frequency
the 646772
and 343668
you 171744
for 165788
that 153540
with 103160
this 89900
was 88608
have 83172
are 77528
but 72908
not 64128
your 54936
all 54684
from 52880
just 52052
out 47504
they 47044
like 46660
will 46572
使用堆栈的建议很好,现在看起来像这样:
> df1 <- stack(Unifreq[1:20], index=F)
> names(df1) <- c("Frequency", "Word")
> head(df1, 10)
Frequency Word
1 646772 the
2 343668 and
3 171744 you
4 165788 for
5 153540 that
6 103160 with
7 89900 this
8 88608 was
9 83172 have
10 77528 are
不过,我想排除索引,所以它们可以看起来像这样:
Word Frequency
and 343668
you 171744
...
我尝试了您提供的链接,但它似乎对我没有帮助。 我对此有点陌生,不明白如何将数据塑造成两个单独的列并将数据显示为表格。
如何在 R 中重塑此数据?
这可以通过base R
的stack
来实现
out <- stack(Unifreq)[2:1]
names(out) <- c("Word", "Frequency")
# Word Frequency
#1 and 343668
#2 you 171744
#3 for 165788
#4 that 153540
#5 with 103160
数据
Unifreq <- structure(list(and = 343668L, you = 171744L, `for` = 165788L,
that = 153540L, with = 103160L), class = "data.frame", row.names = c(NA,
-1L))