首页 > 解决方案 > 我在问一个新问题,以确定如何将数据集重塑为表格并用 R 语言向其中添加标题

问题描述

因此,我有以下数据集 Unifreq[2:6],如下所示:

> Unifreq[2:6]
   and    you    for   that   with 
343668 171744 165788 153540 103160

当我像这样索引数据时:

从这里看这个解决方案:

https://stackoverflow.com/questions/23167827/using-reshape-from-wide-to-long-in-r  

然后我尝试以这种方式做到这一点:

data.frame(frequency = Unifreq[1:20])

我不知道如何完成它,但我取得了一些进展,现在得到了这个:

> data.frame(frequency = Unifreq[1:20])
     frequency
the     646772
and     343668
you     171744
for     165788
that    153540
with    103160
this     89900
was      88608
have     83172
are      77528
but      72908
not      64128
your     54936
all      54684
from     52880
just     52052
out      47504
they     47044
like     46660
will     46572

使用堆栈的建议很好,现在看起来像这样:

> df1 <- stack(Unifreq[1:20], index=F)
> names(df1) <- c("Frequency", "Word")
> head(df1, 10)
   Frequency Word
1     646772  the
2     343668  and
3     171744  you
4     165788  for
5     153540 that
6     103160 with
7      89900 this
8      88608  was
9      83172 have
10     77528  are

不过,我想排除索引,所以它们看起来像这样:

Word   Frequency
and     343668
you      171744
...

我尝试了您提供的链接,但它似乎对我没有帮助。我对此有点陌生,不明白如何将数据塑造成两个单独的列并将数据显示为表格。

我将如何在 R 中重塑这些数据?

标签: r

解决方案


这可以通过stackfrom来实现base R

out <- stack(Unifreq)[2:1]
names(out) <- c("Word", "Frequency")
#  Word Frequency
#1  and 343668
#2  you 171744
#3  for 165788
#4 that 153540
#5 with 103160

数据

Unifreq <- structure(list(and = 343668L, you = 171744L, `for` = 165788L, 
    that = 153540L, with = 103160L), class = "data.frame", row.names = c(NA, 
-1L))

推荐阅读