首页 > 解决方案 > Can I use the unlist function in a dataframe?

问题描述

I was working with a list containing the words from a text and the tags classifying them. I was supposed to restore an old letter, and to do this i needed to extract only the words in a vector, so instead of using sapply, i did this: words <- unlist(data.frame(letter)[1,], use.names = FALSE) It appeared to work, but the auxiliary professor said that doing this was a problem, since you can only use unlist in lists, so I fixed it, but in the end the results were the same. PS: I know that using sapply is more efficient, i just didn't remember the function, I'm just curious to know if you can use unlist in other objects

标签: rlistfunctiondataframe

解决方案


正如@Gregor 所说,data.frames 是列表。考虑以下示例:

df <- data.frame(Col1 = LETTERS[1:5], Col2 = 1:5, stringsAsFactors = FALSE)
is.list(df)
#[1] TRUE

因此,您可以使用lapplyon adata.frame来执行按列操作:

lapply(df,paste0, collapse = "")
#$Col1
#[1] "ABCDE"
#$Col2
#[1] "12345"

但是,在对 a 进行子集化时,您必须小心,data.frame因为根据您使用的方法,您可能无法获得列表。

df["Col2"]
#  Col2
#1    1
#2    2
#3    3
#4    4
#5    5

is.list(df["Col2"])
#[1] TRUE

df[,"Col2"]
#[1] 1 2 3 4 5

is.list(df[,"Col2"])
#[1] FALSE

is.list(df[["Col2"]])
#[1] FALSE

is.list(df$Col2)
#[1] FALSE

is.list(subset(df,select = Col2))
#[1] TRUE

然而,据我所知,对整行进行子集化总是会返回一个列表。

df[1,]
#  Col1 Col2
#1    A    1

is.list(df[1,])
#[1] TRUE

is.list(subset(df,1:5 == 1))
#[1] TRUE

我们可以使用该dput函数查看单行底层结构的文本表示:

dput(df[1,])
#structure(list(Col1 = "A", Col2 = 1L), row.names = 1L, class = "data.frame")

正如我们所看到的,即使是单行也显然是一个列表。因此,我们可以合理unlist地使用该行,就像我们对任何不是也是 a 的列表进行处理一样data.frame

unlist(df[1,], use.names = FALSE)
#[1] "A" "1"

unlist(list(Col1 = "A", Col2 = 1L), use.names = FALSE)
#[1] "A" "1"

推荐阅读