r - Trying to convert nested sublist into a data frame
问题描述
I have a list that is a few levels deep:
[[12]]
[[12]][[1]]
estimateName insuredName priceList
"KING" "IDIF8X_MAY18"
laborEff claimNumber policyNumber
"Restoration/Service/Remodel" "000000-01" "0000000"
typeOfLoss roofDamage deprMat
"Hail" "0" "1"
deprNonMat deprOandP deprTaxes
"1" "1" "1"
onsite recipientsXNAddress carrierId
"1" "CO" "00000"
estimateType
"Mixed"
[[12]][[2]]
type lineNum
"I" "12"
cat sel
"SFG" "GUTA"
act actPrefix
"&" "R&R"
"1"
[[13]]
[[13]][[1]]
eName iName priceList
"KING" "MAY18"
laborEff claimNumber policyNumber
"Restoration/Service/Remodel" "00000-01" "000000000"
typeOfLoss roofDamage deprMat
"Hail" "0" "1"
deprNonMat deprOandP deprTaxes
"1" "1" "1"
onsite recipientsXNAddress carrierId
"1" "MRP.BRIGHTON.CO" "2570112"
estimateType
"Mixed"
[[13]][[2]]
type lineNum
"I" "13"
cat sel
"FEN" "VNLS6"
act actPrefix
"&" "R&R"
This is just a sample. I just was to create a data frame that will convert the attributes as the header and all the rows are the values for the. I tried to unlist and a few other functions but wasn't giving me the correct format.
So a desired output would be two rows of values with the column names.
list(list(structure(c("DARIAN_KING_&_CASSI1", "DARIAN KING & CASSIDY R KING",
"IDIF8X_MAY18", "Restoration/Service/Remodel", "037262569-01",
"H3726819012070", "Hail", "0", "1", "1", "1", "1", "1", "MRP.BRIGHTON.CO",
"2570112", "Mixed"), .Names = c("estimateName", "insuredName",
"priceList", "laborEff", "claimNumber", "policyNumber", "typeOfLoss",
"roofDamage", "deprMat", "deprNonMat", "deprOandP", "deprTaxes",
"onsite", "recipientsXNAddress", "carrierId", "estimateType")),
structure(c("I", "12", "SFG", "GUTA", "&", "R&R", "Gutter / downspout - aluminum - up to 5\"",
"12", "12", "LF", "Dwelling", "0.32", "4.23", "54.6", "36.33",
"2", "1", "9", "25", "18.27", "1.49", "37.28", "18.81", "56.09",
"1", "29.76", "10.08", "6.36", "13.32", "0.58", "24.84",
"1"), .Names = c("type", "lineNum", "cat", "sel", "act",
"actPrefix", "desc", "calc", "qty", "unit", "coverageName",
"remove", "replace", "total", "acv", "deprType", "recoverable",
"age", "lifeExp", "depr", "tax", "acvTotal", "deprTotal",
"rcvTotal", "isPartOfInitSettle", "laborTotal", "laborBase",
"laborBurden", "laborMarkup", "laborHours", "material", "containsBSCDontApply"
))), list(structure(c("DARIAN_KING_&_CASSI1", "DARIAN KING & CASSIDY R KING",
"IDIF8X_MAY18", "Restoration/Service/Remodel", "037262569-01",
"H3726819012070", "Hail", "0", "1", "1", "1", "1", "1", "MRP.BRIGHTON.CO",
"2570112", "Mixed"), .Names = c("estimateName", "insuredName",
"priceList", "laborEff", "claimNumber", "policyNumber", "typeOfLoss",
"roofDamage", "deprMat", "deprNonMat", "deprOandP", "deprTaxes",
"onsite", "recipientsXNAddress", "carrierId", "estimateType")),
structure(c("I", "13", "FEN", "VNLS6", "&", "R&R", "Vinyl (PVC) fence, 5'- 6' high - full slat",
"8*3", "24", "LF", "Other Structures", "4.01", "27.71", "761.28",
"721.38", "2", "9", "150", "39.9", "29.61", "749.21", "41.68",
"790.89", "1", "267.84", "141.36", "80.64", "45.84", "9.1",
"493.44", "1"), .Names = c("type", "lineNum", "cat", "sel",
"act", "actPrefix", "desc", "calc", "qty", "unit", "coverageName",
"remove", "replace", "total", "acv", "deprType", "age", "lifeExp",
"depr", "tax", "acvTotal", "deprTotal", "rcvTotal", "isPartOfInitSettle",
"laborTotal", "laborBase", "laborBurden", "laborMarkup",
"laborHours", "material", "containsBSCDontApply"))))
解决方案
Up front:
dplyr::bind_rows(lapply(datlst, function(dl) as.data.frame(as.list(unlist(dl)), stringsAsFactors=FALSE)))
# estimateName insuredName priceList laborEff claimNumber policyNumber typeOfLoss roofDamage deprMat deprNonMat deprOandP deprTaxes onsite
# 1 DARIAN_KING_&_CASSI1 DARIAN KING & CASSIDY R KING IDIF8X_MAY18 Restoration/Service/Remodel 037262569-01 H3726819012070 Hail 0 1 1 1 1 1
# 2 DARIAN_KING_&_CASSI1 DARIAN KING & CASSIDY R KING IDIF8X_MAY18 Restoration/Service/Remodel 037262569-01 H3726819012070 Hail 0 1 1 1 1 1
# recipientsXNAddress carrierId estimateType type lineNum cat sel act actPrefix desc calc qty unit coverageName remove replace total acv deprType
# 1 MRP.BRIGHTON.CO 2570112 Mixed I 12 SFG GUTA & R&R Gutter / downspout - aluminum - up to 5" 12 12 LF Dwelling 0.32 4.23 54.6 36.33 2
# 2 MRP.BRIGHTON.CO 2570112 Mixed I 13 FEN VNLS6 & R&R Vinyl (PVC) fence, 5'- 6' high - full slat 8*3 24 LF Other Structures 4.01 27.71 761.28 721.38 2
# recoverable age lifeExp depr tax acvTotal deprTotal rcvTotal isPartOfInitSettle laborTotal laborBase laborBurden laborMarkup laborHours material containsBSCDontApply
# 1 1 9 25 18.27 1.49 37.28 18.81 56.09 1 29.76 10.08 6.36 13.32 0.58 24.84 1
# 2 <NA> 9 150 39.9 29.61 749.21 41.68 790.89 1 267.84 141.36 80.64 45.84 9.1 493.44 1
Let's decompose this. Within each top-level are two very-different lists, so I'm guessing that they need to be on the same row. We can combine them simply with unlist(...)
. (I'll use str
and truncation where needed to aid in presentation.)
str(unlist(datlst[[1]]))
# Named chr [1:48] "DARIAN_KING_&_CASSI1" "DARIAN KING & CASSIDY R KING" "IDIF8X_MAY18" "Restoration/Service/Remodel" "037262569-01" "H3726819012070" "Hail" "0" "1" "1" "1" "1" "1" ...
# - attr(*, "names")= chr [1:48] "estimateName" "insuredName" "priceList" "laborEff" ...
Knowing that we can convert a named-list
into a data.frame
easily, let's convert this named-vector into a named-list:
str(as.list(unlist(datlst[[1]])))
# List of 48
# $ estimateName : chr "DARIAN_KING_&_CASSI1"
# $ insuredName : chr "DARIAN KING & CASSIDY R KING"
# $ priceList : chr "IDIF8X_MAY18"
# ...snip...
# $ laborHours : chr "0.58"
# $ material : chr "24.84"
# $ containsBSCDontApply: chr "1"
So we can apply that to each element of datlst
with lapply
. From there, we need to rbind
them all. In base R this is usually done with do.call(rbind, lapply(...))
, but I noticed that there is an element that is in one and not in the other (recoverable
), so this highlights a constraint of rbind
: the column names must be the same and in the same order. However, two (non-base) tools can work around this:
dplyr::bind_rows(...)
data.table::rbindlist(..., fill=TRUE)
I demonstrated the first, but the second works just as well ... use whichever one you may already have installed.
推荐阅读
- android - 无法将 Ionic 1 应用程序发布到具有新的 64 位支持要求的 Google Play 商店
- .net-core - 内部的 HttpResponseMessage EnsureSuccessStatusCode() 方法
- sip - Kamailio - 调度程序通过 http 确定可用性?
- angular - 角度从数据 Json 检测一天结束时间
- composer-php - 使用 Composer 和 Drupal 应用补丁时出现问题
- flutter - 如何在单个文本中包含多种字体
- mongodb - 使用鉴别器更新 mongodb 文档
- javascript - 使用 react-select 时重置值
- sed - 当想要匹配多行时,如何使用 sed 更改文本
- vue.js - 选项后的多个 v-checkbox 删除多个选择不起作用