r - 如何在 R 中解析多个分隔数据中的列/值
问题描述
我收到了一个 0365 中的奇怪文件,它似乎由 : 和 , 用引号分隔。我想将它们放入单独的列和值中。
下面的一个例子:
CreationDate UserID AuditData
2020-05-04 User1 {"Id":"4ccd2","RecordType":20,"CreationTime":"2020-05-04T10:24:44"}
2020-04-14 User2 {"Id":"4def5","RecordType":18,"CreationTime":"2020-04-14T10:24:44"}
2020-03-29 User3 {"Id":"4zxc2","RecordType":07,"CreationTime":"2020-03-29T10:24:44"}
目标:将 AuditData 列分解为:1) Id 和 value 2) RecordType 和 value 3) CreationTime 和 value
等等等等
我一直在尝试用单独的()做几件事,但到目前为止都没有成功。谢谢!
解决方案
这是一个tidyverse
使用separate
.
#Your data
df<-read.csv(text = 'CreationDate UserID AuditData
2020-05-04 User1 {"Id":"4ccd2","RecordType":20,"CreationTime":"2020-05-04T10:24:44"}
2020-04-14 User2 {"Id":"4def5","RecordType":18,"CreationTime":"2020-04-14T10:24:44"}
2020-03-29 User3 {"Id":"4zxc2","RecordType":07,"CreationTime":"2020-03-29T10:24:44"}',
sep = " ")
library(tidyverse)
df %>%
# remove keys using gsub
mutate_at(vars(AuditData), function(x) gsub("\\{|\\}","",x)) %>%
# separate using the colon or comma (however this separates also the time values)
separate(col = AuditData,
# Define the new column names
into = c("Id","Idvalue","RecordType","RecordTypevalue","CreationTime","temp","time1","time2"),
# Use : or , as separators
sep = "\\:|\\,") %>%
# Use paste to reconstruct the time values
mutate(CreationTimevalue = paste(temp,time1,time2, sep = ":")) %>%
# Eliminate unused columns: temp, time1 and time2
select(-c(temp,time1,time2))
# CreationDate UserID Id Idvalue RecordType RecordTypevalue CreationTime CreationTimevalue
# 1 2020-05-04 User1 Id 4ccd2 RecordType 20 CreationTime 2020-05-04T10:24:44
# 2 2020-04-14 User2 Id 4def5 RecordType 18 CreationTime 2020-04-14T10:24:44
# 3 2020-03-29 User3 Id 4zxc2 RecordType 07 CreationTime 2020-03-29T10:24:44
推荐阅读
- visual-studio-2017 - 如何避免在 Visual Studio 2017 15.9.13 中添加此或我的资格问题
- css - 我的本地 python 服务器不知道要运行的静态 css 文件
- python - 如何添加子文件夹
- image - 在 keras 中实现没有 textImagegenerator 的 CTC
- javascript - React 组件样式封装
- youtube-api - YouTube Reporting API 报告为空白
- kotlin - 在 kotlin 中哪个是静态的,哪个是单例的?
- sql - 将 nvarchar 列类型更改为 datetime
- ssh - ssh-add“连接代理时出错:连接被拒绝”
- image - 是否可以选择带有打开的窗口(如浏览器)的部分桌面并对其进行筛选?