python - 如何读取复杂的 .txt 文件并转换为 JSON
问题描述
我需要从我们的 MRT 扫描仪生成的 txt 文件中提取信息,然后将信息转换为具有特定结构的 JSON 文件。文本文件不是数据文件,而是包含有关该扫描会话的信息的实际文本文件。任何人都可以让我走上正确的道路吗?
下面是部分文件内容的示例。我更喜欢在 R 中执行此操作,但 MATLAB 或 Python 也是可能的。
Example:
Image filter = "system default";
Uniformity correction = "no";
Geometry correction = "default";
IF_info_seperator = 0;
Total scan duration = "09:05.0";
Rel. SNR = 0.752056241;
Act. TR (ms) = "5000";
Act. TE (ms) = "74";
ACQ matrix M x P = "96 x 94";
ACQ voxel MPS (mm) = "2.50 / 2.55 / 2.50";
REC voxel MPS (mm) = "2.50 / 2.50 / 2.50";
Scan percentage (%) = 97.9166641;
解决方案
在 R 中,使用readLines
和gsub
同事。
## read raw lines; don't warn "final line" (might have unwanted side-effects)
txt <- readLines("mri.txt", warn=FALSE)
## replace first `=` with `§` and split there
spl <- strsplit(sub("=", "§", txt), "§")
## expand lengths of list elements to 2 (throws a `NA` in case)
spl <- lapply(spl, `length<-`, 2)
## remove leading/trailing whitespace; make matrix
tmp <- t(sapply(spl, trimws))
## replace `;` or `"` with empty string,
tmp[,2] <- gsub(";|\"", "", tmp[,2])
tmp
# [,1] [,2]
# [1,] "Image filter" "system default"
# [2,] "Uniformity correction" "no"
# [3,] "Geometry correction" "default"
# [4,] "IF_info_seperator" "0"
# [5,] "Total scan duration" "09:05.0"
# [6,] "Rel. SNR" "0.752056241"
# [7,] "Act. TR (ms)" "5000"
# [8,] "Act. TE (ms)" "74"
# [9,] "ACQ matrix M x P" "96 x 94"
# [10,] "ACQ voxel MPS (mm)" "2.50 / 2.55 / 2.50"
# [11,] "REC voxel MPS (mm)" "2.50 / 2.50 / 2.50"
# [12,] "Scan percentage (%)" "97.9166641"
最后jsonlite::toJSON
。
library(jsonlite)
toJSON(tmp)
[["Image filter","system default"],["Uniformity correction","no"],["Geometry correction","default"],["IF_info_seperator","0"],["Total scan duration","09:05.0"],["Rel. SNR","0.752056241"],["Act. TR (ms)","5000"],["Act. TE (ms)","74"],["ACQ matrix M x P","96 x 94"],["ACQ voxel MPS (mm)","2.50 / 2.55 / 2.50"],["REC voxel MPS (mm)","2.50 / 2.50 / 2.50"],["Scan percentage (%)","97.9166641"]]
当然,您可能需要对此进行微调。
推荐阅读
- javascript - 为什么我不能在 JavaScript 中设置导航栏的高度?
- angular - 按字母顺序显示 json 数据
- realm - 在 Electron JS 应用程序上使用来自 npm 的 Realm
- angular - 如何根据angular4中的下拉选择进行路由?
- java - JMX 获取实时堆内存使用报告常量值
- wolfram-mathematica - 在循环中命名多个变量
- java - Spring Boot 中的记录器问题
- android - 邮政编码和地址
- javascript - 隐藏所有子元素时的元素值全为零
- ms-project - 当我运行 .MPX 文件时,我收到一条消息“发生导入错误”