首页 > 解决方案 > 用多个逗号分割字符串

问题描述

它是如何在文本下方拆分的?它包含逗号分隔值,但一些内部值也有逗号。但是我们知道每个组都以GO:XX模式开始。

GO:0048193, BP, 高尔基体囊泡转运, GO:0030198, BP, 细胞外基质组织, GO:0006903, BP, 囊泡靶向, GO:0043062, BP, 细胞外结构组织, GO:0048199, BP, 囊泡靶向, to,来自或在高尔基体内部, GO:0031012, CC, 细胞外基质, GO:0062023, CC, 含胶原蛋白的细胞外基质, GO:0005581, CC, 胶原蛋白三聚体, GO:0044420, CC, 细胞外基质成分, GO:0030020, MF , 赋予抗张强度的细胞外基质结构成分, GO:0005201, MF, 细胞外基质结构成分

我使用了这个正则表达式模式,但不适用于多逗号值:(如在 GO:0048199 中)

let myRegexp = /(GO:[0-9]+), (BP|MF|CC), ([^,]+)/g;
let raw = "GO:0048193, BP, Golgi vesicle transport, GO:0030198, BP, extracellular matrix organization, GO:0006903, BP, vesicle targeting, GO:0043062, BP, extracellular structure organization, GO:0048199, BP, vesicle targeting, to, from or within Golgi, GO:0031012, CC, extracellular matrix, GO:0062023, CC, collagen-containing extracellular matrix, GO:0005581, CC, collagen trimer, GO:0044420, CC, extracellular matrix component, GO:0030020, MF, extracellular matrix structural constituent conferring tensile strength, GO:0005201, MF, extracellular matrix structural constituent"
let match = myRegexp.exec(raw);
while (match != null) {
      console.log(match[0].trim());
      match = myRegexp.exec(raw);
}

也许我可以使用模式拆分数据:GO:[0-9]+但是我无法捕获 GO ID。这将是两个步骤两个捕获所有数据,所以它是丑陋的代码。有没有更好的解决方案?

标签: javascriptregex

解决方案


您可以使用前瞻:

GO:\d+.*?(?=,\s+GO:|$)

在 regex101.com 上查看演示


JS可能是:

let myRegexp = /GO:\d+.*?(?=,\s+GO:|$)/g;
let raw = "GO:0048193, BP, Golgi vesicle transport, GO:0030198, BP, extracellular matrix organization, GO:0006903, BP, vesicle targeting, GO:0043062, BP, extracellular structure organization, GO:0048199, BP, vesicle targeting, to, from or within Golgi, GO:0031012, CC, extracellular matrix, GO:0062023, CC, collagen-containing extracellular matrix, GO:0005581, CC, collagen trimer, GO:0044420, CC, extracellular matrix component, GO:0030020, MF, extracellular matrix structural constituent conferring tensile strength, GO:0005201, MF, extracellular matrix structural constituent"
let match = myRegexp.exec(raw);
while (match != null) {
      console.log(match[0].trim());
      match = myRegexp.exec(raw);
}


推荐阅读