首页 > 解决方案 > 如何按正则表达式模式拆分并将分隔符保留在长字符串上?

问题描述

我有一长串地址,每个地址的结构类似于:

123 Main Street St. Louisville OH 43071,432

我想拆分州、邮政编码、门牌号的地址字符串(在上面的例子中,这将是:OH 43071,432)。虽然我有一个正则表达式组合来标识每个字符串 (/\d+,\d+/) 中的这些元素,但基于此的拆分会导致分隔符被删除。

虽然我见过其他解决与此类似问题的堆栈溢出线程,但这些解决方案都不起作用。例如,如果我将正则表达式组合放在捕获组中,例如 (/(\d+,\d+)/),它会在另一行返回邮政编码和地址:

[ '123 Main Street St. Louisville OH ',
  '43071,432',

同样,添加?!或 ?= 在正则表达式组合中无效。

如何成功拆分地址字符串,因此输出将镜像:

[ '123 Main Street St. Louisville OH 43071,432',
   Main Long Road St. Louisville OH 43071,786

我拥有的地址列表是:

let addr =
  "123 Main Street St. Louisville OH 43071,432 Main Long Road St. Louisville OH 43071,786 High Street Pollocksville NY 56432,54 Holy Grail Street Niagara Town ZP 32908,3200 Main Rd. Bern AE 56210,1 Gordon St. Atlanta RE 13000,10 Pussy Cat Rd. Chicago EX 34342,10 Gordon St. Atlanta RE 13000,58 Gordon Road Atlanta RE 13000,22 Tokyo Av. Tedmondville SW 43098,674 Paris bd. Abbeville AA 45521,10 Surta Alley Goodtown GG 30654,45 Holy Grail Al. Niagara Town ZP 32908,320 Main Al. Bern AE 56210,14 Gordon Park Atlanta RE 13000,100 Pussy Cat Rd. Chicago EX 34342,2 Gordon St. Atlanta RE 13000,5 Gordon Road Atlanta RE 13000,2200 Tokyo Av. Tedmondville SW 43098,67 Paris St. Abbeville AA 45521,11 Surta Avenue Goodtown GG 30654,45 Holy Grail Al. Niagara Town ZP 32918,320 Main Al. Bern AE 56215,14 Gordon Park Atlanta RE 13200,100 Pussy Cat Rd. Chicago EX 34345,2 Gordon St. Atlanta RE 13222,5 Gordon Road Atlanta RE 13001,2200";

标签: javascriptregex

解决方案


因为你有重叠的匹配,你将无法使用split- 相反,重复使用.exec捕获组,并提取捕获组。匹配逗号或字符串的开头,然后在前瞻中捕获地址字符串,后跟逗号和数字:

const addr = "123 Main Street St. Louisville OH 43071,432 Main Long Road St. Louisville OH 43071,786 High Street Pollocksville NY 56432,54 Holy Grail Street Niagara Town ZP 32908,3200 Main Rd. Bern AE 56210,1 Gordon St. Atlanta RE 13000,10 Pussy Cat Rd. Chicago EX 34342,10 Gordon St. Atlanta RE 13000,58 Gordon Road Atlanta RE 13000,22 Tokyo Av. Tedmondville SW 43098,674 Paris bd. Abbeville AA 45521,10 Surta Alley Goodtown GG 30654,45 Holy Grail Al. Niagara Town ZP 32908,320 Main Al. Bern AE 56210,14 Gordon Park Atlanta RE 13000,100 Pussy Cat Rd. Chicago EX 34342,2 Gordon St. Atlanta RE 13000,5 Gordon Road Atlanta RE 13000,2200 Tokyo Av. Tedmondville SW 43098,67 Paris St. Abbeville AA 45521,11 Surta Avenue Goodtown GG 30654,45 Holy Grail Al. Niagara Town ZP 32918,320 Main Al. Bern AE 56215,14 Gordon Park Atlanta RE 13200,100 Pussy Cat Rd. Chicago EX 34345,2 Gordon St. Atlanta RE 13222,5 Gordon Road Atlanta RE 13001,2200";
let match;
const matches = [];
const pattern = /(?:^|,)(?=([^,]+,\d+))./g
while (match = pattern.exec(addr)) {
  matches.push(match[1]);
}
console.log(matches);


推荐阅读