javascript - 试图找到所有正则表达式匹配的索引，但有些被遗漏了

问题描述

我想在字符串中的第一个“e”之后找到每个元音的索引。

由于您无法直接从中获取捕获组的索引RegExp.exec(sInput)，但您可以获得包含实际捕获组前面所有内容的捕获组的长度，因此我用来执行此操作的正则表达式是/(.*?e.*?)(a|e|i|o|u)(.*)/.

所以设置基本上是这样的：

let re = /(.*?e.*?)(a|e|i|o|u)(.*)/g;
let sInput = "lorem ipsum";

let tMatches = [];
let tMatchIndices = [];
let iPrevIndex = 0;

while (result = re.exec(sInput)) {
    /*  result[0]: full match
        result[1]: match for 1st capture group (.*?e.*?)
        result[2]: match for 2nd capture group (a|e|i|o|u)
        result[3]: match for 3rd capture group (.*)
    */
    let index = result[1].length + iPrevIndex;
    let sMatch = result[2];
    tMatchIndices.push(index);
    tMatches[index] = sMatch;
    iPrevIndex = index + sMatch.length;
    re.lastIndex = iPrevIndex;
}

for (i = 0; i < tMatches.length; i++) {
  let index = tMatchIndices[i];
    console.log(tMatches[index] + " at index "+index);
}

问题在于输入字符串“lorem ipsum”，我需要“i”和“u”的索引......它只给我“i”的索引。

我知道它为什么这样做 - 将搜索索引推进到第一个匹配之后会切断应该触发下一个匹配的“e”。我坚持的是如何解决它。我不能只是简单地不推进搜索索引，否则它永远不会超过第一个匹配项。

我曾考虑过在进行过程中简单地从搜索字符串中删除每个匹配项，但是随后将其后的每个字符的索引都向左移动，因此我收集的索引对于原始的未截断的索引甚至都不准确细绳。

做什么？

标签： javascriptregexstringindexing

您可以通过积极的回顾来做到这一点：

'lorem ipsum'.replace(/(?<=e.*)[aiueo]/g, function(m, offset) {
  console.log(m + ' ==> ' + offset)
});

输出：

i ==> 6
u ==> 9

解释：

(?<=e.*)- 性格的正面回顾e
[aiueo]- 扫描元音
使用g标志重复
在替换功能中，您可以参考偏移量

javascript - 试图找到所有正则表达式匹配的索引，但有些被遗漏了

问题描述

解决方案

推荐阅读