javascript - 如何使用 .replace 删除字符串的一部分(如果该部分存在于数组中)?
问题描述
我想从字符串中删除几个单词(这将在 for 循环中):
我需要删除的大部分单词是(这是我尝试过的正则表达式):
\b([[:<:]][0-9a-zA-z][[:>:]]|^'|about|after|all|also|[an]|and|another|any|are|[as]|at|[be]|because|been|before|being|\bbetween|both|but|by|came|can|come|could|did|do|each|for|from|get|got|had|[has]|have|he|her|here|him|himself|his|how|if|in|into|is|it|like|make|many|me|might|more|most|much|must|my|never|now|of|on|only|or|other|our|out|over|said|same|see|should|since|some|still|such|take|than|that|the|their|them|then|there|these|they|this|those|through|to|too|under|up|very|was|way|we|well|were|what|where|which|while|who|with|would|you|your)
如您所见,我需要删除 az、AZ、0-9 和几个单词
作为一个例子,我有这个短语:
“这是 Stackoverflow 的数据及其进入许多站点”
我的预期结果是:
“这是 Stackoverflow 的数据及其众多站点”
我试过的是这样的:
let wordsHidden=["[about]","[after]","[all]","[also]","[an]","[and]","[another]","[any]","[are]","[as]","[at]","[be]","[because]","[been]","[before]","[being]","[between]","[both]","[but]","[by]","[came]","[can]","[come]","[could]","[did]","[do]","[each]","[for]","[from]","[get]","[got]","[had]","[has]","[have]","[he]","[her]","[here]","[him]","[himself]","[his]","[how]","[if]","[in]","[into]","[is]","[it]","[like]","[make]","[many]","[me]","[might]","[more]","[most]","[much]","[must]","[my]","[never]","[now]","[of]","[on]","[only]","[or]","[other]","[our]","[out]","[over]","[said]","[same]","[see]","[should]","[since]","[some]","[still]","[such]","[take]","[than]","[that]","[the]","[their]","[them]","[then]","[there]","[these]","[they]","[this]","[those]","[through]","[to]","[too]","[under]","[up]","[very]","[was]","[way]","[we]","[well]","[were]","[what]","[where]","[which]","[while]","[who]","[with]","[would]","[you]","[your]"];
let test = wordsHidden.join("|");
let regexorg = "/\b([[:<:]][0-9a-zA-z][[:>:]]|^'|"+test+")";
var regex = new RegExp("/"+wordsHidden.join("|")+"/", 'g');
let string = "DLs between data";
console.log(string.replace(regex,''));
有没有办法将数组的每个部分视为一个完整的单词并返回整个处理后的单词?
解决方案
I'm not sure what you're trying to do with the start of your rex, but I have figured out a way to delete specific strings (wrapped with a non-word character) from a string.
If you JUST match the exact strings you will be left with extra spaces, so my approach is to match a non-word character on either side of each word, matching each continuing word it finds that is in the list. If we DON'T chain words like this we won't catch adjacent words (since each one will try to match the non-word characters around itself and those will collide, and we will miss adjacent matches)
wordsHidden=["about","after","all","also","an","and","another","any","are","as","at","be","because","been","before","being","between","both","but","by","came","can","come","could","did","do","each","for","from","get","got","had","has","have","he","her","here","him","himself","his","how","if","in","into","is","it","like","make","many","me","might","more","most","much","must","my","never","now","of","on","only","or","other","our","out","over","said","same","see","should","since","some","still","such","take","than","that","the","their","them","then","there","these","they","this","those","through","to","too","under","up","very","was","way","we","well","were","what","where","which","while","who","with","would","you","your"];
rexString = "\\W((" + wordsHidden.join("\\W)|(") + "\\W))+";
console.log(rexString);
regex = new RegExp(rexString, 'g');
string = "This is the Stackoverflow's Data and its into many your your you your about you sites";
match = regex.exec(string);
matches = [];
while (match != null) {
match.lastIndex = regex.lastIndex;
matches.push(match);
match = regex.exec(string);
}
cutString = string;
// iterate through matches backwards from end of string to start,
// so we don't shift our indexes as we delete parts of the string)
for (i = matches.length - 1; i >= 0; i--) {
match = matches[i];
beforeMatch = cutString.substr(0, match.lastIndex - match[0].length);
afterMatch = cutString.substr(match.lastIndex - 1); //leave the trailing "space", might be some other character
console.log(beforeMatch); console.log(match[0]); console.log(afterMatch);
cutString = beforeMatch + afterMatch;
}
console.log(cutString);
This goes from
"This is the Stackoverflow's Data and its into many your your you your about you sites" to
"This Stackoverflow's Data its sites"
with all the matching words stripped (is, the, and, into, many, your, you, about)
推荐阅读
- sms - 为什么在 SMS PDU 的十进制八位字节中使用反向半字节?
- php - 如何在 Symfony v2.8/PHP v5.6 中抛出 InvalidArgumentException 内置异常
- algorithm - 证明 n=o(2^{f(n)})?
- vb.net - 如何删除列表框数据源中的重复值?
- google-apps-script - 不隐藏基于特定单元格的所有空行
- python - 如何在 Microsoft Azure Functions 中重新启动 Python 运行时
- c++ - 如何在不浪费电流的情况下使用下拉电阻让 SPI SCK 线空闲到低电平?
- reactjs - React Store 不使用 React.useReducer 更新上下文
- azure - 在 Azure 管道的 NuGet 包操作中使用自定义环境变量
- javascript - 循环和使用document.write时如何在点击时发送唯一ID