正则表达式仅在不在<tag>和 </tag>



我们使用正则表达式替换来搜索文本中的术语并用<dfn>将命中包装起来。这就像一个魅力,直到我们有一个包含几个被包装的单词的术语,然后是一个只有其中一个单词的术语。例如:

以下是以下术语之一的示例:
"人类设计系统","设计"。

因此,我们的代码首先找到"人类设计系统",用<dfn>标签包装它,然后在其中找到"设计"并用<dfn>标签包装它。

结果变为:

<dfn>Human <dfn>Design</dfn> System</dfn>

当我们想要结果时:

<dfn>Human Design System</dfn>

因此,我们需要的是一种检查术语是否被<dfn></dfn>包装的方法,并简单地跳过这些情况的替换。

这是我们现在使用的代码:

//Definition of variables, please not that ~open~ is replaced by <dfn> and ~close~ is replaced by </dfn> later  
var TPL_TAG_OPEN = '~open~',
    TPL_TAG_CLOSE = '~close~',
    ESCAPERS = '[\s!:.;,%"'\(\)\{\}]';
//This is the RegExp that prepares the content
//term is the term that we are looking for and line is the text we are searching in
var re = new RegExp("^("+term+")(" + ESCAPERS + ")", modifier);
line = line.replace(re, TPL_TAG_OPEN + "$1" + TPL_TAG_CLOSE + "$2");
re = new RegExp("(" + ESCAPERS + ")("+term+")$", modifier);
line = line.replace(re, "$1" + TPL_TAG_OPEN + "$2" + TPL_TAG_CLOSE);
re = new RegExp("(" + ESCAPERS + ")("+term+")(" + ESCAPERS + ")", modifier);
line = line.replace(re, "$1" + TPL_TAG_OPEN +"$2" + TPL_TAG_CLOSE + "$3");

输入:

<dfn>Human Design System</dfn> Human Design Design Human Testar test Human Design Test 
Human Test Design Test Test Design <dfn>Human Design System</dfn> Test Human Design

现在的结果:

<dfn>Human <dfn>Design</dfn> System</dfn> Human <dfn>Design</dfn <dfn>Design</dfn> 
Human Testar test Human <dfn>Design</dfn Test Human Test <dfn>Design</dfn> Test Test
<dfn>Design</dfn> <dfn>Human <dfn>Design</dfn> System</dfn> Test Human <dfn>Design</dfn>

想要的结果:

<dfn>Human Design System</dfn> Human <dfn>Design</dfn> <dfn>Design</dfn> 
Human Testar test Human <dfn>Design</dfn> Test Human Test <dfn>Design</dfn> 
Test Test <dfn>Design</dfn> <dfn>Human Design System</dfn> Test Human <dfn>Design</dfn>

注意:

我们已经成功地检查了该术语是否已经被标签包装,但仅使用 RegExp .test 函数,但如果这会阻止文本继续并检查文本的其余部分,下面是该代码:

var pattern = RegExp("^("+TPL_TAG_OPEN+").*((?!"+TPL_TAG_CLOSE+").).*("+term+")*$");
if (pattern.test(line))
     return false;

最终解决方案:

var ESCAPERS = '[\s!:.;,%"'\(\)\{\}]';
var terms = ['Design','Human Design System','This and That...'];
terms = terms.join('|');
re = new RegExp("(" + ESCAPERS + "|^)(" + terms + ")(" + ESCAPERS + "|$)",'gi');
nodes.contents().filter()
     .each(function(){
          $(this).replaceWith(this.nodeValue.replace(re, '$1<dfn class="thesaurus">$2</dfn>$3'));
     });

只需一次性完成所有操作:

var s = 'Human Design System Human Design Design Human Testar test ' +
        'Human Design Test Human Test Design Test Test Design Human ' +
        'Design System Test Human Design';
// Alternative matches are tried in sequence.
var t = s.replace(/Human Design System|Design/g, '<dfn>$&</dfn>');
console.log(t);

或者,要以增量方式执行此操作:

var s = 'Human Design System Human Design Design Human Testar test ' +
        'Human Design Test Human Test Design Test Test Design Human ' +
        'Design System Test Human Design';
var adddfn = function(s, term){
    return s.replace(/(.*?)(<dfn>.*?</dfn>|$)/g, function(all, one, two){
        return one.replace(RegExp(term, 'g'), '<dfn>$&</dfn>') + two;
    });
};
var terms = ['Human Design System', 'Design'];
var t = terms.reduce(function(result, term){
    return adddfn(result, term);
}, s);
console.log(t);

我只会匹配已经存在的标签并将它们传递:

str = "<dfn>Human Design System</dfn> Human Design Design Human Testar test Human Design Test Human Test Design Test Test Design <dfn>Human Design System</dfn> Test Human Design";
str = str.replace(/(<dfn>.+?</dfn>)|(Human Design System|Design)/g, function(_, $1, $2) {
    return $1 || "<dfn>" + $2 + "</dfn>";
});
alert(str)

这些也是另一种方式,但单个正则表达式无法做到。

查找此匹配设计(<dfn>(?:(?!</?dfn>).)* )design( (?:(?!</?dfn>).)*</dfn>)

$1&tmp;$2替代设计

然后找到匹配的Design并替换为<dfn>$&</dfn>

现在在(<dfn>(?:(?!</?dfn>).)* )&tmp;( (?:(?!</?dfn>).)*</dfn>)的DFN &tmp;内比赛

更改为$1Design$2

现在问题解决了。

如果您想使用上面的代码使用它。

最新更新