使用domdocument和preg_replace_callback在HTML中设置标签

我尝试用(html(锚替换我的术语字典中的单词，以便它获得工具提示。我可以完成替换零件，但是我只是无法将其放回DomDocument对象中。

我做出了递归迭代的递归函数，它迭代了每个幼儿节点，在我的字典中搜索单词并用锚代替它。

我一直在html上使用普通的preg_match使用它，但这只是遇到问题。

递归函数：

$terms = array(
   'example'=>'explanation about example'
);
function iterate_html($doc, $original_doc = null)
    {
    global $terms;
        if(is_null($original_doc)) {
            self::iterate_html($doc, $doc);
        }
        foreach($doc->childNodes as $childnode)
        {
            $children = $childnode->childNodes;
            if($children) {
                self::iterate_html($childnode);
            } else {
                $regexes = '~b' . implode('b|b',array_keys($terms)) . 'b~i';
                $new_nodevalue = preg_replace_callback($regexes, function($matches) {
                    $doc = new DOMDocument();
                    $anchor = $doc->createElement('a', $matches[0]);
                    $anchor->setAttribute('class', 'text-info');
                    $anchor->setAttribute('data-toggle', 'tooltip');
                    $anchor->setAttribute('data-original-title', $terms[strtolower($matches[0])]);
                    return $doc->saveXML($anchor);
                }, $childnode->nodeValue);

                $dom = new DOMDocument();
                $template = $dom->createDocumentFragment();
                $template->appendXML($new_nodevalue);
                $original_doc->importNode($template->childNodes, true);
                $childnode->parentNode->replaceChild($template, $childnode);
            }
        }
    }
echo iterate_html('this is just some example text.');

我希望结果是：

this is just some <a class="text-info" data-toggle="tooltip" data-original-title="explanation about example">example</a> text

我认为在可以使用XPath查询时构建递归功能以行走DOM是有用的。另外，我不确定preg_replace_callback是否是这种情况的适应功能。我更喜欢使用preg_split。这是一个示例：

$html = 'this is just some example text.';
$terms = array(
   'example'=>'explanation about example'
);
// sort by reverse order of key size
// (to be sure that the longest string always wins instead of the first in the pattern)
uksort($terms, function ($a, $b) {
    $diff = mb_strlen($b) - mb_strlen($a);
    return ($diff) ? $diff : strcmp($a, $b);
});
// build the pattern inside a capture group (to have delimiters in the results with the PREG_SPLIT_DELIM_CAPTURE option)
$pattern = '~b(' . implode('|', array_map(function($i) { return preg_quote($i, '~'); }, array_keys($terms))) . ')b~i';
// prevent eventual html errors to be displayed
$libxmlInternalErrors = libxml_use_internal_errors(true);
// determine if the html string have a root html element already, if not add a fake root.
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$fakeRootElement = false;
if ( $dom->documentElement->nodeName !== 'html' ) {
    $dom->loadHTML("<div>$html</div>", LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
    $fakeRootElement = true;
}
libxml_use_internal_errors($libxmlInternalErrors);
// find all text nodes (not already included in a link or between other unwanted tags)
$xp = new DOMXPath($dom);
$textNodes = $xp->query('//text()[not(ancestor::a)][not(ancestor::style)][not(ancestor::script)]');
// replacement
foreach ($textNodes as $textNode) {
    $parts = preg_split($pattern, $textNode->nodeValue, -1, PREG_SPLIT_DELIM_CAPTURE);
    $fragment = $dom->createDocumentFragment();
    foreach ($parts as $k=>$part) {
        if ($k&1) {
            $anchor = $dom->createElement('a', $part);
            $anchor->setAttribute('class', 'text-info');
            $anchor->setAttribute('data-toggle', 'tooltip');
            $anchor->setAttribute('data-original-title', $terms[strtolower($part)]);
            $fragment->appendChild($anchor);
        } else {
            $fragment->appendChild($dom->createTextNode($part));
        }
    }
    $textNode->parentNode->replaceChild($fragment, $textNode);
}

// building of the result string
$result = '';
if ( $fakeRootElement ) {
    foreach ($dom->documentElement->childNodes as $childNode) {
        $result .= $dom->saveHTML($childNode);
    }
} else {
    $result = $dom->saveHTML();
}
echo $result;

演示

可以随意将其放入一个或多个功能/方法中，但请记住，这种编辑具有不可决定的重量，每次编辑HTML时都应使用(而不是每次显示HTML时(。

相关内容

最新更新

热门标签：