我想做的是允许用户在需要时发布代码,这样它就可以查看,而不会渲染。例如:
<span>
<div id="hkhsdfhu"></div>
</span>
<h1>Hello</h1>
应该变成:
<span>
<div id="hkhsdfhu"></div>
</span>
<h1>Hello</h1>
仅当它被包裹在<code></code>
标记中时。现在我使用以下函数只允许某些HTML标记,并转义任何其他标记:
function allowedHtml($str) {
$allowed_tags = array("b", "strong", "i", "em");
$sans_tags = str_replace(array("<", ">"), array("<",">"), $str);
$regex = sprintf("~<(/)?(%s)>~", implode("|",$allowed_tags));
$with_allowed = preg_replace($regex, "<\1\2>", $sans_tags);
return $with_allowed;
}
但是,如果用户将他们的代码封装在<code></code>
标记中,并且它包含上面函数中允许的任何标记,那么这些标记将被呈现,而不是转义。如何使<code></code>
标记中的任何内容都被转义(或者仅将<
和>
转换为<
和>
)?我知道htmlentities()
,但我不想对整个帖子这样做,只想在<code></code>
标签中添加内容。
提前感谢!
只需使用带有e修饰符的单个preg_replace()
函数,即可对<code>
标签中的所有内容执行htmlenteties()
函数
编辑
function allowedHtml($str) {
$str = htmlentities($str, ENT_QUOTES, "UTF-8");
$allowed_tags = array("b", "strong", "i", "em", "code");
foreach ($allowed_tags as $tag) {
$str = preg_replace("#<" . $tag . ">(.*?)</" . $tag . ">#i", "<" . $tag . ">$1</" . $tag . ">", $str);
}
return $str;
}
$reply = allowedHtml($_POST['reply']);
$reply = preg_replace("#<code>(.+?)</code>#e", "'<code>'.htmlentities('$1', ENT_QUOTES, 'UTF-8').'</code>'", $reply);
$reply = str_replace("&", "&", $reply);
重写您的allowedHtml()
函数,并在末尾添加一个str_replace()
。
它经过测试,现在应该可以完美工作:)
更新-新解决方案
function convertHtml($reply, $revert = false) {
$specials = array("**", "*", "_", "-");
$tags = array("b", "i", "u", "s");
foreach ($tags as $key => $tag) {
$open = "<" . $tag . ">";
$close = "</" . $tag . ">";
if ($revert == true) {
$special = $specials[$key];
$reply = preg_replace("#" . $open . "(.+?)" . $close . "#i", $special . "$1" . $special, $reply);
}
else {
$special = str_replace("*", "*", $specials[$key]);
$reply = preg_replace("#" . $special . "(.+?)" . $special . "#i", $open . "$1" . $close, $reply);
}
}
return $reply;
}
$reply = htmlentities($reply, ENT_QUOTES, "UTF-8");
$reply = convertHtml($reply);
$reply = preg_replace("#[^Srn]{4}(.+?)(?!.+)#i", "<pre><code>$1</code></pre>", $reply);
$reply = preg_replace("#</code></pre>(s*)<pre><code>#i", "$1", $reply);
$reply = nl2br($reply);
$reply = preg_replace("#<pre><code>(.*?)</code></pre>#se", "'<pre><code>'.convertHtml(str_replace('<br />', '', '$1'), true).'</code></pre>'", $reply);
讨论了另一个解决方案,上面的代码将解决这个问题。它的工作原理就像Stack Overflow html转换一样,这意味着**变为粗体,*变为斜体,_变为下划线,并且-是"删除线"。除此之外,所有以4个或更多空格开头的行都将输出为代码
我认为您最好直接使用dom,而不是使用正则表达式来解析出允许的标记。例如,要遍历<code>
标记中的dom和escape内容,可以执行以下操作:
$doc = new DOMDocument();
$doc->loadHTML($postHtml);
$codeNode = $doc->getElementsByTagName('code')->item(0);
$escapedCode = htmlspecialchars($codeNode->nodeValue);
这里有一种使用preg_replace()的方法。只要确保在调用allowedHtml
函数之前先调用此函数,这样标记就已经被替换了。
<?php
$post = <<<EOD
I am a person writing a post
How can I write this code?
Example:
<code>
<span>
<div id="hkhsdfhu"></div>
</span>
<h1>Hello</h1>
</code>
Pls help me...
EOD;
$post = preg_replace('/<code>(.*?)</code>/ise',
"'<code>' . htmlspecialchars('$1') . '</code>'",
$post);
var_dump($post);
结果:
string(201) "I am a person writing a post
How can I write this code?
Example:
<code>
<span>
<div id="hkhsdfhu"></div>
</span>
<h1>Hello</h1>
</code>
Pls help me..."
这里有一个。
$str = preg_replace_callback('/(?<=<code>)(.*?)(?=</code>)/si','escape_code',$str);
function escape_code($matches) {
$tags = array('b','strong','i','em');
// declare the tags in this array
$allowed = implode('|',$tags);
$match = htmlentities($matches[0],ENT_NOQUOTES,'UTF-8');
return preg_replace('~<(/)?('.$allowed.')(s*/)?>~i','<$1$2$3>',$match);
}