HTML净化器正在剥离表.有人知道为什么吗



为了清理wysigyg编辑器的输入,我刚刚下载了HTML净化器,但它似乎正在剥离表。

如果我输入此文本:

<font face="Times New Roman" size="3">
</font><p style="margin: 0in 0in 0pt; line-height: 150%; mso-outline-level: 3;"><span style='color: black; line-height: 150%; font-family: "Arial","sans-serif"; font-size: 12pt; mso-ascii-theme-font: minor-bidi; mso-hansi-theme-font: minor-bidi; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi;'>Recruitment methods</span></p><font face="Times New Roman" size="3">
</font><table style="border: currentColor; border-image: none; border-collapse: collapse; mso-border-alt: solid windowtext .5pt; mso-yfti-tbllook: 1184; mso-padding-alt: 0in 5.4pt 0in 5.4pt;" border="1" cellspacing="0" cellpadding="0"><font face="Times New Roman" size="3">
</font><tbody><tr style="mso-yfti-irow: 0; mso-yfti-firstrow: yes;"><font face="Times New Roman" size="3">
</font><td width="37" style="padding: 0in 5.4pt; border: 1pt solid windowtext; border-image: none; width: 27.95pt; background-color: transparent; mso-border-alt: solid windowtext .5pt;"><font face="Times New Roman" size="3">
</font><p align="center" style="margin: 0in 0in 0pt; text-align: center; line-height: normal;"><span style='font-family: "Arial","sans-serif"; mso-ascii-theme-font: minor-bidi; mso-hansi-theme-font: minor-bidi; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi;'><font size="3">No.</font></span></p><font face="Times New Roman" size="3">
</font></td><font face="Times New Roman" size="3">
</font><td width="180" style="border-width: 1pt 1pt 1pt 0px; border-style: solid solid solid none; border-color: windowtext windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; border-image: none; width: 134.95pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;">&nbsp;</td><font face="Times New Roman" size="3">
</font><td width="210" style="border-width: 1pt 1pt 1pt 0px; border-style: solid solid solid none; border-color: windowtext windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; border-image: none; width: 157.5pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;">&nbsp;</td><font face="Times New Roman" size="3">
</font><td width="211" style="border-width: 1pt 1pt 1pt 0px; border-style: solid solid solid none; border-color: windowtext windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; border-image: none; width: 2.2in; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;">&nbsp;</td><font face="Times New Roman" size="3">
</font></tr><font face="Times New Roman" size="3">
</font><tr style="mso-yfti-irow: 1;"><font face="Times New Roman" size="3">
</font><td width="37" style="border-width: 0px 1pt 1pt; border-style: none solid solid; border-color: rgb(0, 0, 0) windowtext windowtext; padding: 0in 5.4pt; border-image: none; width: 27.95pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;"><font face="Times New Roman" size="3">
</font><p align="center" style="margin: 0in 0in 0pt; text-align: center; line-height: normal;"><span style='font-family: "Arial","sans-serif"; mso-ascii-theme-font: minor-bidi; mso-hansi-theme-font: minor-bidi; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi;'><font size="3">1</font></span></p><font face="Times New Roman" size="3">
</font></td><font face="Times New Roman" size="3">
</font><td width="180" style="border-width: 0px 1pt 1pt 0px; border-style: none solid solid none; border-color: rgb(0, 0, 0) windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; width: 134.95pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;">&nbsp;</td><font face="Times New Roman" size="3">
</font><td width="210" style="border-width: 0px 1pt 1pt 0px; border-style: none solid solid none; border-color: rgb(0, 0, 0) windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; width: 157.5pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;">&nbsp;</td><font face="Times New Roman" size="3">
</font><td width="211" style="border-width: 0px 1pt 1pt 0px; border-style: none solid solid none; border-color: rgb(0, 0, 0) windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; width: 2.2in; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;">&nbsp;</td><font face="Times New Roman" size="3">
</font></tr><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font></tbody></table><font face="Times New Roman" size="3">
</font><p align="center" style="margin: 0in 0in 10pt; text-align: center;"><span style='line-height: 115%; font-family: "Arial","sans-serif"; font-size: 12pt; mso-ascii-theme-font: minor-bidi; mso-hansi-theme-font: minor-bidi; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi;'>&nbsp;</span></p><font face="Times New Roman" size="3">
</font><p style="margin: 0in 0in 10pt;"><font face="Times New Roman" size="3">
</font><br>

我得到这个输出:

<font face="Times New Roman" size="3">
</font><p style="margin:0in 0in 0pt;line-height:150%;"><span style="color:#000000;line-height:150%;font-family:Arial, 'sans-serif';font-size:12pt;">Recruitment methods</span></p><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><p align="center" style="margin:0in 0in 0pt;text-align:center;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3">No.</font></span></p><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><p align="center" style="margin:0in 0in 0pt;text-align:center;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3">Method</font></span></p><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><p align="center" style="margin:0in 0in 0pt;text-align:center;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3">Strengths</font></span></p><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><p align="center" style="margin:0in 0in 0pt;text-align:center;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3">Weaknesses</font></span></p><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><p align="center" style="margin:0in 0in 0pt;text-align:center;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3">1</font></span></p><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><p style="margin:0in 0in 0pt;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3">Internal recruitment</font></span></p><font face="Times New Roman" size="3">
</font><p style="margin:0in 0in 0pt;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3">Promotion</font></span></p><font face="Times New Roman" size="3">
</font><p style="margin:0in 0in 0pt;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3">Lateral transfer</font></span></p><font face="Times New Roman" size="3">
</font><p style="margin:0in 0in 0pt;line-height:normal;"><span style="font-family:Arial, 'sans-serif';"><font size="3"> </font></span></p><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3"> etc...

我的设置如下:

require_once 'purify/library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Allowed', 'p,span[style|class],a[href|title],abbr[title],acronym[title],b,strong,blockquote[cite],code,em,i,iframe[src|width|height],img[alt|title|class|src|height|width],h1,h2,h3,h3,ol,ul,li,table[class|style],tr,td,hr');
$purifier = new HTMLPurifier($config);

我只是添加了HTML.Allowed行,以便尝试并特别允许使用表,但这并没有奏效。有人知道为什么它不应该剥离表格吗?

感谢

这有点奇怪——最初我认为可能有一个<font>标记(内联元素(围绕着一个块级元素,从而迫使它被剥离,然后错误从那里级联,但在通过基本(哑(HTML格式化程序运行代码后,看起来它们都是相当独立的。

但打开"错误收集"可以告诉我们发生了什么。问题似乎是,尽管HTML Purifier是自包含的,但一旦遇到第一个<font>,它就会关闭<table>标签,而不会删除<font>(正如人们可能认为的那样(:

注意 第7行第8列: <表>在线路6上启动,通过<字体>

错误 第10行第8列:<tr>删除

注意 第11行第12列: <tr>在线路10上启动,通过<字体>

注意 第11行第12列: <tbody>在线路9上启动,通过<字体>

警告 第31行第8列: 不必要<tr>标签已删除

错误 第34行第8列:<tr>删除

注意 第35行第12列: <tr>在线路34上启动,通过<字体>

警告 第55行第8列: 不必要<tr>标签已删除

警告 第59行第8列: 不必要<tbody>标签已删除

警告 第60行第4列: 不必要<表>标签已删除

注意 文档结尾: <p>第66行开始的标签在文档结束时关闭

警告 文档结尾: <div>节点重新组织以强制执行其内容模型

如果您选择CollectErrors并插入以下HTML:,这就是演示的输出

<font face="Times New Roman" size="3">
</font>
<p style="margin: 0in 0in 0pt; line-height: 150%; mso-outline-level: 3;"><span style='color: black; line-height: 150%; font-family: "Arial","sans-serif"; font-size: 12pt; mso-ascii-theme-font: minor-bidi; mso-hansi-theme-font: minor-bidi; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi;'>Recruitment methods</span></p>
<font face="Times New Roman" size="3">
</font>
<table style="border: currentColor; border-image: none; border-collapse: collapse; mso-border-alt: solid windowtext .5pt; mso-yfti-tbllook: 1184; mso-padding-alt: 0in 5.4pt 0in 5.4pt;" border="1" cellspacing="0" cellpadding="0">
<font face="Times New Roman" size="3">
</font>
<tbody>
<tr style="mso-yfti-irow: 0; mso-yfti-firstrow: yes;">
<font face="Times New Roman" size="3">
</font>
<td width="37" style="padding: 0in 5.4pt; border: 1pt solid windowtext; border-image: none; width: 27.95pt; background-color: transparent; mso-border-alt: solid windowtext .5pt;">
<font face="Times New Roman" size="3">
</font>
<p align="center" style="margin: 0in 0in 0pt; text-align: center; line-height: normal;"><span style='font-family: "Arial","sans-serif"; mso-ascii-theme-font: minor-bidi; mso-hansi-theme-font: minor-bidi; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi;'><font size="3">No.</font></span></p>
<font face="Times New Roman" size="3">
</font>
</td>
<font face="Times New Roman" size="3">
</font>
<td width="180" style="border-width: 1pt 1pt 1pt 0px; border-style: solid solid solid none; border-color: windowtext windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; border-image: none; width: 134.95pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;">&nbsp;</td>
<font face="Times New Roman" size="3">
</font>
<td width="210" style="border-width: 1pt 1pt 1pt 0px; border-style: solid solid solid none; border-color: windowtext windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; border-image: none; width: 157.5pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;">&nbsp;</td>
<font face="Times New Roman" size="3">
</font>
<td width="211" style="border-width: 1pt 1pt 1pt 0px; border-style: solid solid solid none; border-color: windowtext windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; border-image: none; width: 2.2in; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt;">&nbsp;</td>
<font face="Times New Roman" size="3">
</font>
</tr>
<font face="Times New Roman" size="3">
</font>
<tr style="mso-yfti-irow: 1;">
<font face="Times New Roman" size="3">
</font>
<td width="37" style="border-width: 0px 1pt 1pt; border-style: none solid solid; border-color: rgb(0, 0, 0) windowtext windowtext; padding: 0in 5.4pt; border-image: none; width: 27.95pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;">
<font face="Times New Roman" size="3">
</font>
<p align="center" style="margin: 0in 0in 0pt; text-align: center; line-height: normal;"><span style='font-family: "Arial","sans-serif"; mso-ascii-theme-font: minor-bidi; mso-hansi-theme-font: minor-bidi; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi;'><font size="3">1</font></span></p>
<font face="Times New Roman" size="3">
</font>
</td>
<font face="Times New Roman" size="3">
</font>
<td width="180" style="border-width: 0px 1pt 1pt 0px; border-style: none solid solid none; border-color: rgb(0, 0, 0) windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; width: 134.95pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;">&nbsp;</td>
<font face="Times New Roman" size="3">
</font>
<td width="210" style="border-width: 0px 1pt 1pt 0px; border-style: none solid solid none; border-color: rgb(0, 0, 0) windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; width: 157.5pt; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;">&nbsp;</td>
<font face="Times New Roman" size="3">
</font>
<td width="211" style="border-width: 0px 1pt 1pt 0px; border-style: none solid solid none; border-color: rgb(0, 0, 0) windowtext windowtext rgb(0, 0, 0); padding: 0in 5.4pt; width: 2.2in; background-color: transparent; mso-border-alt: solid windowtext .5pt; mso-border-left-alt: solid windowtext .5pt; mso-border-top-alt: solid windowtext .5pt;">&nbsp;</td>
<font face="Times New Roman" size="3">
</font>
</tr>
<font face="Times New Roman" size="3">
</font><font face="Times New Roman" size="3">
</font>
</tbody>
</table>
<font face="Times New Roman" size="3">
</font>
<p align="center" style="margin: 0in 0in 10pt; text-align: center;"><span style='line-height: 115%; font-family: "Arial","sans-serif"; font-size: 12pt; mso-ascii-theme-font: minor-bidi; mso-hansi-theme-font: minor-bidi; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi;'>&nbsp;</span></p>
<font face="Times New Roman" size="3">
</font>
<p style="margin: 0in 0in 10pt;"><font face="Times New Roman" size="3">
</font><br>

HTML净化器论坛上还有另一个帖子可能会让这件事更容易理解。症状描述如下:

当我试图净化这个代码时:

<table>
<tr>
<td>
<li>fffff</li>
</td>
</tr>
</table>

我得到:

<table>
<tr>
<td>
</td>
</tr>
</table>
fffff

然后(我的,呵呵(响应:

我想发生的事情是HTML净化器检测到<李>不能在该位置打开,而是将<李>首先,它会在该点自动关闭其他打开的标签,结果(最初(为:

<table>
<tr>
<td>
</td>
</tr>
</table>
<li>fffff</li>
</td>
</tr>
</table>

然后删除无关的结束标记。。。

<table>
<tr>
<td>
</td>
</tr>
</table>
<li>fffff</li>

然后剥离<li>,导致了所观察到的:

<table>
<tr>
<td>
</td>
</tr>
</table>
fffff

您可以尝试将Lexer切换到DirectLex,看看这是否会改变行为,但我对此表示怀疑-您可能会被这种行为所困扰。不过,试试看。

最新更新