OpenEdge:如何从字符串中删除HTML标签



我试过这样做:

REPLACE(string, "<*>", "").

但它似乎不起作用。

>REPLACE不是那样工作的。其中没有通配符匹配。

我在下面介绍了一种简单的方法。但是,在很多情况下这不起作用 - 格式不正确的 html 等。但也许你可以从这里开始,自己前进。

我所做的是在文本中查找<和>,并用管道(|)替换它之间的所有内容(您可以选择任何字符 - 最好是文本中不存在的字符。完成后,所有管道都将被移除。

同样,这是一个快速而肮脏的解决方案,对生产不安全......

PROCEDURE cleanHtml:
    DEFINE INPUT  PARAMETER pcString  AS CHARACTER   NO-UNDO.
    DEFINE OUTPUT PARAMETER pcCleaned AS CHARACTER   NO-UNDO.
    DEFINE VARIABLE iHtmlTagBegins AS INTEGER     NO-UNDO.
    DEFINE VARIABLE iHtmlTagEnds   AS INTEGER     NO-UNDO.
    DEFINE VARIABLE lHtmlTagActive AS LOGICAL     NO-UNDO.
    DEFINE VARIABLE i AS INTEGER     NO-UNDO.
    DO i = 1 TO LENGTH(pcString):
        IF lHtmlTagActive = FALSE AND SUBSTRING(pcString, i, 1) = "<" THEN DO:
            iHtmlTagBegins = i.
            lHtmlTagActive = TRUE.
        END.
        IF lHtmlTagActive AND SUBSTRING(pcString, i, 1) = ">" THEN DO:
            iHtmlTagEnds = i.
            lHtmlTagActive = FALSE.
            SUBSTRING(pcString, iHtmlTagBegins, iHtmlTagEnds - iHtmlTagBegins + 1) = FILL("|", iHtmlTagEnds - iHtmlTagBegins).
        END.
    END.
    pcCleaned = REPLACE(pcString, "|", "").
END PROCEDURE.
DEFINE VARIABLE c AS CHARACTER   NO-UNDO.
RUN cleanHtml("This is a <b>text</b> with a <i>little</i> bit of <strong>html</strong> in it!", OUTPUT c).
MESSAGE c VIEW-AS ALERT-BOX.

最新更新