Regex匹配Telegram用户名并删除PHP中的整行



我想在消息文本中匹配Telegram用户名并删除整行,我尝试过这种模式,但问题是它也匹配电子邮件:

.*(@(?=.{5,64}(?:s|$))(?![_])(?!.*[_]{2})[a-zA-Z0-9_]+(?<![_.])).*

图案应匹配所有这些行:

你好@username你好吗?

你好@username.你好吗?

😉@用户名。

并且不应该像这样匹配电子邮件:

嗨,发送电子邮件至something@domain.com

使用

.*B@(?=w{5,32}b)[a-zA-Z0-9]+(?:_[a-zA-Z0-9]+)*.*

查看验证

@之前的B意味着在@之前必须有一个非单词字符或字符串的开头。

解释

NODE                     EXPLANATION
--------------------------------------------------------------------------------
.*                       any character except n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
B                       the boundary between two word chars (w)
or two non-word chars (W)
--------------------------------------------------------------------------------
@                        '@'
--------------------------------------------------------------------------------
(?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
w{5,32}                 word characters (a-z, A-Z, 0-9, _)
(between 5 and 32 times (matching the
most amount possible))
--------------------------------------------------------------------------------
b                       the boundary between a word char (w)
and something that is not a word char
--------------------------------------------------------------------------------
)                        end of look-ahead
--------------------------------------------------------------------------------
[a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to 'Z',
'0' to '9' (1 or more times (matching the
most amount possible))
--------------------------------------------------------------------------------
(?:                      group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
_                        '_'
--------------------------------------------------------------------------------
[a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to
'Z', '0' to '9' (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
)*                       end of grouping
--------------------------------------------------------------------------------
.*                       any character except n (0 or more times
(matching the most amount possible))

.*[W](@(?=.{5,64}(?:s|$))(?![_])(?!.*[_]{2})[a-zA-Z0-9_]+(?<![_.])).*

我在@symbol之前添加了这个[W]非单词字符。在这里你可以查看结果https://regex101.com/r/yFGegO/1

阳光下没有什么新鲜事,但基本上其他模式可以简化为:

.*?B@w{5}.*

演示

或者最终:

.*?Bw{5,64}b.*

如果你想更精确,但它真的需要吗?

注意:如果您也想删除换行序列,请在模式末尾添加R?

相关内容

  • 没有找到相关文章

最新更新