如何验证 unicode 电子邮件



2009 年 10 月,互联网名称与数字地址分配机构 (ICANN) 批准在互联网中创建使用 IDNA 母语脚本标准的国家/地区代码顶级域 (ccTLD)。

我们只是验证了过去的a-zA-Z。但是现在,我想验证 unicode 电子邮件,例如中文电子邮件我@在.中国或其他语言。如何通过正则表达式验证它们?

这是我为最大程度的 Unicode 支持和对 RFC 标准的整体合理遵守而编写的验证正则表达式。

.JS:

/^(?!.)((?!.*.{2})[a-zA-Z0-9u0080-u00FFu0100-u017Fu0180-u024Fu0250-u02AFu0300-u036Fu0370-u03FFu0400-u04FFu0500-u052Fu0530-u058Fu0590-u05FFu0600-u06FFu0700-u074Fu0750-u077Fu0780-u07BFu07C0-u07FFu0900-u097Fu0980-u09FFu0A00-u0A7Fu0A80-u0AFFu0B00-u0B7Fu0B80-u0BFFu0C00-u0C7Fu0C80-u0CFFu0D00-u0D7Fu0D80-u0DFFu0E00-u0E7Fu0E80-u0EFFu0F00-u0FFFu1000-u109Fu10A0-u10FFu1100-u11FFu1200-u137Fu1380-u139Fu13A0-u13FFu1400-u167Fu1680-u169Fu16A0-u16FFu1700-u171Fu1720-u173Fu1740-u175Fu1760-u177Fu1780-u17FFu1800-u18AFu1900-u194Fu1950-u197Fu1980-u19DFu19E0-u19FFu1A00-u1A1Fu1B00-u1B7Fu1D00-u1D7Fu1D80-u1DBFu1DC0-u1DFFu1E00-u1EFFu1F00-u1FFFu20D0-u20FFu2100-u214Fu2C00-u2C5Fu2C60-u2C7Fu2C80-u2CFFu2D00-u2D2Fu2D30-u2D7Fu2D80-u2DDFu2F00-u2FDFu2FF0-u2FFFu3040-u309Fu30A0-u30FFu3100-u312Fu3130-u318Fu3190-u319Fu31C0-u31EFu31F0-u31FFu3200-u32FFu3300-u33FFu3400-u4DBFu4DC0-u4DFFu4E00-u9FFFuA000-uA48FuA490-uA4CFuA700-uA71FuA800-uA82FuA840-uA87FuAC00-uD7AFuF900-uFAFF.!#$%&'*+-/=?^_`{|}~-d]+)@(?!.)([a-zA-Z0-9u0080-u00FFu0100-u017Fu0180-u024Fu0250-u02AFu0300-u036Fu0370-u03FFu0400-u04FFu0500-u052Fu0530-u058Fu0590-u05FFu0600-u06FFu0700-u074Fu0750-u077Fu0780-u07BFu07C0-u07FFu0900-u097Fu0980-u09FFu0A00-u0A7Fu0A80-u0AFFu0B00-u0B7Fu0B80-u0BFFu0C00-u0C7Fu0C80-u0CFFu0D00-u0D7Fu0D80-u0DFFu0E00-u0E7Fu0E80-u0EFFu0F00-u0FFFu1000-u109Fu10A0-u10FFu1100-u11FFu1200-u137Fu1380-u139Fu13A0-u13FFu1400-u167Fu1680-u169Fu16A0-u16FFu1700-u171Fu1720-u173Fu1740-u175Fu1760-u177Fu1780-u17FFu1800-u18AFu1900-u194Fu1950-u197Fu1980-u19DFu19E0-u19FFu1A00-u1A1Fu1B00-u1B7Fu1D00-u1D7Fu1D80-u1DBFu1DC0-u1DFFu1E00-u1EFFu1F00-u1FFFu20D0-u20FFu2100-u214Fu2C00-u2C5Fu2C60-u2C7Fu2C80-u2CFFu2D00-u2D2Fu2D30-u2D7Fu2D80-u2DDFu2F00-u2FDFu2FF0-u2FFFu3040-u309Fu30A0-u30FFu3100-u312Fu3130-u318Fu3190-u319Fu31C0-u31EFu31F0-u31FFu3200-u32FFu3300-u33FFu3400-u4DBFu4DC0-u4DFFu4E00-u9FFFuA000-uA48FuA490-uA4CFuA700-uA71FuA800-uA82FuA840-uA87FuAC00-uD7AFuF900-uFAFF-.d]+)((.([a-zA-Zu0080-u00FFu0100-u017Fu0180-u024Fu0250-u02AFu0300-u036Fu0370-u03FFu0400-u04FFu0500-u052Fu0530-u058Fu0590-u05FFu0600-u06FFu0700-u074Fu0750-u077Fu0780-u07BFu07C0-u07FFu0900-u097Fu0980-u09FFu0A00-u0A7Fu0A80-u0AFFu0B00-u0B7Fu0B80-u0BFFu0C00-u0C7Fu0C80-u0CFFu0D00-u0D7Fu0D80-u0DFFu0E00-u0E7Fu0E80-u0EFFu0F00-u0FFFu1000-u109Fu10A0-u10FFu1100-u11FFu1200-u137Fu1380-u139Fu13A0-u13FFu1400-u167Fu1680-u169Fu16A0-u16FFu1700-u171Fu1720-u173Fu1740-u175Fu1760-u177Fu1780-u17FFu1800-u18AFu1900-u194Fu1950-u197Fu1980-u19DFu19E0-u19FFu1A00-u1A1Fu1B00-u1B7Fu1D00-u1D7Fu1D80-u1DBFu1DC0-u1DFFu1E00-u1EFFu1F00-u1FFFu20D0-u20FFu2100-u214Fu2C00-u2C5Fu2C60-u2C7Fu2C80-u2CFFu2D00-u2D2Fu2D30-u2D7Fu2D80-u2DDFu2F00-u2FDFu2FF0-u2FFFu3040-u309Fu30A0-u30FFu3100-u312Fu3130-u318Fu3190-u319Fu31C0-u31EFu31F0-u31FFu3200-u32FFu3300-u33FFu3400-u4DBFu4DC0-u4DFFu4E00-u9FFFuA000-uA48FuA490-uA4CFuA700-uA71FuA800-uA82FuA840-uA87FuAC00-uD7AFuF900-uFAFF]){2,63})+)$/i

.PHP:

/^(?!.)((?!.*.{2})[a-zA-Z0-9x{0080}-x{00FF}x{0100}-x{017F}x{0180}-x{024F}x{0250}-x{02AF}x{0300}-x{036F}x{0370}-x{03FF}x{0400}-x{04FF}x{0500}-x{052F}x{0530}-x{058F}x{0590}-x{05FF}x{0600}-x{06FF}x{0700}-x{074F}x{0750}-x{077F}x{0780}-x{07BF}x{07C0}-x{07FF}x{0900}-x{097F}x{0980}-x{09FF}x{0A00}-x{0A7F}x{0A80}-x{0AFF}x{0B00}-x{0B7F}x{0B80}-x{0BFF}x{0C00}-x{0C7F}x{0C80}-x{0CFF}x{0D00}-x{0D7F}x{0D80}-x{0DFF}x{0E00}-x{0E7F}x{0E80}-x{0EFF}x{0F00}-x{0FFF}x{1000}-x{109F}x{10A0}-x{10FF}x{1100}-x{11FF}x{1200}-x{137F}x{1380}-x{139F}x{13A0}-x{13FF}x{1400}-x{167F}x{1680}-x{169F}x{16A0}-x{16FF}x{1700}-x{171F}x{1720}-x{173F}x{1740}-x{175F}x{1760}-x{177F}x{1780}-x{17FF}x{1800}-x{18AF}x{1900}-x{194F}x{1950}-x{197F}x{1980}-x{19DF}x{19E0}-x{19FF}x{1A00}-x{1A1F}x{1B00}-x{1B7F}x{1D00}-x{1D7F}x{1D80}-x{1DBF}x{1DC0}-x{1DFF}x{1E00}-x{1EFF}x{1F00}-x{1FFF}x{20D0}-x{20FF}x{2100}-x{214F}x{2C00}-x{2C5F}x{2C60}-x{2C7F}x{2C80}-x{2CFF}x{2D00}-x{2D2F}x{2D30}-x{2D7F}x{2D80}-x{2DDF}x{2F00}-x{2FDF}x{2FF0}-x{2FFF}x{3040}-x{309F}x{30A0}-x{30FF}x{3100}-x{312F}x{3130}-x{318F}x{3190}-x{319F}x{31C0}-x{31EF}x{31F0}-x{31FF}x{3200}-x{32FF}x{3300}-x{33FF}x{3400}-x{4DBF}x{4DC0}-x{4DFF}x{4E00}-x{9FFF}x{A000}-x{A48F}x{A490}-x{A4CF}x{A700}-x{A71F}x{A800}-x{A82F}x{A840}-x{A87F}x{AC00}-x{D7AF}x{F900}-x{FAFF}.!#$%&'*+-/=?^_`{|}~-d]+)@(?!.)([a-zA-Z0-9x{0080}-x{00FF}x{0100}-x{017F}x{0180}-x{024F}x{0250}-x{02AF}x{0300}-x{036F}x{0370}-x{03FF}x{0400}-x{04FF}x{0500}-x{052F}x{0530}-x{058F}x{0590}-x{05FF}x{0600}-x{06FF}x{0700}-x{074F}x{0750}-x{077F}x{0780}-x{07BF}x{07C0}-x{07FF}x{0900}-x{097F}x{0980}-x{09FF}x{0A00}-x{0A7F}x{0A80}-x{0AFF}x{0B00}-x{0B7F}x{0B80}-x{0BFF}x{0C00}-x{0C7F}x{0C80}-x{0CFF}x{0D00}-x{0D7F}x{0D80}-x{0DFF}x{0E00}-x{0E7F}x{0E80}-x{0EFF}x{0F00}-x{0FFF}x{1000}-x{109F}x{10A0}-x{10FF}x{1100}-x{11FF}x{1200}-x{137F}x{1380}-x{139F}x{13A0}-x{13FF}x{1400}-x{167F}x{1680}-x{169F}x{16A0}-x{16FF}x{1700}-x{171F}x{1720}-x{173F}x{1740}-x{175F}x{1760}-x{177F}x{1780}-x{17FF}x{1800}-x{18AF}x{1900}-x{194F}x{1950}-x{197F}x{1980}-x{19DF}x{19E0}-x{19FF}x{1A00}-x{1A1F}x{1B00}-x{1B7F}x{1D00}-x{1D7F}x{1D80}-x{1DBF}x{1DC0}-x{1DFF}x{1E00}-x{1EFF}x{1F00}-x{1FFF}x{20D0}-x{20FF}x{2100}-x{214F}x{2C00}-x{2C5F}x{2C60}-x{2C7F}x{2C80}-x{2CFF}x{2D00}-x{2D2F}x{2D30}-x{2D7F}x{2D80}-x{2DDF}x{2F00}-x{2FDF}x{2FF0}-x{2FFF}x{3040}-x{309F}x{30A0}-x{30FF}x{3100}-x{312F}x{3130}-x{318F}x{3190}-x{319F}x{31C0}-x{31EF}x{31F0}-x{31FF}x{3200}-x{32FF}x{3300}-x{33FF}x{3400}-x{4DBF}x{4DC0}-x{4DFF}x{4E00}-x{9FFF}x{A000}-x{A48F}x{A490}-x{A4CF}x{A700}-x{A71F}x{A800}-x{A82F}x{A840}-x{A87F}x{AC00}-x{D7AF}x{F900}-x{FAFF}-.d]+)((.([a-zA-Zx{0080}-x{00FF}x{0100}-x{017F}x{0180}-x{024F}x{0250}-x{02AF}x{0300}-x{036F}x{0370}-x{03FF}x{0400}-x{04FF}x{0500}-x{052F}x{0530}-x{058F}x{0590}-x{05FF}x{0600}-x{06FF}x{0700}-x{074F}x{0750}-x{077F}x{0780}-x{07BF}x{07C0}-x{07FF}x{0900}-x{097F}x{0980}-x{09FF}x{0A00}-x{0A7F}x{0A80}-x{0AFF}x{0B00}-x{0B7F}x{0B80}-x{0BFF}x{0C00}-x{0C7F}x{0C80}-x{0CFF}x{0D00}-x{0D7F}x{0D80}-x{0DFF}x{0E00}-x{0E7F}x{0E80}-x{0EFF}x{0F00}-x{0FFF}x{1000}-x{109F}x{10A0}-x{10FF}x{1100}-x{11FF}x{1200}-x{137F}x{1380}-x{139F}x{13A0}-x{13FF}x{1400}-x{167F}x{1680}-x{169F}x{16A0}-x{16FF}x{1700}-x{171F}x{1720}-x{173F}x{1740}-x{175F}x{1760}-x{177F}x{1780}-x{17FF}x{1800}-x{18AF}x{1900}-x{194F}x{1950}-x{197F}x{1980}-x{19DF}x{19E0}-x{19FF}x{1A00}-x{1A1F}x{1B00}-x{1B7F}x{1D00}-x{1D7F}x{1D80}-x{1DBF}x{1DC0}-x{1DFF}x{1E00}-x{1EFF}x{1F00}-x{1FFF}x{20D0}-x{20FF}x{2100}-x{214F}x{2C00}-x{2C5F}x{2C60}-x{2C7F}x{2C80}-x{2CFF}x{2D00}-x{2D2F}x{2D30}-x{2D7F}x{2D80}-x{2DDF}x{2F00}-x{2FDF}x{2FF0}-x{2FFF}x{3040}-x{309F}x{30A0}-x{30FF}x{3100}-x{312F}x{3130}-x{318F}x{3190}-x{319F}x{31C0}-x{31EF}x{31F0}-x{31FF}x{3200}-x{32FF}x{3300}-x{33FF}x{3400}-x{4DBF}x{4DC0}-x{4DFF}x{4E00}-x{9FFF}x{A000}-x{A48F}x{A490}-x{A4CF}x{A700}-x{A71F}x{A800}-x{A82F}x{A840}-x{A87F}x{AC00}-x{D7AF}x{F900}-x{FAFF}]){2,63})+)$/u

除了 RFC 规则之外,上面的 Unicode 部分还包含一系列字符子集。这样做是为了使只有实数字母和数字进入验证,而拒绝非拉丁标点符号和杂项 Unicode 字符。

目前无法正确验证的主要内容是代替域名的 IP 地址、"本地"部分内的注释、撇号和正斜杠。我从未见过有人使用后两者,所以没有费心膨胀正则表达式来支持它们。

您可以在此处找到现场演示:http://jsfiddle.net/aossikine/qCLVH/3/

以下是 RFC 标准允许的字符细分,以及此正则表达式是否支持它们:

  • a-zA-Z0-9
  • !#$%&'*+-/=?^_`{|}~
  • (),:;<>@[](必须在引号之间)
    • 这个尚未实现
  • .(句点不能是第一个或最后一个字符,不应连续出现)
  • (前面必须有反斜杠)
    • 这个尚未实现
  • "(前面必须有反斜杠)
    • 这个尚未实现
  • 从局部部分的开头或结尾去除用括号括起来的任何内容
    • 这个尚未实现
  • 域名只能包含字母、数字和短划线(破折号可以是连续的)

有关 RFC 标准的详细信息,请参阅 http://en.wikipedia.org/wiki/E-mail_address#Syntax。那里详细介绍的大多数逻辑都受支持。上面的 JSFiddle 链接包括一些额外的文档和指向方便站点的其他链接。

使用正则表达式无法正确验证 IDNA 域名。它基本上涉及RFC3490中定义的 ToASCII 操作。对于每个域标签,必须执行以下步骤:

  • RFC3491 中定义的 NAMEPREP 处理需要:
    • 使用 RFC 3454 (STRINGPREP) 中的表映射代码点。
    • 转换为 NFKC 形式的 Unicode 规范化。
    • 使用 RFC 3454 中的表检查禁止的代码点。
    • 检查 RFC 3454 中定义的双向字符。
  • 检查 ASCII 字符。
  • 使用 Punycode 进行编码。
  • 检查结果是否不超过 63 个字符。

您应该使用所选语言的 IDNA 库。

编辑:上面的答案是指已被RFC 5890 ff取代的旧IDNA标准。后者是基于包含的,但使用正则表达式验证域名仍然太复杂了。

试试这个:

u0900-u097Fu0980-u09FFu0A00-u0A7Fu0A80-u0AFFu0B00-u0B7Fu0B80-u0BFFu0C00-u0C7Fu0C80-u0CFFu0D00-u0D7Fu0D80-u0DFFu0E00-u0E7Fu0E80-u0EFFu0F00-u0FFFu1000-u109Fu1700-u171Fu1720-u173Fu1740-u175Fu1760-u177Fu1780-u17FFu1900-u194Fu1950-u197Fu1980-u19DFu19E0-u19FFu1A00-u1A1Fu1A20-u1AAFu1B00-u1B7Fu1B80-u1BBFu1BC0-u1BFFu1C00-u1C4Fu1CC0-u1CCFuA800-uA82FuA840-uA87FuA880-uA8DFuA8E0-uA8FFuA930-uA95FuA980-uA9DFuA9E0-uA9FFuAA00-uAA5FuAA60-uAA7FuAA80-uAADFuAAE0-uAAFFuABC0-uABFFu0600-u06FFu0750–u077Fu08A0–u08FFuFB50–uFDFFuFE70–uFEFFu4e00-u9fafu0D80-u0DFFu0E80-u0EFF]/)!=null}e.exports=a}),null);

最新更新