如何改进regex,使其能够匹配谷歌脚本电子邮件刮刀的电子邮件格式



我正在尝试制作一个电子邮件刮刀,它可以读取您的电子邮件,并将交易放入谷歌表单中,以便于预算编制。

电子邮件的格式如下:

This is an Alert to help you manage your credit card account ending in 0000.
As you requested, we are notifying you of any charges over the amount of ($USD) 0.01, as specified in your Alert settings. A charge of ($USD) 44.44 at UBER * EATS PENDIN has been authorized on Apr 34, 2073 at 2:27 PM ET.
Do not reply to this Alert.
If you have questions, please call the number on the back of your credit card, or send a secure message from your Inbox on www.bank.com.
To see all of the Alerts available to you, or to manage your Alert settings, please log on to www.bank.com.

我只想知道价格(44.44(、公司(优步饮食(、日期(2073年4月34日(和时间(美国东部时间下午2:27(。

我有这个作为我的正则表达式:

/A charge ofsW+w+W+s(.+?(?=at))w+s(.+?(?=has))w+sw+sw+sw+s(.+?(?=at))w+s(.+?(?=ET))/g

然而,尽管它在regex101中匹配,但它不再工作。

有什么想法可以让它在谷歌脚本中匹配,这样我就可以抓取电子邮件了吗?其他一切正常

对于您显示的示例,您可以尝试以下操作吗。使用这里的PCRE功能,这将创建3个捕获组,您可以根据需要从中获取值。

^(?:As you requested.*$USD)s+)(d+.d+)s+[w]+s+([^ ]*).*?authorized on(.*).$

以上regex 的在线演示

解释:添加以上详细解释。

^(?:                           ##Matching from starting of value, starting a non-capturing group.
As you requested.*$USD)s+   ##Matching string As you requested. till $USD) spaces here.
)                              ##Closing non-capturing group here.
(d+.d+)                     ##1st capturing group has digits DOT digits here.
s+[w]+s+                    ##Matching spaces word characters spaces here.
([^ ]*)                        ##2nd capturing group matches till any spaces(basically Uber value will come here).
.*?authorized on               ##Matching everything till authorized on here.
(.*).$                        ##Matching everything till last dot comes of the line, time and date basically.

你的Regex对我来说很好,我看到的唯一问题是,你使用的是global,这样你就不会得到匹配的组。如果你把它取下来,它会很好用的。请参阅MDN RegEx.match((

你可以在命名组中这样尝试。

const string =`A charge of ($USD) 44.44 at UBER * EATS PENDIN has been authorized on Apr 34, 2073 at 2:27 PM ET.`;
const regEx = /^A charge ofs((?<currency>.+))s(?<amount>d+.?d+) at (?<company>.+) has been authorized on (?<date>.+) at (?<time>.+).$/;
console.log(string.match(regEx).groups)

在使用命名捕获组之前,请检查浏览器支持,我可以使用吗。

关于您尝试的模式的一些注意事项

  • 您可以省略捕获组中的前瞻断言(?=,而将文本作为匹配的一部分
  • 最后的断言(?=ET)将使ET不属于该组
  • 您可以考虑让日期部分更具体(或者至少稍后验证该部分为有效日期(,就像接受Apr 34, 2073这样的日期一样——客户可能永远不会收到订单

您可以将模式简化为

bAs+charges+ofs+D*b(d+(?:.d+)?)s+ats+(S.*?)s+hass+beens+authorizeds+ons+(S.*?)s+ats+([^.]+).

模式匹配:

  • bAs+charges+ofs+D*b匹配A charge of,后面跟单词边界之间除数字以外的任何字符,以防止部分匹配
  • (d+(?:.d+)?)捕获组1用可选小数部分匹配1位以上的数字
  • s+ats+在空白字符之间匹配at
  • (S.*?)捕获组2匹配一个非空白字符,后面跟尽可能多的最后一个字符
  • s+hass+beens+authorizeds+ons+匹配has been authorized on
  • (S.*?)捕获组3匹配一个非空白字符,后面跟尽可能多的最后一个字符
  • s+ats+在空白字符之间匹配at
  • ([^.]+)捕获组4匹配除.之外的1+个字符
  • .匹配.

如果可以有更多匹配,可以使用/g标志并循环所有组的结果。

Regex演示

const regex = /bA charge of D*b(d+(?:.d+)?) at (S.*?) has been authorized on (S.*?) at ([^.]+)./g;
const str = `This is an Alert to help you manage your credit card account ending in 0000.
As you requested, we are notifying you of any charges over the amount of ($USD) 0.01, as specified in your Alert settings. A charge of ($USD) 44.44 at UBER * EATS PENDIN has been authorized on Apr 34, 2073 at 2:27 PM ET.
Do not reply to this Alert.
If you have questions, please call the number on the back of your credit card, or send a secure message from your Inbox on www.bank.com.
To see all of the Alerts available to you, or to manage your Alert settings, please log on to www.bank.com.`;
while ((m = regex.exec(str)) !== null) {
m.forEach((match, i) => {
if (i > 0) console.log(match);
});
}

相关内容

  • 没有找到相关文章