我们如何编写正则表达式(正则表达式)来用单位标识数量，例如"54.20 grams"？

以下是一些测试输入示例
测试输入是ASCII编码的字符串。

测试用例输入

arrhar = Array(100)
arrhar[1] = "Low Carb Orzo Low Carb Rice, High Protein, Great Low Carb Bread Company, Low Carb Pasta Rice, 7 g per pack"
arrhar[2] = "Helios Certified Organic Greek Orzo Pasta, 500gr"
arrhar[3] = "Barilla Orzo Pasta 15.73 oz."
arrhar[4] = "Pasta Granoro Il Primo Orzo 6 ounces per bag"
arrhar[5] = "Authentic Italian Orzo -- 6 OUNCE per bag"
arrhar[6] = "ORZO PASA 4 U! 1 BAGGY IZ 4.39-GRM"
arrhar.trim()

测试用例输出

out[1] = "7 g"    
out[2] = "500gr"     
out[3] = "15.73 oz"      
out[4] = "6 ounces"    
out[5] = "6 OUNCE"       
out[6] = "4.1-grm"

正则表达式的英文描述

假设我们将字符串匹配模式表示为项目符号列表
bullet(1(描述字符串的最左边部分
bullet(2(描述左起第二个子字符串
bullet(3(描述字符串的第三部分
等等…

数字数量
1. 零位或多个数字(0、1、2、…、9(
2. 零或一个小数点或逗号
3. 零位或多个数字(0、1、2、…、9(
可选分隔符
1. 除了类[A-Z]、[a-z]和d中的字符之外，任何字符的零个或多个
单位
1. 克
  1. 任何不区分大小写的"；GRAMS"；a."；g"；b."；GRMS"；c."；gs"；d."；Gms"；e.等等
2. 盎司
  1. Z盎司。。。OUNCEZ的任何不区分大小写的子串
  2. S盎司。。。OUNCES的任何不区分大小写的子串

Regex碎片

适当的正则表达式——数字量的左部分(整数部分(可能是：

d*

d{0,}

[0-9]{0,}

[0123456789]*

小数点为零或一的正则表达式是[.,]?

十进制数为d*[.,]d

数字和单位规范之间可能有分隔符，也可能没有。

56.1gr
56.1 gr
56.1-grams

适用于分隔符的正则表达式可能是[^a-zA-Z0-9]*

假设我们为数字和分隔符而不是单位(例如"盎司"(编写正则表达式。我们可能有：

d*[.,]?d[^a-zA-Z0-9]*?

我希望以上内容与"4.91...."或"4.91 "匹配

用于"；GRAMS"；可能是：[Gg]?[Rr]?[Aa]?[Mm]?[Ss]?

捕获类似"4.1-grm"的正则表达式如下所示：

d*[.,]?d[^a-zA-Z0-9]*?[Gg]?[Rr]?[Aa]?[Mm]?[Ss]?

我们如何才能同时获得克和盎司。

使用?使[Gg]?[Rr]?[Aa]?[Mm]?[Ss]?中的所有部分都是可选的，也可能匹配RM或空字符串。

您可以使用不区分大小写的匹配和替换|来列出可能的替代方案，使其更加具体。

bd+(?:[.,]d+)?s*(?:gr?|oz|ounces?|-grm|grams?)b

b单词边界
d+匹配1+位数字
(?:[.,]d+)?可选地匹配.或,和1+位数字
s*匹配0+个空白字符
(?:gr?|oz|ounces?|-grm|grams?)匹配其中一个备选方案
bA字边界

Regex演示

例如，另一种选择是嵌套非捕获组，以使选定的零件成为选项，但按一定的顺序：

bd+(?:[.,]d+)?s*-?(?:g(?:r(?:a?ms?)?)?|oz|ounces?)b

Regex演示

使用

/d[.,d]*W*(?:gr?a?m?s?|ou?n?c?e?[zs]?)/i

见证明。

解释

--------------------------------------------------------------------------------
d                       digits (0-9)
--------------------------------------------------------------------------------
[.,d]*                  any character of: '.', ',', digits (0-9)
(0 or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
W*                      non-word characters (all but a-z, A-Z, 0-
9, _) (0 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
(?:                      group, but do not capture:
--------------------------------------------------------------------------------
g                        'g'
--------------------------------------------------------------------------------
r?                       'r' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
a?                       'a' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
m?                       'm' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
s?                       's' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
|                        OR
--------------------------------------------------------------------------------
o                        'o'
--------------------------------------------------------------------------------
u?                       'u' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
n?                       'n' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
c?                       'c' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
e?                       'e' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
[zs]?                    any character of: 'z', 's' (optional
(matching the most amount possible))
--------------------------------------------------------------------------------
)                        end of grouping

测试用例输入

测试用例输出

正则表达式的英文描述

Regex碎片

相关内容

最新更新

热门标签：