PHP regex匹配SKU(几个模式)在字符串中



记录的名称中混合了几种类型的SKU,这些SKU可能包含符号、数字等。

例子:

Name of product 67304-4200-52-21
67304-4200-52 Name of product
67304-4200 Name of product
38927/6437 Name of product
BKK1MBM06-02 Name of product
BKK1MBM06 Name of product

我需要preg_match (PHP)只有SKU部分与任何符号在任何组合。

所以我写了pattern:

/d+/d+|d+-?d+-?d+-?d+|bbkk.*b/i

它可以工作,但不能与[BKK*] SKU。

是否有办法将所有这些类型的SKU组合在一个模式中?

模式d+-?d+-?d+-?d+表示至少应该有4个数字,因为所有的连字符都是可选的,但在示例数据中,数字部分至少有一个连字符,并且由2,3或4部分组成。

您可以重复包含数字和连字符的部分1次或多次,而不是使用.*b,而是使用S*b来匹配可选的非空白字符,这些字符将回溯到最后一个单词边界。

注意,如果在php中使用/以外的其他分隔符,则不必转义/

使用不区分大小写的匹配:

b(?:d+(?:-d+)+|bkkS*|d+/d+)b

  • b防止部分字匹配的字边界
  • (?:备选项的非捕获组
    • d+(?:-d+)+匹配1+数字并重复1次或多次匹配-和再次匹配1+数字(或使用{1,3}代替+)
    • |Or
    • bkkS*匹配bkk和可选非空白字符
    • |
    • d+/d+匹配1+数字/和1+数字
  • )关闭非捕获组
  • bA字边界

查看regex101演示。

使用

d+(?:d+(?:-?d+){3}|/d+)|b[bB][kK][kK][A-Za-z0-9-]*

参见正则表达式证明。

REGEX101解释

1st Alternative d+(?:d+(?:-?d+){3}|/d+)
d matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
Non-capturing group (?:d+(?:-?d+){3}|/d+)
1st Alternative d+(?:-?d+){3}
d matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
Non-capturing group (?:-?d+){3}
{3} matches the previous token exactly 3 times
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
d matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Alternative /d+
/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
d matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Alternative b[bB][kK][kK][A-Za-z0-9-]*
b assert position at a word boundary: (^w|w$|Ww|wW)
Match a single character present in the list below [bB]
bB matches a single character in the list bB (case sensitive)
Match a single character present in the list below [kK]
kK matches a single character in the list kK (case sensitive)
Match a single character present in the list below [kK]
kK matches a single character in the list kK (case sensitive)
Match a single character present in the list below [A-Za-z0-9-]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)

最新更新