Oracle REGEXP_SUBSTR不匹配点字符



我试图从字符串中提取信息,如:

FOO-BAR-AUDIT-DATABASE.NUPKG
FOO.BAR.DATABASE-2.0.0.NUPKG

to info like:

'FOO.BAR.DATABASE' '2.0.0'
|             |
module_name     version

目前,当module_name部分包含.字符时,我无法正确解析。

下面的示例显示了我如何提取信息。regexp的第一组是不能正常工作的'(.*?),其余组处理不同版本信息的情况。

select case module_name when expected then 'pass' else 'fail' end as test, y.* from(
select lower(regexp_substr(t.pck, g.regex, 1, 1, '', 1)) as module_name, 
t.expected,
to_number(regexp_substr(t.pck, g.regex, 1, 1, '', 3)) as major,
to_number(regexp_substr(t.pck, g.regex, 1, 1, '', 5)) as minor,
to_number(regexp_substr(t.pck, g.regex, 1, 1, '', 7)) as patch,
(t.pck) as package_name
from   (select 'FUNKY_LOG_DATABASE-1.0.0.NUPKG' as pck, 'funky_log_database' as expected from dual
union select 'FOO.BAR.DATABASE-2.0.0.NUPKG', 'foo.bar.database' from dual 
union select 'FOO-BAR-AUDIT-DATABASE.NUPKG', 'foo-bar-audit-database' from dual
union select 'funk-database-1.nupkg', 'funk-database' from dual
union select 'funk-database-1.2.nupkg', 'funk-database' from dual
union select 'baz-database-1.0.1.nupkg', 'baz-database' from dual) t
cross  join (select '(.*?)(-(d+)(.(d+))?(.(d+))?)?(..*)' as regex from dual) g
)y;

上面的查询结果如下(Oracle 19c):

<表类>测试module_name预期主要小补丁package_nametbody><<tr>通过foo-bar-audit-databasefoo-bar-audit-databasefoo-bar-audit-database。NUPKG失败foofoo.bar.databaseFOO.BAR.DATABASE-2.0.0.NUPKG通过funky_log_databasefunky_log_database100FUNKY_LOG_DATABASE-1.0.0.NUPKG通过baz-databasebaz-database101baz-database-1.0.1.nupkg通过funk-databasefunk-database12funk-database-1.2.nupkg通过funk-databasefunk-database1恐慌-数据库- 1. - nupkg

您可以从末尾匹配以获得版本,然后提取版本前的子字符串以获得模块名称:

select case module_name when expected then 'pass' else 'fail' end as test,
y.*
from   (
select lower(
substr(
t.pck,
1,
REGEXP_INSTR(t.pck, g.regex) - 1
)
) as module_name, 
t.expected,
to_number(regexp_substr(t.pck, g.regex, 1, 1, '', 2)) as major,
to_number(regexp_substr(t.pck, g.regex, 1, 1, '', 3)) as minor,
to_number(regexp_substr(t.pck, g.regex, 1, 1, '', 4)) as patch,
t.pck as package_name
from   (
select 'FUNKY_LOG_DATABASE-1.0.0.NUPKG' as pck, 'funky_log_database' as expected from dual
union select 'FOO.BAR.DATABASE-2.0.0.NUPKG', 'foo.bar.database' from dual 
union select 'FOO-BAR-AUDIT-DATABASE.NUPKG', 'foo-bar-audit-database' from dual
union select 'funk-database-1.nupkg', 'funk-database' from dual
union select 'funk-database-1.2.nupkg', 'funk-database' from dual
union select 'baz-database-1.0.1.nupkg', 'baz-database' from dual
) t
cross  join (
select '(-(d+).?(d+)?.?(d+)?)?.[^.]+$' as regex from dual
) g
)y;

输出:

<表类>测试MODULE_NAME预期主要小补丁PACKAGE_NAMEtbody><<tr>通过foo-bar-audit-databasefoo-bar-audit-databasefoo-bar-audit-database。NUPKG通过foo.bar.databasefoo.bar.database200FOO.BAR.DATABASE-2.0.0.NUPKG通过funky_log_databasefunky_log_database100FUNKY_LOG_DATABASE-1.0.0.NUPKG通过baz-databasebaz-database101baz-database-1.0.1.nupkg通过funk-databasefunk-database12funk-database-1.2.nupkg通过funk-databasefunk-database1恐慌-数据库- 1. - nupkg

这样行吗?它不是复杂的,但是-返回您想要的数据(至少,我认为是这样)。

  • 行#1 - 8 -样本数据
  • tempCTE:删除扩展(.nupkg),为了简单
  • 最后查询:
    • line #18是module name;如果它包含数字,则获取substring直到第一个数字。否则,删除整个PCT值
    • 行#20 - 22返回version:如果没有数字,返回NULL。否则,返回从第一个数字开始的子字符串

SQL> with
2  test as
3   (select 'FUNKY_LOG_DATABASE-1.0.0.NUPKG' as pck, 'funky_log_database' as expected from dual
4          union select 'FOO.BAR.DATABASE-2.0.0.NUPKG', 'foo.bar.database' from dual
5          union select 'FOO-BAR-AUDIT-DATABASE.NUPKG', 'foo-bar-audit-database' from dual
6          union select 'funk-database-1.nupkg', 'funk-database' from dual
7          union select 'funk-database-1.2.nupkg', 'funk-database' from dual
8          union select 'baz-database-1.0.1.nupkg', 'baz-database' from dual),
9  temp as
10    -- remove extension
11    (select pck pck_old, expected,
12       replace(lower(pck), '.nupkg', '') pck
13     from test
14    )
15  select pck_old,
16         expected,
17    --
18    nvl(substr(pck, 1, regexp_instr(pck, 'd') - 2), pck) module_name,
19    --
20    case when regexp_instr(pck, 'd') = 0 then null
21         else substr(pck, regexp_instr(pck, 'd'))
22    end version
23  from temp;
PCK_OLD                        EXPECTED               MODULE_NAME             VERSION
------------------------------ ---------------------- ----------------------- --------
FOO-BAR-AUDIT-DATABASE.NUPKG   foo-bar-audit-database foo-bar-audit-database
FOO.BAR.DATABASE-2.0.0.NUPKG   foo.bar.database       foo.bar.database        2.0.0
FUNKY_LOG_DATABASE-1.0.0.NUPKG funky_log_database     funky_log_database      1.0.0
baz-database-1.0.1.nupkg       baz-database           baz-database            1.0.1
funk-database-1.2.nupkg        funk-database          funk-database           1.2
funk-database-1.nupkg          funk-database          funk-database           1
6 rows selected.
SQL>

最新更新