Oracle PL/SQL-识别并提取两个字符串之间的匹配单词



我对Oracle功能是很新的,所以对我的天真表示歉意。

我正在寻找一个将COL_A中的字符串与COL_B中的字符串进行比较的函数,然后将字符串中的任何匹配单词输出到COL_C

例如。

  • COL_A = ‘Microsoft Office’COL_B = ‘Windows Microsoft’,因此COL_C中的预期结果是‘Microsoft’
  • COL_A = ‘Microsoft Office’COL_B = ‘Microsoft Office’,因此COL_C的预期结果将是‘Microsoft Office’
  • COL_A = ‘Microsoft Office’COL_B = ‘Microsoft Windows',因此COL_C中的预期结果将为‘Microsoft’
  • COL_A = ‘Microsoft Office’COL_B = ‘Outlook’,因此预期结果COL_C将是NULL

我确实找到了一个几乎满足需求的函数(在两个字符串中计数顺序匹配单词),但是,此函数输出了匹配单词的计数,并且仅对匹配进行分类,而单词订单也匹配(出于我的目的,该顺序是无关紧要的,我理想地喜欢显示的匹配单词)。

CREATE OR REPLACE FUNCTION STR_WORD_MATCH(
  P_STR1 IN VARCHAR2,
  P_STR2 IN VARCHAR2 )
 RETURN NUMBER
IS
 L_STR1 VARCHAR2(4000) := P_STR1;
 L_STR2 VARCHAR2(4000) := P_STR2;
 L_RES NUMBER DEFAULT 0;
 L_DEL_POS1 NUMBER;
 L_DEL_POS2 NUMBER;
 L_WORD1 VARCHAR2(1000);
 L_WORD2 VARCHAR2(1000);
BEGIN
 LOOP
  L_DEL_POS1 := INSTR(L_STR1,' ');
  L_DEL_POS2 := INSTR(L_STR2,' ');
  CASE L_DEL_POS1
  WHEN 0 THEN
   L_WORD1 := L_STR1;
   L_STR1 := '';
  ELSE
   L_WORD1 := SUBSTR(L_STR1,1,L_DEL_POS1 - 1);
  END CASE;
  CASE L_DEL_POS2
  WHEN 0 THEN
   L_WORD2 := L_STR2;
   L_STR2 := '';
  ELSE
   L_WORD2 := SUBSTR(L_STR2,1,L_DEL_POS2 - 1);
  END CASE;
  EXIT
 WHEN (L_WORD1 <> L_WORD2) OR ((L_WORD1 IS NULL) OR (L_WORD2 IS NULL));
  L_RES := L_RES + 1;
  L_STR1 := SUBSTR(L_STR1,L_DEL_POS1 + 1);
  L_STR2 := SUBSTR(L_STR2,L_DEL_POS2 + 1);
 END LOOP;
RETURN L_RES;
END;

一如既往,任何帮助将不胜感激。

您可以在一个查询中执行此操作,但是为了简化语法,我创建了简短的函数来拆分单词。接下来,我使用了此功能和listagg()

select rn, max(c1) c1, max(c2) c2, 
       listagg(t2.column_value, ' ') within group (order by rn, c1, c2) common
  from (select rownum rn, c1, c2 from t) t 
  cross join table(split(c1)) t1 
   left join table(split(c2)) t2 on t2.column_value = t1.column_value
  group by rn

功能:

create or replace function split(i_str in varchar2) 
  return sys.odcivarchar2list pipelined is
begin
  for i in 1..length(' '||regexp_replace(i_str, '[^ ]+')) loop
    pipe row (regexp_substr(i_str, '[^ ]+', 1, i));
  end loop;
end;

示例:

with t (c1, c2) as (
    select 'Microsoft Office', 'Windows Microsoft' from dual union all
    select 'Microsoft Office', 'Microsoft Office'  from dual union all
    select 'Microsoft Office', 'Microsoft Windows' from dual union all
    select 'Microsoft Office', 'Outlook' from dual )
select rn, max(c1) c1, max(c2) c2, 
       listagg(t2.column_value, ' ') within group (order by rn, c1, c2) common
  from (select rownum rn, c1, c2 from t) t 
  cross join table(split(c1)) t1 
   left join table(split(c2)) t2 on t2.column_value = t1.column_value
  group by rn

结果:

        RN C1               C2                COMMON
---------- ---------------- ----------------- -----------------
         1 Microsoft Office Windows Microsoft Microsoft
         2 Microsoft Office Microsoft Office  Microsoft Office
         3 Microsoft Office Microsoft Windows Microsoft
         4 Microsoft Office Outlook

最新更新