在Java中使用regex进行希伯来语文本解析



我正在尝试解析希伯来语文本,但没有成功。这里有人能帮忙吗?

    String hebrewSearhString  = "חן";
    //String regexHebrewPattern = "([\u0591-\u05F4\s]+)"; // Tried this too, but same no success
    String regexHebrewPattern = "([\p{InHebrew}]+)"; 
    Pattern patternHebrew = Pattern.compile(regexHebrewPattern, Pattern.UNICODE_CASE);
    Matcher matcherHebrew = pattern.matcher(hebrewSearhString);
    if(matcherHebrew.matches()) {
        System.out.println("Whole -"+ matcherHebrew.group(0));
        //System.out.println("Group 1 -"+ matcherHebrew.group(1));
        //System.out.println("Group 2 -"+ matcherHebrew.group(2));
    }
    Result : "If" condition doesn't gets to TRUE

感谢

这个,

Matcher matcherHebrew = pattern.matcher(hebrewSearhString);

应该是

Matcher matcherHebrew = patternHebrew.matcher(hebrewSearhString);

我得到了输出,

Whole -חן

因为CCD_ 1的值确实是CCD_。

最新更新