根据数组中找到的单词仅返回一个句子



我有一个Java代码,可以根据单词数组检索句子。文本字符串:

String text = "This is a sample text. Simple yet elegant. Everyone dies. I don't care. This text is nice.";

我还有一个单词数组如下:

String[] words = new String[] {"text", "care", "nice"};

现在,我需要获取包含数组中特定单词的句子。因此,要输出的句子应包含"文本","关怀"或"nice"一词。生成的输出应如下所示:

This is a sample text. //contains the word "text"
I don't care. //contains the word "care"
This text is nice. //contains the word "text" and "nice" 

我拥有的代码是:

    public static void main(String[] args) {
    String text = "This is a sample text. Simple yet elegant. Everyone dies. I don't care. This text is nice.";
    String[] words = new String[] {"text", "care", "nice"};
    String[] parts = text.split("\.");
    for(String w: words){
        for(String sentence: parts){
            if(sentence.contains(w)){
                System.out.println(sentence +" //contains: "+w);
            }
        }
    }   
}

但是,如果句子包含数组中的两个单词,它将打印该句子两次。例如:

The text is nice //contains: text
The text is nice//contains: nice.

如何只打印一次句子?谢谢。

Java 8解决方案。

    for (String sentence : parts) {
        List<String> wordsInCurrentSentence = new LinkedList<String>();
        for (String w : words) {
            if (sentence.contains(w)) {
                wordsInCurrentSentence.add(w);
            }
        }
        if (!wordsInCurrentSentence.isEmpty()) {
            String result = wordsInCurrentSentence.stream().collect(Collectors.joining(","));
            System.out.println(sentence.trim() + " //contains: " + result);
        }
    }

我认为最好把外循环放进去。这样,您可以检查您想要的单词是否被击中并将它们添加到本地列表中。像这样:

for(String sentence: parts){
    List<String> hitList = new ArrayList<String>();
    for(String w: words){
        if(sentence.contains(w)){
            hitList.add(w);
        }
    }
    System.out.println(sentence +" //contains: "+ hitList != null ? hitList : "No match" );
}

这样,您可以针对您指出的情况进行检查 这段文字很好。 包含单词"text"和"nice"

反转循环并添加中断。其他人已经建议了更好的方法来做到这一点。但是只要对代码进行微小的更改,它应该可以工作。交换循环并在成功时添加中断。

for(String sentence: parts){
        for(String w: words){
            if(sentence.contains(w)){
                System.out.println(sentence +" //contains: "+w);
                break;
            }
        }
    }

我会使用正则表达式,如下所示:

String regex = ".*?(" + String.join("|", words) + ").*?";//either of one word in the sentence
for (String sentence: parts) {
    if(sentence.matches(regex)) {
        //...
    }
}

最新更新