从文件中读取数组列表.打印仅出现一次的字词

编码和Java的新手，请善待:)

我

正在为学校做一个项目，我正在尝试迭代我从文本文件中读入的ArrayList。

我使用扫描仪将文件读入ArrayList，然后使用Collections.sort()对ArrayList进行排序，希望我可以用下一个元素检查每个元素。如果元素与下一个元素相同，请忽略并继续，但如果该元素在ArrayList中没有重复，则将其添加到新ArrayList。

因此，在读取包含以下单词的文本文件时：

this this is a a sentence sentence that does not not make sense a sentence not sentence not really really why not this a sentence not sentence a this really why

新ArrayList应该是

is that does make sense

因为这些词只出现一次。

public static void main (String[] args) throws FileNotFoundException {   
    Scanner fileIn = new Scanner(new File("words.txt"));
    ArrayList<String> uniqueArrList = new ArrayList<String>();
    ArrayList<String> tempArrList   = new ArrayList<String>();
    while (fileIn.hasNext()) {
        tempArrList.add(fileIn.next());
        Collections.sort(tempArrList);            
    }           
    for (String s : tempArrList) {
        if(!uniqueArrList.contains(s))
            uniqueArrList.add(s);
            else if (uniqueArrList.contains(s))
                uniqueArrList.remove(s);
            Collections.sort(uniqueArrList);  
           System.out.println(uniqueArrList);
    }

这就是我到目前为止所拥有的，但我总是以这个[a, does, is, make, really, sense, that]结束

我希望有人能告诉我我做错了什么:)

您的算法不正确，因为它不断在uniqueArrList中添加和删除项目。因此，它查找出现奇数次的单词，并且不关心要排序的列表。

您可以对列表进行一次排序（sort移出循环），然后使用非常简单的策略：

使用整数索引遍历列表
对照下一个索引中的单词检查当前索引中的单词
如果单词不同，请打印当前单词，并将索引前进 1
如果单词相同，请向前移动列表，直到看到不同的单词，并使用该单词的位置作为循环索引的下一个值。

下面是一个示例实现：

Scanner fileIn = new Scanner(new File("words.txt"));
List<String> list = new ArrayList<>();
while (fileIn.hasNext()) {
    list.add(fileIn.next());
}           
Collections.sort(list);
int pos = 0;
while (pos != list.size()) {
    int next = pos+1;
    while (next != list.size() && list.get(pos).equals(list.get(next))) {
        next++;
    }
    if (next == pos+1) {
        System.out.println(list.get(pos));
    }
    pos = next;
}

演示。

这里的一种选择是维护一个单词的哈希图，以便在您解析文件时进行计数。然后，在最后迭代该映射以获得只出现一次的单词：

Scanner fileIn = new Scanner(new File("words.txt"));
Map<String, Integer> map = new HashMap<>();
ArrayList<String> uniqueArrList = new ArrayList<String>();
while (fileIn.hasNext()) {
    String word = fileIn.next():
    Integer cnt = map.get(word);
    map.put(word, cnt == null ? 1 : cnt.intValue() + 1);
}
// now iterate over all words in the map, adding unique words to a separate list
for (Map.Entry<String, Integer> entry : map.entrySet()) {
    if (entry.getValue() == 1) {
        uniqueArrList.add(entry.getKey());
    }
}

您当前的方法很接近，您应该在添加所有单词后进行一次排序。然后，您需要保留List的索引，以便可以测试相等的元素是否相邻。像这样，

List<String> uniqueArrList = new ArrayList<>();
List<String> tempArrList = new ArrayList<>();
while (fileIn.hasNext()) {
    tempArrList.add(fileIn.next());
}
Collections.sort(tempArrList);
for (int i = 1; i < tempArrList.size(); i++) {
    String s = tempArrList.get(i - 1);
    if (s.equals(tempArrList.get(i))) {
        // skip all equal and adjacent values
        while (s.equals(tempArrList.get(i)) && i + 1 < tempArrList.size()) {
            i++;
        }
    } else {
        uniqueArrList.add(s);
    }
}
System.out.println(uniqueArrList);

最简单的方法是使用 Set 或 HashSet，因为您忘记控制元素的重复。但是，如果必须使用列表，则无需对元素进行排序。只需在单词上迭代两次，就可以了

List<String> uniqueWords = new ArrayList<>();
    for (int i = 0; i < words.size(); i++) {
        boolean hasDuplicate = false;
        for (int j = 0; j < words.size(); j++) {
            if (i != j) {
                if (words.get(i).equals(words.get(j))){
                    hasDuplicate = true;
                }
            }
        }
        if (!hasDuplicate) {
            uniqueWords.add(words.get(i))
        }
    }

调用

时的逻辑错误

else if (uniqueArrList.contains(s))
      uniqueArrList.remove(s);

使用一个数组：

            Scanner fileIn = new Scanner(new File("words.txt"));
            ArrayList<String> tempArrList = new ArrayList<String>();
            while (fileIn.hasNext()) {
                tempArrList.add(fileIn.next());
            }
            Collections.sort(tempArrList);
            System.out.println(tempArrList);
            if (tempArrList.size() > 1) {
                for (int i = tempArrList.size() - 1; i >= 0; i--) {
                    String item = tempArrList.remove(i);
                    if (tempArrList.removeAll(Collections.singleton(item))) {
                        if (i > tempArrList.size()) {
                            i = tempArrList.size();
                        }
                    } else {
                        tempArrList.add(item);
                    }
                }
            }
            System.out.println(tempArrList);

希望对您有所帮助！如果有帮助，请反馈。

仅出于完整性考虑，这个问题对于Java 8 Streams来说是一个明智的选择，使用distinct（）中间操作：

public static void main (String[] args) throws FileNotFoundException {    
    final Scanner fileIn = new Scanner(new File("words.txt"));
    final List<String> tempArrList = new ArrayList<String>();
    while (fileIn.hasNext()) {
        tempArrList.add(fileIn.next());
    }
    final List<String> uniqueArrList = tempArrList.stream().distinct().collect(Collectors.toList());
    System.out.println(uniqueArrList);
}

此代码打印（对于提供的输入）：

[这个，是，一个，句子，那个，做，不，使，有意义，真的，为什么]

如果我们希望对所有单词进行排序，只需将 sorted（）添加到流管道即可：

tempArrList.stream().sorted().distinct().collect(Collectors.toList());

我们得到一个排序的（和漂亮的）输出：

[a，确实，是，使，不是，真的，感觉，句子，那个，这个，为什么]

相关内容

最新更新

热门标签：