过滤Java中的重复字符

我正在尝试编写一个方法为public static void method(List<String> words)的程序，其中参数words是文本文件words.txt中经过排序的单词列表，并且只包含每个字母只出现一次的单词。例如，"感觉"一词不会包含在这个列表中，因为"e"出现不止一次。单词列表在程序的其余部分中不用作参数，因此方法method仅用于存储和记住单词列表以供以后使用。此函数还可以执行任何排序方法。

我的思考过程是创建一个读取文本文件的方法，并将该文本文件用作method中的参数。method会过滤掉所有出现多次的带有字母的单词，并对新列表进行排序。

运行该程序时，我在for (String word : words)行收到一个错误"java.util.ConcurrentModificationException：null(在java.util.LinkedList$Listltr中("。public static List list;行是否正确保存和存储list以备日后使用？

import java.util.*;
import java.io.*;
class ABC
{
public static List<String> list = new LinkedList<String>()
public static List readFile()
{
String content = new String(); 
File file = new File("words.txt");
LinkedList<String> words = new LinkedList<String>();
try
{
Scanner sc = new Scanner(new FileInputStream(file));
while (sc.hasNextLine())
{
content = sc.nextLine();
words.add(content);
}
}
catch (FileNotFoundException fnf)
{
fnf.printStackTrace();
}
catch (Exception e)
{
e.printStackTrace();
System.out.println("nProgram terminated safely");
}  
for (String word : words) 
{
if (letters(word) == false) 
{
list.add(word);
}
}
Collections.sort(list);
return list;
}
public static boolean letters(String word)
{
for (int i = 0; i < word.length() - 1; i++) 
{
if (word.contains(String.valueOf(word.charAt(i))) == true) 
{
return true;
}
}
return false;
}
public static void main(String args[])
{
System.out.println(readFile());
}
}

错误的来源是您正在更改正在迭代的列表。这通常不是一个好主意。

既然你正在构建一个新的列表，你实际上不需要更改你正在迭代的列表。我建议你更改代码，这样决定一个字母是否多次出现的逻辑就可以用一个单独的方法了。这样，任何给定方法的复杂性都是可控的，并且您可以单独测试它们。

因此，创建一个新方法来测试任何字母是否出现不止一次：

static boolean doesAnyLetterAppearMoreThanOnce(String word) {
...
}

然后你可以在你现有的方法中使用它：

for (String word : words) {
if (!doesAnyLetterAppearMoreThanOnce(word)) {
list.add(word);
}
}
Collections.sort(list);

使用迭代器。这样试试吧。


Iterator<String> it = words.iterator(); 
while(it.hasNext()) {
CharSequence ch = it.next();
for (int j = 0; j < ch.length(); j++)
{
for (int k = j + 1; k < ch.length(); k++)
{
if (ch.charAt(j) == ch.charAt(k))
{
it.remove(word);
}
}
}
list.add(word);
}

然而，我会采取不同的做法。

String[] data =
{ "hello", "bad", "bye", "computer", "feel", "glee" };
outer: for (String word : data) {
for (int i = 0; i < word.length() - 1; i++) {
if (word.charAt(i) == word.charAt(i + 1)) {
System.out.println("dropping '" + word + "'");
continue outer;
}
}
System.out.println("Keeping '" + word + "'");
List.add(word);
}

注意：您使用了feel作为示例，因此不清楚是要检查单词中的相同字母anywhere，还是只检查相同的adjacent字母。

您的程序有几个问题：

public static List list;
每当你看到一个没有泛型的集合(比如List(时，那都是一股臭味。应为
public static List<String> list;
还应考虑将public更改为private

在readFile((方法中，您用局部变量"list"屏蔽类变量"list'"。因此，类变量保持未初始化状态：
list = new LinkedList<String>();

更好地使用scanner的资源：
try(Scanner sc = new Scanner(new FileInputStream(file))) {
之后不需要手动关闭它。
您不能修改正在迭代的列表。您应该使用迭代器及其remove方法，或者创建一个新列表并将好单词附加到其中，而不是从原始列表中删除坏单词。

public static List<String> readFile() {
File file = new File("words.txt");
List<String> list = new ArrayList<>();
try (Scanner scanner = new Scanner(file)) {
while (scanner.hasNextLine()) {
String word = scanner.nextLine();
if (noDuplicates(word)) {
list.add(word);
}
}
Collections.sort(list);
} catch (FileNotFoundException e) {
System.out.println("File not found");
}
return list;
}
private static boolean noDuplicates(String word) {
Set<Character> distinctChars = new HashSet<>();
for (char c : word.toCharArray()) {
if (!distinctChars.add(c)) {
return false;
}
}
return true;
}

我建议采用更短的方法：

public static void method(List<String> words) {
words.removeIf(word -> {
Set<Integer> hs = new HashSet<>();
return word.chars().anyMatch(c -> {
if (hs.contains(c)) return true;
else hs.add(c);
return false;
});
});
System.out.println(words);}

单词列表现在只包含每个字母只出现一次的单词。

相关内容

最新更新

热门标签：