包含树状映射的密钥方法返回 false，尽管该键已在 Map 中

我尝试编写一个计算文本文件所有单词的程序。我输入了与树状图中的模式匹配的任何单词。

我通过的文本文件args0

例如，文本文件包含以下文本：The Project Gutenberg EBook of The Complete Works of William Shakespeare

检查树状图是否已包含单词的条件，对于单词 The 的第二次出现返回 false，但在第二次出现单词 of 时true返回。

我不明白为什么...
这是我的代码：

public class WordCount
{
    public static void main(String[] args)
    {
        // Charset charset = Charset.forName("UTF-8");
        // Locale locale = new Locale("en", "US");
        Path p0 = Paths.get(args[0]);
        Path p1 = Paths.get(args[1]);
        Path p2 = Paths.get(args[2]);
        Pattern pattern1 = Pattern.compile("[a-zA-Z]");
        Matcher matcher;
        Pattern pattern2 = Pattern.compile("'.");
        Map<String, Integer> alphabetical = new TreeMap<String, Integer>();
        try (BufferedReader reader = Files.newBufferedReader(p0))
        {
            String line = null;
            while ((line = reader.readLine()) != null)
            {
                // System.out.println(line);
                for (String word : line.split("\s"))
                {
                    boolean found = false;
                    matcher = pattern1.matcher(word);
                    while (matcher.find())
                    {
                        found = true;
                    }
                    if (found)
                    {
                        boolean check = alphabetical.containsKey(word.toLowerCase());
                        if (!alphabetical.containsKey(word.toLowerCase()))
                            alphabetical.put(word.toLowerCase(), 1);
                        else
                            alphabetical.put(word.toLowerCase(), alphabetical.get(word.toLowerCase()).intValue() + 1);
                    }
                    else
                    {
                        matcher = pattern2.matcher(word);
                        while (matcher.find())
                        {
                            found = true;
                        }
                        if (found)
                        {
                            if (!alphabetical.containsKey(word.substring(1, word.length())))
                                alphabetical.put(word.substring(1, word.length()).toLowerCase(), 1);
                            else
                                alphabetical.put(word.substring(1, word.length()).toLowerCase(), alphabetical.get(word).intValue() + 1);
                        }
                    }
                }
            }
}

我已经测试了你的代码，没关系。我认为您必须检查文件编码。

它肯定在"UTF-8"中。把它放在"没有BOM的UTF-8"中，你会没事的！

编辑：如果无法更改编码，可以手动更改。请参阅此链接：http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html

问候

相关内容

最新更新

热门标签：