带有哈希图打印的矩阵表

我通过Hadoop制作一个代码作为赋值，以编写从文本文件中读取的值的矩阵表。基本上，它必须读取一个角色在某个其他角色之后出现的次数。

我已经有了这些值，通过我在 hadoop中的代码将它们放入哈希映射<char1，hashmap><char2，value>> 并将它们传递给一个名为频率表的函数：

 public void FrequencieTable(HashMap<Character, HashMap<Character,Integer>> charFrequentie){
    String theRowList = "";
    String theColumnList = "";
    for(Character row : charFrequentie.keySet()){
        theRowList += "n" + row;
        for(Character column : charFrequentie.get(row).keySet()) {
            theColumnList += column.charValue() + "t";
            theRowList += "t" +  charFrequentie.get(row).get(column);
        }
        
        System.out.println("t" + theColomList +  "n");
        System.out.println(theRowList);
    }
}

只有这样才会给出错误的输出，因为它应该只在行和列上显示每个字符，如"H"，如果那里没有任何数据，它应该显示 0。

基本上它给出了这样的输出：

    u   s   a   o   e   m   g   d   t   n   j   t   a   n   g   t   e   m   a   t   e   e   a   o   e   u   s   g   l   j   k   

g   1   1   2   2
d   1   1
e   1   2   2   1   1
a   2   2   3
n   2   2
o   2   1

虽然它应该是这样的：（没有重复）

    u   s   a   o   e   m   g   d   t   n   j   
g   0   0   0   0   0   0   0   1   1   2   2
d   1   1   0   0   1   0   3   0   0   0   0

有人知道我们应该做什么吗？我们完全是无知的自动取款机。

谢谢你

我不知道

你的数据应该代表什么，但基本问题似乎是你的内部循环：

for(Character column: charFrequentie.get(row).keySet()){
    theColumnList += column.charValue() + "t";
    theRowList += "t" +  charFrequentie.get(row).get(column);
}

在这里，如果

同一字符存在于多行的映射中，您将向列列表中添加相同的字符，例如，如果您有 a->c 和 b->c 的频率，则列列表中会有 c 两次。

除此之外，您还需要以相同的顺序迭代映射中的所有字符。您目前只使用每行的值，由于您不知道它们在哪一列中，因此无法用 0 填充其他列。

要解决此问题，您必须循环两次（否则您可能会错过顶部行中的尾随零）：

一次获取所有列
一次实际打印行

列的顺序要么必须提供/定义，要么取决于映射中的顺序。

例：

//Step 1: collect all the columns that have values
Set<Character> columns = new LinkedHashSet<>();
for(Character row : charFrequentie.keySet()){
  //gets the mapped characters for the row and adds them in the order they are returned ignoring any duplicates 
  columns.addAll(charFrequentie.get(row).keySet());
}
//Step 2: print
for( Character col : columns ) {
  //print the columns as the first line
}
//here you iterate over the rows since you'll print line by line
for(Character row : charFrequentie.keySet()){
  //go over the columns in the same order for each row
  for( Character col : columns ) {
    //get the frequency for the column in that row, which might be null       
    Integer frequency = charFrequentie.get(row).get(col);
    //if there is no value for the column in the current row just print 0
    if( frequency == null ) {
      //print 0
    } else {
      //there is a frequency value so just print it         
    }
  }
}

关于列及其顺序的两个注释：

由于您只提供哈希映射，因此您无法确定列的顺序（LinkedHashSet顺序将与映射返回的顺序相同，但这仍然不会定义，因为哈希映射通常不定义顺序。如果你想要一个特定的顺序，你必须对列进行排序（然后使用排序集）或手动提供它们。
条目，则不会获得任何具有全零的列，如您的示例中，o 和 m 只有 0 值。在这种情况下，您必须手动提供它们才能获取没有频率数据的列。

编辑：

为了使示例更清晰，假设以下输入数据（格式：行，列，频率）

a,a,1
a,b,5
a,c,3
b,a,5
b,e,7

这将导致一个列集具有值 a 、 b、c 、 e 的任何顺序（因为你使用哈希映射）。

输出可能如下所示（由于使用哈希图，顺序可能会有所不同，我只是使用随机顺序）：

  b e a c
b 0 7 5 0  
a 5 0 1 3

相关内容

最新更新

热门标签：