多维维恩图

我有这个数据：

String[] a = {"a", "b", "c", "d"};
String[] b = {"c", "d"};
String[] c = {"b", "c"};

现在，我需要这些列表的每个交叉点的图形表示，主要是这样会得到这样的维恩图：http://manuals.bioinformatics.ucr.edu/_/rsrc/1353282523430/home/R_BioCondManual/r-bioc-temp/venn1.png?height=291&宽度=400

在我的实现中，这些列表大多包含1000多个条目，我将有10多个列表，因此一个好的表示将创建一组字符串并与它们相交。在我非常简单的情况下，这将导致

set_a = {"c"};      // in all three lists
set_b = {"b", "d"}; // in two of three lists
set_c = {"a"};      // in one of three lists

现在的另一个要求是，交集的大小应该与列表中出现的次数成比例。因此CCD_ 1的大小应该是CCD_。

有满足那个需求的库吗？

我认为这个程序完成了您想要的转换：

    // The input
    String[][] a = {
        {"a", "b", "c", "d"},
        {"c", "d"},
        {"b", "c"}
    };
    System.out.println("Input: "+ Arrays.deepToString(a));
    // Convert the input to a Set of Sets (so that we can hangle it more easily
    Set<Set<String>> input = new HashSet<Set<String>>();
    for (String[] s : a) {
        input.add(new HashSet<String>(Arrays.asList(s)));
    }
    // The map is used for counting how many times each element appears 
    Map<String, Integer> count = new HashMap<String, Integer>();
    for (Set<String> s : input) {
        for (String i : s) {
            if (!count.containsKey(i)) {
                count.put(i, 1);
            } else {
                count.put(i, count.get(i) + 1);
            }
        }
    }
    //Create the output structure
    Set<String> output[] = new HashSet[a.length + 1];
    for (int i = 1; i < output.length; i++) {
        output[i] = new HashSet<String>();
    }
    // Fill the output structure according the map
    for (String key : count.keySet()) {
        output[count.get(key)].add(key);
    }
    // And print the output
    for (int i = output.length - 1; i > 0; i--) {
        System.out.println("Set_" + i + " = " + Arrays.toString(output[i].toArray()));
    }

输出：

Input: [[a, b, c, d], [c, d], [b, c]]
Set_3 = [c]
Set_2 = [d, b]
Set_1 = [a]

相关内容

最新更新

热门标签：