排序和合并Java ArrayList,如Python Tuples



我来自Python背景,目前正在将我的Python程序移植到Java。我需要关于解决问题的最佳方法的建议。

最初,我在Python中创建了一个元组列表:

loft = [('india',1),('accepts',1),('narendra',1), ('modi',1),('manmohan',1),('singh',1),('sonia gandhi',1),('rajkot',1),('sharma',1),('raja',1),('india',2),('manmohan',2),('singh',2),('nepal',2),('prime minister',2),('meeting',2),('economy',2),('manmohan',3),('narendra',3),('modi',3),('gupta',3),('rajkot',3),('patel',3),('singh',3),('rajiv',3),('aajtak',3),('manmohan',4),('nepal',4),('bahadur',4),('king',4),('meeting',4),('economy',4),('wife',4),('plane',4)]

(其中印度,接受是关键字,数字是从数据库中获取的id)。现在,应用:

di = {}
for x,y in ll:
     di.setdefault(x,[]).append(y)
newdi = {}

我的列表变成了一个字典:

di = {'manmohan': [1, 2, 3, 4], 'sonia gandhi': [1], 'raja': [1], 'india': [1, 2], 'narendra': [1, 3], 'patel': [3], 'sharma': [1], 'nepal': [2, 4], 'gupta': [3], 'singh': [1, 2, 3], 'meeting': [2, 4], 'economy': [2, 4], 'rajkot': [1, 3], 'prime minister': [2], 'plane': [4], 'bahadur': [4], 'king': [4], 'wife': [4], 'accepts': [1], 'modi': [1, 3], 'aajtak': [3], 'rajiv': [3]}

Java部分:

    public void step1() throws SQLException{
      Connection con= new Clustering().connect();
      Statement st = con.createStatement();
      Statement st1 = con.createStatement();
      ResultSet rs = st.executeQuery("select uid from url where artorcat=1");
      ArrayList<Tuples> allkeyword = new ArrayList<Tuples>();
      long starttime = System.currentTimeMillis();
      while (rs.next()) {
        int id = rs.getInt("uid");
        String query = "select tags.tagname from tags left join tag_url_relation on tags.tid=tag_url_relation.tid where tag_url_relation.uid="+id;
        ResultSet rs1 = st1.executeQuery(query);
        while (rs1.next()){
          String tag = rs1.getString(1);
          //Creating an object t of type Tuples
          //and pass values to constructor
          Tuples t = new Tuples(id,tag);
          //adding the above tuple to arraylist allkeyword
          allkeyword.add(t);
        }//job done, now lets test by iterating
      }
      Iterator<Tuples> it = allkeyword.iterator();
      while(it.hasNext()){
        Tuples t = it.next();
        System.out.println(t.getId());
        System.out.println(t.getKeyword());
      }
      long endtime = System.currentTimeMillis();
      long totaltime = endtime-starttime;
      System.out.println("Total time:" + totaltime);
    }

And here is Tuples class : 
/**
 * 
 * 
 * Tuple class is created to create a multiple data type tuple. We are using this tuples object to retrieve keyword and 
 * id in step1 in Clustering.java.
 * @author akshayy
 *
 */

public class Tuples {
    int i;
    String s;

    public Tuples(int i, String s) {
        this.i= i;
        this.s=s;
    }

    public int getId(){
        return this.i;
    }
    public String getKeyword(){
        return this.s;      
    }

}

到目前为止一切顺利。我创建了一个包含关键字和id的元组类的数组列表。那么下一步如何查找id中出现的关键字呢?就像"manmohan"在id 1、2、3、4等等中都可以找到。

di = {'manmohan': [1, 2, 3, 4], 'sonia gandhi': [1], 'raja': [1], 'india': [1, 2], 'narendra': [1, 3], 'patel': [3], 'sharma': [1], 'nepal': [2, 4], 'gupta': [3], 'singh': [1, 2, 3], 'meeting': [2, 4], 'economy': [2, 4], 'rajkot': [1, 3], 'prime minister': [2], 'plane': [4], 'bahadur': [4], 'king': [4], 'wife': [4], 'accepts': [1], 'modi': [1, 3], 'aajtak': [3], 'rajiv': [3]}

请告诉我下一步应该怎么做才能在arraylist中找到类似的项目,并像上面那样对它们进行排序。还是我需要一个完全不同的东西?

看一下java.lang.Map接口。你实际上是在创建一个

Map<String,List<Integer>> 

使用纯Collections类,可以使用contains和Collections等方法。排序(如果关注性能,可以考虑使用自己的排序算法)

对于新的Java开发人员来说,迭代Map并不是那么直接,但是您可以迭代KeySet,在每个迭代点对Map执行get操作,然后对值(在本例中是List)执行contains操作。

Integer bar = whatever you are evaluating
Map<String, List<Integer>> fooMap = new HashMap<String, List<Integer>>();
... build your map ...
for(String key:fooMap.keySet()){
    if(fooMap.get(key).contains(bar)){
        ...logic when found...  
    }
}

您需要创建一个具有List或Set值的Map。根据需要,您可以保留Tuples类,也可以单独使用String和Integer。

下面是一个例子:

// construct a map with string key (tag) and list of integers (ids) as the value
Map<String, List<Integer>> keywords = new HashMap<String, List<Integer>>();
while (rs.next()) {
    int id = rs.getInt("uid");
    String query = "select tags.tagname from tags left join tag_url_relation on tags.tid=tag_url_relation.tid where tag_url_relation.uid="+id;
    ResultSet rs1 = st1.executeQuery(query);
    while (rs1.next()){
        String tag = rs1.getString(1);
        // construct the List for this keyword
        if (!keywords.containsKey(tag)) {
            keywords.put(tag, new ArrayList<Integer>());
        } 
        keywords.get(tag).add(id);
    }
}

keywords将是一个类似于Python实现中的数据结构:

List<Integer> manmohanList = keywords.get("manmohan"); // will get you a list containing the numbers 1,2,3,4
for (Integer id: manmohanList) {
    System.out.println(id); // prints 1,2,3,4
}

与其为元组定义一个类,不如声明一个HashMap来存储字典中的关键字和位置。如

Map<String, ArrayList<Integer>> dictionary = new HashMap<String, ArrayList<Integer>>();
//Now before adding any new keyword to the map just check if it contains it or not.
while (rs1.next()){
   //Your
   //Old
   //Code
   if(dictionary.contains(tag)){
       id_list = dictionary.get(tag);
       id_list.add(id);
       dictionary.put(tag, id_list);
   }else{
        dictionary.put(tag, id);
   }
}

没有测试它是否有拼写错误。但我想你应该有个想法。

最新更新