将流分组为pojo的正确方法



我有一个汇总行列表,每个实体的行数很少,其中实体的一些标量属性是重复的,并且有两个唯一的附加列GroupName和GroupCount。

基本上这是SQL连接的输出,实体数据是重复的,并且有一个唯一的组名,并且在每行中都有它的计数。

我想要流式传输并将其收集到具有实体属性的实体Dto以及用于合并组统计数据的Map。

我尝试了一个使用收集器的实现。groupingBy,但看起来还是不对。

@Data
@AllArgsConstructor
public static class DepartmentSummaryRow{
private int id;
private String name;
private String groupName;
private int groupMembersCount;
}
@Data
@AllArgsConstructor
public static class Department{
private int id;
private String name;
@EqualsAndHashCode.Exclude
private final Map<String, Integer> groupCounts = new HashMap<>();
}

public static void main(String[] args) {
grouping();
}

private static void grouping() {
Gson g = new GsonBuilder().setPrettyPrinting().disableHtmlEscaping().create();

//Test data
List<DepartmentSummaryRow> summaries = new ArrayList<>();
for(int i=1;i<=50;i++) {
summaries.add( new DepartmentSummaryRow(i, "name_a"+i, "g1", 3 ) );
summaries.add( new DepartmentSummaryRow(i, "name_b"+i, "g2", 9 ) );
}

//Just group the summary rows
Map<Department, List<DepartmentSummaryRow>> departmentsToSummaries = summaries
.stream()
.collect(
      Collectors.groupingBy( 
              (summary)->{ return new Department(summary.id, summary.name); }, 
              LinkedHashMap::new, 
              Collectors.toList()
      )
);

//Merge the info into the departments
departmentsToSummaries.forEach( (entity, sumaryRow)->{ 
entity.groupCounts.putAll( 
sumaryRow.stream().collect( 
Collectors.groupingBy( 
DepartmentSummaryRow::getGroupName, 
Collectors.summingInt( DepartmentSummaryRow::getGroupMembersCount ) 
) 
) 
) ;
} );

System.out.println( g.toJson( departmentsToSummaries.keySet() ) );
}

我正在寻找一些更好的实现的想法,而不是将流分组为自定义pojo。任何建议都会很有帮助。谢谢!

(注意:这本身有一些错误…出于某种原因,我的POJO的第一个分组根本没有分组。这很奇怪,因为它有一个很好的哈希码和Lombok提供的等号)

编辑:下面是输入的内容:

[
{ "id": 1, "name": "name_a1", "groupName": "g1", "groupMembersCount": 3 }, 
{ "id": 1, "name": "name_b1", "groupName": "g2", "groupMembersCount": 9 }, 
{ "id": 2, "name": "name_a1", "groupName": "g1", "groupMembersCount": 3 }, 
...
]

预期结果如下:

[ 
{ "id": 1, "name": "name_a1", "groupCounts": { "g1": 3, "g2": 9 } }, 
{ "id": 2, "name": "name_a2", "groupCounts": { "g1": 3, "g2": 9 } },
...
]

主要问题是,只有当summary.id(summary.name值不同)进行分组时,才能检索到预期的结果,那么第一个匹配的DepartmentSummaryRow的名称应应用于剩余的Department

因此,在Department中排除equalshashCode中的name的小修复应该可以做到这一点:

@Data
@AllArgsConstructor
public static class Department {
private int id;
@EqualsAndHashCode.Exclude
private String name;
@EqualsAndHashCode.Exclude
private final Map<String, Integer> groupCounts = new HashMap<>();
}

但是,最好使用Collectors.toMapmerge函数和Supplier<Map>来实现类似的结果,而不使用Department作为映射键:

List<Department> result = new ArrayList<>(
summaries
.stream() // Stream<DepartmentSummaryRow>
.collect(Collectors.toMap(
DepartmentSummaryRow::getId, // int id as key
SOGroup::create,             // value: Department
SOGroup::merge,              // merge departments by id
LinkedHashMap::new           // keep insertion order
))
.values()
);
result.forEach(System.out::println);

需要实现几个实用程序方法:

static Department create(DepartmentSummaryRow row) {
Department dept = new Department(row.getId(), row.getName());
dept.getGroupCounts().put(row.getGroupName(), row.getGroupMembersCount());
return dept;
}
static Department merge(Department dept1, Department dept2) {
dept2.getGroupCounts().forEach(
(k, v) -> dept1.getGroupCounts().merge(k, v, Integer::sum)
);
return dept1;
}

输出:

[
{"id":1,"name":"name_a1","groupCounts":{"g1":3,"g2":9}},
{"id":2,"name":"name_a2","groupCounts":{"g1":3,"g2":9}},
...
{"id":49,"name":"name_a49","groupCounts":{"g1":3,"g2":9}},
{"id":50,"name":"name_a50","groupCounts":{"g1":3,"g2":9}}
]