java 按对象值分组,计数,然后按最大对象属性设置组键
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30210547/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Grouping by object value, counting and then setting group key by maximum object attribute
提问by Jernej Jerin
I have managed to write a solution using Java 8 Streams API that first groups a list of object Route by its value and then counts the number of objects in each group. It returns a mapping Route -> Long. Here is the code:
我设法使用 Java 8 Streams API 编写了一个解决方案,该解决方案首先按其值对对象 Route 列表进行分组,然后计算每个组中的对象数量。它返回一个映射 Route -> Long。这是代码:
Map<Route, Long> routesCounted = routes.stream()
.collect(Collectors.groupingBy(gr -> gr, Collectors.counting()));
And the Route class:
和 Route 类:
public class Route implements Comparable<Route> {
private long lastUpdated;
private Cell startCell;
private Cell endCell;
private int dropOffSize;
public Route(Cell startCell, Cell endCell, long lastUpdated) {
this.startCell = startCell;
this.endCell = endCell;
this.lastUpdated = lastUpdated;
}
public long getLastUpdated() {
return this.lastUpdated;
}
public void setLastUpdated(long lastUpdated) {
this.lastUpdated = lastUpdated;
}
public Cell getStartCell() {
return startCell;
}
public void setStartCell(Cell startCell) {
this.startCell = startCell;
}
public Cell getEndCell() {
return endCell;
}
public void setEndCell(Cell endCell) {
this.endCell = endCell;
}
public int getDropOffSize() {
return this.dropOffSize;
}
public void setDropOffSize(int dropOffSize) {
this.dropOffSize = dropOffSize;
}
@Override
/**
* Compute hash code by using Apache Commons Lang HashCodeBuilder.
*/
public int hashCode() {
return new HashCodeBuilder(43, 59)
.append(this.startCell)
.append(this.endCell)
.toHashCode();
}
@Override
/**
* Compute equals by using Apache Commons Lang EqualsBuilder.
*/
public boolean equals(Object obj) {
if (!(obj instanceof Route))
return false;
if (obj == this)
return true;
Route route = (Route) obj;
return new EqualsBuilder()
.append(this.startCell, route.startCell)
.append(this.endCell, route.endCell)
.isEquals();
}
@Override
public int compareTo(Route route) {
if (this.dropOffSize < route.dropOffSize)
return -1;
else if (this.dropOffSize > route.dropOffSize)
return 1;
else {
// if contains drop off timestamps, order by last timestamp in drop off
// the highest timestamp has preceding
if (this.lastUpdated < route.lastUpdated)
return -1;
else if (this.lastUpdated > route.lastUpdated)
return 1;
else
return 0;
}
}
}
What I would like to additionally achieve is that the key for each group would be the one with the largest lastUpdated value. I was already looking at this solutionbut I do not know how to combine the counting and grouping by value and Route maximum lastUpdated value. Here is the example data of what I want to achieve:
我想另外实现的是,每个组的键将是具有最大 lastUpdated 值的键。我已经在看这个解决方案,但我不知道如何将计数和分组按值与路由最大 lastUpdated 值结合起来。这是我想要实现的示例数据:
EXAMPLE:
例子:
List<Route> routes = new ArrayList<>();
routes.add(new Route(new Cell(1, 2), new Cell(2, 1), 1200L));
routes.add(new Route(new Cell(3, 2), new Cell(2, 5), 1800L));
routes.add(new Route(new Cell(1, 2), new Cell(2, 1), 1700L));
SHOULD BE CONVERTED TO:
应该转换为:
Map<Route, Long> routesCounted = new HashMap<>();
routesCounted.put(new Route(new Cell(1, 2), new Cell(2, 1), 1700L), 2);
routesCounted.put(new Route(new Cell(3, 2), new Cell(2, 5), 1800L), 1);
Notice that the key for mapping, which counted 2 Routes is the one with the largest lastUpdated value.
请注意,映射的键(计算 2 条路由)是lastUpdated 值最大的那个。
采纳答案by Misha
Here's one approach. First group into lists and then process the lists into the values you actually want:
这是一种方法。首先分组为列表,然后将列表处理为您实际想要的值:
import static java.util.Comparator.comparingLong;
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.toMap;
Map<Route,Integer> routeCounts = routes.stream()
.collect(groupingBy(x -> x))
.values().stream()
.collect(toMap(
lst -> lst.stream().max(comparingLong(Route::getLastUpdated)).get(),
List::size
));
回答by Tagir Valeev
You can define an abstract "library" method which combines two collectors into one:
您可以定义一个抽象的“库”方法,它将两个收集器合二为一:
static <T, A1, A2, R1, R2, R> Collector<T, ?, R> pairing(Collector<T, A1, R1> c1,
Collector<T, A2, R2> c2, BiFunction<R1, R2, R> finisher) {
EnumSet<Characteristics> c = EnumSet.noneOf(Characteristics.class);
c.addAll(c1.characteristics());
c.retainAll(c2.characteristics());
c.remove(Characteristics.IDENTITY_FINISH);
return Collector.of(() -> new Object[] {c1.supplier().get(), c2.supplier().get()},
(acc, v) -> {
c1.accumulator().accept((A1)acc[0], v);
c2.accumulator().accept((A2)acc[1], v);
},
(acc1, acc2) -> {
acc1[0] = c1.combiner().apply((A1)acc1[0], (A1)acc2[0]);
acc1[1] = c2.combiner().apply((A2)acc1[1], (A2)acc2[1]);
return acc1;
},
acc -> {
R1 r1 = c1.finisher().apply((A1)acc[0]);
R2 r2 = c2.finisher().apply((A2)acc[1]);
return finisher.apply(r1, r2);
}, c.toArray(new Characteristics[c.size()]));
}
After that the actual operation may look like this:
之后的实际操作可能是这样的:
Map<Route, Long> result = routes.stream()
.collect(Collectors.groupingBy(Function.identity(),
pairing(Collectors.maxBy(Comparator.comparingLong(Route::getLastUpdated)),
Collectors.counting(),
(route, count) -> new AbstractMap.SimpleEntry<>(route.get(), count))
))
.values().stream().collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
Update: such collector is available in my StreamExlibrary: MoreCollectors.pairing()
. Also similar collector is implemented in jOOLlibrary, so you can use Tuple.collectors
instead of pairing
.
更新:此类收集器在我的StreamEx库中可用:MoreCollectors.pairing()
。类似的还有收集器中实现jOOL库,这样你就可以使用Tuple.collectors
,而不是pairing
。
回答by Stuart Marks
In principle it seems like this ought to be doable in one pass. The usual wrinkle is that this requires an ad-hoc tuple or pair, in this case with a Route
and a count. Since Java lacks these, we end up using an Object array of length 2 (as shown in Tagir Valeev's answer), or AbstractMap.SimpleImmutableEntry
, or a hypothetical Pair<A,B>
class.
原则上,这似乎应该可以一次性完成。通常的问题是这需要一个特别的元组或对,在这种情况下有一个Route
和一个计数。由于 Java 缺少这些,我们最终使用了一个长度为 2 的 Object 数组(如Tagir Valeev 的回答所示),或者AbstractMap.SimpleImmutableEntry
,或者一个假设的Pair<A,B>
类。
The alternative is to write a little value class that holds a Route
and a count. Of course there's some pain in doing this, but in this case I think it pays off because it provides a place to put the combining logic. That in turn simplifies the stream operation.
另一种方法是编写一个包含 aRoute
和计数的小值类。当然,这样做会有一些痛苦,但在这种情况下,我认为这是值得的,因为它提供了一个放置组合逻辑的地方。这反过来又简化了流操作。
Here's the value class containing a Route
and a count:
这是包含 aRoute
和计数的值类:
class RouteCount {
final Route route;
final long count;
private RouteCount(Route r, long c) {
this.route = r;
count = c;
}
public static RouteCount fromRoute(Route r) {
return new RouteCount(r, 1L);
}
public static RouteCount combine(RouteCount rc1, RouteCount rc2) {
Route recent;
if (rc1.route.getLastUpdated() > rc2.route.getLastUpdated()) {
recent = rc1.route;
} else {
recent = rc2.route;
}
return new RouteCount(recent, rc1.count + rc2.count);
}
}
Pretty straightforward, but notice the combine
method. It combines two RouteCount
values by choosing the Route
that's been updated more recently and using the sum of the counts. Now that we have this value class, we can write a one-pass stream to get the result we want:
非常简单,但请注意combine
方法。它RouteCount
通过选择Route
最近更新的值并使用计数的总和来组合两个值。现在我们有了这个值类,我们可以编写一个单程流来获得我们想要的结果:
Map<Route, RouteCount> counted = routes.stream()
.collect(groupingBy(route -> route,
collectingAndThen(
mapping(RouteCount::fromRoute, reducing(RouteCount::combine)),
Optional::get)));
Like other answers, this groups the routes into equivalence classes based on the starting and ending cell. The actual Route
instance used as the key isn't significant; it's just a representative of its class. The value will be a single RouteCount
that contains the Route
instance that has been updated most recently, along with the count of equivalent Route
instances.
与其他答案一样,这会根据起始单元格和结束单元格将路由分组为等价类。Route
用作键的实际实例并不重要;它只是同类产品的代表。该值将是一个RouteCount
包含Route
最近更新的实例以及等效Route
实例的计数的单个值。
The way this works is that each Route
instance that has the same start and end cells is then fed into the downstream collector of groupingBy
. This mapping
collector maps the Route
instance into a RouteCount
instance, then passes it to a reducing
collector that reduces the instances using the combining logic described above. The and-then portion of collectingAndThen
extracts the value from the Optional<RouteCount>
that the reducing
collector produces.
其工作方式是将Route
具有相同开始和结束单元格的每个实例送入 的下游收集器groupingBy
。此mapping
收集器将Route
实例映射到一个RouteCount
实例,然后将其传递给reducing
使用上述组合逻辑减少实例的收集器。该和然后的部分collectingAndThen
提取从所述值Optional<RouteCount>
,所述reducing
集电极产生。
(Normally a bare get
is dangerous, but we don't get to this collector at all unless there's at least one value available. So get
is safe in this case.)
(通常,bareget
是危险的,但除非至少有一个值可用,否则我们根本不会访问此收集器。因此get
在这种情况下是安全的。)
回答by Mati
Changed equals and hashcode to be dependent only on start cell and end cell.
将 equals 和 hashcode 更改为仅依赖于开始单元格和结束单元格。
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Cell cell = (Cell) o;
if (a != cell.a) return false;
if (b != cell.b) return false;
return true;
}
@Override
public int hashCode() {
int result = a;
result = 31 * result + b;
return result;
}
My solution looks like this:
我的解决方案如下所示:
Map<Route, Long> routesCounted = routes.stream()
.sorted((r1,r2)-> (int)(r2.lastUpdated - r1.lastUpdated))
.collect(Collectors.groupingBy(gr -> gr, Collectors.counting()));
Of course casting to int should be replaced with something more appropriated.
当然,转换为 int 应该用更合适的东西代替。