如何在 Java 8 中动态进行过滤？

Question

提问by Frank

I know in Java 8, I can do filtering like this :

我知道在 Java 8 中，我可以像这样进行过滤：

List<User> olderUsers = users.stream().filter(u -> u.age > 30).collect(Collectors.toList());

But what if I have a collection and half a dozen filtering criteria, and I want to test the combination of the criteria ?

但是，如果我有一个集合和六个过滤条件，并且我想测试这些条件的组合怎么办？

For example I have a collection of objects and the following criteria :

例如，我有一组对象和以下条件：

<1> Size
<2> Weight
<3> Length
<4> Top 50% by a certain order
<5> Top 20% by a another certain ratio
<6> True or false by yet another criteria

And I want to test the combination of the above criteria, something like :

我想测试上述标准的组合，例如：

<1> -> <2> -> <3> -> <4> -> <5>
<1> -> <2> -> <3> -> <5> -> <4>
<1> -> <2> -> <5> -> <4> -> <3>
...
<1> -> <5> -> <3> -> <4> -> <2>
<3> -> <2> -> <1> -> <4> -> <5>
...
<5> -> <4> -> <3> -> <3> -> <1>

If each testing order may give me different results, how to write a loop to automatically filter through all the combinations ?

如果每个测试订单可能给我不同的结果，如何编写一个循环来自动过滤所有组合？

What I can think of is to use another method that generates the testing order like the following :

我能想到的是使用另一种生成测试顺序的方法，如下所示：

int[][] getTestOrder(int criteriaCount)
{
 ...
}

So if the criteriaCount is 2, it will return : {{1,2},{2,1}}
If the criteriaCount is 3, it will return : {{1,2,3},{1,3,2},{2,1,3},{2,3,1},{3,1,2},{3,2,1}}
...

But then how to most efficiently implement it with the filtering mechanism in concise expressions that comes with Java 8 ?

但是，如何使用 Java 8 附带的简洁表达式中的过滤机制最有效地实现它呢？

Answer 1

采纳答案by Stuart Marks

Interesting problem. There are several things going on here. No doubt this could be solved in less than half a page of Haskell or Lisp, but this is Java, so here we go....

有趣的问题。这里有几件事情正在发生。毫无疑问，这可以在不到半页的 Haskell 或 Lisp 中解决，但这是 Java，所以我们开始......

One issue is that we have a variable number of filters, whereas most of the examples that have been shown illustrate fixed pipelines.

一个问题是我们有可变数量的过滤器，而大多数显示的示例说明了固定管道。

Another issue is that some of the OP's "filters" are context sensitive, such as "top 50% by a certain order". This can't be done with a simple filter(predicate)construct on a stream.

另一个问题是 OP 的一些“过滤器”是上下文敏感的，例如“按特定顺序排在前 50%”。这不能通过filter(predicate)流上的简单构造来完成。

The key is to realize that, while lambdas allow functions to be passed as arguments (to good effect) it also means that they can be stored in data structures and computations can be performed on them. The most common computation is to take multiple functions and compose them.

关键是要认识到，虽然 lambda 允许将函数作为参数传递（效果良好），但这也意味着它们可以存储在数据结构中，并且可以对它们执行计算。最常见的计算是采用多个函数并组合它们。

Assume that the values being operated on are instances of Widget, which is a POJO that has some obvious getters:

假设被操作的值是 Widget 的实例，它是一个有一些明显的 getter 的 POJO：

class Widget {
    String name() { ... }
    int length() { ... }
    double weight() { ... }

    // constructors, fields, toString(), etc.
}

Let's start off with the first issue and figure out how to operate with a variable number of simple predicates. We can create a list of predicates like this:

让我们从第一个问题开始，弄清楚如何使用可变数量的简单谓词进行操作。我们可以像这样创建一个谓词列表：

List<Predicate<Widget>> allPredicates = Arrays.asList(
    w -> w.length() >= 10,
    w -> w.weight() > 40.0,
    w -> w.name().compareTo("c") > 0);

Given this list, we can permute them (probably not useful, since they're order independent) or select any subset we want. Let's say we just want to apply all of them. How do we apply a variable number of predicates to a stream? There is a Predicate.and()method that will take two predicates and combine them using a logical and, returning a single predicate. So we could take the first predicate and write a loop that combines it with the successive predicates to build up a single predicate that's a composite andof them all:

给定这个列表，我们可以对它们进行置换（可能没有用，因为它们与顺序无关）或选择我们想要的任何子集。假设我们只想应用所有这些。我们如何将可变数量的谓词应用于流？有一种Predicate.and()方法将采用两个谓词并使用逻辑and将它们组合起来，返回一个谓词。因此，我们可以采用第一个谓词并编写一个循环，将它与连续的谓词组合起来，以构建一个单一的谓词，该谓词是一个复合谓词，并且是所有谓词：

Predicate<Widget> compositePredicate = allPredicates.get(0);
for (int i = 1; i < allPredicates.size(); i++) {
    compositePredicate = compositePredicate.and(allPredicates.get(i));
}

This works, but it fails if the list is empty, and since we're doing functional programming now, mutating a variable in a loop is declassé. But lo! This is a reduction! We can reduce all the predicates over the andoperator get a single composite predicate, like this:

这有效，但如果列表为空，它会失败，并且由于我们现在正在进行函数式编程，因此在循环中改变变量是 declassé。但是！这是降价！我们可以减少和操作符上的所有谓词得到一个复合谓词，像这样：

Predicate<Widget> compositePredicate =
    allPredicates.stream()
                 .reduce(w -> true, Predicate::and);

(Credit: I learned this technique from @venkat_s. If you ever get a chance, go see him speak at a conference. He's good.)

（信用：我从@venkat_s那里学到了这个技巧。如果你有机会，去看看他在会议上的演讲。他很好。）

Note the use of w -> trueas the identity value of the reduction. (This could also be used as the initial value of compositePredicatefor the loop, which would fix the zero-length list case.)

注意使用w -> true作为归约的标识值。（这也可以用作compositePredicate循环的初始值，这将修复零长度列表的情况。）

Now that we have our composite predicate, we can write out a short pipeline that simply applies the composite predicate to the widgets:

现在我们有了复合谓词，我们可以写出一个简短的管道，简单地将复合谓词应用于小部件：

widgetList.stream()
          .filter(compositePredicate)
          .forEach(System.out::println);

Context Sensitive Filters

上下文敏感过滤器

Now let's consider what I referred to as a "context sensitive" filter, which is represented by the example like "top 50% in a certain order", say the top 50% of widgets by weight. "Context sensitive" isn't the best term for this but it's what I've got at the moment, and it is somewhat descriptive in that it's relative to the number of elements in the stream up to this point.

现在让我们考虑一下我所说的“上下文敏感”过滤器，它由“按特定顺序排在前 50%”的示例表示，比如按重量计算前 50% 的小部件。“上下文敏感”不是最好的术语，但它是我目前所拥有的，并且它具有一定的描述性，因为它与流中到目前为止的元素数量有关。

How would we implement something like this using streams? Unless somebody comes up with something really clever, I think we have to collect the elements somewhere first (say, in a list) before we can emit the first element to the output. It's kind of like sorted()in a pipeline which can't tell which is the first element to output until it has read every single input element and has sorted them.

我们将如何使用流来实现这样的事情？除非有人想出一些非常聪明的方法，否则我认为我们必须先在某处（例如在列表中）收集元素，然后才能将第一个元素发送到输出。这有点像sorted()在管道中，在读取每个输入元素并对它们进行排序之前，它无法判断哪个是要输出的第一个元素。

The straightforward approach to finding the top 50% of widgets by weight, using streams, would look something like this:

使用流查找按重量排名前 50% 的小部件的直接方法如下所示：

List<Widget> temp =
    list.stream()
        .sorted(comparing(Widget::weight).reversed())
        .collect(toList());
temp.stream()
    .limit((long)(temp.size() * 0.5))
    .forEach(System.out::println);

This isn't complicated, but it's a bit cumbersome as we have to collect the elements into a list and assign it to a variable, in order to use the list's size in the 50% computation.

这并不复杂，但有点麻烦，因为我们必须将元素收集到一个列表中并将其分配给一个变量，以便在 50% 的计算中使用列表的大小。

This is limiting, though, in that it's a "static" representation of this kind of filtering. How would we chain this into a stream with a variable number of elements (other filters or criteria) like we did with the predicates?

但是，这是限制性的，因为它是这种过滤的“静态”表示。我们如何将它链接到一个具有可变数量元素（其他过滤器或条件）的流中，就像我们对谓词所做的那样？

A important observation is that this code does its actual work in between the consumption of a stream and the emitting of a stream. It happens to have a collector in the middle, but if you chain a stream to its front and chain stuff off its back end, nobody is the wiser. In fact, the standard stream pipeline operations like mapand filtereach take a stream as input and emit a stream as output. So we can write a function kind of like this ourselves:

一个重要的观察是这段代码在流的消耗和流的发射之间完成其实际工作。它恰好在中间有一个收集器，但是如果你将一个流链接到它的前端并将其链接到它的后端，没有人是更聪明的。事实上，标准的流管道操作就像map和filter每个都将一个流作为输入并发出一个流作为输出。所以我们可以自己写一个类似这样的函数：

Stream<Widget> top50PercentByWeight(Stream<Widget> stream) {
    List<Widget> temp =
        stream.sorted(comparing(Widget::weight).reversed())
              .collect(toList());
    return temp.stream()
               .limit((long)(temp.size() * 0.5));
}

A similar example might be to find the shortest three widgets:

一个类似的例子可能是找到最短的三个小部件：

Stream<Widget> shortestThree(Stream<Widget> stream) {
    return stream.sorted(comparing(Widget::length))
                 .limit(3);
}

Now we can write something that combines these stateful filters with ordinary stream operations:

现在我们可以编写一些将这些有状态过滤器与普通流操作结合起来的东西：

shortestThree(
    top50PercentByWeight(
        widgetList.stream()
                  .filter(w -> w.length() >= 10)))
.forEach(System.out::println);

This works, but is kind of lousy because it reads "inside-out" and backwards. The stream source is widgetListwhich is streamed and filtered through an ordinary predicate. Now, going backwards, the top 50% filter is applied, then the shortest-three filter is applied, and finally the stream operation forEachis applied at the end. This works but is quite confusing to read. And it's still static. What we really want is to have a way to put these new filters inside a data structure that we can manipulate, for example, to run all the permutations, as in the original question.

这有效，但有点糟糕，因为它读取“由内而外”和向后读取。流源是widgetList通过普通谓词流式传输和过滤的。现在，倒退，应用前 50% 过滤器，然后应用最短的三个过滤器，最后forEach应用流操作。这有效，但读起来很混乱。它仍然是静态的。我们真正想要的是有一种方法将这些新过滤器放入我们可以操作的数据结构中，例如，运行所有排列，如原始问题中所示。

A key insight at this point is that these new kinds of filters are really just functions, and we have functional interface types in Java which let us represent functions as objects, to manipulate them, store them in data structures, compose them, etc. The functional interface type that takes an argument of some type and returns a value of the same type is UnaryOperator. The argument and return type in this case is Stream<Widget>. If we were to take method references such as this::shortestThreeor this::top50PercentByWeight, the types of the resulting objects would be

在这一点上的一个关键见解是，这些新类型的过滤器实际上只是函数，我们在 Java 中有函数式接口类型，它允许我们将函数表示为对象、操作它们、将它们存储在数据结构中、组合它们等。函数式接口类型接受某种类型的参数并返回相同类型的值 is UnaryOperator。这种情况下的参数和返回类型是Stream<Widget>. 如果我们采用诸如this::shortestThreeorthis::top50PercentByWeight之类的方法引用，则结果对象的类型将是

UnaryOperator<Stream<Widget>>

If we were to put these into a list, the type of that list would be

如果我们将这些放入一个列表中，该列表的类型将是

List<UnaryOperator<Stream<Widget>>>

Ugh! Three levels of nested generics is too much for me. (But Aleksey Shipilevdid once show me some code that used four levels of nested generics.) The solution for too much generics is to define our own type. Let's call one of our new things a Criterion. It turns out that there's little value to be gained by making our new functional interface type be related to UnaryOperator, so our definition can simply be:

啊! 三层嵌套泛型对我来说太多了。（但Aleksey Shipilev曾经向我展示了一些使用四级嵌套泛型的代码。）泛型过多的解决方案是定义我们自己的类型。让我们将我们的一项新事物称为标准。事实证明，让我们的新功能接口类型与相关几乎没有什么价值UnaryOperator，所以我们的定义可以简单地是：

@FunctionalInterface
public interface Criterion {
    Stream<Widget> apply(Stream<Widget> s);
}

Now we can create a list of criteria like this:

现在我们可以创建一个这样的标准列表：

List<Criterion> criteria = Arrays.asList(
    this::shortestThree,
    this::lengthGreaterThan20
);

(We'll figure out how to use this list below.) This is a step forward, since we can now manipulate the list dynamically, but it's still somewhat limiting. First, it can't be combined with ordinary predicates. Second, there's a lot of hard-coded values here, such as the shortest three: how about two or four? How about a different criterion than length? What we really want is a function that creates these Criterion objects for us. This is easy with lambdas.

（我们将在下面弄清楚如何使用这个列表。）这是向前迈出的一步，因为我们现在可以动态地操作列表，但它仍然有些限制。首先，它不能与普通谓词结合使用。其次，这里有很多硬编码的值，比如最短的三个：两个或四个怎么样？与长度不同的标准怎么样？我们真正想要的是一个为我们创建这些 Criterion 对象的函数。这对 lambda 很容易。

This creates a criterion that selects the top N widgets, given a comparator:

给定一个比较器，这将创建一个选择前 N 个小部件的标准：

Criterion topN(Comparator<Widget> cmp, long n) {
    return stream -> stream.sorted(cmp).limit(n);
}

This creates a criterion that selects the top p percent of widgets, given a comparator:

给定一个比较器，这将创建一个选择前 p% 的小部件的标准：

Criterion topPercent(Comparator<Widget> cmp, double pct) {
    return stream -> {
        List<Widget> temp =
            stream.sorted(cmp).collect(toList());
        return temp.stream()
                   .limit((long)(temp.size() * pct));
    };
}

And this creates a criterion from an ordinary predicate:

这从普通谓词创建了一个标准：

Criterion fromPredicate(Predicate<Widget> pred) {
    return stream -> stream.filter(pred);
}

Now we have a very flexible way of creating criteria and putting them into a list, where they can be subsetted or permuted or whatever:

现在我们有一种非常灵活的方式来创建标准并将它们放入一个列表中，在那里它们可以被子集化或排列或其他：

List<Criterion> criteria = Arrays.asList(
    fromPredicate(w -> w.length() > 10),                    // longer than 10
    topN(comparing(Widget::length), 4L),                    // longest 4
    topPercent(comparing(Widget::weight).reversed(), 0.50)  // heaviest 50%
);

Once we have a list of Criterion objects, we need to figure out a way to apply all of them. Once again, we can use our friend reduceto combine all of them into a single Criterion object:

一旦我们有了 Criterion 对象的列表，我们就需要找出一种方法来应用所有这些对象。再一次，我们可以使用我们的朋友reduce将所有这些组合成一个 Criterion 对象：

Criterion allCriteria =
    criteria.stream()
            .reduce(c -> c, (c1, c2) -> (s -> c2.apply(c1.apply(s))));

The identity function c -> cis clear, but the second arg is a bit tricky. Given a stream swe first apply Criterion c1, then Criterion c2, and this is wrapped in a lambda that takes two Criterion objects c1 and c2 and returns a lambda that applies the composition of c1 and c2 to a stream and returns the resulting stream.

identity 函数c -> c很清楚，但第二个 arg 有点棘手。给定一个流，s我们首先应用Criterion c1，然后应用Criterion c2，这被包装在一个lambda 中，该lambda 接受两个Criterion 对象c1 和c2，并返回一个将c1 和c2 的组合应用于流并返回结果流的lambda。

Now that we've composed all the criteria, we can apply it to a stream of widgets like so:

现在我们已经编写了所有标准，我们可以将它应用到一个小部件流，如下所示：

allCriteria.apply(widgetList.stream())
           .forEach(System.out::println);

This is still a bit inside-out, but it's fairly well controlled. Most importantly, it addresses the original question, which is how to combine criteria dynamically. Once the Criterion objects are in a data structure, they can be selected, subsetted, permuted, or whatever as necessary, and they can all be combined in a single criterion and applied to a stream using the above techniques.

这仍然有点由内而外，但控制得相当好。最重要的是，它解决了最初的问题，即如何动态组合条件。一旦 Criterion 对象位于数据结构中，就可以根据需要对它们进行选择、子集化、置换或任何其他操作，并且可以将它们全部组合在单个标准中并使用上述技术应用于流。

The functional programming gurus are probably saying "He just reinvented ... !" which is probably true. I'm sure this has probably been invented somewhere already, but it's new to Java, because prior to lambda, it just wasn't feasible to write Java code that uses these techniques.

函数式编程大师可能会说“他刚刚重新发明了……！” 这可能是真的。我确信这可能已经在某个地方被发明了，但它对 Java 来说是新的，因为在 lambda 之前，编写使用这些技术的 Java 代码是不可行的。

Update 2014-04-07

更新 2014-04-07

I've cleaned up and posted the complete sample codein a gist.

我已经清理并在要点中发布了完整的示例代码。

Answer 2

回答by Raul Guiu

We could add a counter with a map so we know how many elements we have after the filters. I created a helper class that has a method that counts and returns the same object passed:

我们可以添加一个带有地图的计数器，以便我们知道过滤器之后有多少元素。我创建了一个辅助类，它有一个计算并返回相同对象的方法：

class DoNothingButCount<T> {
    AtomicInteger i;
    public DoNothingButCount() {
        i = new AtomicInteger(0);
    }
    public T pass(T p) {
        i.incrementAndGet();
        return p;
    }
}

public void runDemo() {
    List<Person>persons = create(100);
    DoNothingButCount<Person> counter = new DoNothingButCount<>();

    persons.stream().filter(u -> u.size > 12).filter(u -> u.weitght > 12).
            map((p) -> counter.pass(p)).
            sorted((p1, p2) -> p1.age - p2.age).
            collect(Collectors.toList()).stream().
            limit((int) (counter.i.intValue() * 0.5)).
            sorted((p1, p2) -> p2.length - p1.length).
            limit((int) (counter.i.intValue() * 0.5 * 0.2)).forEach((p) -> System.out.println(p));
}

I had to convert the stream to list and back to stream in the middle because the limit would use the initial count otherwise. Is all a but "hackish" but is all I could think.

我必须在中间将流转换为列表并返回到流，否则限制将使用初始计数。都是“hackish”但我能想到的全部。

I could do it a bit differently using a Function for my mapped class:

我可以为我的映射类使用函数来做一些不同的事情：

class DoNothingButCount<T > implements Function<T, T> {
    AtomicInteger i;
    public DoNothingButCount() {
        i = new AtomicInteger(0);
    }
    public T apply(T p) {
        i.incrementAndGet();
        return p;
    }
}

The only thing will change in the stream is:

流中唯一会改变的是：

            map((p) -> counter.pass(p)).

will become:

会变成：

            map(counter).

My complete test class including the two examples:

我的完整测试课程包括两个示例：

import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.function.Function;
import java.util.stream.Collectors;

public class Demo2 {
    Random r = new Random();
    class Person {
        public int size, weitght,length, age;
        public Person(int s, int w, int l, int a){
            this.size = s;
            this.weitght = w;
            this.length = l;
            this.age = a;
        }
        public String toString() {
            return "P: "+this.size+", "+this.weitght+", "+this.length+", "+this.age+".";
        }
    }

    public List<Person>create(int size) {
        List<Person>persons = new ArrayList<>();
        while(persons.size()<size) {
            persons.add(new Person(r.nextInt(10)+10, r.nextInt(10)+10, r.nextInt(10)+10,r.nextInt(20)+14));
        }
        return persons;
    }

    class DoNothingButCount<T> {
        AtomicInteger i;
        public DoNothingButCount() {
            i = new AtomicInteger(0);
        }
        public T pass(T p) {
            i.incrementAndGet();
            return p;
        }
    }

    class PDoNothingButCount<T > implements Function<T, T> {
        AtomicInteger i;
        public PDoNothingButCount() {
            i = new AtomicInteger(0);
        }
        public T apply(T p) {
            i.incrementAndGet();
            return p;
        }
    }

    public void runDemo() {
        List<Person>persons = create(100);
        PDoNothingButCount<Person> counter = new PDoNothingButCount<>();

        persons.stream().filter(u -> u.size > 12).filter(u -> u.weitght > 12).
                map(counter).
                sorted((p1, p2) -> p1.age - p2.age).
                collect(Collectors.toList()).stream().
                limit((int) (counter.i.intValue() * 0.5)).
                sorted((p1, p2) -> p2.length - p1.length).
                limit((int) (counter.i.intValue() * 0.5 * 0.2)).forEach((p) -> System.out.println(p));
    }

    public void runDemo2() {
        List<Person>persons = create(100);
        DoNothingButCount<Person> counter = new DoNothingButCount<>();

        persons.stream().filter(u -> u.size > 12).filter(u -> u.weitght > 12).
                map((p) -> counter.pass(p)).
                sorted((p1, p2) -> p1.age - p2.age).
                collect(Collectors.toList()).stream().
                limit((int) (counter.i.intValue() * 0.5)).
                sorted((p1, p2) -> p2.length - p1.length).
                limit((int) (counter.i.intValue() * 0.5 * 0.2)).forEach((p) -> System.out.println(p));
    }

    public static void main(String str[]) {
        Demo2 demo = new Demo2();
        System.out.println("Demo 2:");
        demo.runDemo2();
        System.out.println("Demo 1:");
        demo.runDemo();

    }
}

如何在 Java 8 中动态进行过滤？

提问by Frank

采纳答案by Stuart Marks

Context Sensitive Filters

上下文敏感过滤器

Update 2014-04-07

更新 2014-04-07

回答by Raul Guiu

相关推荐

最近更新

标签

如何在 Java 8 中动态进行过滤？

提问by Frank

采纳答案by Stuart Marks

Context Sensitive Filters

上下文敏感过滤器

Update 2014-04-07

更新 2014-04-07

回答by Raul Guiu

相关推荐

Java 带有参数的Spring RestTemplate HTTP Post导致400错误请求错误

Java 如何使用 1-1000 的整数创建 100 的数组？

Java 强制重试特定的 http 状态代码

Java 在android中连接2个字符串？

相关推荐

最近更新

标签