你如何在 Java 中查询对象集合(Criteria/SQL-like)?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/93417/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you query object collections in Java (Criteria/SQL-like)?
提问by stian
Suppose you have a collection of a few hundred in-memory objects and you need to query this List to return objects matching some SQL or Criteria like query. For example, you might have a List of Car objects and you want to return all cars made during the 1960s, with a license plate that starts with AZ, ordered by the name of the car model.
假设您有数百个内存对象的集合,并且您需要查询此 List 以返回与某些 SQL 或 Criteria 之类的查询匹配的对象。例如,您可能有一个 Car 对象列表,并且您想要返回 1960 年代制造的所有汽车,其车牌以 AZ 开头,按车型名称排序。
I know about JoSQL, has anyone used this, or have any experience with other/homegrown solutions?
我知道JoSQL,有没有人使用过这个,或者对其他/本土解决方案有任何经验?
采纳答案by Eric Weilnau
I have used Apache Commons JXPathin a production application. It allows you to apply XPath expressions to graphs of objects in Java.
我在生产应用程序中使用了Apache Commons JXPath。它允许您将 XPath 表达式应用于 Java 中的对象图。
回答by Bill the Lizard
I would use a Comparator that takes a range of years and license plate pattern as input parameters. Then just iterate through your collection and copy the objects that match. You'd likely end up making a whole package of custom Comparators with this approach.
我会使用一个比较器,它将一系列年份和车牌图案作为输入参数。然后只需遍历您的集合并复制匹配的对象。您可能最终会使用这种方法制作一整套自定义比较器。
回答by Steve Moyer
If you need a single concrete match, you can have the class implement Comparator, then create a standalone object with all the hashed fields included and use it to return the index of the match. When you want to find more than one (potentially) object in the collection, you'll have to turn to a library like JoSQL (which has worked well in the trivial cases I've used it for).
如果你需要一个具体的匹配,你可以让类实现 Comparator,然后创建一个包含所有散列字段的独立对象,并使用它来返回匹配的索引。当您想在集合中找到多个(可能)对象时,您将不得不求助于像 JoSQL 这样的库(它在我使用过的琐碎案例中运行良好)。
In general, I tend to embed Derby into even my small applications, use Hibernate annotations to define my model classes and let Hibernate deal with caching schemes to keep everything fast.
一般来说,我倾向于将 Derby 嵌入到我的小型应用程序中,使用 Hibernate 注释来定义我的模型类,并让 Hibernate 处理缓存方案以保持一切快速。
回答by Yuval
The Comparator
option is not bad, especially if you use anonymous classes (so as not to create redundant classes in the project), but eventually when you look at the flow of comparisons, it's pretty much just like looping over the entire collection yourself, specifying exactly the conditions for matching items:
这个Comparator
选项不错,特别是如果你使用匿名类(以免在项目中创建冗余类),但最终当你查看比较的流程时,它几乎就像自己循环遍历整个集合,准确地指定匹配项的条件:
if (Car car : cars) {
if (1959 < car.getYear() && 1970 > car.getYear() &&
car.getLicense().startsWith("AZ")) {
result.add(car);
}
}
Then there's the sorting... that might be a pain in the backside, but luckily there's class Collections
and its sort
methods, one of which receives a Comparator
...
然后是排序......这可能是背面的痛苦,但幸运的是有类Collections
及其sort
方法,其中一个收到Comparator
......
回答by joev
Continuing the Comparator
theme, you may also want to take a look at the Google CollectionsAPI. In particular, they have an interface called Predicate, which serves a similar role to Comparator
, in that it is a simple interface that can be used by a filtering method, like Sets.filter. They include a whole bunch of composite predicate implementations, to do ANDs, ORs, etc.
继续这个Comparator
主题,您可能还想看看Google CollectionsAPI。特别是,它们有一个名为Predicate的接口,它的作用类似于Comparator
,因为它是一个简单的接口,可以被过滤方法使用,如Sets.filter。它们包括一大堆复合谓词实现,用于执行 AND、OR 等。
Depending on the size of your data set, it may make more sense to use this approach than a SQL or external relational database approach.
根据数据集的大小,使用此方法可能比 SQL 或外部关系数据库方法更有意义。
回答by npgall
Filtering is one way to do this, as discussed in other answers.
如其他答案中所述,过滤是一种方法。
Filtering is not scalable though. On the surface time complexity would appear to be O(n) (i.e. already not scalable if the number of objects in the collection will grow), but actually because one or moretests need to be applied to each object depending on the query, time complexity more accurately is O(n t) where tis the number of tests to apply to each object.
但是过滤是不可扩展的。从表面上看,时间复杂度似乎是 O( n)(即,如果集合中的对象数量会增加,则已经不可扩展),但实际上是因为需要根据查询对每个对象应用一个或多个测试,时间更准确的复杂度是 O( nt),其中t是应用于每个对象的测试数量。
So performance will degrade as additional objects are added to the collection, and/oras the number of tests in the query increases.
因此,随着向集合中添加其他对象和/或查询中的测试数量增加,性能将下降。
There is another way to do this, using indexing and set theory.
还有另一种方法可以做到这一点,使用索引和集合论。
One approach is to build indexeson the fieldswithin the objects stored in your collection and which you will subsequently test in your query.
一种方法是在存储在集合中的对象中的字段上构建索引,随后您将在查询中测试这些字段。
Say you have a collection of Car
objects and every Car
object has a field color
. Say your query is the equivalent of "SELECT * FROM cars WHERE Car.color = 'blue'
". You could build an index on Car.color
, which would basically look like this:
假设您有一个Car
对象集合,每个Car
对象都有一个字段color
。假设您的查询相当于“ SELECT * FROM cars WHERE Car.color = 'blue'
”。你可以在 上建立一个索引Car.color
,它基本上是这样的:
'blue' -> {Car{name=blue_car_1, color='blue'}, Car{name=blue_car_2, color='blue'}}
'red' -> {Car{name=red_car_1, color='red'}, Car{name=red_car_2, color='red'}}
Then given a query WHERE Car.color = 'blue'
, the set of blue cars could be retrieved in O(1) time complexity. If there were additional tests in your query, you could then test each car in that candidate setto check if it matched the remaining tests in your query. Since the candidate set is likely to be significantly smaller than the entire collection, time complexity is less thanO(n) (in the engineering sense, see comments below). Performance does not degrade as much, when additional objects are added to the collection. But this is still not perfect, read on.
然后给定一个查询WHERE Car.color = 'blue'
,可以以 O( 1) 的时间复杂度检索蓝色汽车的集合。如果您的查询中有额外的测试,那么您可以测试该候选集中的每辆车,以检查它是否与您查询中的其余测试相匹配。由于候选集可能比整个集合小得多,时间复杂度小于O( n)(在工程意义上,请参阅下面的评论)。当额外的对象被添加到集合中时,性能不会下降太多。但这仍然不完美,请继续阅读。
Another approach, is what I would refer to as a standing query index. To explain: with conventional iteration and filtering, the collection is iterated and every object is tested to see if it matches the query. So filtering is like running a query over a collection. A standing query index would be the other way around, where the collection is instead run over the query, but only once for each object in the collection, even though the collection could be queried any number of times.
另一种方法是我所说的标准查询索引。解释一下:使用传统的迭代和过滤,集合会被迭代并测试每个对象以查看它是否与查询匹配。所以过滤就像对集合运行查询。一个常设查询索引将是相反的,其中集合在查询上运行,但对于集合中的每个对象只运行一次,即使集合可以被查询任意次数。
A standing query indexwould be like registering a query with some sort of intelligent collection, such that as objects are added to and removed from the collection, the collection would automatically test each object against all of the standing queries which have been registered with it. If an object matches a standing query then the collection could add/remove it to/from a set dedicated to storing objects matching that query. Subsequently, objects matching any of the registered queries could be retrieved in O(1) time complexity.
一个常设查询索引就像用某种智能集合注册一个查询,这样当对象被添加到集合中和从集合中删除时,集合会自动针对所有注册的常设查询测试每个对象。如果一个对象与一个常设查询匹配,那么该集合可以将它添加到/从一个专用于存储与该查询匹配的对象的集合中删除。随后,可以以 O( 1) 的时间复杂度检索与任何已注册查询匹配的对象。
The information above is taken from CQEngine (Collection Query Engine). This basically is a NoSQL query engine for retrieving objects from Java collections using SQL-like queries, without the overhead of iterating through the collection. It is built around the ideas above, plus some more. Disclaimer: I am the author. It's open source and in maven central. If you find it helpful please upvote this answer!
以上信息取自CQEngine (Collection Query Engine)。这基本上是一个 NoSQL 查询引擎,用于使用类似 SQL 的查询从 Java 集合中检索对象,而无需遍历集合的开销。它是围绕上述想法构建的,再加上一些其他想法。免责声明:我是作者。它是开源的,并且在 maven 中心。如果您觉得有帮助,请为这个答案点赞!
回答by Federico Piazza
yes, I know it's an old post, but technologies appear everyday and the answer will change in the time.
是的,我知道这是一个旧帖子,但技术每天都在出现,答案会随着时间而改变。
I think this is a good problem to solve it with LambdaJ. You can find it here: http://code.google.com/p/lambdaj/
我认为这是用 LambdaJ 解决的一个好问题。你可以在这里找到它:http: //code.google.com/p/lambdaj/
Here you have an example:
这里有一个例子:
LOOK FOR ACTIVE CUSTOMERS // (Iterable version)
寻找活跃的客户//(可迭代版本)
List<Customer> activeCustomers = new ArrayList<Customer>();
for (Customer customer : customers) {
if (customer.isActive()) {
activeCusomers.add(customer);
}
}
LambdaJ version
LambdaJ 版本
List<Customer> activeCustomers = select(customers,
having(on(Customer.class).isActive()));
Of course, having this kind of beauty impacts in the performance (a little... an average of 2 times), but can you find a more readable code?
当然,有这种美感对性能有影响(有点……平均2次),但你能找到更易读的代码吗?
It has many many features, another example could be sorting:
它有许多功能,另一个例子可能是排序:
Sort Iterative
排序迭代
List<Person> sortedByAgePersons = new ArrayList<Person>(persons);
Collections.sort(sortedByAgePersons, new Comparator<Person>() {
public int compare(Person p1, Person p2) {
return Integer.valueOf(p1.getAge()).compareTo(p2.getAge());
}
});
Sort with lambda
用 lambda 排序
List<Person> sortedByAgePersons = sort(persons, on(Person.class).getAge());
Update: after java 8 you can use out of the box lambda expressions, like:
更新:在 java 8 之后,您可以使用开箱即用的 lambda 表达式,例如:
List<Customer> activeCustomers = customers.stream()
.filter(Customer::isActive)
.collect(Collectors.toList());