java 使用 Hibernate 根据唯一键查找或插入

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5022812/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 09:05:44  来源:igfitidea点击:

Find or insert based on unique key with Hibernate

javahibernateunique-constraintupsertunique-key

提问by Mike Deck

I'm trying to write a method that will return a Hibernate object based on a unique but non-primary key. If the entity already exists in the database I want to return it, but if it doesn't I want to create a new instance and save it before returning.

我正在尝试编写一个方法,该方法将返回基于唯一但非主键的 Hibernate 对象。如果实体已经存在于数据库中,我想返回它,但如果它不存在,我想创建一个新实例并在返回之前保存它。

UPDATE:Let me clarify that the application I'm writing this for is basically a batch processor of input files. The system needs to read a file line by line and insert records into the db. The file format is basically a denormalized view of several tables in our schema so what I have to do is parse out the parent record either insert it into the db so I can get a new synthetic key, or if it already exists select it. Then I can add additional associated records in other tables that have foreign keys back to that record.

更新:让我澄清一下,我正在为其编写的应用程序基本上是输入文件的批处理器。系统需要逐行读取文件并将记录插入到数据库中。文件格式基本上是我们模式中几个表的非规范化视图,所以我要做的是解析父记录,或者将其插入到数据库中,以便我可以获得一个新的合成键,或者如果它已经存在,请选择它。然后我可以在具有外键的其他表中添加其他关联记录到该记录。

The reason this gets tricky is that each file needs to be either totally imported or not imported at all, i.e. all inserts and updates done for a given file should be a part of one transaction. This is easy enough if there's only one process that's doing all the imports, but I'd like to break this up across multiple servers if possible. Because of these constraints I need to be able to stay inside one transaction, but handle the exceptions where a record already exists.

这变得棘手的原因是每个文件都需要完全导入或根本不导入,即为给定文件所做的所有插入和更新都应该是一个事务的一部分。如果只有一个进程执行所有导入,这很容易,但如果可能,我想将其拆分到多台服务器上。由于这些限制,我需要能够留在一个事务中,但处理记录已经存在的异常。

The mapped class for the parent records looks like this:

父记录的映射类如下所示:

@Entity
public class Foo {
    @Id
    @GeneratedValue(strategy = IDENTITY)
    private int id;
    @Column(unique = true)
    private String name;
    ...
}

My initial attempt at writting this method is as follows:

我最初尝试编写此方法如下:

public Foo findOrCreate(String name) {
    Foo foo = new Foo();
    foo.setName(name);
    try {
        session.save(foo)
    } catch(ConstraintViolationException e) {
        foo = session.createCriteria(Foo.class).add(eq("name", name)).uniqueResult();
    }
    return foo;
}

The problem is when the name I'm looking for exists, an org.hibernate.AssertionFailure exception is thrown by the call to uniqueResult(). The full stack trace is below:

问题是当我要查找的名称存在时,调用 uniqueResult() 会引发 org.hibernate.AssertionFailure 异常。完整的堆栈跟踪如下:

org.hibernate.AssertionFailure: null id in com.searchdex.linktracer.domain.LinkingPage entry (don't flush the Session after an exception occurs)
    at org.hibernate.event.def.DefaultFlushEntityEventListener.checkId(DefaultFlushEntityEventListener.java:82) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.event.def.DefaultFlushEntityEventListener.getValues(DefaultFlushEntityEventListener.java:190) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.event.def.DefaultFlushEntityEventListener.onFlushEntity(DefaultFlushEntityEventListener.java:147) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.event.def.AbstractFlushingEventListener.flushEntities(AbstractFlushingEventListener.java:219) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:99) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.event.def.DefaultAutoFlushEventListener.onAutoFlush(DefaultAutoFlushEventListener.java:58) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.impl.SessionImpl.autoFlushIfRequired(SessionImpl.java:1185) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1709) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.impl.CriteriaImpl.list(CriteriaImpl.java:347) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
    at org.hibernate.impl.CriteriaImpl.uniqueResult(CriteriaImpl.java:369) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]

Does anyone know what is causing this exception to be thrown? Does hibernate support a better way of accomplishing this?

有谁知道是什么导致抛出这个异常?Hibernate 是否支持更好的方法来实现这一点?

Let me also preemptively explain why I'm inserting first and then selecting if and when that fails. This needs to work in a distributed environment so I can't synchronize across the check to see if the record already exists and the insert. The easiest way to do this is to let the database handle this synchronization by checking for the constraint violation on every insert.

让我也先发制人地解释一下为什么我要先插入,然后选择是否失败以及何时失败。这需要在分布式环境中工作,因此我无法跨检查同步以查看记录是否已存在并插入。执行此操作的最简单方法是通过检查每次插入时是否违反约束,让数据库处理此同步。

采纳答案by Lawrence McAlpin

I had a similar batch processing requirement, with processes running on multiple JVMs. The approach I took for this was as follows. It is very much like jtahlborn's suggestion. However, as vbence pointed out, if you use a NESTED transaction, when you get the constraint violation exception, your session is invalidated. Instead, I use REQUIRES_NEW, which suspends the current transaction and creates a new, independent transaction. If the new transaction rolls back it will not affect the original transaction.

我有类似的批处理要求,进程在多个 JVM 上运行。我为此采取的方法如下。这非常像 jtahlborn 的建议。但是,正如 vbence 所指出的,如果您使用 NESTED 事务,当您收到约束冲突异常时,您的会话将失效。相反,我使用 REQUIRES_NEW,它会暂停当前事务并创建一个新的独立事务。如果新事务回滚,则不会影响原始事务。

I am using Spring's TransactionTemplate but I'm sure you could easily translate it if you do not want a dependency on Spring.

我正在使用 Spring 的 TransactionTemplate,但我相信如果您不想依赖 Spring,您可以轻松地翻译它。

public T findOrCreate(final T t) throws InvalidRecordException {
   // 1) look for the record
   T found = findUnique(t);
   if (found != null)
     return found;
   // 2) if not found, start a new, independent transaction
   TransactionTemplate tt = new TransactionTemplate((PlatformTransactionManager)
                                            transactionManager);
   tt.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRES_NEW);
   try {
     found = (T)tt.execute(new TransactionCallback<T>() {
        try {
            // 3) store the record in this new transaction
            return store(t);
        } catch (ConstraintViolationException e) {
            // another thread or process created this already, possibly
            // between 1) and 2)
            status.setRollbackOnly();
            return null;
        }
     });
     // 4) if we failed to create the record in the second transaction, found will
     // still be null; however, this would happy only if another process
     // created the record. let's see what they made for us!
     if (found == null)
        found = findUnique(t);
   } catch (...) {
     // handle exceptions
   }
   return found;
}

回答by vbence

Two solution come to mind:

想到了两个解决方案:

That's what TABLE LOCKS are for

这就是 TABLE LOCKS 的用途

Hibernate does not support table locks, but this is the situation when they come handy. Fortunately you can use native SQL thru Session.createSQLQuery(). For example (on MySQL):

Hibernate 不支持表锁,但这是它们派上用场时的情况。幸运的是,您可以通过Session.createSQLQuery(). 例如(在 MySQL 上):

// no access to the table for any other clients
session.createSQLQuery("LOCK TABLES foo WRITE").executeUpdate();

// safe zone
Foo foo = session.createCriteria(Foo.class).add(eq("name", name)).uniqueResult();
if (foo == null) {
    foo = new Foo();
    foo.setName(name)
    session.save(foo);
}

// releasing locks
session.createSQLQuery("UNLOCK TABLES").executeUpdate();

This way when a session (client connection) gets the lock, all the other connections are blocked until the operation ends and the locks are released. Read operations are also blocked for other connections, so needless to say use this only in case of atomic operations.

这样,当会话(客户端连接)获得锁时,所有其他连接都会被阻塞,直到操作结束并释放锁。其他连接的读取操作也会被阻止,因此不用说,仅在原子操作的情况下使用它。

What about Hibernate's locks?

Hibernate 的锁呢?

Hibernate uses row level locking. We can not use it directly, because we can not lock non-existent rows. But we can create a dummytable with a single record, map it to the ORM, then use SELECT ... FOR UPDATEstyle locks on that object to synchronize our clients. Basically we only need to be sure that no other clients (running the same software, with the same conventions) will do any conflicting operations while we are working.

Hibernate 使用行级锁定。我们不能直接使用它,因为我们不能锁定不存在的行。但是我们可以创建一个包含单个记录的虚拟表,将它映射到 ORM,然后SELECT ... FOR UPDATE在该对象上使用样式锁来同步我们的客户端。基本上我们只需要确保在我们工作时没有其他客户端(运行相同的软件,具有相同的约定)会执行任何冲突的操作。

// begin transaction
Transaction transaction = session.beginTransaction();

// blocks until any other client holds the lock
session.load("dummy", 1, LockOptions.UPGRADE);

// virtual safe zone
Foo foo = session.createCriteria(Foo.class).add(eq("name", name)).uniqueResult();
if (foo == null) {
    foo = new Foo();
    foo.setName(name)
    session.save(foo);
}

// ends transaction (releasing locks)
transaction.commit();

Your database has to know the SELECT ... FOR UPDATEsyntax (Hibernate is goig to use it), and of course this only works if all your clients has the same convention (they need to lock the same dummy entity).

您的数据库必须知道SELECT ... FOR UPDATE语法(Hibernate 是 goig 才能使用它),当然这只有在您的所有客户端都具有相同约定时才有效(他们需要锁定相同的虚拟实体)。

回答by Vlad Mihalcea

This is a great question, so I decided to write an article to explain it in more detail.

这是一个很好的问题,所以我决定写一篇文章来更详细地解释它

As I explained in this free chapter of my book, High-Performance Java Persistence, you need to use UPSERT or MERGE to achieve this goal.

正如我在我的书High-Performance Java Persistence 的这个免费章节中所解释的,您需要使用 UPSERT 或 MERGE 来实现这个目标。

However, Hibernate does not offer support for this construct, so you need to use jOOQinstead.

但是,Hibernate 不提供对这种构造的支持,因此您需要改用jOOQ

private PostDetailsRecord upsertPostDetails(
        DSLContext sql, Long id, String owner, Timestamp timestamp) {
    sql
    .insertInto(POST_DETAILS)
    .columns(POST_DETAILS.ID, POST_DETAILS.CREATED_BY, POST_DETAILS.CREATED_ON)
    .values(id, owner, timestamp)
    .onDuplicateKeyIgnore()
    .execute();

    return sql.selectFrom(POST_DETAILS)
    .where(field(POST_DETAILS.ID).eq(id))
    .fetchOne();
}

Calling this method on PostgreSQL:

在 PostgreSQL 上调用这个方法:

PostDetailsRecord postDetailsRecord = upsertPostDetails(
    sql, 
    1L, 
    "Alice",
    Timestamp.from(LocalDateTime.now().toInstant(ZoneOffset.UTC))
);

Yields the following SQL statements:

产生以下 SQL 语句:

INSERT INTO "post_details" ("id", "created_by", "created_on") 
VALUES (1, 'Alice',  CAST('2016-08-11 12:56:01.831' AS timestamp))
ON CONFLICT  DO NOTHING;

SELECT "public"."post_details"."id",
       "public"."post_details"."created_by",
       "public"."post_details"."created_on",
       "public"."post_details"."updated_by",
       "public"."post_details"."updated_on"
FROM "public"."post_details"
WHERE "public"."post_details"."id" = 1

On Oracle and SQL Server, jOOQ will use MERGEwhile on MySQL it will use ON DUPLICATE KEY.

在 Oracle 和 SQL Server 上,jOOQ 将使用,MERGE而在 MySQL 上它将使用ON DUPLICATE KEY.

The concurrency mechanism is ensured by the row-level locking mechanism emplyed when inserting, updating or deletinga record,w hich you can view in the following diagram:

并发机制由插入、更新或删除记录时采用的行级锁定机制来保证,如下图所示:

enter image description here

在此处输入图片说明

Code avilable on GitHub.

代码可在GitHub 上使用

回答by jtahlborn

a couple people have mentioned different parts of the overall strategy. assuming that you generally expect to find an existing object more often than you create a new object:

有几个人提到了整体战略的不同部分。假设您通常希望比创建新对象更频繁地找到现有对象:

  • search for existing object by name. if found, return
  • start nested (separate) transaction
    • try to insert new object
    • commit nested transaction
  • catch any failure from nested transaction, if anything but constraint violation, re-throw
  • otherwise search for existing object by name and return it
  • 按名称搜索现有对象。如果找到,返回
  • 启动嵌套(单独)事务
    • 尝试插入新对象
    • 提交嵌套事务
  • 从嵌套事务中捕获任何失败,如果没有违反约束,则重新抛出
  • 否则按名称搜索现有对象并返回它

just to clarify, as pointed out in another answer, the "nested" transaction is actually a separate transaction (many databases don't even support true, nested transactions).

只是为了澄清,正如在另一个答案中指出的那样,“嵌套”事务实际上是一个单独的事务(许多数据库甚至不支持真正的嵌套事务)。

回答by Nicolas Bousquet

The solution is in fact really simple. First perform a select using your name value. If a result is found, return that. If not, create a new one. In case the creation fail (with an exception), this is because another client added this very same value between your select and your insert statement. This is then logical that you have an exception. Catch it, rollback your transaction and run the same code again. Because the row already exist, the select statement will find it and you'll return your object.

解决方案实际上非常简单。首先使用您的名称值执行选择。如果找到结果,则返回该结果。如果没有,请创建一个新的。如果创建失败(有例外),这是因为另一个客户端在您的 select 和 insert 语句之间添加了这个非常相同的值。那么你有一个例外是合乎逻辑的。抓住它,回滚您的事务并再次运行相同的代码。因为该行已经存在,所以 select 语句会找到它并且您将返回您的对象。

You can see here explanation of strategies for optimistic and pessimistic locking with hibernate here : http://docs.jboss.org/hibernate/core/3.3/reference/en/html/transactions.html

您可以在此处查看使用 hibernate 进行乐观和悲观锁定策略的解释:http: //docs.jboss.org/hibernate/core/3.3/reference/en/html/transactions.html

回答by Mike Deck

The Hibernate documentation on transactions and exceptionsstates that all HibernateExceptions are unrecoverable and that the current transaction must be rolled back as soon as one is encountered. This explains why the code above does not work. Ultimately you should never catch a HibernateException without exiting the transaction and closing the session.

关于事务和异常Hibernate 文档指出所有 HibernateExceptions 都是不可恢复的,并且一旦遇到当前事务就必须回滚。这解释了为什么上面的代码不起作用。最终,您不应该在不退出事务和关闭会话的情况下捕获 HibernateException。

The only real way to accomplish this it would seem would be to manage the closing of the old session and reopening of a new one within the method itself. Implementing a findOrCreate method which can participate in an existing transaction and is safe within a distributed environment would seem to be impossible using Hibernate based on what I have found.

实现这一点的唯一真正方法似乎是在方法本身内管理旧会话的关闭和新会话的重新打开。根据我的发现,使用 Hibernate 实现可以参与现有事务并且在分布式环境中安全的 findOrCreate 方法似乎是不可能的。

回答by Mike Baranczak

Well, here's one way to do it - but it's not appropriate for all situations.

好吧,这是一种方法 - 但它并不适合所有情况。

  • In Foo, remove the "unique = true" attribute on name. Add a timestamp that gets updated on every insert.
  • In findOrCreate(), don't bother checking if the entity with the given name already exists - just insert a new one every time.
  • When looking up Foo instances by name, there may be 0 or more with a given name, so you just select the newest one.
  • 在 Foo 中,删除 上的“unique = true”属性name。添加在每次插入时更新的时间戳。
  • 在 中findOrCreate(),不要费心检查具有给定名称的实体是否已经存在——每次只需插入一个新实体。
  • 在通过 查找 Foo 实例时name,可能有 0 个或多个具有给定名称的实例,因此您只需选择最新的一个。

The nice thing about this method is that it doesn't require any locking, so everything should run pretty fast. The downside is that your database will be littered with obsolete records, so you may have to do something somewhere else to deal with them. Also, if other tables refer to Foo by its id, then this will screw up those relations.

这种方法的好处是它不需要任何锁定,所以一切都应该运行得非常快。缺点是您的数据库将充斥着过时的记录,因此您可能不得不在其他地方做一些事情来处理它们。此外,如果其他表通过 Foo 引用 Foo id,那么这将破坏这些关系。

回答by Iogui

Maybe you should change your strategy: First find the user with the name and only if the user thoes not exist, create it.

也许你应该改变你的策略:首先找到有名字的用户,只有当用户不存在时,才创建它。

回答by ThomasRS

I would try the following strategy:

我会尝试以下策略:

A. Start a main transaction (at time 1)
B. Start a sub-transaction (at time 2)

一个。开始一个主事务(在时间 1)
B。开始一个子事务(在时间 2)

Now, any object created after time 1 will not be visible in the main transaction. So when you do

现在,在时间 1 之后创建的任何对象在主事务中都将不可见。所以当你做

C. Create new race-condition object, commit sub-transaction
D. Handle conflict by starting a new sub-transaction (at time 3) and getting the object from a query (the sub-transaction from point B is now out-of-scope).

Ç。创建新的竞争条件对象,提交子事务
D。通过启动一个新的子事务(在时间 3)并从查询中获取对象(来自 B 点的子事务现在超出范围)来处理冲突。

only return the object primary key and then use EntityManager.getReference(..) to obtain the object you will be using in the main transaction. Alternatively, start the main transaction after D; it is not totally clear to me in how many race conditions you will have within your main transaction, but the above should allow for n times B-C-D in a 'large' transaction.

只返回对象主键,然后使用 EntityManager.getReference(..) 获取您将在主事务中使用的对象。或者,在D之后启动主事务;我并不完全清楚您在主交易中将有多少竞争条件,但以上应该允许在“大”交易中进行 n 倍的 BCD。

Note that you might want to do multi-threading (one thread per CPU) and then you can probably reduce this issue considerably by using a shared static cache for these kind of conflicts - and point 2 can be kept 'optimistic', i.e. not doing a .find(..) first.

请注意,您可能想要进行多线程处理(每个 CPU 一个线程),然后您可以通过对此类冲突使用共享静态缓存来显着减少此问题 - 第 2 点可以保持“乐观”,即不做首先是 .find(..) 。

Edit: For a new transaction, you need an EJB interface method call annotated with transaction type REQUIRES_NEW.

编辑:对于新事务,您需要使用事务类型REQUIRES_NEW注释的 EJB 接口方法调用。

Edit: Double check that the getReference(..) works as I think it does.

编辑:仔细检查 getReference(..) 是否像我认为的那样工作。