Java 在大事务中间安全地清除 Hibernate 会话

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3788048/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 04:59:16  来源:igfitidea点击:

Safely clearing Hibernate session in the middle of large transaction

javahibernatespringorm

提问by mindas

I am using Spring+Hibernate for an operation which requires creating and updating literally hundreds of thousands of items. Something like this:

我正在使用 Spring+Hibernate 进行需要创建和更新数十万个项目的操作。像这样的东西:

{
   ...
   Foo foo = fooDAO.get(...);
   for (int i=0; i<500000; i++) {
      Bar bar = barDAO.load(i);
      if (bar.needsModification() && foo.foo()) {
         bar.setWhatever("new whatever");
         barDAO.update(bar);
         // commit here
         Baz baz = new Baz();
         bazDAO.create(baz);
         // if (i % 100 == 0), clear
      }
   }
}

To protect myself against losing changes in the middle, I commit the changes immediately after barDAO.update(bar):

为了保护自己不会在中间丢失更改,我会在以下之后立即提交更改barDAO.update(bar)

HibernateTransactionManager transactionManager = ...; // injected by Spring
DefaultTransactionDefinition def = new DefaultTransactionDefinition();
def.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRED);
TransactionStatus transactionStatus = transactionManager.getTransaction(def);
transactionManager.commit(transactionStatus);

At this point I have to say that entire process is running in a transaction wrapped into org.springframework.orm.hibernate3.support.ExtendedOpenSessionInViewFilter(yes, this is a webapp).

在这一点上我不得不说,整个过程是在一个事务中运行的org.springframework.orm.hibernate3.support.ExtendedOpenSessionInViewFilter(是的,这是一个 webapp)。

This all works fine with one exception: after few thousand of updates/commits, entire process gets really slow, most likely due to memory being bloated by ever-increasing amount of objects kept by Spring/Hibernate.

这一切都很好,只有一个例外:在几千次更新/提交之后,整个过程变得非常缓慢,很可能是由于 Spring/Hibernate 保留的对象数量不断增加而导致内存膨胀。

In Hibernate-only environment this would be easily solvable by calling org.hibernate.Session#clear().

在 Hibernate-only 环境中,这可以通过调用org.hibernate.Session#clear().

Now, the questions:

现在,问题:

  • When is it a good time to clear()? Does it have big performance cost?
  • Why aren't objects like baror bazreleased/GCd automatically? What's the point of keeping them in the session after the commit (in the next loop of iteration they're not reachable anyway)? I haven't done memory dump to prove this but my good feeling is that they're still there until completely exited. If the answer to this is "Hibernate cache", then why isn't the cache flushed upon the available memory going low?
  • is it safe/recommended to call org.hibernate.Session#clear()directly (having in mind entire Spring context, things like lazy loading, etc.)? Are there any usable Spring wrappers/counterparts for achieving the same?
  • If answer to the above question is true, what will happen with object foo, assuming clear()is called inside the loop? What if foo.foo()is a lazy-load method?
  • 什么时候去clear()合适?它有很大的性能成本吗?
  • 为什么对象不会像bar或被baz释放/GCd 自动?在提交后将它们保留在会话中有什么意义(在下一个迭代循环中它们无论如何都无法访问)?我还没有做内存转储来证明这一点,但我的好感觉是它们仍然存在,直到完全退出。如果对此的答案是“休眠缓存”,那么为什么在可用内存变低时不刷新缓存?
  • org.hibernate.Session#clear()直接调用是否安全/推荐(考虑到整个 Spring 上下文、延迟加载等)?是否有任何可用的 Spring 包装器/对应物来实现相同的目标?
  • 如果上述问题的答案为真foo,假设clear()在循环内被调用, object 会发生什么?如果foo.foo()是延迟加载方法呢?

Thank you for the answers.

谢谢你的回答。

采纳答案by Pascal Thivent

When is it a good time to clear()? Does it have big performance cost?

什么时候是 clear() 的好时机?它有很大的性能成本吗?

At regular intervals, ideally the same as the JDBC batch size, after having flushed the changes. The documentation describes common idioms in the chapter about Batch processing:

在刷新更改后,每隔一定时间,理想情况下与 JDBC 批处理大小相同。该文档在关于批处理的章节中描述了常见的习语:

13.1. Batch inserts

When making new objects persistent flush() and then clear() the session regularly in order to control the size of the first-level cache.

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

for ( int i=0; i<100000; i++ ) {
    Customer customer = new Customer(.....);
    session.save(customer);
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

13.1. 批量插入

当使新对象持久化时,flush() 然后 clear() 定期会话以控制一级缓存的大小。

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

for ( int i=0; i<100000; i++ ) {
    Customer customer = new Customer(.....);
    session.save(customer);
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

And this shouldn't have a performance cost, au contraire:

这不应该有性能成本,相反:

  • it allows to keep the number of objects to track for dirtiness low (so flushing should be fast),
  • it should allow to reclaim memory.
  • 它允许将要跟踪的对象数量保持在较低的水平(因此刷新应该很快),
  • 它应该允许回收内存。

Why aren't objects like bar or baz released/GCd automatically? What's the point of keeping them in the session after the commit (in the next loop of iteration they're not reachable anyway)?

为什么像 bar 或 baz 这样的对象不会自动释放/GCd?在提交后将它们保留在会话中有什么意义(在下一个迭代循环中它们无论如何都无法访问)?

You need to clear()the session explicitly if you don't want to keep entities tracked, that's all, that's how it works (one might want to commit a transaction without "loosing" the entities).

clear()如果您不想跟踪实体,则需要明确地进行会话,仅此而已,这就是它的工作原理(人们可能希望在不“丢失”实体的情况下提交事务)。

But from what I can see, bar and baz instances should become candidate to GC after the clear. It would be interesting to analyze a memory dump to see what is happening exactly.

但据我所知,bar 和 baz 实例在清除后应该成为 GC 的候选者。分析内存转储以查看到底发生了什么会很有趣。

is it safe/recommended to call org.hibernate.Session#clear() directly

直接调用 org.hibernate.Session#clear() 是否安全/推荐

As long as you flush()the pending changes to not loose them (unless this is what you want), I don't see any problem with that (your current code will loose a create every 100 loop but maybe it's just some pseudo code).

只要您flush()未处理的更改不丢失它们(除非这是您想要的),我看不出有任何问题(您当前的代码将每 100 个循环丢失一次创建,但可能只是一些伪代码)。

If answer to the above question is true, what will happen with object foo, assuming clear() is called inside the loop? What if foo.foo() is a lazy-load method?

如果上述问题的答案为真,假设在循环内调用 clear() ,对象 foo 会发生什么?如果 foo.foo() 是一个延迟加载方法呢?

Calling clear()evicts all loaded instances from the Session, making them detached entities. If a subsequent invocation requires an entity to be "attached", it will fail.

调用clear()从 中驱逐所有加载的实例Session,使它们成为分离的实体。如果后续调用需要“附加”一个实体,它将失败。

回答by smdb21

I just wanted to point out that, after clearing the session, if you want to continue to use some objects that were in the session, you will have to Session.refresh(obj)them in order to continue.

我只是想指出,清除会话后,如果您想继续使用会话中的某些对象,则必须使用Session.refresh(obj)它们才能继续。

Otherwise you will get following error:

否则你会得到以下错误:

org.hibernate.NonUniqueObjectException