postgresql Spring + Hibernate:查询计划缓存内存使用情况

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31557076/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 01:58:53  来源:igfitidea点击:

Spring + Hibernate: Query Plan Cache Memory usage

springhibernatepostgresqlspring-boot

提问by LastElb

I'm programming an application with the latest version of Spring Boot. I recently became problems with growing heap, that can not be garbage collected. The analysis of the heap with Eclipse MAT showed that, within one hour of running the application, the heap grew to 630MB and with Hibernate's SessionFactoryImpl using more than 75% of the whole heap.

我正在使用最新版本的 Spring Boot 编写应用程序。我最近遇到了堆增长的问题,无法进行垃圾收集。使用 Eclipse MAT 对堆的分析表明,在运行应用程序的一小时内,堆增长到 630MB,并且 Hibernate 的 SessionFactoryImpl 使用了整个堆的 75% 以上。

enter image description here

在此处输入图片说明

Is was looking for possible sources around the Query Plan Cache, but the only thing I found was this, but that did not play out. The properties were set like this:

正在寻找有关查询计划缓存的可能来源,但我发现的唯一一件事是this,但这并没有发挥作用。属性设置如下:

spring.jpa.properties.hibernate.query.plan_cache_max_soft_references=1024
spring.jpa.properties.hibernate.query.plan_cache_max_strong_references=64

The database queries are all generated by the Spring's Query magic, using repository interfaces like in this documentation. There are about 20 different queries generated with this technique. No other native SQL or HQL are used. Sample:

数据库查询全部由 Spring 的 Query 魔术生成,使用本文档中的存储库接口。使用这种技术生成了大约 20 个不同的查询。不使用其他本机 SQL 或 HQL。样本:

@Transactional
public interface TrendingTopicRepository extends JpaRepository<TrendingTopic, Integer> {
    List<TrendingTopic> findByNameAndSource(String name, String source);
    List<TrendingTopic> findByDateBetween(Date dateStart, Date dateEnd);
    Long countByDateBetweenAndName(Date dateStart, Date dateEnd, String name);
}

or

或者

List<SomeObject> findByNameAndUrlIn(String name, Collection<String> urls);

as example for IN usage.

作为 IN 用法的示例。

Question is: Why does the query plan cache keep growing (it does not stop, it ends in a full heap) and how to prevent this? Did anyone encounter a similar problem?

问题是:为什么查询计划缓存不断增长(它不会停止,它以完整的堆结束)以及如何防止这种情况?有没有人遇到过类似的问题?

Versions:

版本:

  • Spring Boot 1.2.5
  • Hibernate 4.3.10
  • 弹簧靴 1.2.5
  • 休眠 4.3.10

回答by Neeme Praks

I've hit this issue as well. It basically boils down to having variable number of values in your IN clause and Hibernate trying to cache those query plans.

我也遇到过这个问题。它基本上归结为在您的 IN 子句中具有可变数量的值,并且 Hibernate 尝试缓存这些查询计划。

There are two great blog posts on this topic. The first:

关于这个主题有两篇很棒的博客文章。 第一个

Using Hibernate 4.2 and MySQL in a project with an in-clause query such as: select t from Thing t where t.id in (?)

Hibernate caches these parsed HQL queries. Specifically the Hibernate SessionFactoryImplhas QueryPlanCachewith queryPlanCacheand parameterMetadataCache. But this proved to be a problem when the number of parameters for the in-clause is large and varies.

These caches grow for every distinct query. So this query with 6000 parameters is not the same as 6001.

The in-clause query is expanded to the number of parameters in the collection. Metadata is included in the query plan for each parameter in the query, including a generated name like x10_, x11_ , etc.

Imagine 4000 different variations in the number of in-clause parameter counts, each of these with an average of 4000 parameters. The query metadata for each parameter quickly adds up in memory, filling up the heap, since it can't be garbage collected.

This continues until all different variations in the query parameter count is cached or the JVM runs out of heap memory and starts throwing java.lang.OutOfMemoryError: Java heap space.

Avoiding in-clauses is an option, as well as using a fixed collection size for the parameter (or at least a smaller size).

For configuring the query plan cache max size, see the property hibernate.query.plan_cache_max_size, defaulting to 2048(easily too large for queries with many parameters).

在一个项目中使用 Hibernate 4.2 和 MySQL 并带有 in-clause 查询,例如: select t from Thing t where t.id in (?)

Hibernate 缓存这些解析的 HQL 查询。具体地说休眠 SessionFactoryImpl具有QueryPlanCachequeryPlanCacheparameterMetadataCache。但是,当子句的参数数量很大且变化时,这被证明是一个问题。

这些缓存会随着每个不同的查询而增长。所以这个有 6000 个参数的查询和 6001 是不一样的。

子句查询扩展到集合中的参数数量。元数据包含在查询中每个参数的查询计划中,包括生成的名称,如 x10_、x11_ 等。

想象一下子句参数计数的数量有 4000 种不同的变化,每一种都有平均 4000 个参数。每个参数的查询元数据在内存中快速累加,填满堆,因为它不能被垃圾收集。

这一直持续到查询参数计数中的所有不同变化都被缓存或 JVM 耗尽堆内存并开始抛出 java.lang.OutOfMemoryError: Java heap space。

避免使用 in-clauses 是一种选择,以及为参数使用固定的集合大小(或至少较小的大小)。

要配置查询计划缓存最大大小,请参阅属性 hibernate.query.plan_cache_max_size,默认为2048(对于具有许多参数的查询来说很容易太大)。

And second(also referenced from the first):

第二(也从第一参考):

Hibernate internally uses a cachethat maps HQL statements (as strings) to query plans. The cache consists of a bounded map limited by default to 2048 elements (configurable). All HQL queries are loaded through this cache. In case of a miss, the entry is automatically added to the cache. This makes it very susceptible to thrashing - a scenario in which we constantly put new entries into the cache without ever reusing them and thus preventing the cache from bringing any performance gains (it even adds some cache management overhead). To make things worse, it is hard to detect this situation by chance - you have to explicitly profile the cache in order to notice that you have a problem there. I will say a few words on how this could be done later on.

So the cache thrashing results from new queries being generated at high rates. This can be caused by a multitude of issues. The two most common that I have seen are - bugs in hibernate which cause parameters to be rendered in the JPQL statement instead of being passed as parameters and the use of an "in" - clause.

Due to some obscure bugs in hibernate, there are situations when parameters are not handled correctly and are rendered into the JPQL query (as an example check out HHH-6280). If you have a query that is affected by such defects and it is executed at high rates, it will thrash your query plan cache because each JPQL query generated is almost unique (containing IDs of your entities for example).

The second issue lays in the way that hibernate processes queries with an "in" clause (e.g. give me all person entities whose company id field is one of 1, 2, 10, 18). For each distinct number of parameters in the "in"-clause, hibernate will produce a different query - e.g. select x from Person x where x.company.id in (:id0_)for 1 parameter, select x from Person x where x.company.id in (:id0_, :id1_)for 2 parameters and so on. All these queries are considered different, as far as the query plan cache is concerned, resulting again in cache thrashing. You could probably work around this issue by writing a utility class to produce only certain number of parameters - e.g. 1, 10, 100, 200, 500, 1000. If you, for example, pass 22 parameters, it will return a list of 100 elements with the 22 parameters included in it and the remaining 78 parameters set to an impossible value (e.g. -1 for IDs used for foreign keys). I agree that this is an ugly hack but could get the job done. As a result you will only have at most 6 unique queries in your cache and thus reduce thrashing.

So how do you find out that you have the issue? You could write some additional code and expose metrics with the number of entries in the cache e.g. over JMX, tune logging and analyze the logs, etc. If you do not want to (or can not) modify the application, you could just dump the heap and run this OQL query against it (e.g. using mat): SELECT l.query.toString() FROM INSTANCEOF org.hibernate.engine.query.spi.QueryPlanCache$HQLQueryPlanKey l. It will output all queries currently located in any query plan cache on your heap. It should be pretty easy to spot whether you are affected by any of the aforementioned problems.

As far as the performance impact goes, it is hard to say as it depends on too many factors. I have seen a very trivial query causing 10-20 ms of overhead spent in creating a new HQL query plan. In general, if there is a cache somewhere, there must be a good reason for that - a miss is probably expensive so your should try to avoid misses as much as possible. Last but not least, your database will have to handle large amounts of unique SQL statements too - causing it to parse them and maybe create different execution plans for every one of them.

Hibernate 内部使用缓存将 HQL 语句(作为字符串)映射到查询计划。缓存由默认限制为 2048 个元素(可配置)的有界地图组成。所有 HQL 查询都通过此缓存加载。在未命中的情况下,条目会自动添加到缓存中。这使得它非常容易受到抖动的影响——在这种情况下,我们不断地将新条目放入缓存中而从未重用它们,从而阻止缓存带来任何性能提升(它甚至增加了一些缓存管理开销)。更糟糕的是,很难偶然检测到这种情况 - 您必须明确地分析缓存才能注意到那里有问题。稍后我将就如何做到这一点说几句话。

因此缓存抖动是由高速生成的新查询造成的。这可能是由多种问题引起的。我见过的两个最常见的是 - 休眠中的错误导致参数在 JPQL 语句中呈现而不是作为参数传递,以及使用“in” - 子句。

由于 hibernate 中的一些模糊错误,存在参数未正确处理并呈现到 JPQL 查询中的情况(例如查看HHH-6280)。如果您有一个受此类缺陷影响的查询并且它以高速率执行,它会破坏您的查询计划缓存,因为生成的每个 JPQL 查询几乎都是唯一的(例如,包含您的实体的 ID)。

第二个问题在于 hibernate 处理带有“in”子句的查询的方式(例如,给我公司 ID 字段为 1、2、10、18 之一的所有个人实体)。对于“in”子句中每个不同数量的参数,hibernate 将生成不同的查询 - 例如 select x from Person x where x.company.id in (:id0_)对于 1 个参数, select x from Person x where x.company.id in (:id0_, :id1_)对于 2 个参数等等。就查询计划缓存而言,所有这些查询都被认为是不同的,再次导致缓存抖动。您可能可以通过编写一个实用程序类来仅生成特定数量的参数(例如 1、10、100、200、500、1000)来解决此问题。例如,如果您传递 22 个参数,它将返回一个包含 100 个参数的列表其中包含 22 个参数且其余 78 个参数设置为不可能值的元素(例如 -1 表示用于外键的 ID)。我同意这是一个丑陋的黑客,但可以完成工作。因此,您的缓存中最多只有 6 个唯一查询,从而减少了抖动。

那么你怎么知道你有这个问题呢?您可以编写一些额外的代码并使用缓存中的条目数量公开指标,例如通过 JMX、调整日志记录和分析日志等。如果您不想(或不能)修改应用程序,您可以转储堆和对其运行此OQL查询(例如,使用SELECT l.query.toString() FROM INSTANCEOF org.hibernate.engine.query.spi.QueryPlanCache$HQLQueryPlanKey l。它将输出当前位于堆上任何查询计划缓存中的所有查询。应该很容易发现您是否受到上述任何问题的影响。

至于性能影响,很难说,因为它取决于太多因素。我见过一个非常简单的查询,导致在创建新的 HQL 查询计划时花费了 10-20 毫秒的开销。一般而言,如果某处有缓存,则必须有充分的理由 - 未命中可能代价高昂,因此您应该尽量避免未命中。最后但并非最不重要的一点是,您的数据库也必须处理大量独特的 SQL 语句 - 导致它解析它们并可能为每个语句创建不同的执行计划。

回答by Georgi Staykov

I had the exact same problem using Spring Boot 1.5.7 with Spring Data (Hibernate) and the following config solved the problem (memory leak):

我在使用 Spring Boot 1.5.7 和 Spring Data(Hibernate)时遇到了完全相同的问题,以下配置解决了这个问题(内存泄漏):

spring:
  jpa:
    properties:
      hibernate:
        query:
          plan_cache_max_size: 64
          plan_parameter_metadata_max_size: 32

回答by woo2333

Starting with Hibernate 5.2.12, you can specify a hibernate configuration property to change how literals are to be bound to the underlying JDBC prepared statements by using the following:

从 Hibernate 5.2.12 开始,您可以指定一个 hibernate 配置属性,以通过使用以下内容来更改将文字绑定到底层 JDBC 准备好的语句的方式:

hibernate.criteria.literal_handling_mode=BIND

From the Java documentation, this configuration property has 3 settings

从 Java 文档中,此配置属性有 3 个设置

  1. AUTO (default)
  2. BIND - Increases the likelihood of jdbc statement caching using bind parameters.
  3. INLINE - Inlines the values rather than using parameters (be careful of SQL injection).
  1. 自动(默认)
  2. BIND - 使用绑定参数增加 jdbc 语句缓存的可能性。
  3. INLINE - 内联值而不是使用参数(注意 SQL 注入)。

回答by Jeroen Borgers

We also had a QueryPlanCache with growing heap usage. We had IN-queries which we rewrote, and additionally we have queries which use custom types. Turned out that the Hibernate class CustomType didn't properly implement equals and hashCode thereby creating a new key for every query instance. This is now solved in Hibernate 5.3. See https://hibernate.atlassian.net/browse/HHH-12463. You still need to properly implement equals/hashCode in your userTypes to make it work properly.

我们还有一个 QueryPlanCache,堆使用量不断增加。我们有重写的 IN 查询,另外我们有使用自定义类型的查询。原来 Hibernate 类 CustomType 没有正确实现 equals 和 hashCode,从而为每个查询实例创建一个新键。现在在 Hibernate 5.3 中解决了这个问题。请参阅https://hibernate.atlassian.net/browse/HHH-12463。您仍然需要在您的 userTypes 中正确实现 equals/hashCode 以使其正常工作。

回答by arpit sharma

I had a similar issue, the issue is because you are creating the query and not using the PreparedStatement. So what happens here is for each query with different parameters it creates an execution plan and caches it. If you use a prepared statement then you should see a major improvement in the memory being used.

我有一个类似的问题,问题是因为您正在创建查询而不是使用 PreparedStatement。所以这里发生的是对于每个具有不同参数的查询,它创建一个执行计划并缓存它。如果您使用准备好的语句,那么您应该会看到所用内存的重大改进。

回答by Alex

I have same problems with many(>10000) parameters in IN-queries. The number of my parameters is always different and I can not predict this, my QueryCachePlangrowing too fast.

我对 IN 查询中的许多(> 10000)个参数有同样的问题。我的参数数量总是不同的,我无法预测,我的QueryCachePlan增长太快了。

For database systems supporting execution plan caching, there's a better chance of hitting the cache if the number of possible IN clause parameters lowers.

对于支持执行计划缓存的数据库系统,如果可能的 IN 子句参数数量减少,则更有可能命中缓存。

Fortunately Hibernate of version 5.3.0 and higher has a solution with padding of parameters in IN-clause.

幸运的是,5.3.0 及更高版本的 Hibernate 有一个在 IN 子句中填充参数的解决方案。

Hibernate can expand the bind parameters to power-of-two: 4, 8, 16, 32, 64. This way, an IN clause with 5, 6, or 7 bind parameters will use the 8 IN clause, therefore reusing its execution plan.

Hibernate 可以将绑定参数扩展为 2 的幂:4、8、16、32、64。这样,具有 5、6 或 7 个绑定参数的 IN 子句将使用 8 IN 子句,因此重用其执行计划.

If you want to activate this feature, you need to set this property to true hibernate.query.in_clause_parameter_padding=true.

如果要激活此功能,则需要将此属性设置为 true hibernate.query.in_clause_parameter_padding=true

For more information see this article, atlassian.

有关更多信息,请参阅本文atlassian

回答by Guilherme

I had a big issue with this queryPlanCache, so I did a Hibernate cache monitor to see the queries in the queryPlanCache. I am using in QA environment as a Spring task each 5 minutes. I found which IN queries I had to change to solve my cache problem. A detail is: I am using Hibernate 4.2.18 and I don't know if will be useful with other versions.

我对这个 queryPlanCache 有一个大问题,所以我做了一个 Hibernate 缓存监视器来查看 queryPlanCache 中的查询。我在 QA 环境中每 5 分钟使用一次作为 Spring 任务。我发现我必须更改哪些 IN 查询才能解决我的缓存问题。一个细节是:我使用的是 Hibernate 4.2.18,我不知道对其他版本是否有用。

import java.lang.reflect.Field;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Set;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import org.hibernate.ejb.HibernateEntityManagerFactory;
import org.hibernate.internal.SessionFactoryImpl;
import org.hibernate.internal.util.collections.BoundedConcurrentHashMap;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.dao.GenericDAO;

public class CacheMonitor {

private final Logger logger  = LoggerFactory.getLogger(getClass());

@PersistenceContext(unitName = "MyPU")
private void setEntityManager(EntityManager entityManager) {
    HibernateEntityManagerFactory hemf = (HibernateEntityManagerFactory) entityManager.getEntityManagerFactory();
    sessionFactory = (SessionFactoryImpl) hemf.getSessionFactory();
    fillQueryMaps();
}

private SessionFactoryImpl sessionFactory;
private BoundedConcurrentHashMap queryPlanCache;
private BoundedConcurrentHashMap parameterMetadataCache;

/*
 * I tried to use a MAP and use compare compareToIgnoreCase.
 * But remember this is causing memory leak. Doing this
 * you will explode the memory faster that it already was.
 */

public void log() {
    if (!logger.isDebugEnabled()) {
        return;
    }

    if (queryPlanCache != null) {
        long cacheSize = queryPlanCache.size();
        logger.debug(String.format("QueryPlanCache size is :%s ", Long.toString(cacheSize)));

        for (Object key : queryPlanCache.keySet()) {
            int filterKeysSize = 0;
            // QueryPlanCache.HQLQueryPlanKey (Inner Class)
            Object queryValue = getValueByField(key, "query", false);
            if (queryValue == null) {
                // NativeSQLQuerySpecification
                queryValue = getValueByField(key, "queryString");
                filterKeysSize = ((Set) getValueByField(key, "querySpaces")).size();
                if (queryValue != null) {
                    writeLog(queryValue, filterKeysSize, false);
                }
            } else {
                filterKeysSize = ((Set) getValueByField(key, "filterKeys")).size();
                writeLog(queryValue, filterKeysSize, true);
            }
        }
    }

    if (parameterMetadataCache != null) {
        long cacheSize = parameterMetadataCache.size();
        logger.debug(String.format("ParameterMetadataCache size is :%s ", Long.toString(cacheSize)));
        for (Object key : parameterMetadataCache.keySet()) {
            logger.debug("Query:{}", key);
        }
    }
}

private void writeLog(Object query, Integer size, boolean b) {
    if (query == null || query.toString().trim().isEmpty()) {
        return;
    }
    StringBuilder builder = new StringBuilder();
    builder.append(b == true ? "JPQL " : "NATIVE ");
    builder.append("filterKeysSize").append(":").append(size);
    builder.append("\n").append(query).append("\n");
    logger.debug(builder.toString());
}

private void fillQueryMaps() {
    Field queryPlanCacheSessionField = null;
    Field queryPlanCacheField = null;
    Field parameterMetadataCacheField = null;
    try {
        queryPlanCacheSessionField = searchField(sessionFactory.getClass(), "queryPlanCache");
        queryPlanCacheSessionField.setAccessible(true);
        queryPlanCacheField = searchField(queryPlanCacheSessionField.get(sessionFactory).getClass(), "queryPlanCache");
        queryPlanCacheField.setAccessible(true);
        parameterMetadataCacheField = searchField(queryPlanCacheSessionField.get(sessionFactory).getClass(), "parameterMetadataCache");
        parameterMetadataCacheField.setAccessible(true);
        queryPlanCache = (BoundedConcurrentHashMap) queryPlanCacheField.get(queryPlanCacheSessionField.get(sessionFactory));
        parameterMetadataCache = (BoundedConcurrentHashMap) parameterMetadataCacheField.get(queryPlanCacheSessionField.get(sessionFactory));
    } catch (Exception e) {
        logger.error("Failed fillQueryMaps", e);
    } finally {
        queryPlanCacheSessionField.setAccessible(false);
        queryPlanCacheField.setAccessible(false);
        parameterMetadataCacheField.setAccessible(false);
    }
}

private <T> T getValueByField(Object toBeSearched, String fieldName) {
    return getValueByField(toBeSearched, fieldName, true);
}

@SuppressWarnings("unchecked")
private <T> T getValueByField(Object toBeSearched, String fieldName, boolean logErro) {
    Boolean accessible = null;
    Field f = null;
    try {
        f = searchField(toBeSearched.getClass(), fieldName, logErro);
        accessible = f.isAccessible();
        f.setAccessible(true);
    return (T) f.get(toBeSearched);
    } catch (Exception e) {
        if (logErro) {
            logger.error("Field: {} error trying to get for: {}", fieldName, toBeSearched.getClass().getName());
        }
        return null;
    } finally {
        if (accessible != null) {
            f.setAccessible(accessible);
        }
    }
}

private Field searchField(Class<?> type, String fieldName) {
    return searchField(type, fieldName, true);
}

private Field searchField(Class<?> type, String fieldName, boolean log) {

    List<Field> fields = new ArrayList<Field>();
    for (Class<?> c = type; c != null; c = c.getSuperclass()) {
        fields.addAll(Arrays.asList(c.getDeclaredFields()));
        for (Field f : c.getDeclaredFields()) {

            if (fieldName.equals(f.getName())) {
                return f;
            }
        }
    }
    if (log) {
        logger.warn("Field: {} not found for type: {}", fieldName, type.getName());
    }
    return null;
}
}

回答by Puja Kedia

We had faced this issue with query plan cache growing too fast and old gen heap was also growing along with it as gc was unable to collect it.The culprit was JPA query taking some more than 200000 ids in the IN clause. To optimise the query we used joins instead of fetching ids from one table and passing those in other table select query..

我们遇到过查询计划缓存增长过快的问题,并且旧的 gen 堆也随之增长,因为 gc 无法收集它。罪魁祸首是 JPA 查询在 IN 子句中使用了超过 200000 个 id。为了优化查询,我们使用了连接而不是从一个表中获取 ids 并在其他表选择查询中传递它们。