oracle 读取 CLOB 列很慢

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/473874/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 17:43:09  来源:igfitidea点击:

Reading CLOB column is slow

databaseoracleclob

提问by Carl

Hopefully someone can shed a little light on an issue that I'm currently having with an Oracle DB - I'm sure it's something simple!!

希望有人可以对我目前在使用 Oracle DB 时遇到的问题有所了解 - 我相信这很简单!!

I've managed to recreate this in a sample, so here is the DB structure:

我已经设法在一个示例中重新创建了它,所以这里是数据库结构:

CREATE TABLE MyTable(
    ixMyTable NUMBER,
    clobData CLOB
)
/

CREATE OR REPLACE PACKAGE PKGTEST
AS
    PROCEDURE DoSomething(
        cur_OUT OUT SYS_REFCURSOR
        );
END PKGTEST;
/

CREATE OR REPLACE PACKAGE BODY PKGTEST
AS

PROCEDURE DoSomething(
    cur_OUT OUT SYS_REFCURSOR
)
AS
BEGIN
    OPEN cur_OUT FOR 
        SELECT ixMyTable, clobData
        FROM MyTable;
END;

END PKGTEST;
/

GRANT EXECUTE ON PKGTEST TO TEST_ROLE
/

BEGIN
    FOR i IN 1 .. 7000 LOOP
        insert into mytable values (i, TO_CLOB('123456'));
    END LOOP;
END;
/

Extra info:

额外信息:

Schema owner is TEST_SCHEMA

架构所有者是 TEST_SCHEMA

User is CARL

用户是卡尔

CARL has the TEST_ROLE role

CARL 具有 TEST_ROLE 角色

Given the database setup as above, I have a C# test app that uses the standard System.Data.OracleClient.OracleCommand etc. in order to execute PKGTEST.DoSomething and throw the results into a datagrid (DevExpress).

鉴于上述数据库设置,我有一个 C# 测试应用程序,它使用标准 System.Data.OracleClient.OracleCommand 等来执行 PKGTEST.DoSomething 并将结果放入数据网格 (DevExpress)。

Pretty sure that the grid is irrelevant here, as we experience the same issue through c++ using the open source OTL (not my department, fortunately).

非常确定网格在这里无关紧要,因为我们通过使用开源 OTL 的 C++ 遇到了同样的问题(幸运的是,不是我的部门)。

OK, to the problem....

好的,问题来了......

Time from beginning until the grid is populated is ~35-40s, ouch.

从开始到网格填充的时间约为 35-40 秒,哎哟。

However, if I do the following:

但是,如果我执行以下操作:

GRANT SELECT ON MyTable TO TEST_ROLE
/

and then perform the query again, it takes ~5-6s.

然后再次执行查询,大约需要 5-6 秒。

It seems to me that this has something to do with privileges etc., but I'm not quite sure why it still actually works both ways??

在我看来,这与特权等有关,但我不太确定为什么它实际上仍然可以双向工作?

Just to throw something else into the pot, if I change the procedure to

只是把别的东西扔进锅里,如果我把程序改成

SELECT ixMyTable, TO_CLOB(TO_NCLOB(clobData))
FROM MyTable;

Then the time is ~5-6s, with or without the extra SELECT privilege.

然后时间是 ~5-6s,有或没有额外的 SELECT 权限。

Any pointers or straight out solutions would be much appreciated!

任何指针或直接的解决方案将不胜感激!

Edit:

编辑:

OS is Vista x86 Business

操作系统是 Vista x86 商业版

Oracle Server is 10.2.0.1

Oracle 服务器是 10.2.0.1

Oracle Client is 10.2.0.3

Oracle 客户端是 10.2.0.3

Edit:

编辑:

As suggested, I've tried changing from the MS OracleClient to the ODP.NET and this does speed up as required.

按照建议,我已经尝试从 MS OracleClient 更改为 ODP.NET,这确实根据需要加快了速度。

Unfortunately, the C# app that was being affected was just an internal app that is used for viewing tables / running SPROCS etc.

不幸的是,受影响的 C# 应用程序只是一个用于查看表/运行 SPROCS 等的内部应用程序。

Our main deliverable is the C++ app that uses OTL (http://otl.sourceforge.net/otl3_intro.htm) for database access. This is not really something that can be changed at this time, and so I would really like to understand the reasons for the difference, without having to throw gratuitous GRANT SELECTs around willy-nilly.

我们的主要交付物是使用 OTL ( http://otl.sourceforge.net/otl3_intro.htm) 进行数据库访问的 C++ 应用程序。目前这并不是真正可以改变的东西,所以我真的很想了解差异的原因,而不必随意随意地抛出无端的 GRANT SELECT。

If lack of SELECT privilege caused a complete failure then I could probably live with this, but lack of SELECT appears to be causing some slower route for accessing the CLOB data.

如果缺少 SELECT 权限导致完全失败,那么我可能会接受这一点,但缺少 SELECT 似乎导致访问 CLOB 数据的路径较慢。

I've marked up the 3 answers - thanks for those - but I could really do with a reason, so I'll add a bounty to this.

我已经标记了 3 个答案 - 谢谢你 - 但我真的可以做一个理由,所以我会为此添加一个赏金。

P.S. We really wanted to go with OCCI at the outset for our C++, but as Oracle are always supporting a version of the IDE before current we couldn't get it to play nicely with our Visual Studio 2008.

PS 我们从一开始就真的想为 C++ 使用 OCCI,但由于 Oracle 始终支持当前版本之前的 IDE,因此我们无法让它与我们的 Visual Studio 2008 完美配合。

回答by Thomas Jones-Low

Are you sure that you are reading the blob from disk each time, and not reading if from disk cache the second and following?

您确定每次都从磁盘读取 blob,而不是从磁盘缓存中读取第二个和以下内容吗?

I've seen this problem with performance testing, particularly on Oracle, where the first run of a test is terrible. Then with one minor (and seeming insignificant change) the performance suddenly improves significantly. But really what has happened is that the data you are querying has been loaded into the cache and can be access at 10x or 20x the rate (memory vs. disk).

我已经在性能测试中看到了这个问题,特别是在 Oracle 上,第一次运行测试很糟糕。然后通过一个微小的(看似微不足道的变化),性能突然显着提高。但真正发生的情况是,您查询的数据已加载到缓存中,并且可以以 10 倍或 20 倍的速率(内存与磁盘)进行访问。

The proper way to do this test is to bounce the database between query runs. Load a copy of Oracle XE on your machine if the DBA won't let you bounce the test-production server for this test.

进行此测试的正确方法是在查询运行之间反弹数据库。如果 DBA 不允许您为此测试退回测试生产服务器,请在您的机器上加载 Oracle XE 的副本。

Edit: Or better: drop and re-create the table every time. You may be doing this, but didn't mention it.

编辑:或者更好:每次删除并重新创建表。你可能正在这样做,但没有提到它。

回答by tuinstoel

I would try odp.net http://www.oracle.com/technology/tech/windows/odpnet/index.htmlinstead of System.Data.OracleClient.

我会尝试 odp.net http://www.oracle.com/technology/tech/windows/odpnet/index.html而不是 System.Data.OracleClient。

回答by Thomas Jones-Low

Following up on the suggestion above, can you try a different ODBC driver, or client software. The fact that the TO_CLOB(TO_NCLOB())works so faster, where the straight one doesn't seems to indicate that that is where the problem exists.

按照上面的建议,您可以尝试不同的 ODBC 驱动程序或客户端软件。事实上,它的TO_CLOB(TO_NCLOB())工作速度如此之快,而直的似乎并不表明这就是问题所在。

First I would take the two queries and run them through SQLDeveloper and get an explain plan. This should give you a baseline for execution on the Oracle side. Adding the conversions should make no difference on the execution path. Run the queries alternately and time both to see how fast they are. If there is no difference, I'd suggest that the client software is your problem.

首先,我将获取这两个查询并通过 SQLDeveloper 运行它们并获得解释计划。这应该为您提供在 Oracle 端执行的基线。添加转换应该不会对执行路径产生影响。交替运行查询并同时计时以查看它们的速度。如果没有区别,我建议客户端软件是您的问题。

If my assumption is correct, this would also explain the GRANT SELECTchange. The client software is doing some sort of expensive conversion on the CLOB. The grant and/or explicit conversion allows the client to avoid this. I don't know why.

如果我的假设是正确的,这也可以解释这种GRANT SELECT变化。客户端软件正在对 CLOB 进行某种昂贵的转换。授权和/或显式转换允许客户端避免这种情况。我不知道为什么。

回答by Thomas Jones-Low

I honestly think the problem here is that the OTL driver and the OracleClient are having problems dealing with the CLOB coming from the SYS_REFCURSOR. SYS_REFCURSORs are weak, meaning the cursor itself can return any type of data, meaning the drivers need to continually go back to database and query the metadata for the cursor (ie. get the types of data in the cursor), then load the CLOB stream, then load the data. Obviously if the user doing the query has access to the tables being queried, the driver can do a better job of retrieving the metadata and returning the correct data.

老实说,我认为这里的问题是 OTL 驱动程序和 OracleClient 在处理来自 SYS_REFCURSOR 的 CLOB 时遇到问题。SYS_REFCURSOR 很弱,这意味着游标本身可以返回任何类型的数据,这意味着驱动程序需要不断返回数据库并查询游标的元数据(即获取游标中的数据类型),然后加载 CLOB 流,然后加载数据。显然,如果执行查询的用户有权访问被查询的表,驱动程序可以更好地检索元数据并返回正确的数据。

Does changing the Stored Procedure to return a strong ref_cursor work?

更改存储过程以返回强 ref_cursor 是否有效?

CREATE OR REPLACE PACKAGE PKGTEST
    AS
    TYPE C_DoSomething IS REF CURSOR RETURN MyTable%ROWTYPE;
    PROCEDURE DoSomething(
        cur_OUT OUT c_DoSomething
     );
END PKGTEST;

Can you change the Stored Procedure to do the TO_CLOB(), since that also seems to work?

您可以更改存储过程来执行TO_CLOB(),因为这似乎也有效吗?

回答by Thomas Jones-Low

What is the NLS_CHARACTERSET and NLS_NCHAR_CHARACTERSET for your database. Run

您的数据库的 NLS_CHARACTERSET 和 NLS_NCHAR_CHARACTERSET 是什么。跑

select * from nls_database_parameters;

to get the results. And what is the NLS_LANG setting for your client software?

得到结果。您的客户端软件的 NLS_LANG 设置是什么?

This may give some more insight to the problem, and will help answer the question of how expensive the TO_CLOB(TO_NCLOB()) call is.

这可能会更深入地了解这个问题,并有助于回答 TO_CLOB(TO_NCLOB()) 调用的开销有多大的问题。

回答by Egwor

These aren't all necessarily directly linked, but you probably ought to check each one out in case it is related. I suspect that this is caching related: once you've done your first query you then apply select and then it goes fast. In the middle you have to bounce the server to get rid of the caches if you want to performance test it correctly. If you do this test and it suddenly performs better, then try pinning the table into the cache. See below around clob storage inline, since that will probably be related.

这些不一定都是直接链接的,但您可能应该检查每一个,以防万一。我怀疑这与缓存相关:一旦你完成了你的第一个查询,你就会应用 select ,然后它就会变得很快。在中间,如果你想正确地测试它的性能,你必须反弹服务器以摆脱缓存。如果您执行此测试并且它突然表现得更好,则尝试将表固定到缓存中。请参阅以下有关内联 Clob 存储的内容,因为这可能是相关的。

I had issues around clob performance in Oracle 10g a year or so back. We got around most of them once we got our awesome dba to help. It took about 2 months to get the performance up to an adequate speed.

大约一年前,我在 Oracle 10g 中遇到了 Clob 性能方面的问题。一旦我们得到了很棒的 dba 的帮助,我们就绕过了他们中的大多数。大约需要 2 个月的时间才能使性能达到足够的速度。

Which version of Oracle are you using? In Oracle 10g (early versions) there were massive issues with clob performance. In fact it was actually faster in some cases to just use two tables, and a varchar column (concatenate the varchars together and you have your clob). We upgraded to a later version, and it was much better

您使用的是哪个版本的 Oracle?在 Oracle 10g(早期版本)中,clob 性能存在大量问题。事实上,在某些情况下,仅使用两个表和一个 varchar 列实际上更快(将 varchar 连接在一起,您就拥有了 clob)。我们升级到了更高版本,好多了

Also where is your data being stored? There's also an option to store the clob in the table itself. Depending how big your data is you might find this helps performance. If you've got your data stored on a SAN then it's worth also looking at the cache size on the SAN and also the block sizes. Oracle + SAN can be a bit funny when the caching sizes are incorrect.

此外,您的数据存储在哪里?还有一个选项可以将 Clob 存储在表本身中。根据您的数据有多大,您可能会发现这有助于提高性能。如果您的数据存储在 SAN 上,那么还值得查看 SAN 上的缓存大小以及块大小。当缓存大小不正确时,Oracle + SAN 可能有点滑稽。

Another workaround: If you're finding persistence is slow or even access is slow and you're not CPU bound, zip the data and store it in a blob. We saw a big performance benefit here too.

另一个解决方法:如果您发现持久性很慢,甚至访问速度很慢并且您不受 CPU 限制,请压缩数据并将其存储在 blob 中。我们在这里也看到了巨大的性能优势。

If you're seeing performance issues (memory related?) in whatever is processing the clobs, we found that we would recreate the objects as new strings. The drivers were pre-creating 32K sized Strings even if the data was smaller.

如果您在处理 clobs 的任何过程中看到性能问题(内存相关?),我们发现我们会将对象重新创建为新字符串。即使数据更小,驱动程序也会预先创建 32K 大小的字符串。

I did wonder whether the system tables might be fragmented? Are there lots of tables/schemas? What about synonyms?

我确实想知道系统表是否可能碎片化?有很多表/模式吗?同义词呢?

Also, when you store clobs, don't they get stored into one massive file in Oracle? If I remember correctly you have to be careful about fragmentation; the storage doesn't get released for reuse.

另外,当您存储 clob 时,它们不会被存储到 Oracle 中的一个大文件中吗?如果我没记错的话,您必须小心碎片化;存储不会被释放以供重用。

Perhaps you could put a .NET web service in front of your db? That might be one option if you can't solve the performance issues.

也许你可以在你的数据库前面放一个 .NET web 服务?如果您无法解决性能问题,那可能是一种选择。