SQL Server 链接服务器性能
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1248753/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL Server linked server performance
提问by George2
I am using SQL Server 2008 Enterprise. And I am using Linked Server technologies to link another SQL Server 2008 Enterprise instance from another server. I write TSQL to manipulate objects (e.g. tables) from both server instances.
我正在使用 SQL Server 2008 企业版。我正在使用链接服务器技术从另一台服务器链接另一个 SQL Server 2008 Enterprise 实例。我编写 TSQL 来操作来自两个服务器实例的对象(例如表)。
My question is, for linked server, is there a big performance issue? If yes, what is the key performance bottleneck and best practice we should follow?
我的问题是,对于链接服务器,是否存在很大的性能问题?如果是,我们应该遵循的关键性能瓶颈和最佳实践是什么?
thanks in advance, George
提前致谢,乔治
回答by Sam Saffron
My question is, for linked server, is there a big performance issue? If yes, what is the key performance bottleneck and best practice we should follow?
我的问题是,对于链接服务器,是否存在很大的性能问题?如果是,我们应该遵循的关键性能瓶颈和最佳实践是什么?
Compared to what? For what queries? of course it all dependson what you are doing. For some queries the performance hit will be negligible for others massive.
与什么相比?对于什么查询?当然,这一切都取决于你在做什么。对于某些查询,性能影响对于其他大量查询可以忽略不计。
There are a bunch of concerns you should keep in mind:
您应该记住一些问题:
- If you will be joining 2 tables from DB1 to 2 tables from DB2, and the tables are big, stuff can get ugly. At the end of the day, the queries will execute somewhere. The db will have to pull all the results in to the main DB and maintain transactional integrity on the main db. This can be really expensive.
- If you start running distributed transactions stuff can get ugly, fast.
- When joining stuff across servers your indexes on the remote server can be rendered useless. The data all has to move somewhere for the joins.
- Linked server links can go down at unexpected times and lead to hard to diagnose bugs.
- 如果您要将来自 DB1 的 2 个表连接到来自 DB2 的 2 个表,并且这些表很大,那么事情会变得很丑陋。在一天结束时,查询将在某处执行。数据库必须将所有结果拉入主数据库并维护主数据库上的事务完整性。这可能真的很贵。
- 如果你开始运行分布式事务,事情会变得丑陋、快速。
- 当跨服务器加入东西时,您在远程服务器上的索引可能会变得无用。数据都必须移动到某个地方才能进行连接。
- 链接的服务器链接可能会在意外的时间中断并导致难以诊断的错误。
In the past I have found situations where it was a few orders of magnitude faster to move the remote data locally, and index it before joining into it.
在过去,我发现在本地移动远程数据并在加入之前对其进行索引要快几个数量级的情况。
回答by Mitch Wheat
It depends on what you are doing.
这取决于你在做什么。
If you are running queries that join between tables in the two server instances, and transferring large amounts of data, then you have a bottleneck that you need to be aware of.
如果您正在运行连接两个服务器实例中的表并传输大量数据的查询,那么您需要注意一个瓶颈。
If the servers are on their own subnet with a 1GB link, then you should not have to worry a great deal. I would be concerned if the two servers are connected by a shared, slow link.
如果服务器在它们自己的子网上,有 1GB 的链接,那么你不必太担心。我会担心两台服务器是否通过共享的慢速链接连接。
回答by Eric
You're going to take a bit of a hit to transfer the results across the wire twice (linked server to SQL Server to your machine). Secondly, it's got to resolve the name and log in, which isn't much of a hit, but it's a hit nonetheless.
要通过网络传输结果两次(链接服务器到 SQL Server 到您的机器),您需要付出一些努力。其次,它必须解析名称并登录,这不是很受欢迎,但仍然很受欢迎。
Anyway, I've found the only major bottleneck is jumping servers, since it has to transmit the information twice.
无论如何,我发现唯一的主要瓶颈是跳跃服务器,因为它必须传输信息两次。
回答by Raspin
I use linked servers frequently to synchronise data between environments, mainly because I found it to be the easiest solution to code and manage.
我经常使用链接服务器在环境之间同步数据,主要是因为我发现它是编码和管理的最简单的解决方案。
One tip I found, but may not be an option for others, was to run any procedures on the server that has the most data or is doing the most updating/inserting. For example I have a procedure that compares two tables and inserts/updates from A to B. If I ran this on server A it would take many times longer than running the procedure on B. If you don't have a choice where to run our code, and you are stuck on, say, server A, then this advice may not help.
我发现的一个技巧,但可能不是其他人的选择,是在拥有最多数据或正在执行最多更新/插入的服务器上运行任何程序。例如,我有一个比较两个表和从 A 到 B 的插入/更新的过程。如果我在服务器 A 上运行它,它会比在 B 上运行该过程花费的时间长很多倍。如果你没有选择在哪里运行我们的代码,而您卡在服务器 A 上,那么此建议可能无济于事。
Another tip is to reduce the data returned to the minimum necessary. Whereas you might normally have data returned almost instantly on a local server, if a linked server is some distance away then the latency can be very painful. Be stricter than normal in accessing only those columns you need.
另一个技巧是将返回的数据减少到必要的最小值。虽然您通常可能会在本地服务器上几乎立即返回数据,但如果链接服务器相距较远,那么延迟可能会非常痛苦。在仅访问您需要的那些列时比平时更严格。
回答by mattmc3
I've found that if you're doing outer joins (left/right) the performance degrades fast. It's sometimes faster to select the data from the remote server into a temp table and index it rather than joining across the network. Mostly, the best strategy is to write the query the way it makes sense and then only tune it if performance is a real problem.
我发现如果你在做外连接(左/右),性能会快速下降。有时,将远程服务器中的数据选择到临时表中并对其进行索引,而不是通过网络加入会更快。大多数情况下,最好的策略是按照有意义的方式编写查询,然后仅在性能确实存在问题时才对其进行调整。
回答by Ivan
@George2,
@乔治2,
Sam Saffron is correct in this case. When a join is performed locally then SQL Server uses indexes to perform a join and then does lookups for the columns not included into an index definition.
Sam Saffron 在这种情况下是正确的。在本地执行连接时,SQL Server 使用索引执行连接,然后查找未包含在索引定义中的列。
With linked server to do a join all table needs to be transferred from a remote server first, then join is performed. This is a bottle neck. If you can pre-filter all remoted tables before joining them to local tables it will considerably improve performance (e.g. select into #temp tables with good filter to reduce number of rows), then if you need to perform multiple operations against that table you are better off creating an index right away.
使用链接服务器进行连接时,需要先从远程服务器传输所有表,然后执行连接。这是瓶颈。如果您可以在将所有远程表加入本地表之前对其进行预过滤,这将显着提高性能(例如,选择具有良好过滤器的 #temp 表以减少行数),那么如果您需要对该表执行多项操作,您将最好立即创建索引。