SQL 多语句表值函数与内联表值函数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2554333/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Multi-statement Table Valued Function vs Inline Table Valued Function
提问by AndrewC
A few examples to show, just incase:
举几个例子,以防万一:
Inline Table Valued
内联表值
CREATE FUNCTION MyNS.GetUnshippedOrders()
RETURNS TABLE
AS
RETURN SELECT a.SaleId, a.CustomerID, b.Qty
FROM Sales.Sales a INNER JOIN Sales.SaleDetail b
ON a.SaleId = b.SaleId
INNER JOIN Production.Product c ON b.ProductID = c.ProductID
WHERE a.ShipDate IS NULL
GO
Multi Statement Table Valued
多语句表值
CREATE FUNCTION MyNS.GetLastShipped(@CustomerID INT)
RETURNS @CustomerOrder TABLE
(SaleOrderID INT NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME NOT NULL,
OrderQty INT NOT NULL)
AS
BEGIN
DECLARE @MaxDate DATETIME
SELECT @MaxDate = MAX(OrderDate)
FROM Sales.SalesOrderHeader
WHERE CustomerID = @CustomerID
INSERT @CustomerOrder
SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
FROM Sales.SalesOrderHeader a INNER JOIN Sales.SalesOrderHeader b
ON a.SalesOrderID = b.SalesOrderID
INNER JOIN Production.Product c ON b.ProductID = c.ProductID
WHERE a.OrderDate = @MaxDate
AND a.CustomerID = @CustomerID
RETURN
END
GO
Is there an advantage to using one type (in-line or multi statement) over the other? Is there certain scenarios when one is better than the other or are the differences purely syntactical? I realise the two example queries are doing different things but is there a reason I would write them in that way?
使用一种类型(内联或多语句)比另一种有优势吗?是否存在某种情况下一种优于另一种的情况,或者差异纯粹是句法上的?我意识到这两个示例查询正在做不同的事情,但是我有理由以这种方式编写它们吗?
Reading about them and the advantages/differences haven't really been explained.
阅读有关它们以及优点/差异的信息还没有真正得到解释。
采纳答案by Thomas
In researching Matt's comment, I have revised my original statement. He is correct, there will be a difference in performance between an inline table valued function (ITVF) and a multi-statement table valued function (MSTVF) even if they both simply execute a SELECT statement. SQL Server will treat an ITVF somewhat like a VIEW
in that it will calculate an execution plan using the latest statistics on the tables in question. A MSTVF is equivalent to stuffing the entire contents of your SELECT statement into a table variable and then joining to that. Thus, the compiler cannot use any table statistics on the tables in the MSTVF. So, all things being equal, (which they rarely are), the ITVF will perform better than the MSTVF. In my tests, the performance difference in completion time was negligible however from a statistics standpoint, it was noticeable.
在研究马特的评论时,我修改了我的原始声明。他是对的,内联表值函数 (ITVF) 和多语句表值函数 (MSTVF) 之间的性能会有差异,即使它们都只是执行 SELECT 语句。SQL Server 会将 ITVF 视为有点像VIEW
因为它将使用有关表的最新统计信息计算执行计划。MSTVF 相当于将 SELECT 语句的全部内容填充到表变量中,然后加入该变量。因此,编译器不能对 MSTVF 中的表使用任何表统计信息。因此,在所有条件相同的情况下(它们很少如此),ITVF 的性能将优于 MSTVF。在我的测试中,完成时间的性能差异可以忽略不计,但从统计的角度来看,这是显而易见的。
In your case, the two functions are not functionally equivalent. The MSTV function does an extra query each time it is called and, most importantly, filters on the customer id. In a large query, the optimizer would not be able to take advantage of other types of joins as it would need to call the function for each customerId passed. However, if you re-wrote your MSTV function like so:
在您的情况下,这两个函数在功能上并不等效。MSTV 函数在每次调用时都会执行一次额外的查询,最重要的是,对客户 ID 进行过滤。在大型查询中,优化器将无法利用其他类型的连接,因为它需要为每个传递的 customerId 调用该函数。但是,如果您像这样重写 MSTV 函数:
CREATE FUNCTION MyNS.GetLastShipped()
RETURNS @CustomerOrder TABLE
(
SaleOrderID INT NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME NOT NULL,
OrderQty INT NOT NULL
)
AS
BEGIN
INSERT @CustomerOrder
SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
FROM Sales.SalesOrderHeader a
INNER JOIN Sales.SalesOrderHeader b
ON a.SalesOrderID = b.SalesOrderID
INNER JOIN Production.Product c
ON b.ProductID = c.ProductID
WHERE a.OrderDate = (
Select Max(SH1.OrderDate)
FROM Sales.SalesOrderHeader As SH1
WHERE SH1.CustomerID = A.CustomerId
)
RETURN
END
GO
In a query, the optimizer would be able to call that function once and build a better execution plan but it still would not be better than an equivalent, non-parameterized ITVS or a VIEW
.
在查询中,优化器将能够调用该函数一次并构建更好的执行计划,但它仍然不会比等效的非参数化 ITVS 或VIEW
.
ITVFs should be preferred over a MSTVFs when feasible because the datatypes, nullability and collation from the columns in the table whereas you declare those properties in a multi-statement table valued function and, importantly, you will get better execution plans from the ITVF. In my experience, I have not found many circumstances where an ITVF was a better option than a VIEW but mileage may vary.
在可行的情况下,ITVF 应该优于 MSTVF,因为表中列的数据类型、可空性和排序规则,而您在多语句表值函数中声明这些属性,重要的是,您将从 ITVF 获得更好的执行计划。根据我的经验,我没有发现很多情况下 ITVF 比 VIEW 更好,但里程可能会有所不同。
Thanks to Matt.
感谢马特。
Addition
添加
Since I saw this come up recently, here is an excellent analysis done by Wayne Sheffield comparing the performance difference between Inline Table Valued functions and Multi-Statement functions.
由于我最近看到了这个问题,这里是 Wayne Sheffield 进行的一项出色的分析,比较了内联表值函数和多语句函数之间的性能差异。
回答by Paul McLoughlin
Internally, SQL Server treats an inline table valued function much like it would a view and treats a multi-statement table valued function similar to how it would a stored procedure.
在内部,SQL Server 像对待视图一样对待内联表值函数,并像对待存储过程一样对待多语句表值函数。
When an inline table-valued function is used as part of an outer query, the query processor expands the UDF definition and generates an execution plan that accesses the underlying objects, using the indexes on these objects.
当内联表值函数用作外部查询的一部分时,查询处理器扩展 UDF 定义并生成访问底层对象的执行计划,使用这些对象上的索引。
For a multi-statement table valued function, an execution plan is created for the function itself and stored in the execution plan cache (once the function has been executed the first time). If multi-statement table valued functions are used as part of larger queries then the optimiser does not know what the function returns, and so makes some standard assumptions - in effect it assumes that the function will return a single row, and that the returns of the function will be accessed by using a table scan against a table with a single row.
对于多语句表值函数,会为函数本身创建一个执行计划并将其存储在执行计划缓存中(一旦函数第一次执行)。如果将多语句表值函数用作较大查询的一部分,则优化器不知道该函数返回什么,因此会做出一些标准假设 - 实际上它假设该函数将返回单行,并且将通过对具有单行的表使用表扫描来访问该函数。
Where multi-statement table valued functions can perform poorly is when they return a large number of rows and are joined against in outer queries. The performance issues are primarily down to the fact that the optimiser will produce a plan assuming that a single row is returned, which will not necessarily be the most appropriate plan.
当多语句表值函数返回大量行并在外部查询中连接时,它们可能表现不佳。性能问题主要归结于这样一个事实,即优化器将生成一个假设返回单行的计划,这不一定是最合适的计划。
As a general rule of thumb we have found that where possible inline table valued functions should be used in preference to multi-statement ones (when the UDF will be used as part of an outer query) due to these potential performance issues.
作为一般经验法则,我们发现,由于这些潜在的性能问题,应尽可能优先使用内联表值函数而不是多语句函数(当 UDF 将用作外部查询的一部分时)。
回答by Craig Beere
There is another difference. An inline table-valued function can be inserted into, updated, and deleted from - just like a view. Similar restrictions apply - can't update functions using aggregates, can't update calculated columns, and so on.
还有一个区别。内联表值函数可以插入、更新和删除 - 就像视图一样。类似的限制适用 - 不能使用聚合更新函数,不能更新计算列,等等。
回答by Ray
Your examples, I think, answer the question very well. The first function can be done as a single select, and is a good reason to use the inline style. The second could probably be done as a single statement (using a sub-query to get the max date), but some coders may find it easier to read or more natural to do it in multiple statements as you have done. Some functions just plain can't get done in one statement, and so require the multi-statement version.
我认为你的例子很好地回答了这个问题。第一个函数可以作为单个选择完成,这是使用内联样式的一个很好的理由。第二个可能可以作为单个语句完成(使用子查询来获取最大日期),但是一些编码人员可能会发现像您所做的那样在多个语句中执行它更容易阅读或更自然。有些功能无法在一个语句中完成,因此需要多语句版本。
I suggest using the simplest (inline) whenever possible, and using multi-statements when necessary (obviously) or when personal preference/readability makes it wirth the extra typing.
我建议尽可能使用最简单的(内联)语句,并在必要时(显然)或当个人偏好/可读性使其需要额外的输入时使用多语句。
回答by hmfarimani
look at Comparing Inline and Multi-Statement Table-Valued Functionsyou can find good descriptions and performance benchmarks
查看Comparing Inline and Multi-Statement Table-Valued Functions你可以找到很好的描述和性能基准
回答by William Egge
I have not tested this, but a multi statement function caches the result set. There may be cases where there is too much going on for the optimizer to inline the function. For example suppose you have a function that returns a result from different databases depending on what you pass as a "Company Number". Normally, you could create a view with a union all then filter by company number but I found that sometimes sql server pulls back the entire union and is not smart enough to call the one select. A table function can have logic to choose the source.
我没有测试过这个,但是一个多语句函数缓存了结果集。在某些情况下,优化器内联函数可能会发生太多事情。例如,假设您有一个函数根据您作为“公司编号”传递的内容从不同的数据库返回结果。通常,您可以创建一个带有联合的视图,然后按公司编号过滤,但我发现有时 sql server 会撤回整个联合并且不够聪明,无法调用一个选择。表函数可以具有选择源的逻辑。
回答by William Egge
Another case to use a multi line function would be to circumvent sql server from pushing down the where clause.
使用多行函数的另一种情况是避免 sql server 下推 where 子句。
For example, I have a table with a table names and some table names are formatted like C05_2019 and C12_2018 and and all tables formatted that way have the same schema. I wanted to merge all that data into one table and parse out 05 and 12 to a CompNo column and 2018,2019 into a year column. However, there are other tables like ACA_StupidTable which I cannot extract CompNo and CompYr and would get a conversion error if I tried. So, my query was in two part, an inner query that returned only tables formatted like 'C_______' then the outer query did a sub-string and int conversion. ie Cast(Substring(2, 2) as int) as CompNo. All looks good except that sql server decided to put my Cast function before the results were filtered and so I get a mind scrambling conversion error. A multi statement table function may prevent that from happening, since it is basically a "new" table.
例如,我有一个带有表名的表,一些表名的格式类似于 C05_2019 和 C12_2018,并且所有以这种方式格式化的表都具有相同的架构。我想将所有数据合并到一个表中,并将 05 和 12 解析为 CompNo 列,将 2018,2019 解析为年份列。但是,还有其他表,如 ACA_StupidTable,我无法提取 CompNo 和 CompYr,如果我尝试会得到转换错误。因此,我的查询分为两部分,一个内部查询仅返回格式为“C_______”的表,然后外部查询执行子字符串和 int 转换。即 Cast(Substring(2, 2) as int) 作为 CompNo。一切看起来都不错,只是 sql server 决定在过滤结果之前放置我的 Cast 函数,所以我得到了一个令人费解的转换错误。多语句表函数可以防止这种情况发生,
回答by LinchenPal
Maybe in a very condensed way. ITVF ( inline TVF) : more if u are DB person, is kind of parameterized view, take a single SELECT st
也许以一种非常浓缩的方式。ITVF(内联 TVF):更多如果你是 DB 人,是一种参数化视图,采取单个 SELECT st
MTVF ( Multi-statement TVF): Developer, creates and load a table variable.
MTVF(多语句 TVF):开发人员,创建并加载表变量。
回答by KM.
if you are going to do a query you can join in your Inline Table Valued function like:
如果您要进行查询,您可以加入您的内联表值函数,例如:
SELECT
a.*,b.*
FROM AAAA a
INNER JOIN MyNS.GetUnshippedOrders() b ON a.z=b.z
it will incur little overhead and run fine.
它会产生很少的开销并且运行良好。
if you try to use your the Multi Statement Table Valued in a similar query, you will have performance issues:
如果您尝试在类似查询中使用多语句表值,则会遇到性能问题:
SELECT
x.a,x.b,x.c,(SELECT OrderQty FROM MyNS.GetLastShipped(x.CustomerID)) AS Qty
FROM xxxx x
because you will execute the function 1 time for each row returned, as the result set gets large, it will run slower and slower.
因为您将针对返回的每一行执行该函数 1 次,随着结果集变大,它的运行速度会越来越慢。