如何在 SQL 中计算数据的第 90 个百分位、SD、平均值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33765734/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to calculate 90th Percentile, SD, Mean for data in SQL
提问by Agga
Hi I have a table facility. Which holds a score for each day (Multiple scores can be reported each day and both would be valid)
嗨,我有一个餐桌设施。每天都有一个分数(每天可以报告多个分数,两者都有效)
I need to calculate the 90th percentile, SD, and Mean for score by month.
我需要按月计算得分的第 90 个百分位数、SD 和平均值。
Facility:
设施:
Id Month Date score
1 Jan 1 5
1 Jan 1 5
1 Jan 2 3
1 Jan 3 4
1 Jan 4 4
1 Jan 5 4
1 Feb 1 5
1 Feb 1 5
1 Feb 2 3
1 Feb 3 4
1 Feb 4 4
1 Feb 5 4
Is there any way?
有什么办法吗?
Thanks for your help.
谢谢你的帮助。
回答by Code Different
You can use the new suite of analytic functions introduced in SQL Server 2012:
您可以使用 SQL Server 2012 中引入的新分析函数套件:
SELECT DISTINCT
[Month],
Mean = AVG(Score) OVER (PARTITION BY [Month]),
StdDev = STDEV(Score) OVER (PARTITION BY [Month]),
P90 = PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY Score) OVER (PARTITION BY [Month])
FROM my_table
There are 2 percentile functions: PERCENTILE_CONT
for continuous distribution and PERCENTILE_DISC
for discrete distribution. Picks one that suits your needs.
有 2 个百分位函数:PERCENTILE_CONT
用于连续分布和PERCENTILE_DISC
用于离散分布。选择一款适合您的需求。
回答by LDMJoe
Here's the setup...
这是设置...
CREATE TABLE Facility (Id INT NOT NULL, Month nvarchar(3) NOT NULL, Date INT NOT NULL, score INT NOT NULL)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 5, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 5, 4)
Now, Standard Deviation and Mean are straight forward enough - there are built in aggregate functions for them...
现在,标准偏差和平均值已经足够直接了——它们有内置的聚合函数......
SELECT
[Month],
AVG(CONVERT(real, score)) AS [Mean],
STDEV(score) AS [Standard Deviation]
FROM
Facility
GROUP BY
[Month]
For your 90th percentile, you'll need to invent a function...
对于第 90 个百分位数,您需要发明一个函数……
CREATE FUNCTION NintythPercentile(@Month nvarchar(3)) RETURNS INT AS
BEGIN
DECLARE @ReturnValue INT
SELECT
@ReturnValue = MIN(DerivedTopTenPercent.score) --AS [90th Percentile]
FROM
(
SELECT TOP 10 PERCENT
score
FROM
Facility
WHERE
[Month] = @Month
ORDER BY
score DESC
) DerivedTopTenPercent
RETURN @ReturnValue
END
With that function in place, your final query will look like this...
有了该功能,您的最终查询将如下所示...
SELECT
[Month],
AVG(CONVERT(real, score)) AS [Mean],
STDEV(score) AS [Standard Deviation],
dbo.NintythPercentile([Month]) AS [90th Percentile]
FROM
Facility
GROUP BY
[Month]