如何在 SQL 中计算数据的第 90 个百分位、SD、平均值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33765734/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 04:11:52  来源:igfitidea点击:

How to calculate 90th Percentile, SD, Mean for data in SQL

sqlsql-serversql-server-2012

提问by Agga

Hi I have a table facility. Which holds a score for each day (Multiple scores can be reported each day and both would be valid)

嗨,我有一个餐桌设施。每天都有一个分数(每天可以报告多个分数,两者都有效)

I need to calculate the 90th percentile, SD, and Mean for score by month.

我需要按月计算得分的第 90 个百分位数、SD 和平均值。

Facility:

设施:

Id   Month Date  score
1    Jan     1    5
1    Jan     1    5
1    Jan     2    3
1    Jan     3    4
1    Jan     4    4
1    Jan     5    4
1    Feb     1    5
1    Feb     1    5
1    Feb     2    3
1    Feb     3    4
1    Feb     4    4
1    Feb     5    4

Is there any way?

有什么办法吗?

Thanks for your help.

谢谢你的帮助。

回答by Code Different

You can use the new suite of analytic functions introduced in SQL Server 2012:

您可以使用 SQL Server 2012 中引入的新分析函数套件:

SELECT DISTINCT
            [Month],
            Mean   = AVG(Score) OVER (PARTITION BY [Month]),
            StdDev = STDEV(Score) OVER (PARTITION BY [Month]),
            P90    = PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY Score) OVER (PARTITION BY [Month])
FROM        my_table

There are 2 percentile functions: PERCENTILE_CONTfor continuous distribution and PERCENTILE_DISCfor discrete distribution. Picks one that suits your needs.

有 2 个百分位函数:PERCENTILE_CONT用于连续分布和PERCENTILE_DISC用于离散分布。选择一款适合您的需求。

回答by LDMJoe

Here's the setup...

这是设置...

CREATE TABLE Facility (Id INT NOT NULL, Month nvarchar(3) NOT NULL, Date INT NOT NULL, score INT NOT NULL)

INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 5, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 5, 4)

Now, Standard Deviation and Mean are straight forward enough - there are built in aggregate functions for them...

现在,标准偏差和平均值已经足够直接了——它们有内置的聚合函数......

SELECT
    [Month],
    AVG(CONVERT(real, score)) AS [Mean],
    STDEV(score) AS [Standard Deviation]
FROM
    Facility
GROUP BY
    [Month]

For your 90th percentile, you'll need to invent a function...

对于第 90 个百分位数,您需要发明一个函数……

CREATE FUNCTION NintythPercentile(@Month nvarchar(3)) RETURNS INT AS
BEGIN
    DECLARE @ReturnValue INT

    SELECT 
        @ReturnValue = MIN(DerivedTopTenPercent.score) --AS [90th Percentile]
    FROM
        (
        SELECT TOP 10 PERCENT
            score
        FROM
            Facility
        WHERE
            [Month] = @Month
        ORDER BY
            score DESC
        ) DerivedTopTenPercent

    RETURN @ReturnValue
END

With that function in place, your final query will look like this...

有了该功能,您的最终查询将如下所示...

SELECT
    [Month],
    AVG(CONVERT(real, score)) AS [Mean],
    STDEV(score) AS [Standard Deviation],
    dbo.NintythPercentile([Month]) AS [90th Percentile]
FROM
    Facility
GROUP BY
    [Month]