SQL 想了解更多关于 NTILE()

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14355324/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 13:09:50  来源:igfitidea点击:

Want to learn more on NTILE()

sqlsql-serversql-server-2008tsql

提问by 10e5x

I was reading on RANKING function for ms sql. I understand the others function except NTILE(). Lets say if i have this data:

我正在阅读 ms sql 的 RANKING 函数。我理解除 NTILE() 之外的其他函数。假设我有这些数据:

   StudentID     MARKS  
      S1           75  
      S2           83
      S3           91
      S4           83
      S5           93  

So if i do a NTILE(2) OVER(ORDER BY MARKS desc)what will be the result and why?
And what if it is a NTILE(3)? Simple explaination anyone?

那么,如果我做一个NTILE(2) OVER(ORDER BY MARKS desc)结果会是什么,为什么?
如果它是一个NTILE(3)? 简单的解释有人吗?

回答by SQLMenace

Think of it as buckets, NTILE(2) will make 2 buckets, half the rows will have the value 1 and the other half the value 2

将其视为存储桶,NTILE(2) 将创建 2 个存储桶,其中一半的行的值为 1,另一半的值为 2

example

例子

create table  #temp(StudentID char(2),    Marks  int) 
insert #temp  values('S1',75 ) 
insert #temp  values('S2',83)
insert #temp  values('S3',91)
insert #temp  values('S4',83)
insert #temp  values('S5',93 ) 


select NTILE(2) over(order by Marks),*
from #temp
order by Marks

Here is the output, since you have an uneven number of rows, bucket 1 will have 1 row more

这是输出,因为您的行数是奇数,所以存储桶 1 将多 1 行

1   S1  75
1   S2  83
1   S4  83
2   S3  91
2   S5  93

If you add one more row

如果再添加一行

insert #temp  values('S6',92 ) 

Now both buckets have 3 rows

现在两个桶都有 3 行

1   S1  75
1   S2  83
1   S4  83
2   S3  91
2   S6  92
2   S5  93

In reality I have never used NTILE in production code but I can see the use where you need to split the results into n number of buckets

实际上,我从未在生产代码中使用过 NTILE,但我可以看到需要将结果拆分为 n 个桶的用途

回答by Aditi

It will arrange the data in descending order of marks and then split it into 2 groups.

它将按照标记的降序排列数据,然后将其分成 2 组。

If the data cannot be split into equal groups, then the first few groups will have more rows than the latter groups.

如果数据不能分成相等的组,那么前几组的行数将比后几组多。

So NTILE(2) will give you

所以 NTILE(2) 会给你

StudentID       MARKS       NTILE  
      S5           93           1 
      S3           91           1 
      S2           83           1
      S4           83           2
      S1           75           2 

Similarly NTILE(3) will give you

同样 NTILE(3) 会给你

StudentID       MARKS       NTILE  
      S5           93           1 
      S3           91           1 
      S2           83           2
      S4           83           2
      S1           75           3  

回答by nick

I use NTILE quite frequently to split email lists into buckets for 10/10/80 testing. For example, we are testing a subject line of an email, and want to send one of two options to 10% each of the list, with the one that performs better being sent to the remaining 80%.

我经常使用 NTILE 将电子邮件列表分成多个桶以进行 10/10/80 测试。例如,我们正在测试一封电子邮件的主题行,并希望将两个选项之一发送到每个列表的 10%,将性能更好的发送到剩余的 80%。

SELECT [field list],(NTILE(10) OVER(order by newid()))-1 AS Segment FROM [data]

SELECT [字段列表],(NTILE(10) OVER(order by newid()))-1 AS Segment FROM [data]

The "order by newid()" ensures a random order. The "[NTILE ...]-1" syntax is a direct result of some of the other tools we use doing text parsing instead of integer math, so it was easier to have the results run from 0-9 rather than 1-10. The segment field will be populated with a value from 0-9 which I can use to separate out 10% of the records quite easily, and more than once for campaigns with multiple efforts to them.

“按 newid() 排序”可确保随机顺序。“[NTILE ...]-1”语法是我们用来进行文本解析而不是整数数学的一些其他工具的直接结果,因此让结果从 0-9 而不是 1-10 运行更容易. 细分字段将填充一个 0-9 之间的值,我可以使用它很容易地分离出 10% 的记录,并且不止一次用于对它们进行多次努力的活动。

If you need a query with replicable results, you would need to use something deterministic in the "order by" clause, or add a column with a GUID to use for the order by clause.

如果您需要具有可复制结果的查询,则需要在“order by”子句中使用确定性的内容,或者添加带有 GUID 的列以用于 order by 子句。

The PARTITION BY clause would be used to create groups of buckets based on state, or profession, or some other predetermined grouping, i.e. NTILE(10) OVER (PARTITION BY State ORDER BY newid()) or some such. I believe the ORDER BY clause is required - the PARTITION BY is optional.

PARTITION BY 子句将用于根据状态、职业或其他一些预先确定的分组来创建桶组,即 NTILE(10) OVER (PARTITION BY State ORDER BY newid()) 等。我相信 ORDER BY 子句是必需的 - PARTITION BY 是可选的。

回答by sam

In Ntile function first it count the number of rows and divide it by the paramenter passed in ntile and then make a equal group of rows according to the quotient and rank them and then remaining rows will distributed by each group from the top in a shifting manner and will not take it from the least rows eg if group1 has 4 rows then it will take 5th row in its group not the last row.

在 Ntile 函数中,它首先计算行数并除以 ntile 中传入的参数,然后根据商将行分成相等的一组并对其进行排序,然后剩余的行将由每组从顶部以移位的方式分配并且不会从最少的行中获取它,例如,如果 group1 有 4 行,那么它将占用其组中的第 5 行而不是最后一行。

Thanks

谢谢

回答by Sagar Byali

Ntile without using partition clause, just divide the dataset based on the number in the ntile(number) such that : if no of rows are 7, example: 1,1,1,2,3,4,5 ntile(3) will give 3,2,2. How did i get 3,2,2?. Firstly assume 7 as 6 (one less to make it even), 6/3 gives 2,2,2 , then add +! to first partition. If the no.of rows are even then no problem. just divide the dataset

Ntile 不使用 partition 子句,只需根据 ntile(number) 中的数字划分数据集,例如:如果行数为 7,则示例:1,1,1,2,3,4,5 ntile(3) 将给 3,2,2。我是怎么得到 3,2,2 的?首先假设 7 为 6(少一个以使其相等),6/3 给出 2,2,2 ,然后添加 +!到第一个分区。如果行数是偶数,那就没问题了。只需划分数据集

Ntile using partition clause, just divide the dataset based on the values in dataset such that : if no of rows are 7,Example row values are: 1,1,1,2,3,4,5 then: ntile(3) partitioned by value will give: 1,2,3,1,1,1,1. How did i get this?. Firstly break the dataset based on values: here, 1,1,1 is one partiton, next all values form a different partition. Then start assigning ntile rank to each partition. Here, 1,1,1 will become 1,2,3 then continue with the next partition, you can pull the rank only till the number specified in ntile() function

Ntile 使用 partition 子句,只需根据数据集中的值划分数据集,例如:如果行数为 7,示例行值为:1,1,1,2,3,4,5 则:ntile(3) partitioned按值将给出:1,2,3,1,1,1,1。我是怎么得到这个的?。首先根据值打破数据集:这里,1,1,1 是一个分区,接下来所有值形成一个不同的分区。然后开始为每个分区分配 ntile rank。在这里,1,1,1 将变成 1,2,3 然后继续下一个分区,您只能将排名拉到 ntile() 函数中指定的数字