Python 你如何在 Numpy 中找到 IQR?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/23228244/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you find the IQR in Numpy?
提问by Nick T
Is there a baked-in Numpy/Scipy function to find the interquartile range? I can do it pretty easily myself, but mean()exists which is basically sum/len...
是否有内置的 Numpy/Scipy 函数来查找四分位距?我自己可以很容易地做到这一点,但mean()存在基本上sum/len......
def IQR(dist):
return np.percentile(dist, 75) - np.percentile(dist, 25)
采纳答案by Jaime
np.percentiletakes multiple percentile arguments, and you are slightly better off doing:
np.percentile需要多个百分位参数,你最好这样做:
q75, q25 = np.percentile(x, [75 ,25])
iqr = q75 - q25
or
或者
iqr = np.subtract(*np.percentile(x, [75, 25]))
than making two calls to percentile:
而不是两次调用percentile:
In [8]: x = np.random.rand(1e6)
In [9]: %timeit q75, q25 = np.percentile(x, [75 ,25]); iqr = q75 - q25
10 loops, best of 3: 24.2 ms per loop
In [10]: %timeit iqr = np.subtract(*np.percentile(x, [75, 25]))
10 loops, best of 3: 24.2 ms per loop
In [11]: %timeit iqr = np.percentile(x, 75) - np.percentile(x, 25)
10 loops, best of 3: 33.7 ms per loop
回答by Mad Physicist
There is now an iqrfunction in scipy.stats. It is available as of scipy 0.18.0. My original intent was to add it to numpy, but it was considered too domain-specific.
现在有一个iqr函数scipy.stats。它从 scipy 0.18.0 开始可用。我的初衷是将它添加到 numpy,但它被认为过于特定于域。
You may be better off just using Jaime's answer, since the scipy code is just an over-complicated version of the same.
您最好只使用 Jaime 的答案,因为 scipy 代码只是它的一个过于复杂的版本。
回答by Ham
Ignore this if Jaime's answerworks for your case. But if not, according to this answer, to find the exactvalues of 1st and 3rd quartiles, you should consider doing something like:
如果Jaime 的回答适用于您的情况,请忽略这一点。但如果不是,根据这个答案,要找到第 1 和第 3 四分位数的确切值,您应该考虑执行以下操作:
samples = sorted([28, 12, 8, 27, 16, 31, 14, 13, 19, 1, 1, 22, 13])
def find_median(sorted_list):
indices = []
list_size = len(sorted_list)
median = 0
if list_size % 2 == 0:
indices.append(int(list_size / 2) - 1) # -1 because index starts from 0
indices.append(int(list_size / 2))
median = (sorted_list[indices[0]] + sorted_list[indices[1]]) / 2
pass
else:
indices.append(int(list_size / 2))
median = sorted_list[indices[0]]
pass
return median, indices
pass
median, median_indices = find_median(samples)
Q1, Q1_indices = find_median(samples[:median_indices[0]])
Q2, Q2_indices = find_median(samples[median_indices[-1] + 1:])
IQR = Q3 - Q1
quartiles = [Q1, median, Q2]
Code taken from the referenced answer.
代码取自参考答案。

