Python Numpy argsort - 它在做什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17901218/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:25:57  来源:igfitidea点击:

Numpy argsort - what is it doing?

pythonnumpy

提问by user1276273

Why is numpy giving this result:

为什么 numpy 给出这个结果:

x = numpy.array([1.48,1.41,0.0,0.1])
print x.argsort()

>[2 3 1 0]

when I'd expect it to do this:

当我希望它这样做时:

[3 2 0 1]

[3 2 0 1]

Clearly my understanding of the function is lacking.

显然,我对这个函数的理解是缺乏的。

采纳答案by falsetru

According to the documentation

根据文档

Returns the indices that would sort an array.

返回对数组进行排序的索引。

  • 2is the index of 0.0.
  • 3is the index of 0.1.
  • 1is the index of 1.41.
  • 0is the index of 1.48.
  • 2是 的索引0.0
  • 3是 的索引0.1
  • 1是 的索引1.41
  • 0是 的索引1.48

回答by unutbu

[2, 3, 1, 0]indicates that the smallest element is at index 2, the next smallest at index 3, then index 1, then index 0.

[2, 3, 1, 0]表示最小元素位于索引 2,下一个最小元素位于索引 3,然后是索引 1,然后是索引 0。

There are a number of waysto get the result you are looking for:

多种方法可以获得您正在寻找的结果:

import numpy as np
import scipy.stats as stats

def using_indexed_assignment(x):
    "https://stackoverflow.com/a/5284703/190597 (Sven Marnach)"
    result = np.empty(len(x), dtype=int)
    temp = x.argsort()
    result[temp] = np.arange(len(x))
    return result

def using_rankdata(x):
    return stats.rankdata(x)-1

def using_argsort_twice(x):
    "https://stackoverflow.com/a/6266510/190597 (k.rooijers)"
    return np.argsort(np.argsort(x))

def using_digitize(x):
    unique_vals, index = np.unique(x, return_inverse=True)
    return np.digitize(x, bins=unique_vals) - 1


For example,

例如,

In [72]: x = np.array([1.48,1.41,0.0,0.1])

In [73]: using_indexed_assignment(x)
Out[73]: array([3, 2, 0, 1])


This checks that they all produce the same result:

这会检查它们是否都产生相同的结果:

x = np.random.random(10**5)
expected = using_indexed_assignment(x)
for func in (using_argsort_twice, using_digitize, using_rankdata):
    assert np.allclose(expected, func(x))

These IPython %timeitbenchmarks suggests for large arrays using_indexed_assignmentis the fastest:

这些 IPython%timeit基准测试表明大型数组using_indexed_assignment是最快的:

In [50]: x = np.random.random(10**5)
In [66]: %timeit using_indexed_assignment(x)
100 loops, best of 3: 9.32 ms per loop

In [70]: %timeit using_rankdata(x)
100 loops, best of 3: 10.6 ms per loop

In [56]: %timeit using_argsort_twice(x)
100 loops, best of 3: 16.2 ms per loop

In [59]: %timeit using_digitize(x)
10 loops, best of 3: 27 ms per loop

For small arrays, using_argsort_twicemay be faster:

对于小数组,using_argsort_twice可能会更快:

In [78]: x = np.random.random(10**2)

In [81]: %timeit using_argsort_twice(x)
100000 loops, best of 3: 3.45 μs per loop

In [79]: %timeit using_indexed_assignment(x)
100000 loops, best of 3: 4.78 μs per loop

In [80]: %timeit using_rankdata(x)
100000 loops, best of 3: 19 μs per loop

In [82]: %timeit using_digitize(x)
10000 loops, best of 3: 26.2 μs per loop


Note also that stats.rankdatagives you more control over how to handle elements of equal value.

另请注意,这stats.rankdata使您可以更好地控制如何处理等值的元素。

回答by BrenBarn

As the documentationsays, argsort:

正如文档所说,argsort

Returns the indices that would sort an array.

返回对数组进行排序的索引。

That means the first element of the argsort is the index of the element that should be sorted first, the second element is the index of the element that should be second, etc.

这意味着 argsort 的第一个元素是应该首先排序的元素的索引,第二个元素是应该是第二个元素的索引,依此类推。

What you seem to want is the rank order of the values, which is what is provided by scipy.stats.rankdata. Note that you need to think about what should happen if there are ties in the ranks.

您似乎想要的是值的排名顺序,这是由scipy.stats.rankdata. 请注意,您需要考虑如果行列中存在联系会发生什么。

回答by Rodrigo Saraguro

First, it was ordered the array. Then generate an array with the initial index of the array.

首先,它是有序的阵列。然后用数组的初始索引生成一个数组。

回答by Multihunter

Just want to directly contrast the OP's original understanding against the actual implementation with code.

只是想直接将 OP 的原始理解与代码的实际实现进行对比。

numpy.argsortis defined such that for 1D arrays:

numpy.argsort被定义为对于一维数组:

x[x.argsort()] == numpy.sort(x) # this will be an array of True's

The OP originally thought that it was defined such that for 1D arrays:

OP 最初认为它是这样定义的,对于一维数组:

x == numpy.sort(x)[x.argsort()] # this will not be True

Note:This code doesn't work in the general case (only works for 1D), this answer is purely for illustration purposes.

注意:此代码在一般情况下不起作用(仅适用于 1D),此答案仅用于说明目的。

回答by JMpony

input:
import numpy as np
x = np.array([1.48,1.41,0.0,0.1])
x.argsort().argsort()

输入:
将 numpy 导入为 np
x = np.array([1.48,1.41,0.0,0.1])
x.argsort().argsort()

output:
array([3, 2, 0, 1])

输出:
数组([3, 2, 0, 1])

回答by vivek

np.argsort returns the index of the sorted array given by the 'kind' (which specifies the type of sorting algorithm). However, when a list is used with np.argmax, it returns the index of the largest element in the list. While, np.sort, sorts the given array, list.

np.argsort 返回由“种类”(指定排序算法的类型)给出的排序数组的索引。但是,当列表与 np.argmax 一起使用时,它返回列表中最大元素的索引。而 np.sort 对给定的数组列表进行排序。

回答by Yogesh

numpy.argsort(a, axis=-1, kind='quicksort', order=None)

numpy.argsort(a,axis=-1,kind='quicksort',order=None)

Returns the indices that would sort an array

返回对数组进行排序的索引

Perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as that index data along the given axis in sorted order.

使用 kind 关键字指定的算法沿给定轴执行间接排序。它以排序的顺序返回与给定轴上的索引数据具有相同形状的索引数组。

Consider one example in python, having a list of values as

考虑 python 中的一个例子,有一个值列表

listExample  = [0 , 2, 2456,  2000, 5000, 0, 1]

Now we use argsort function:

现在我们使用 argsort 函数:

import numpy as np
list(np.argsort(listExample))

The output will be

输出将是

[0, 5, 6, 1, 3, 2, 4]

This is the list of indices of values in listExample if you map these indices to the respective values then we will get the result as follows:

这是 listExample 中值的索引列表,如果您将这些索引映射到相应的值,那么我们将得到如下结果:

[0, 0, 1, 2, 2000, 2456, 5000]

(I find this function very useful in many places e.g. If you want to sort the list/array but don't want to use list.sort() function (i.e. without changing the order of actual values in the list) you can use this function.)

(我发现这个函数在很多地方都非常有用,例如如果你想对列表/数组进行排序但不想使用 list.sort() 函数(即不改变列表中实际值的顺序),你可以使用它功能。)

For more details refer this link: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.argsort.html

有关更多详细信息,请参阅此链接:https: //docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.argsort.html

回答by nucsit026

It returns indices according to the given array indices,[1.48,1.41,0.0,0.1],that means: 0.0is the first element, in index [2]. 0.1is the second element, in index[3]. 1.41is the third element, in index [1]. 1.48is the fourth element, in index[0]. Output:

它根据给定的数组索引返回索引,[1.48,1.41,0.0,0.1],这意味着: 0.0是索引 [2] 中的第一个元素。 0.1是索引 [3] 中的第二个元素。 1.41是索引 [1] 中的第三个元素。 1.48是第四个元素,在 index[0] 中。输出:

[2,3,1,0]