Python 如何将one-hot编码转换为整数?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42497340/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert one-hot encodings into integers?
提问by Hyman
I have a numpy array data set with shape (100,10). Each row is a one-hot encoding. I want to transfer it into a nd-array with shape (100,) such that I transferred each vector row into a integer that denote the index of the nonzero index. Is there a quick way of doing this using numpy or tensorflow?
我有一个形状为 (100,10) 的 numpy 数组数据集。每一行都是一个单热编码。我想将它传输到一个形状为 (100,) 的 nd 数组中,这样我将每个向量行转换为一个整数,表示非零索引的索引。是否有使用 numpy 或 tensorflow 快速执行此操作的方法?
采纳答案by JawguyChooser
As pointed out by Franck Dernoncourt, since a one hot encoding only has a single 1 and the rest are zeros, you can use argmax for this particular example. In general, if you want to find a value in a numpy array, you'll probabaly want to consult numpy.where. Also, this stack exchange question:
正如 Franck Dernoncourt 所指出的,由于 one hot 编码只有一个 1,其余的都是 0,因此您可以在这个特定示例中使用 argmax。一般来说,如果你想在一个 numpy 数组中找到一个值,你可能会想咨询numpy.where。另外,这个堆栈交换问题:
Is there a NumPy function to return the first index of something in an array?
Since a one-hot vector is a vector with all 0s and a single 1, you can do something like this:
由于 one-hot 向量是一个全为 0 且只有一个 1 的向量,因此您可以执行以下操作:
>>> import numpy as np
>>> a = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]])
>>> [np.where(r==1)[0][0] for r in a]
[1, 0, 3]
This just builds a list of the index which is 1 for each row. The [0][0] indexing is just to ditch the structure (a tuple with an array) returned by np.where
which is more than you asked for.
这只是构建一个索引列表,每行都为 1。[0][0] 索引只是为了放弃返回的结构(带有数组的元组),np.where
它比您要求的要多。
For any particular row, you just want to index into a. For example in the zeroth row the 1 is found in index 1.
对于任何特定行,您只想索引到 a. 例如,在第 0 行,在索引 1 中找到 1。
>>> np.where(a[0]==1)[0][0]
1
回答by Franck Dernoncourt
You can use numpy.argmaxor tf.argmax. Example:
您可以使用 numpy.argmax或 tf.argmax。例子:
import numpy as np
a = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]])
print('np.argmax(a, axis=1): {0}'.format(np.argmax(a, axis=1)))
output:
输出:
np.argmax(a, axis=1): [1 0 3]
You may also want to look at sklearn.preprocessing.LabelBinarizer.inverse_transform
.
您可能还想查看 sklearn.preprocessing.LabelBinarizer.inverse_transform
.
回答by user9114146
Simply use np.argmax(x, axis=1)
只需使用 np.argmax(x, axis=1)
Example:
例子:
import numpy as np
array = np.array([[0, 1, 0, 0], [0, 0, 0, 1]])
print(np.argmax(array, axis=1))
> [1 3]
回答by Martin Thoma
While I strongly suggest to use numpy for speed, mpu.ml.one_hot2indices(one_hots)
shows how to do it without numpy. Simply pip install mpu --user --upgrade
.
虽然我强烈建议使用 numpy 来提高速度,但mpu.ml.one_hot2indices(one_hots)
展示了如何在没有 numpy 的情况下做到这一点。简直了pip install mpu --user --upgrade
。
Then you can do
然后你可以做
>>> one_hot2indices([[1, 0], [1, 0], [0, 1]])
[0, 0, 1]
回答by Iván Sánchez
def int_to_onehot(n, n_classes):
v = [0] * n_classes
v[n] = 1
return v
def onehot_to_int(v):
return v.index(1)
>>> v = int_to_onehot(2, 5)
>>> v
[0, 0, 1, 0, 0]
>>> i = onehot_to_int(v)
>>> i
2
回答by Emre Tatbak
You can use this simple code:
您可以使用这个简单的代码:
a=[[0,0,0,0,0,1,0,0,0,0]]
j=0
for i in a[0]:
if i==1:
print(j)
else:
j+=1
5
5
回答by Pando MM
What I do in these cases is something like this. The idea is to interpret the one-hot vector as an index of a 1,2,3,4,5... array.
在这些情况下,我所做的就是这样。这个想法是将 one-hot 向量解释为 1,2,3,4,5... 数组的索引。
# Define stuff
import numpy as np
one_hots = np.zeros([100,10])
for k in range(100):
one_hots[k,:] = np.random.permutation([1,0,0,0,0,0,0,0,0,0])
# Finally, the trick
ramp = np.tile(np.arange(0,10),[100,1])
integers = ramp[one_hots==1].ravel()
I prefer this trick because I feel np.argmax
and other suggested solutions may be slower than indexing (although indexing may consume more memory)
我更喜欢这个技巧,因为我觉得np.argmax
其他建议的解决方案可能比索引慢(尽管索引可能会消耗更多内存)