Python 如何将one-hot编码转换为整数？

Question

提问by Hyman

I have a numpy array data set with shape (100,10). Each row is a one-hot encoding. I want to transfer it into a nd-array with shape (100,) such that I transferred each vector row into a integer that denote the index of the nonzero index. Is there a quick way of doing this using numpy or tensorflow?

我有一个形状为 (100,10) 的 numpy 数组数据集。每一行都是一个单热编码。我想将它传输到一个形状为 (100,) 的 nd 数组中，这样我将每个向量行转换为一个整数，表示非零索引的索引。是否有使用 numpy 或 tensorflow 快速执行此操作的方法？

Answer 1

采纳答案by JawguyChooser

As pointed out by Franck Dernoncourt, since a one hot encoding only has a single 1 and the rest are zeros, you can use argmax for this particular example. In general, if you want to find a value in a numpy array, you'll probabaly want to consult numpy.where. Also, this stack exchange question:

正如 Franck Dernoncourt 所指出的，由于 one hot 编码只有一个 1，其余的都是 0，因此您可以在这个特定示例中使用 argmax。一般来说，如果你想在一个 numpy 数组中找到一个值，你可能会想咨询numpy.where。另外，这个堆栈交换问题：

Is there a NumPy function to return the first index of something in an array?

是否有一个 NumPy 函数来返回数组中某物的第一个索引？

Since a one-hot vector is a vector with all 0s and a single 1, you can do something like this:

由于 one-hot 向量是一个全为 0 且只有一个 1 的向量，因此您可以执行以下操作：

>>> import numpy as np
>>> a = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]])
>>> [np.where(r==1)[0][0] for r in a]
[1, 0, 3]

This just builds a list of the index which is 1 for each row. The [0][0] indexing is just to ditch the structure (a tuple with an array) returned by np.wherewhich is more than you asked for.

这只是构建一个索引列表，每行都为 1。[0][0] 索引只是为了放弃返回的结构（带有数组的元组），np.where它比您要求的要多。

For any particular row, you just want to index into a. For example in the zeroth row the 1 is found in index 1.

对于任何特定行，您只想索引到 a. 例如，在第 0 行，在索引 1 中找到 1。

>>> np.where(a[0]==1)[0][0]
1

Answer 2

回答by Franck Dernoncourt

You can use numpy.argmaxor tf.argmax. Example:

您可以使用 numpy.argmax或 tf.argmax。例子：

import numpy as np  
a  = np.array([[0,1,0,0],[1,0,0,0],[0,0,0,1]])
print('np.argmax(a, axis=1): {0}'.format(np.argmax(a, axis=1)))

output:

输出：

np.argmax(a, axis=1): [1 0 3]

You may also want to look at sklearn.preprocessing.LabelBinarizer.inverse_transform.

您可能还想查看 sklearn.preprocessing.LabelBinarizer.inverse_transform.

Answer 3

回答by user9114146

Simply use np.argmax(x, axis=1)

只需使用 np.argmax(x, axis=1)

Example:

例子：

import numpy as np
array = np.array([[0, 1, 0, 0], [0, 0, 0, 1]])
print(np.argmax(array, axis=1))
> [1 3]

Answer 4

回答by Martin Thoma

While I strongly suggest to use numpy for speed, mpu.ml.one_hot2indices(one_hots)shows how to do it without numpy. Simply pip install mpu --user --upgrade.

虽然我强烈建议使用 numpy 来提高速度，但mpu.ml.one_hot2indices(one_hots)展示了如何在没有 numpy 的情况下做到这一点。简直了pip install mpu --user --upgrade。

Then you can do

然后你可以做

>>> one_hot2indices([[1, 0], [1, 0], [0, 1]])
[0, 0, 1]

Answer 5

回答by Iván Sánchez

def int_to_onehot(n, n_classes):
    v = [0] * n_classes
    v[n] = 1
    return v

def onehot_to_int(v):
    return v.index(1)


>>> v = int_to_onehot(2, 5)
>>> v
[0, 0, 1, 0, 0]


>>> i = onehot_to_int(v)
>>> i
2

Answer 6

回答by Emre Tatbak

You can use this simple code:

您可以使用这个简单的代码：

a=[[0,0,0,0,0,1,0,0,0,0]]
j=0
for i in a[0]:
    if i==1:
        print(j)
    else:
        j+=1

5

Answer 7

回答by Pando MM

What I do in these cases is something like this. The idea is to interpret the one-hot vector as an index of a 1,2,3,4,5... array.

在这些情况下，我所做的就是这样。这个想法是将 one-hot 向量解释为 1,2,3,4,5... 数组的索引。

# Define stuff
import numpy as np
one_hots = np.zeros([100,10])
for k in range(100):
    one_hots[k,:] = np.random.permutation([1,0,0,0,0,0,0,0,0,0])

# Finally, the trick
ramp = np.tile(np.arange(0,10),[100,1])
integers = ramp[one_hots==1].ravel()

I prefer this trick because I feel np.argmaxand other suggested solutions may be slower than indexing (although indexing may consume more memory)

我更喜欢这个技巧，因为我觉得np.argmax其他建议的解决方案可能比索引慢（尽管索引可能会消耗更多内存）

Python 如何将one-hot编码转换为整数？

提问by Hyman

采纳答案by JawguyChooser

回答by Franck Dernoncourt

回答by user9114146

回答by Martin Thoma

回答by Iván Sánchez

回答by Emre Tatbak

回答by Pando MM

相关推荐

最近更新

标签

Python 如何将one-hot编码转换为整数？

提问by Hyman

采纳答案by JawguyChooser

回答by Franck Dernoncourt

回答by user9114146

回答by Martin Thoma

回答by Iván Sánchez

回答by Emre Tatbak

回答by Pando MM

相关推荐

Python 在 matplotlib imshow 中调整网格线和刻度线

使用“浏览”按钮在 Tkinter 中显示文件的路径 - Python

Python：无法将浮点 NaN 转换为整数

pyinstaller 创建 EXE 运行时错误：调用 Python 对象时超出了最大递归深度

相关推荐

最近更新

标签