Python“数组的索引太多”

Question

提问by Farhan Javed

I am reading a file in python using pandas and then saving it in a numpy array. The file has the dimension of 11303402 rows x 10 columns. I need to split the data for cross validation and for that I sliced the data into 11303402 rows x 9 columns of examples and 1 array of 11303402 rows x 1 col of labels. The following is the code:

我正在使用 Pandas 在 python 中读取文件，然后将其保存在一个 numpy 数组中。该文件的尺寸为 11303402 行 x 10 列。我需要拆分数据以进行交叉验证，为此我将数据切成 11303402 行 x 9 列示例和 1 个 11303402 行 x 1 列标签数组。以下是代码：

tdata=pd.read_csv('train.csv')
tdata.columns='Arrival_Time','Creation_Time','x','y','z','User','Model','Device','sensor','gt']

User_Data = np.array(tdata)
features = User_Data[:,0:9]
labels = User_Data[:,9:10]

The error comes in the following code:

错误出现在以下代码中：

classes=np.unique(labels)
idx=labels==classes[0]
Yt=labels[idx]
Xt=features[idx,:]

On the line:

在线上：

Xt=features[idx,:]

it says 'too many indices for array'

它说“数组的索引太多”

The shapes of all 3 data sets are:

所有 3 个数据集的形状是：

print np.shape(tdata) = (11303402, 10)
print np.shape(features) = (11303402, 9)
print np.shape(labels) = (11303402, 1)

If anyone knows the problem, please help.

如果有人知道问题，请帮助。

Answer 1

回答by Keith Prussing

The problem is idxhas shape (11303402,1)because the logical comparison returns an array of the same shape as labels. These two dimensions use all of the indexes in features. The quick work around is

问题是idx有形状，(11303402,1)因为逻辑比较返回一个与形状相同的数组labels。这两个维度使用中的所有索引features。快速解决方法是

Xt=features[idx[:,0],:]

Python“数组的索引太多”

提问by Farhan Javed

回答by Keith Prussing

相关推荐

最近更新

标签

Python“数组的索引太多”

提问by Farhan Javed

回答by Keith Prussing

相关推荐

Python 无法通过 pip 安装 Django 2.0

Python 在 Windows 中使用 anaconda 安装 tensorflow

Python 上的操作顺序是如何进行的？

使用 python 3.5 安装 cPickle

相关推荐

最近更新

标签