Python 索引错误：索引过多

Question

提问by ZeeeeeV

I am trying to use an algorithm in scikit-learn to predict the output based on a number of inputs. I seem to be getting the error 'too many indices' returned, but cannot figure out why.

我正在尝试使用 scikit-learn 中的算法来预测基于多个输入的输出。我似乎收到了返回的错误“索引过多”，但无法弄清楚原因。

CSV File Training:

CSV 文件训练：

 1.1    0.2 0.1 0   0.12    0.1
 1.4    0.2 0.1 0.1 0.14    0.1
 0.1    0.1 0.1 0   0.26    0.1
 24.5   0.1 0   0.1 0.14    0.1
 0.1    0.1 0.1 0   0.25    0.1

Code:

代码：

    fileCSVTraining = genfromtxt('TrainingData.csv', delimiter=',', dtype=None)

    #Define first 6 rows of data as the features
    t = fileCSVTraining[:, 6:]

    #Define which column to put prediction in
    r = fileCSVTraining[:, 0-6:]    
    #Create and train classifier 
    x, y = r, t
    clf = LinearSVC()
    clf = clf.fit(x, y)     
    #New data to predict
    X_new = [1.0, 2.1, 3.0, 2.4, 2.1]
    b = clf.predict(X_new)

Error:

错误：

 t = fileCSVTraining[:, 6:]
 IndexError: too many indices

Answer 1

采纳答案by Warren Weckesser

Based on the comments, I think you want:

根据评论，我认为您想要：

fileCSVTraining = genfromtxt('TrainingData.csv')

Then, to get the "first 6 rows", you would use

然后，要获得“前 6 行”，您可以使用

t = fileCSVTraining[:6, :]

(I'm assuming your actual data file is longer than you've shown. Your example has only 5 rows.)

（我假设您的实际数据文件比您显示的要长。您的示例只有 5 行。）

I suspect your use of array indexing to get ris also incorrect.

我怀疑您使用数组索引来获取r也是不正确的。

Answer 2

回答by ogrisel

Please print your xand yvariables and you will likely see why the data is invalid and fix accordingly.

请打印您的x和y变量，您可能会看到数据无效的原因并相应地进行修复。

Also for the last line:

同样对于最后一行：

X_new = [1.0, 2.1, 3.0, 2.4, 2.1]
b = clf.predict(X_new)

should be:

应该：

X_new = [[1.0, 2.1, 3.0, 2.4, 2.1]]
b = clf.predict(X_new)

as predict expects a collection of samples (2D array of (n_new_samples, n_features)), not a single sample.

正如 predict 期望的样本集合（2D 数组(n_new_samples, n_features)），而不是单个样本。

Answer 3

回答by ZeeeeeV

Array indexing to get r and t was incorrect. Using:

获取 r 和 t 的数组索引不正确。使用：

  t = fileCSVTraining[:, 1-0:]

Got me the required training data, leaving the prediction column.

得到了我需要的训练数据，离开了预测列。

Answer 4

回答by Samsair

It is also important to specify dtype=float because "None" will allow for integers (if there were any in your data) to be included in the array which would force 1-D array instead of a 2-D array. Indexing, as shown, does not work on 1-D.

指定 dtype=float 也很重要，因为“无”将允许将整数（如果您的数据中有任何）包含在数组中，这将强制使用一维数组而不是二维数组。如图所示，索引不适用于一维。

Python 索引错误：索引过多

提问by ZeeeeeV

采纳答案by Warren Weckesser

回答by ogrisel

回答by ZeeeeeV

回答by Samsair

相关推荐

最近更新

标签

Python 索引错误：索引过多

提问by ZeeeeeV

采纳答案by Warren Weckesser

回答by ogrisel

回答by ZeeeeeV

回答by Samsair

相关推荐

使用 cscope 通过 VIM 浏览 Python 代码？

Python json.loads 不起作用

为 Python 项目添加 .gitignore 文件的最佳实践？

Python matplotlib 轴标签格式

相关推荐

最近更新

标签