将 sklearn 函数应用于 Pandas 数据帧会给出 ValueError("Unknown label type: %r" % y)

Question

提问by Alex

The following code gives an error message:

以下代码给出了错误消息：

    >>> import pandas as pd
    >>> from sklearn import preprocessing, svm
    >>> df = pd.DataFrame({"a": [0,1,2], "b":[0,1,2], "c": [0,1,2]})
    >>> clf = svm.SVC()
    >>> df = df.apply(lambda x: preprocessing.scale(x))
    >>> clf.fit(df[["a", "b"]], df["c"])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:\Users\Alexander\Anaconda\lib\site-packages\sklearn\svm\base.py", lin
     151, in fit
        y = self._validate_targets(y)
      File "C:\Users\Alexander\Anaconda\lib\site-packages\sklearn\svm\base.py", lin
     515, in _validate_targets
        check_classification_targets(y)
      File "C:\Users\Alexander\Anaconda\lib\site-packages\sklearn\utils\multiclass.
    y", line 173, in check_classification_targets
        raise ValueError("Unknown label type: %r" % y)
    ValueError: Unknown label type: 0   -1.224745
    1    0.000000
    2    1.224745
    Name: c, dtype: float64

The dtype of the pandas DataFrame is not an object, so applying the sklearn svm function should be fine, but for some reason it does not recognize the classification labels. What is causing this issue?

pandas DataFrame 的 dtype 不是对象，因此应用 sklearn svm 函数应该没问题，但由于某种原因它无法识别分类标签。是什么导致了这个问题？

Answer 1

回答by maxymoo

The issue is that after your scaling step, the labels are float-valued, which is not a valid label-type; if you convert to intor strit should work:

问题是在缩放步骤之后，标签是浮点值，这不是有效的标签类型；如果您转换为int或str它应该可以工作：

In [32]: clf.fit(df[["a", "b"]], df["c"].astype(int))
Out[32]: 
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

将 sklearn 函数应用于 Pandas 数据帧会给出 ValueError("Unknown label type: %r" % y)

提问by Alex

回答by maxymoo

相关推荐

最近更新

标签

将 sklearn 函数应用于 Pandas 数据帧会给出 ValueError("Unknown label type: %r" % y)

提问by Alex

回答by maxymoo

相关推荐

如何在 Pandas 中迭代 MultiIndex 级别？

pandas 将熊猫浮点系列转换为 int

pandas 使用python中pandas的read_excel函数将日期保留为字符串

pandas 使用字典中的值过滤熊猫数据框

相关推荐

最近更新

标签