Python Arff Loader:AttributeError:'dict'对象没有属性'data'

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28966434/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 03:57:02  来源:igfitidea点击:

Arff Loader : AttributeError: 'dict' object has no attribute 'data'

pythonattributesruntime-errorarff

提问by Erdnase

I am trying to load a .arff file into a numpy array using liac-arff library. (https://github.com/renatopp/liac-arff)

我正在尝试使用 liac-arff 库将 .arff 文件加载到 numpy 数组中。( https://github.com/renatopp/liac-arff)

This is my code.

这是我的代码。

import arff, numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset.data)

when executing, I am getting the error.

执行时,我收到错误。

ArffLoader.py", line 8, in <module>
data = np.array(dataset.data)
AttributeError: 'dict' object has no attribute 'data'

I have seen similar threads, Smartsheet Data Tracker: AttributeError: 'dict' object has no attribute 'append'. I am new to Python and is not able to resolve this issue. How can I fix this?

我见过类似的线程,Smartsheet Data Tracker: AttributeError: 'dict' object has no attribute 'append'。我是 Python 新手,无法解决此问题。我怎样才能解决这个问题?

采纳答案by TheBlackCat

Short version

精简版

datasetis a dict. For a dict, you access the values using the python indexing notation, dataset[key], where keycould be a string, integer, float, tuple, or any other immutable data type (it is a bit more complicated than that, more below if you are interested).

dataset是一个dict。对于 a dict,您可以使用 python 索引符号访问值,dataset[key],其中key可以是字符串、整数、浮点数、元组或任何其他不可变数据类型(它比这更复杂一点,如果您感兴趣,可以在下面详细了解)。

In your case, the key is in the form of a string. To access it, you need to give the string you want as an index, like so:

在您的情况下,密钥是字符串的形式。要访问它,您需要提供您想要的字符串作为索引,如下所示:

import arff
import numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])

(you also shouldn't put the imports on the same line, although this is just a readability issue)

(您也不应该将导入放在同一行上,尽管这只是一个可读性问题)

More detailed explanation

更详细的解释

datasetis a dict, which on some languages is called a mapor hashtable. In a dict, you access values in a similar way to how you index in a list or array, except the "index" can be any data-type that is "hashable" (which is, ideally, unique identifier for each possible value). This "index" is called a "key". In practice, at least for built-in types and most major packages, only immutable data types or hashable, but there is no actual rule that requires this to be the case.

dataset是 a dict,在某些语言中称为 amaphashtable。在 a 中dict,您以与在列表或数组中索引的方式类似的方式访问值,除了“索引”可以是“可散列”的任何数据类型(理想情况下,每个可能值的唯一标识符)。这个“索引”被称为“键”。在实践中,至少对于内置类型和大多数主要包,只有不可变数据类型或可散列,但没有实际规则要求这种情况。

Do you come from MATLAB? If so, then you are probably trying to use MATLAB'sstructaccess technique. You could think of a dictas a much faster, more flexible struct, but syntax for accessing values are is different.

你从哪里来MATLAB?如果是这样,那么您可能正在尝试使用MATLAB'sstruct访问技术。您可以将 adict视为更快、更灵活的struct,但访问值的语法是不同的。

回答by Thirumal Alagu

Its easy to load arff data into python using scipy.

使用 scipy 很容易将 arff 数据加载到 python 中。

from scipy.io import arff

import pandas as pd

data = arff.loadarff('dataset.arff')

df = pd.DataFrame(data[0])

df.head()