Python Arff Loader:AttributeError:'dict'对象没有属性'data'
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28966434/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Arff Loader : AttributeError: 'dict' object has no attribute 'data'
提问by Erdnase
I am trying to load a .arff file into a numpy array using liac-arff library. (https://github.com/renatopp/liac-arff)
我正在尝试使用 liac-arff 库将 .arff 文件加载到 numpy 数组中。( https://github.com/renatopp/liac-arff)
This is my code.
这是我的代码。
import arff, numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset.data)
when executing, I am getting the error.
执行时,我收到错误。
ArffLoader.py", line 8, in <module>
data = np.array(dataset.data)
AttributeError: 'dict' object has no attribute 'data'
I have seen similar threads, Smartsheet Data Tracker: AttributeError: 'dict' object has no attribute 'append'. I am new to Python and is not able to resolve this issue. How can I fix this?
我见过类似的线程,Smartsheet Data Tracker: AttributeError: 'dict' object has no attribute 'append'。我是 Python 新手,无法解决此问题。我怎样才能解决这个问题?
采纳答案by TheBlackCat
Short version
精简版
dataset
is a dict
. For a dict
, you access the values using the python indexing notation, dataset[key]
, where key
could be a string, integer, float, tuple, or any other immutable data type (it is a bit more complicated than that, more below if you are interested).
dataset
是一个dict
。对于 a dict
,您可以使用 python 索引符号访问值,dataset[key]
,其中key
可以是字符串、整数、浮点数、元组或任何其他不可变数据类型(它比这更复杂一点,如果您感兴趣,可以在下面详细了解)。
In your case, the key is in the form of a string. To access it, you need to give the string you want as an index, like so:
在您的情况下,密钥是字符串的形式。要访问它,您需要提供您想要的字符串作为索引,如下所示:
import arff
import numpy as np
dataset = arff.load(open('mydataset.arff', 'rb'))
data = np.array(dataset['data'])
(you also shouldn't put the imports on the same line, although this is just a readability issue)
(您也不应该将导入放在同一行上,尽管这只是一个可读性问题)
More detailed explanation
更详细的解释
dataset
is a dict
, which on some languages is called a map
or hashtable
. In a dict
, you access values in a similar way to how you index in a list or array, except the "index" can be any data-type that is "hashable" (which is, ideally, unique identifier for each possible value). This "index" is called a "key". In practice, at least for built-in types and most major packages, only immutable data types or hashable, but there is no actual rule that requires this to be the case.
dataset
是 a dict
,在某些语言中称为 amap
或hashtable
。在 a 中dict
,您以与在列表或数组中索引的方式类似的方式访问值,除了“索引”可以是“可散列”的任何数据类型(理想情况下,每个可能值的唯一标识符)。这个“索引”被称为“键”。在实践中,至少对于内置类型和大多数主要包,只有不可变数据类型或可散列,但没有实际规则要求这种情况。
Do you come from MATLAB
? If so, then you are probably trying to use MATLAB's
struct
access technique. You could think of a dict
as a much faster, more flexible struct
, but syntax for accessing values are is different.
你从哪里来MATLAB
?如果是这样,那么您可能正在尝试使用MATLAB's
struct
访问技术。您可以将 adict
视为更快、更灵活的struct
,但访问值的语法是不同的。
回答by Thirumal Alagu
Its easy to load arff data into python using scipy.
使用 scipy 很容易将 arff 数据加载到 python 中。
from scipy.io import arff
import pandas as pd
data = arff.loadarff('dataset.arff')
df = pd.DataFrame(data[0])
df.head()