Pandas DataFrame 到 Numpy Array ValueError
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31791476/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas DataFrame to Numpy Array ValueError
提问by Adam_G
I am trying to convert a single column of a dataframe to a numpy array. Converting the entire dataframe has no issues.
我正在尝试将数据帧的单列转换为 numpy 数组。转换整个数据帧没有问题。
df
df
viz a1_count a1_mean a1_std
0 0 3 2 0.816497
1 1 0 NaN NaN
2 0 2 51 50.000000
Both of these functions work fine:
这两个功能都可以正常工作:
X = df.as_matrix()
X = df.as_matrix(columns=df.columns[1:])
However, when I try:
但是,当我尝试时:
y = df.as_matrix(columns=df.columns[0])
I get:
我得到:
TypeError: Index(...) must be called with a collection of some kind, 'viz' was passed
采纳答案by EdChum
The problem here is that you're passing just a single element which in this case is just the string title of that column, if you convert this to a list with a single element then it works:
这里的问题是您只传递了一个元素,在这种情况下只是该列的字符串标题,如果您将其转换为具有单个元素的列表,则它可以工作:
In [97]:
y = df.as_matrix(columns=[df.columns[0]])
y
Out[97]:
array([[0],
[1],
[0]], dtype=int64)
Here is what you're passing:
这是您要传递的内容:
In [101]:
df.columns[0]
Out[101]:
'viz'
So it's equivalent to this:
所以它相当于:
y = df.as_matrix(columns='viz')
which results in the same error
这导致相同的错误
The docsshow the expected params:
该文档显示预期PARAMS:
DataFrame.as_matrix(columns=None) Convert the frame to its Numpy-array representation.
Parameters: columns: list, optional, default:None If None, return all columns, otherwise, returns specified columns
DataFrame.as_matrix(columns=None) 将框架转换为其 Numpy 数组表示。
参数:columns:列表,可选,默认:None 如果没有,返回所有列,否则返回指定的列
回答by DeepSpace
as_matrixexpects a listfor the columnskeyword and df.columns[0]isn't a list. Try
df.as_matrix(columns=[df.columns[0]])instead.
as_matrix预计一list为columns关键字,df.columns[0]是不是列表。试试吧
df.as_matrix(columns=[df.columns[0]])。
回答by Reed Richards
Using the index tolist function works as well
使用 index tolist 函数也可以
df.as_matrix(columns=df.columns[0].tolist())
When giving multiple columns, for example, the ten first, then the command
当给出多列时,例如,先是十列,然后是命令
df.as_matrix(columns=[df.columns[0:10]])
does not workas it returns an index. However, using
不起作用,因为它返回一个索引。但是,使用
df.as_matrix(columns=df.columns[0:10].tolist())
works well.
效果很好。

