pandas 如何将数据框转换为一维数组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43497472/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to convert dataframe to 1D array ?
提问by raj
First of all apologies. I am very new to pandas, scikit learn and python. So I am sure I am doing something silly. Let me give a little background.
首先道歉。我对Pandas、scikit 学习和 python 很陌生。所以我确定我在做一些愚蠢的事情。让我介绍一下背景。
I am trying to run KNeighborsClassifier from scikit learn (python) Following is my strategy
我正在尝试从 scikit learn (python) 运行 KNeighborsClassifier 以下是我的策略
#Reading the Training set
data = pd.read_csv('Path_TO_File\Train_Set.csv', sep=',') # reading CSV File
X = data[['Attribute 1','Attribute 2']]
y = data['Target_Column'] # the output is a Dataframe of single column with many rows
neigh = KNeighborsClassifier(n_neighbors=3)
neigh.fit(X,y)
Next I try to read Test data
接下来我尝试读取测试数据
test = pd.read_csv('PATH_TO_FILE\Test.csv', sep=',')
t = test[['Attribute 1','Attribute 2']]
pred = neigh.predict(t)
actual = test['Target_Column']
Next I try to check the accuracy by following function which is throwing error.
接下来,我尝试通过以下抛出错误的函数来检查准确性。
accuracy=neigh.score(actual,pred)
ERROR: ValueError: could not convert string to float: N
错误:ValueError:无法将字符串转换为浮点数:N
I checked actual and pred both and they are of following data type and content
我检查了 actual 和 pred ,它们具有以下数据类型和内容
actual
Out[161]:
Target_Column
0 Y
1 N
:
[614 rows x 1 columns]
pred
Out[162]:
array(['Y', 'N', .....'N'], dtype=object)
N.B : pred has 614 values.
注意:pred 有 614 个值。
I tried to convert "actual" variable to 1D array I might be able to execute the function however, I am not successful.
我试图将“实际”变量转换为一维数组,但我可能能够执行该函数,但没有成功。
I think I need to do following two things, however, was not able to do so (after googling it)
我想我需要做以下两件事,但是无法做到(在谷歌搜索之后)
1) Convert actual into 1Dimen array 2) Making a transpose of the 1Dimen array since the pred has 614 columns.
1) 将实际转换为 1Dimen 数组 2) 由于 pred 有 614 列,因此对 1Dimen 数组进行转置。
Please let me know how to correct the function.
请让我知道如何更正该功能。
Thanks in advance ! Raj
提前致谢 !拉吉
回答by raj
Thanks Vivek and Thornhale
感谢 Vivek 和 Thornhale
Indeed I was doing two wrong things.
事实上,我做错了两件事。
- As pointed by you guys, I should have been using 1, 0 in stead of Y, N.
- I was giving wrong parameters to the function score. It should be accuracy=neigh.score(t, actual) , where t is test feature set and actual is test label information.
- 正如你们所指出的,我应该使用 1, 0 而不是 Y, N。
- 我给函数分数提供了错误的参数。它应该是 precision=neigh.score(t, actual) ,其中 t 是测试特征集,而 actual 是测试标签信息。
回答by Thornhale
You could convert your series which is what you get when you do "test[COLUMN_NAME]" into an array like so:
您可以将您在执行“test[COLUMN_NAME]”时得到的系列转换为如下所示的数组:
actual = np.array(test['Target_Column'])
To then reshape an np array, you would emply this command:
然后要重塑 np 数组,您可以使用以下命令:
actual.reshape(1, 612) # <- Could be the other way around as well.
Your main issue though is that your Series needs to be boolean (as in 0,1).
不过,您的主要问题是您的系列需要是布尔值(如 0,1)。