pandas 如何将数据框转换为一维数组?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43497472/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:25:19  来源:igfitidea点击:

How to convert dataframe to 1D array ?

pythonpandasdataframescikit-learn

提问by raj

First of all apologies. I am very new to pandas, scikit learn and python. So I am sure I am doing something silly. Let me give a little background.

首先道歉。我对Pandas、scikit 学习和 python 很陌生。所以我确定我在做一些愚蠢的事情。让我介绍一下背景。

I am trying to run KNeighborsClassifier from scikit learn (python) Following is my strategy

我正在尝试从 scikit learn (python) 运行 KNeighborsClassifier 以下是我的策略

#Reading the Training set
data = pd.read_csv('Path_TO_File\Train_Set.csv', sep=',') # reading CSV File
X = data[['Attribute 1','Attribute 2']] 
y = data['Target_Column'] # the output is a Dataframe of single column with many rows
neigh = KNeighborsClassifier(n_neighbors=3)
neigh.fit(X,y) 

Next I try to read Test data

接下来我尝试读取测试数据

test = pd.read_csv('PATH_TO_FILE\Test.csv', sep=',')
t = test[['Attribute 1','Attribute 2']] 
pred = neigh.predict(t)
actual = test['Target_Column']

Next I try to check the accuracy by following function which is throwing error.

接下来,我尝试通过以下抛出错误的函数来检查准确性。

accuracy=neigh.score(actual,pred)

ERROR: ValueError: could not convert string to float: N

错误:ValueError:无法将字符串转换为浮点数:N

I checked actual and pred both and they are of following data type and content

我检查了 actual 和 pred ,它们具有以下数据类型和内容

actual
Out[161]: 
    Target_Column
0             Y
1             N
:

[614 rows x 1 columns]

pred
Out[162]: 
array(['Y', 'N', .....'N'], dtype=object)

N.B : pred has 614 values.

注意:pred 有 614 个值。

I tried to convert "actual" variable to 1D array I might be able to execute the function however, I am not successful.

我试图将“实际”变量转换为一维数组,但我可能能够执行该函数,但没有成功。

I think I need to do following two things, however, was not able to do so (after googling it)

我想我需要做以下两件事,但是无法做到(在谷歌搜索之后)

1) Convert actual into 1Dimen array 2) Making a transpose of the 1Dimen array since the pred has 614 columns.

1) 将实际转换为 1Dimen 数组 2) 由于 pred 有 614 列,因此对 1Dimen 数组进行转置。

Please let me know how to correct the function.

请让我知道如何更正该功能。

Thanks in advance ! Raj

提前致谢 !拉吉

回答by raj

Thanks Vivek and Thornhale

感谢 Vivek 和 Thornhale

Indeed I was doing two wrong things.

事实上,我做错了两件事。

  1. As pointed by you guys, I should have been using 1, 0 in stead of Y, N.
  2. I was giving wrong parameters to the function score. It should be accuracy=neigh.score(t, actual) , where t is test feature set and actual is test label information.
  1. 正如你们所指出的,我应该使用 1, 0 而不是 Y, N。
  2. 我给函数分数提供了错误的参数。它应该是 precision=neigh.score(t, actual) ,其中 t 是测试特征集,而 actual 是测试标签信息。

回答by Thornhale

You could convert your series which is what you get when you do "test[COLUMN_NAME]" into an array like so:

您可以将您在执行“test[COLUMN_NAME]”时得到的系列转换为如下所示的数组:

actual = np.array(test['Target_Column'])

To then reshape an np array, you would emply this command:

然后要重塑 np 数组,您可以使用以下命令:

actual.reshape(1, 612) # <- Could be the other way around as well.

Your main issue though is that your Series needs to be boolean (as in 0,1).

不过,您的主要问题是您的系列需要是布尔值(如 0,1)。