Python AttributeError: 'numpy.ndarray' 对象没有属性 'columns'

Question

提问by Husterwgm

I'm trying to create a function to remove the features that are highly correlated with each other. However, I am getting the error ''AttributeError: 'numpy.ndarray' object has no attribute 'columns' '' ...

我正在尝试创建一个函数来删除彼此高度相关的功能。但是，我收到错误 ''AttributeError: 'numpy.ndarray' object has no attribute 'columns' '' ...

I just want to call pandas to read columns number. What can I do next?

我只想打电话给熊猫来读取列号。我接下来可以做什么？

import pandas as pd
import numpy as np

def remove_features_identical(DataFrame,data_source):
    n=len(DataFrame.columns)
    print 'dealing with %d features of %s data......... \n' % (n,data_source)
    remove_ind = []
    R = np.corrcoef(DataFrame.T)
    for i in range(n-1):
        for j in range(i+1,n):
            if R[i,j]==1:
                remove_ind.append(j)    

    DataFrame.drop(remove_ind, axis=1, inplace=True)
    DataFrame.drop(remove_ind, axis=1, inplace=True)
    print ('deleting %d columns with correration factor >0.99') % ( len(remove_ind))
    return DataFrame

if __name__ == "__main__":
    # load data and initialize y and x from train set and test set
    df_train = pd.read_csv('train.csv')
    df_test = pd.read_csv('test.csv')
    y_train=df_train['TARGET'].values
    X_train =df_train.drop(['ID','TARGET'], axis=1).values
    y_test=[]
    X_test = df_test.drop(['ID'], axis=1).values

    # delete identical feartures in raw data
    X_train = remove_features_identical(X_train,'train set')
    X_test = remove_features_identical(X_test,'test set')

Answer 1

回答by hpaulj

Check the Pandas documentation, but I think

检查 Pandas 文档，但我认为

X_train =df_train.drop(['ID','TARGET'], axis=1).values

.valuesreturns a numpyarray, not a Pandas dataframe. An array does not have a columnsattribute.

.values返回一个numpy数组，而不是 Pandas 数据帧。数组没有columns属性。

remove_features_identical- if you pass this an array, make sure you are only using array, not dataframe, features. Otherwise, make sure you pass it a dataframe. And don't use variable names like DataFrame.

remove_features_identical- 如果你传递一个数组，确保你只使用数组，而不是数据框，功能。否则，请确保将数据帧传递给它。并且不要使用像DataFrame.

Python AttributeError: 'numpy.ndarray' 对象没有属性 'columns'

提问by Husterwgm

回答by hpaulj

相关推荐

最近更新

标签

Python AttributeError: 'numpy.ndarray' 对象没有属性 'columns'

提问by Husterwgm

回答by hpaulj

相关推荐

如何从 Python 创建 Gephi 网络图？

Python 类型错误：预期的字符串或类似字节的对象

Python 使用 chrome headless 和 selenium 下载

如何从 Python 字典中的值中获取键？

相关推荐

最近更新

标签