删除特定列 Pandas 时出现轴错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/54296214/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:16:51  来源:igfitidea点击:

Axis error when dropping specific columns Pandas

pythonpandasdataframe

提问by rmahesh

I have identified specific columns I want to select as my predictors for my model based on some analysis. I have captured those column numbers and stored it in a list. I have roughly 80 columns and want to loop through and drop the columns not in this specific list. X_train is the column in which I want to do this. Here is my code:

我已经根据一些分析确定了我想要选择作为我的模型的预测变量的特定列。我已经捕获了这些列号并将其存储在列表中。我有大约 80 列,想要遍历并删除不在此特定列表中的列。X_train 是我想在其中执行此操作的列。这是我的代码:

cols_selected = [24, 4, 7, 50, 2, 60, 46, 53, 48, 61]
cols_drop = []

for x in range(len(X_train.columns)):
    if x in cols_selected:
        pass
    else:
        X_train.drop([x])

When running this, I am faced with the following error while highlighting the code: X_train.drop([x]):

运行此程序时,我在突出显示代码时遇到以下错误:X_train.drop([x]):

KeyError: '[3] not found in axis'

KeyError: '[3] 在轴中找不到'

I am sure it is something very simple that I am missing. I tried including the inplace=True or axis=1 statements along with this and all of them had the same error message (while the value inside the [] changed with those error codes).

我确信这是我遗漏的非常简单的东西。我尝试将 inplace=True 或 axis=1 语句与此一起包含在内,并且它们都有相同的错误消息(而 [] 中的值随这些错误代码而改变)。

Any help would be great!

任何帮助都会很棒!

Edit: Here is the addition to get this working:

编辑:这是让这个工作的补充:

cols_selected = [24, 4, 7, 50, 2, 60, 46, 53, 48, 61]
cols_drop = []

for x in range(len(X_train.columns)):
    if x in cols_selected:
        pass
    else:
        cols_drop.append(x)

X_train = X_train.drop(X_train.columns[[cols_drop]], axis=1)    

回答by Karn Kumar

I am just assuming as per the question litle:

我只是根据问题来假设:

Example DataFrame:

示例数据帧:

>>> df
   A  B   C   D
0  0  1   2   3
1  4  5   6   7
2  8  9  10  11

Dropping Specific columns B& C:

删除特定列B& C

>>> df.drop(['B', 'C'], axis=1)
# df.drop(['B', 'C'], axis=1, inplace=True) <-- to make the change the df itself , use inplace=True
   A   D
0  0   3
1  4   7
2  8  11

If you are trying to drop them by column numbers (Dropping by index) then try like below:

如果您尝试按列号 ( Dropping by index)删除它们,请尝试如下操作:

>>> df.drop(df.columns[[1, 2]], axis=1)
   A   D
0  0   3
1  4   7
2  8  11

OR

或者

>>> df.drop(columns=['B', 'C'])
   A   D
0  0   3
1  4   7
2  8  11

回答by matthiasdenu

Also, in addition to @pygo pointing out that df.drop takes a keyword arg to designate the axis, try this:

此外,除了@pygo 指出 df.drop 使用关键字 arg 来指定轴之外,请尝试以下操作:

X_train = X_train[[col for col in X_train.columns if col in cols_selected]] 

Here is an example:

下面是一个例子:

>>> import numpy as np
>>> import pandas as pd
>>> cols_selected = ['a', 'c', 'e']
>>> X_train = pd.DataFrame(np.random.randint(low=0, high=10, size=(20, 5)), columns=['a', 'b', 'c', 'd', 'e'])
>>> X_train
    a  b  c  d  e
0   4  0  3  5  9
1   8  8  6  7  2
2   1  0  2  0  2
3   3  8  0  5  9
4   5  9  7  8  0
5   1  9  3  5  9 ...
>>> X_train = X_train[[col for col in X_train.columns if col in cols_selected]]
>>> X_train
    a  c  e
0   4  3  9
1   8  6  2
2   1  2  2
3   3  0  9
4   5  7  0
5   1  3  9 ...

回答by 0x51ba

According to the documentation of drop:

根据drop的文档:

Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names

通过指定标签名称和相应的轴,或通过直接指定索引或列名称来删除行或列

You can not drop columns by simply using the index of the column. You need the name of the columns. Also the axisparameter has to be set to 1or columnsReplace X_train.drop([x])with X_train=X_train.drop(X_train.columns[x], axis='columns')to make your example work.

您不能通过简单地使用列的索引来删除列。您需要列的名称。此外,axis参数必须设置为1columns替换为X_train.drop([x])X_train=X_train.drop(X_train.columns[x], axis='columns')以使您的示例工作。