pandas 熊猫 - 删除列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45333530/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas - drop columns
提问by theprowler
I'm aware that dropping a dataframe's columns should be as easy as:
我知道删除数据框的列应该很简单:
df.drop(df.columns[1], axis=1)
to drop by index
df.drop(df.columns[1], axis=1)
按索引删除
or dr.dropna(axis=1, how='any')
to drop based on if it contains NaN
s.
或dr.dropna(axis=1, how='any')
根据它是否包含NaN
s丢弃。
But neither of those works on my dataframe and I'm not sure if that's because of a format issue or data type issue or a misuse or misunderstanding of these commands.
但这些都不适用于我的数据框,我不确定这是因为格式问题或数据类型问题,还是对这些命令的误用或误解。
Here is my dataframe:
这是我的数据框:
fish_frame after append new_column: 0 1 2 3 4 \
2 GBE COD NaN NaN 600 NaN
3 GBW COD NaN 11,189 NaN NaN
4 GOM COD NaN 0 NaN Package Deal - ,753.69
5 POLLOCK NaN NaN 1,103 NaN
6 WHAKE NaN NaN 12 NaN
7 GBE HADDOCK NaN 10,730 NaN NaN
8 GBW HADDOCK NaN 64,147 NaN NaN
9 GOM HADDOCK NaN 0 NaN NaN
10 REDFISH NaN NaN 0 NaN
11 WITCH FLOUNDER NaN 370 NaN NaN
12 PLAICE NaN NaN 622 NaN
13 GB WINTER FLOUNDER 54,315 NaN NaN NaN
14 GOM WINTER FLOUNDER 653 NaN NaN NaN
15 SNEMA WINTER FLOUNDER 14,601 NaN NaN NaN
16 GB YELLOWTAIL NaN 1,663 NaN NaN
17 SNEMA YELLOWTAIL NaN 1,370 NaN NaN
18 CCGOM YELLOWTAIL 1,812 NaN NaN NaN
6 package_deal_column Package_Price new_column
2 NaN Package Deal - ,753.69 None 600
3 NaN Package Deal - ,753.69 None 11,1890
4 None Package Deal - ,753.69 None 0
5 NaN Package Deal - ,753.69 None 1,103
6 NaN Package Deal - ,753.69 None 12
7 NaN Package Deal - ,753.69 None 10,7300
8 NaN Package Deal - ,753.69 None 64,1470
9 NaN Package Deal - ,753.69 None 0
10 NaN Package Deal - ,753.69 None 0
11 NaN Package Deal - ,753.69 None 3700
12 NaN Package Deal - ,753.69 None 622
13 None Package Deal - ,753.69 None 54,31500
14 None Package Deal - ,753.69 None 65300
15 None Package Deal - ,753.69 None 14,60100
16 NaN Package Deal - ,753.69 None 1,6630
17 NaN Package Deal - ,753.69 None 1,3700
18 None Package Deal - ,753.69 None 1,81200
And then I have the following lines of code:
然后我有以下几行代码:
fish_frame.drop(fish_frame.columns[1], axis=1)
fish_frame.drop(fish_frame.columns[2], axis=1)
fish_frame.drop(fish_frame.columns[3], axis=1)
fish_frame.drop(fish_frame.columns[4:5], axis=1)
#del fish_frame[4:5] #doesn't work, "TypeError: slice(4, 5, None) is an invalid key"
del fish_frame['Package_Price']
fish_frame.dropna(axis=1, how='any')
And then I printout the dataframe again and it comes out as:
然后我再次打印出数据框,结果如下:
NEW fish_frame: 0 1 2 3 4 \
2 GBE COD NaN NaN 600 NaN
3 GBW COD NaN 11,189 NaN NaN
4 GOM COD NaN 0 NaN Package Deal - ,753.69
5 POLLOCK NaN NaN 1,103 NaN
6 WHAKE NaN NaN 12 NaN
7 GBE HADDOCK NaN 10,730 NaN NaN
8 GBW HADDOCK NaN 64,147 NaN NaN
9 GOM HADDOCK NaN 0 NaN NaN
10 REDFISH NaN NaN 0 NaN
11 WITCH FLOUNDER NaN 370 NaN NaN
12 PLAICE NaN NaN 622 NaN
13 GB WINTER FLOUNDER 54,315 NaN NaN NaN
14 GOM WINTER FLOUNDER 653 NaN NaN NaN
15 SNEMA WINTER FLOUNDER 14,601 NaN NaN NaN
16 GB YELLOWTAIL NaN 1,663 NaN NaN
17 SNEMA YELLOWTAIL NaN 1,370 NaN NaN
18 CCGOM YELLOWTAIL 1,812 NaN NaN NaN
6 package_deal_column new_column
2 NaN Package Deal - ,753.69 600
3 NaN Package Deal - ,753.69 11,1890
4 None Package Deal - ,753.69 0
5 NaN Package Deal - ,753.69 1,103
6 NaN Package Deal - ,753.69 12
7 NaN Package Deal - ,753.69 10,7300
8 NaN Package Deal - ,753.69 64,1470
9 NaN Package Deal - ,753.69 0
10 NaN Package Deal - ,753.69 0
11 NaN Package Deal - ,753.69 3700
12 NaN Package Deal - ,753.69 622
13 None Package Deal - ,753.69 54,31500
14 None Package Deal - ,753.69 65300
15 None Package Deal - ,753.69 14,60100
16 NaN Package Deal - ,753.69 1,6630
17 NaN Package Deal - ,753.69 1,3700
18 None Package Deal - ,753.69 1,81200
With neither the NaN
drop working nor the index drop working. Only the specific drop[column name]
command works but I can't do that for every iteration of this script.
既没有NaN
下降工作也没有索引下降工作。只有特定的drop[column name]
命令有效,但我不能对这个脚本的每次迭代都这样做。
I'm very confused and I hope this isn't a very dumb mistake I'm making.
我很困惑,我希望这不是我犯的一个非常愚蠢的错误。
Also, I myself don't fully understand this information but printing fish_frame.info()
produces:
另外,我自己并不完全理解这些信息,但打印fish_frame.info()
会产生:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17 entries, 2 to 18
Data columns (total 8 columns):
0 17 non-null object
1 4 non-null object
2 8 non-null object
3 5 non-null object
4 1 non-null object
6 0 non-null object
package_deal_column 17 non-null object
new_column 17 non-null object
dtypes: object(8)
memory usage: 586.0+ bytes
Any help solving this would be appreciated thanks.
任何帮助解决这个问题将不胜感激,谢谢。
回答by A.Kot
If there is no error which I don't see one from your output, you've simply forgotten to use the inplace
parameter:
如果没有我从您的输出中看不到的错误,您只是忘记使用inplace
参数:
df.drop(df.columns[1], axis=1, inplace=True)
回答by MaxU
Here are some alternatives:
以下是一些替代方案:
Setup:
设置:
df = pd.DataFrame(np.random.rand(3,5), columns=list('abcde'))
In [57]: cols_to_drop = ['b', 'd']
In [63]: df
Out[63]:
a b c d e
0 0.758670 0.734007 0.027711 0.614674 0.955711
1 0.833110 0.242010 0.922831 0.165401 0.546079
2 0.414916 0.949050 0.608527 0.018036 0.230343
Option 1:
选项1:
df = df[df.columns.drop(col_to_drop)]
Option 2:
选项 2:
df = df[df.columns.difference(cols_to_drop)]
Option 3:
选项 3:
df = df.loc[:, ~df.columns.isin(cols_to_drop)]
All return:
全部返回:
a c e
0 0.758670 0.027711 0.955711
1 0.833110 0.922831 0.546079
2 0.414916 0.608527 0.230343
回答by Loochie
If you are trying to drop the columns with NaN the following code will suffice. Well, I tried it myself and it worked.
如果您尝试删除带有 NaN 的列,以下代码就足够了。嗯,我自己试过了,它奏效了。
df = df.dropna(axis = 1)