pandas 熊猫 - 删除列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45333530/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:06:22  来源:igfitidea点击:

Pandas - drop columns

pythonpandasdataframe

提问by theprowler

I'm aware that dropping a dataframe's columns should be as easy as:

我知道删除数据框的列应该很简单:

df.drop(df.columns[1], axis=1)to drop by index

df.drop(df.columns[1], axis=1)按索引删除

or dr.dropna(axis=1, how='any')to drop based on if it contains NaNs.

dr.dropna(axis=1, how='any')根据它是否包含NaNs丢弃。

But neither of those works on my dataframe and I'm not sure if that's because of a format issue or data type issue or a misuse or misunderstanding of these commands.

但这些都不适用于我的数据框,我不确定这是因为格式问题或数据类型问题,还是对这些命令的误用或误解。

Here is my dataframe:

这是我的数据框:

fish_frame after append new_column:                         0       1       2      3                          4  \
2                 GBE COD     NaN     NaN    600                        NaN   
3                 GBW COD     NaN  11,189    NaN                        NaN   
4                 GOM COD     NaN       0    NaN  Package Deal - ,753.69   
5                 POLLOCK     NaN     NaN  1,103                        NaN   
6                   WHAKE     NaN     NaN     12                        NaN   
7             GBE HADDOCK     NaN  10,730    NaN                        NaN   
8             GBW HADDOCK     NaN  64,147    NaN                        NaN   
9             GOM HADDOCK     NaN       0    NaN                        NaN   
10                REDFISH     NaN     NaN      0                        NaN   
11         WITCH FLOUNDER     NaN     370    NaN                        NaN   
12                 PLAICE     NaN     NaN    622                        NaN   
13     GB WINTER FLOUNDER  54,315     NaN    NaN                        NaN   
14    GOM WINTER FLOUNDER     653     NaN    NaN                        NaN   
15  SNEMA WINTER FLOUNDER  14,601     NaN    NaN                        NaN   
16          GB YELLOWTAIL     NaN   1,663    NaN                        NaN   
17       SNEMA YELLOWTAIL     NaN   1,370    NaN                        NaN   
18       CCGOM YELLOWTAIL   1,812     NaN    NaN                        NaN   

       6        package_deal_column Package_Price new_column  
2    NaN  Package Deal - ,753.69          None        600  
3    NaN  Package Deal - ,753.69          None    11,1890  
4   None  Package Deal - ,753.69          None          0  
5    NaN  Package Deal - ,753.69          None      1,103  
6    NaN  Package Deal - ,753.69          None         12  
7    NaN  Package Deal - ,753.69          None    10,7300  
8    NaN  Package Deal - ,753.69          None    64,1470  
9    NaN  Package Deal - ,753.69          None          0  
10   NaN  Package Deal - ,753.69          None          0  
11   NaN  Package Deal - ,753.69          None       3700  
12   NaN  Package Deal - ,753.69          None        622  
13  None  Package Deal - ,753.69          None   54,31500  
14  None  Package Deal - ,753.69          None      65300  
15  None  Package Deal - ,753.69          None   14,60100  
16   NaN  Package Deal - ,753.69          None     1,6630  
17   NaN  Package Deal - ,753.69          None     1,3700  
18  None  Package Deal - ,753.69          None    1,81200 

And then I have the following lines of code:

然后我有以下几行代码:

fish_frame.drop(fish_frame.columns[1], axis=1)
fish_frame.drop(fish_frame.columns[2], axis=1)
fish_frame.drop(fish_frame.columns[3], axis=1)
fish_frame.drop(fish_frame.columns[4:5], axis=1)
#del fish_frame[4:5]    #doesn't work, "TypeError: slice(4, 5, None) is an invalid key"
del fish_frame['Package_Price']
fish_frame.dropna(axis=1, how='any')

And then I printout the dataframe again and it comes out as:

然后我再次打印出数据框,结果如下:

NEW fish_frame:                         0       1       2      3                          4  \
2                 GBE COD     NaN     NaN    600                        NaN   
3                 GBW COD     NaN  11,189    NaN                        NaN   
4                 GOM COD     NaN       0    NaN  Package Deal - ,753.69   
5                 POLLOCK     NaN     NaN  1,103                        NaN   
6                   WHAKE     NaN     NaN     12                        NaN   
7             GBE HADDOCK     NaN  10,730    NaN                        NaN   
8             GBW HADDOCK     NaN  64,147    NaN                        NaN   
9             GOM HADDOCK     NaN       0    NaN                        NaN   
10                REDFISH     NaN     NaN      0                        NaN   
11         WITCH FLOUNDER     NaN     370    NaN                        NaN   
12                 PLAICE     NaN     NaN    622                        NaN   
13     GB WINTER FLOUNDER  54,315     NaN    NaN                        NaN   
14    GOM WINTER FLOUNDER     653     NaN    NaN                        NaN   
15  SNEMA WINTER FLOUNDER  14,601     NaN    NaN                        NaN   
16          GB YELLOWTAIL     NaN   1,663    NaN                        NaN   
17       SNEMA YELLOWTAIL     NaN   1,370    NaN                        NaN   
18       CCGOM YELLOWTAIL   1,812     NaN    NaN                        NaN   

       6        package_deal_column new_column  
2    NaN  Package Deal - ,753.69        600  
3    NaN  Package Deal - ,753.69    11,1890  
4   None  Package Deal - ,753.69          0  
5    NaN  Package Deal - ,753.69      1,103  
6    NaN  Package Deal - ,753.69         12  
7    NaN  Package Deal - ,753.69    10,7300  
8    NaN  Package Deal - ,753.69    64,1470  
9    NaN  Package Deal - ,753.69          0  
10   NaN  Package Deal - ,753.69          0  
11   NaN  Package Deal - ,753.69       3700  
12   NaN  Package Deal - ,753.69        622  
13  None  Package Deal - ,753.69   54,31500  
14  None  Package Deal - ,753.69      65300  
15  None  Package Deal - ,753.69   14,60100  
16   NaN  Package Deal - ,753.69     1,6630  
17   NaN  Package Deal - ,753.69     1,3700  
18  None  Package Deal - ,753.69    1,81200  

With neither the NaNdrop working nor the index drop working. Only the specific drop[column name]command works but I can't do that for every iteration of this script.

既没有NaN下降工作也没有索引下降工作。只有特定的drop[column name]命令有效,但我不能对这个脚本的每次迭代都这样做。

I'm very confused and I hope this isn't a very dumb mistake I'm making.

我很困惑,我希望这不是我犯的一个非常愚蠢的错误。

Also, I myself don't fully understand this information but printing fish_frame.info()produces:

另外,我自己并不完全理解这些信息,但打印fish_frame.info()会产生:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17 entries, 2 to 18
Data columns (total 8 columns):
0                      17 non-null object
1                      4 non-null object
2                      8 non-null object
3                      5 non-null object
4                      1 non-null object
6                      0 non-null object
package_deal_column    17 non-null object
new_column             17 non-null object
dtypes: object(8)
memory usage: 586.0+ bytes

Any help solving this would be appreciated thanks.

任何帮助解决这个问题将不胜感激,谢谢。

回答by A.Kot

If there is no error which I don't see one from your output, you've simply forgotten to use the inplaceparameter:

如果没有我从您的输出中看不到的错误,您只是忘记使用inplace参数:

df.drop(df.columns[1], axis=1, inplace=True)

回答by MaxU

Here are some alternatives:

以下是一些替代方案:

Setup:

设置:

df = pd.DataFrame(np.random.rand(3,5), columns=list('abcde'))

In [57]: cols_to_drop = ['b', 'd']

In [63]: df
Out[63]:
          a         b         c         d         e
0  0.758670  0.734007  0.027711  0.614674  0.955711
1  0.833110  0.242010  0.922831  0.165401  0.546079
2  0.414916  0.949050  0.608527  0.018036  0.230343

Option 1:

选项1:

df = df[df.columns.drop(col_to_drop)]

Option 2:

选项 2:

df = df[df.columns.difference(cols_to_drop)]

Option 3:

选项 3:

df = df.loc[:, ~df.columns.isin(cols_to_drop)]

All return:

全部返回:

          a         c         e
0  0.758670  0.027711  0.955711
1  0.833110  0.922831  0.546079
2  0.414916  0.608527  0.230343

回答by Loochie

If you are trying to drop the columns with NaN the following code will suffice. Well, I tried it myself and it worked.

如果您尝试删除带有 NaN 的列,以下代码就足够了。嗯,我自己试过了,它奏效了。

df = df.dropna(axis = 1)