Python 如何选择除熊猫中的一列之外的所有列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29763620/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to select all columns, except one column in pandas?
提问by markov zain
I have a dataframe look like this:
我有一个如下所示的数据框:
import pandas
import numpy as np
df = DataFrame(np.random.rand(4,4), columns = list('abcd'))
df
a b c d
0 0.418762 0.042369 0.869203 0.972314
1 0.991058 0.510228 0.594784 0.534366
2 0.407472 0.259811 0.396664 0.894202
3 0.726168 0.139531 0.324932 0.906575
How I can get all columns except column b
?
我如何获得除column b
?
采纳答案by Marius
When the columns are not a MultiIndex, df.columns
is just an array of column names so you can do:
当列不是 MultiIndex 时,df.columns
它只是一个列名数组,因此您可以执行以下操作:
df.loc[:, df.columns != 'b']
a c d
0 0.561196 0.013768 0.772827
1 0.882641 0.615396 0.075381
2 0.368824 0.651378 0.397203
3 0.788730 0.568099 0.869127
回答by Mike
Don't use ix
. It's deprecated. The most readable and idiomatic way of doing this is df.drop()
:
不要使用ix
. 它已弃用。最易读和最惯用的方法是df.drop()
:
>>> df
a b c d
0 0.175127 0.191051 0.382122 0.869242
1 0.414376 0.300502 0.554819 0.497524
2 0.142878 0.406830 0.314240 0.093132
3 0.337368 0.851783 0.933441 0.949598
>>> df.drop('b', axis=1)
a c d
0 0.175127 0.382122 0.869242
1 0.414376 0.554819 0.497524
2 0.142878 0.314240 0.093132
3 0.337368 0.933441 0.949598
Note that by default, .drop()
does not operate inplace; despite the ominous name, df
is unharmed by this process. If you want to permanently remove b
from df
, do df.drop('b', inplace=True)
.
注意,默认情况下,.drop()
不会就地操作;尽管名字不祥,但df
并未受到此过程的伤害。如果你想永久删除b
的df
,做的df.drop('b', inplace=True)
。
df.drop()
also accepts a list of labels, e.g. df.drop(['a', 'b'], axis=1)
will drop column a
and b
.
df.drop()
还接受标签列表,例如df.drop(['a', 'b'], axis=1)
将删除列a
和b
。
回答by Salvador Dali
Here is another way:
这是另一种方式:
df[[i for i in list(df.columns) if i != '<your column>']]
You just pass all columns to be shown except of the one you do not want.
您只需传递要显示的所有列,除了您不想要的列。
回答by ayhan
df[df.columns.difference(['b'])]
Out:
a c d
0 0.427809 0.459807 0.333869
1 0.678031 0.668346 0.645951
2 0.996573 0.673730 0.314911
3 0.786942 0.719665 0.330833
回答by Sudhi
I think the best way to do is the way mentioned by @Salvador Dali. Not that the others are wrong.
我认为最好的方法是@Salvador Dali 提到的方法。并不是说其他人错了。
Because when you have a data set where you just want to select one column and put it into one variable and the rest of the columns into another for comparison or computational purposes. Then dropping the column of the data set might not help. Of course there are use cases for that as well.
因为当您有一个数据集时,您只想选择一列并将其放入一个变量中,并将其余的列放入另一个变量中以进行比较或计算。然后删除数据集的列可能无济于事。当然,也有一些用例。
x_cols = [x for x in data.columns if x != 'name of column to be excluded']
Then you can put those collection of columns in variable x_cols
into another variable like x_cols1
for other computation.
然后,您可以将变量中的这些列集合x_cols
放入另一个变量中,例如x_cols1
用于其他计算。
ex: x_cols1 = data[x_cols]
回答by user1718097
Another slight modification to @Salvador Dali enables a list of columns to exclude:
对@Salvador Dali 的另一个轻微修改启用了要排除的列列表:
df[[i for i in list(df.columns) if i not in [list_of_columns_to_exclude]]]
or
或者
df.loc[:,[i for i in list(df.columns) if i not in [list_of_columns_to_exclude]]]
回答by Tom
You can use df.columns.isin()
您可以使用 df.columns.isin()
df.loc[:, ~df.columns.isin(['b'])]
When you want to drop multiple columns, as simple as:
当你想删除多个列时,就像这样简单:
df.loc[:, ~df.columns.isin(['col1', 'col2'])]
回答by Grant Shannon
Here is a one line lambda:
这是一个单行 lambda:
df[map(lambda x :x not in ['b'], list(df.columns))]
before:
之前:
import pandas
import numpy as np
df = pd.DataFrame(np.random.rand(4,4), columns = list('abcd'))
df
a b c d
0 0.774951 0.079351 0.118437 0.735799
1 0.615547 0.203062 0.437672 0.912781
2 0.804140 0.708514 0.156943 0.104416
3 0.226051 0.641862 0.739839 0.434230
after:
之后:
df[map(lambda x :x not in ['b'], list(df.columns))]
a c d
0 0.774951 0.118437 0.735799
1 0.615547 0.437672 0.912781
2 0.804140 0.156943 0.104416
3 0.226051 0.739839 0.434230