Pandas 按索引删除列会删除所有具有相同名称的列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/35797964/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas dropping columns by index drops all columns with same name
提问by Robin Nemeth
Consider following dataframe which has columns with same name (Apparently this does happens, currently I have a dataset like this! :( )
考虑以下具有相同名称的列的数据框(显然这确实发生了,目前我有一个这样的数据集!:()
>>> df = pd.DataFrame({"a":range(10,15),"b":range(5,10)})
>>> df.rename(columns={"b":"a"},inplace=True)
df
a a
0 10 5
1 11 6
2 12 7
3 13 8
4 14 9
>>> df.columns
Index(['a', 'a'], dtype='object')
I would expect that when dropping by index , only the column with the respective index would be gone, but apparently this is not the case.
我希望当按 index 删除时,只有具有相应索引的列会消失,但显然情况并非如此。
>>> df.drop(df.columns[-1],1)
0
1
2
3
4
Is there a way to get rid of columns with duplicated column names?
有没有办法摆脱具有重复列名的列?
EDIT: I choose missleading values for the first column, fixed now
编辑:我为第一列选择了误导性值,现已修复
EDIT2: the expected outcome is
EDIT2:预期的结果是
a
0 10
1 11
2 12
3 13
4 14
回答by EdChum
Actually just do this:
其实只要这样做:
In [183]:
df.ix[:,~df.columns.duplicated()]
Out[183]:
a
0 0
1 1
2 2
3 3
4 4
So this index all rows and then uses the column mask generated from duplicated
and invert the mask using ~
所以这个索引所有行,然后使用从生成的列掩码duplicated
并使用反转掩码~
The output from duplicated
:
输出duplicated
:
In [184]:
df.columns.duplicated()
Out[184]:
array([False, True], dtype=bool)
UPDATE
更新
As .ix
is deprecated(since v0.20.1
) you should do any of the following:
由于.ix
已弃用(自 v 0.20.1
),您应该执行以下任何操作:
df.iloc[:,~df.columns.duplicated()]
or
或者
df.loc[:,~df.columns.duplicated()]
Thanks to @DavideFiocco for alerting me
感谢@DavideFiocco 提醒我