Python 如何使用熊猫从数据框中删除一列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28035839/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to delete a column from a data frame with pandas?
提问by newWithPython
I read my data
我读了我的数据
import pandas as pd
df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t')
print df
and get:
并得到:
id text
0 361.273 text1...
1 374.350 text2...
2 374.350 text3...
How can I delete the id
column from the above data frame?. I tried the following:
如何id
从上述数据框中删除列?我尝试了以下方法:
import pandas as pd
df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t')
print df.drop('id', 1)
But it raises this exception:
但它引发了这个异常:
ValueError: labels ['id'] not contained in axis
采纳答案by EdChum
To actually delete the column
实际删除列
del df['id']
or df.drop('id', 1)
should have worked if the passed column matches exactly
del df['id']
或者df.drop('id', 1)
如果传递的列完全匹配就应该工作
However, if you don't need to delete the column then you can just select the column of interest like so:
但是,如果您不需要删除该列,那么您可以像这样选择感兴趣的列:
In [54]:
df['text']
Out[54]:
0 text1
1 text2
2 textn
Name: text, dtype: object
If you never wanted it in the first place then you pass a list of cols to read_csv
as a param usecols
:
如果您从一开始就不需要它,那么您可以将 cols 列表read_csv
作为 param传递给usecols
:
In [53]:
import io
temp="""id text
363.327 text1
366.356 text2
37782 textn"""
df = pd.read_csv(io.StringIO(temp), delimiter='\s+', usecols=['text'])
df
Out[53]:
text
0 text1
1 text2
2 textn
Regarding your error it's because 'id'
is not in your columns or that it's spelt differently or has whitespace. To check this look at the output from print(df.columns.tolist())
this will output a list of the columns and will show if you have any leading/trailing whitespace.
关于您的错误,这是因为'id'
不在您的列中,或者拼写不同或有空格。要检查此内容,请查看此输出print(df.columns.tolist())
将输出列列表,并显示您是否有任何前导/尾随空格。
回答by unutbu
df.drop(colname, axis=1)
(or del df[colname]
) is the correct method to use to delete a column.
df.drop(colname, axis=1)
(或del df[colname]
) 是用于删除列的正确方法。
If a ValueError
is raised, it means the column name is not exactly what you think it is.
如果 aValueError
被提出,则表示列名与您认为的不完全相同。
Check df.columns
to see what Pandas thinks are the names of the columns.
检查df.columns
Pandas 认为列的名称是什么。
回答by borgr
The best way to delete a column in pandas is to use drop:
在 Pandas 中删除列的最佳方法是使用drop:
df = df.drop('column_name', axis=1)
where 1
is the axisnumber (0
for rows and 1
for columns.)
其中1
是轴数(0
行和1
列的)。
To delete the column without having to reassign df
you can do:
要删除列而不必重新分配,df
您可以执行以下操作:
df.drop('column_name', axis=1, inplace=True)
Finally, to drop by column numberinstead of by column label, try this. To delete, e.g. the 1st, 2nd and 4th columns:
最后,要按列号而不是按列标签删除,试试这个。删除,例如第 1、2 和 4 列:
df.drop(df.columns[[0, 1, 3]], axis=1) # df.columns is zero-based pd.Index
Exceptions:
例外:
If a wrong column number or label is requested an error will be thrown.
To check the number of columns use df.shape[1]
or len(df.columns.values)
and to check the column labels use df.columns.values
.
如果请求了错误的列号或标签,则会抛出错误。要检查列数,请使用df.shape[1]
或len(df.columns.values)
并检查列标签,请使用df.columns.values
.
An exception would be raised answer was based on @LondonRob's answerand left here to help future visitors of this page.
一个例外将提出答案基于@LondonRob 的答案,并留在此处以帮助此页面的未来访问者。