Python 如何使用熊猫从数据框中删除一列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28035839/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 02:39:02  来源:igfitidea点击:

How to delete a column from a data frame with pandas?

pythonpython-2.7pandascsvio

提问by newWithPython

I read my data

我读了我的数据

import pandas as pd
df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t')
print df

and get:

并得到:

          id    text
0    361.273    text1...
1    374.350    text2...
2    374.350    text3...

How can I delete the idcolumn from the above data frame?. I tried the following:

如何id从上述数据框中删除列?我尝试了以下方法:

import pandas as pd
df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t')
print df.drop('id', 1)

But it raises this exception:

但它引发了这个异常:

ValueError: labels ['id'] not contained in axis

采纳答案by EdChum

To actually delete the column

实际删除列

del df['id']or df.drop('id', 1)should have worked if the passed column matches exactly

del df['id']或者df.drop('id', 1)如果传递的列完全匹配就应该工作

However, if you don't need to delete the column then you can just select the column of interest like so:

但是,如果您不需要删除该列,那么您可以像这样选择感兴趣的列:

In [54]:

df['text']
Out[54]:
0    text1
1    text2
2    textn
Name: text, dtype: object

If you never wanted it in the first place then you pass a list of cols to read_csvas a param usecols:

如果您从一开始就不需要它,那么您可以将 cols 列表read_csv作为 param传递给usecols

In [53]:
import io
temp="""id    text
363.327    text1
366.356    text2
37782    textn"""
df = pd.read_csv(io.StringIO(temp), delimiter='\s+', usecols=['text'])
df
Out[53]:
    text
0  text1
1  text2
2  textn

Regarding your error it's because 'id'is not in your columns or that it's spelt differently or has whitespace. To check this look at the output from print(df.columns.tolist())this will output a list of the columns and will show if you have any leading/trailing whitespace.

关于您的错误,这是因为'id'不在您的列中,或者拼写不同或有空格。要检查此内容,请查看此输出print(df.columns.tolist())将输出列列表,并显示您是否有任何前导/尾随空格。

回答by unutbu

df.drop(colname, axis=1)(or del df[colname]) is the correct method to use to delete a column.

df.drop(colname, axis=1)(或del df[colname]) 是用于删除列的正确方法。

If a ValueErroris raised, it means the column name is not exactly what you think it is.

如果 aValueError被提出,则表示列名与您认为的不完全相同。

Check df.columnsto see what Pandas thinks are the names of the columns.

检查df.columnsPandas 认为列的名称是什么。

回答by borgr

The best way to delete a column in pandas is to use drop:

在 Pandas 中删除列的最佳方法是使用drop

df = df.drop('column_name', axis=1)

where 1is the axisnumber (0for rows and 1for columns.)

其中1数(0行和1列的)。

To delete the column without having to reassign dfyou can do:

要删除列而不必重新分配,df您可以执行以下操作:

df.drop('column_name', axis=1, inplace=True)

Finally, to drop by column numberinstead of by column label, try this. To delete, e.g. the 1st, 2nd and 4th columns:

最后,要按列而不是按列标签删除,试试这个。删除,例如第 1、2 和 4 列:

df.drop(df.columns[[0, 1, 3]], axis=1)  # df.columns is zero-based pd.Index 


Exceptions:


例外:

If a wrong column number or label is requested an error will be thrown. To check the number of columns use df.shape[1]or len(df.columns.values)and to check the column labels use df.columns.values.

如果请求了错误的列号或标签,则会抛出错误。要检查列数,请使用df.shape[1]len(df.columns.values)并检查列标签,请使用df.columns.values.

An exception would be raised answer was based on @LondonRob's answerand left here to help future visitors of this page.

一个例外将提出答案基于@LondonRob 的答案,并留在此处以帮助此页面的未来访问者。