pandas 从数据框中按索引删除行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47932937/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:56:44  来源:igfitidea点击:

Drop rows by index from dataframe

pythonpandasdataframe

提问by octavian

I have an array wrong_indexes_trainwhich contains a list of indexes that I would like to remove from a dataframe:

我有一个数组wrong_indexes_train,其中包含我想从数据框中删除的索引列表:

[0, 63, 151, 469, 1008]

To remove these indexes, I am trying this:

要删除这些索引,我正在尝试:

df_train.drop(wrong_indexes_train)

However, the code fails with the error:

但是,代码失败并显示错误:

ValueError: labels ['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath'
 'YearBuilt'] not contained in axis

Here, ['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath' 'YearBuilt']are the names of my dataframe's columns.

['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath' 'YearBuilt']是我的数据框列的名称。

How could I just make the dataframe drop the entire rows of the indices that I specified?

我怎么能让数据框删除我指定的索引的整行?

回答by Gabriel A

Change it to

将其更改为

df_train.drop(wrong_indexes_train,axis=1)

回答by MrDrFenner

Not 100% certain what you want without a minimum-(not)working-example, but you should specify an axisparameter. df.dropreturns the modified DataFrame. If you want to operate inplace, specify inplace=True.

如果没有最小(非)工作示例,不能 100% 确定您想要什么,但您应该指定一个axis参数。 df.drop返回修改后的DataFrame. 如果要就地操作,请指定inplace=True

See this for symbolic row names (index):

请参阅此符号行名称(索引):

df = pd.DataFrame({"ones":[1,3,5],
                   "tens":[20, 40, 60]},
                  index=['barb', 'mark', 'ethan'])
df.drop(['barb', 'mark'], axis='index')

And this for numeric (default) indices:

这对于数字(默认)索引:

df = pd.DataFrame({"ones":[1,3,5],
                   "tens":[20, 40, 60]})
df.drop([0,2], axis='index')

回答by Jeff Otieno

Try

尝试

df_train=df_train.reset_index() 

followed by

其次是

df_train.drop(wrong_indexes_train)

My guess is df_traindoesn't have a numerical index right now, rather one of the columns ['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath' 'YearBuilt']is serving as the index.

我的猜测是df_train现在没有数字索引,而是其中一列['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath' 'YearBuilt']用作索引。