pandas 从数据框中按索引删除行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47932937/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Drop rows by index from dataframe
提问by octavian
I have an array wrong_indexes_train
which contains a list of indexes that I would like to remove from a dataframe:
我有一个数组wrong_indexes_train
,其中包含我想从数据框中删除的索引列表:
[0, 63, 151, 469, 1008]
To remove these indexes, I am trying this:
要删除这些索引,我正在尝试:
df_train.drop(wrong_indexes_train)
However, the code fails with the error:
但是,代码失败并显示错误:
ValueError: labels ['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath'
'YearBuilt'] not contained in axis
Here, ['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath' 'YearBuilt']
are the names of my dataframe's columns.
这['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath' 'YearBuilt']
是我的数据框列的名称。
How could I just make the dataframe drop the entire rows of the indices that I specified?
我怎么能让数据框删除我指定的索引的整行?
回答by Gabriel A
Change it to
将其更改为
df_train.drop(wrong_indexes_train,axis=1)
回答by MrDrFenner
Not 100% certain what you want without a minimum-(not)working-example, but you should specify an axis
parameter. df.drop
returns the modified DataFrame
. If you want to operate inplace, specify inplace=True
.
如果没有最小(非)工作示例,不能 100% 确定您想要什么,但您应该指定一个axis
参数。 df.drop
返回修改后的DataFrame
. 如果要就地操作,请指定inplace=True
。
See this for symbolic row names (index):
请参阅此符号行名称(索引):
df = pd.DataFrame({"ones":[1,3,5],
"tens":[20, 40, 60]},
index=['barb', 'mark', 'ethan'])
df.drop(['barb', 'mark'], axis='index')
And this for numeric (default) indices:
这对于数字(默认)索引:
df = pd.DataFrame({"ones":[1,3,5],
"tens":[20, 40, 60]})
df.drop([0,2], axis='index')
回答by Jeff Otieno
Try
尝试
df_train=df_train.reset_index()
followed by
其次是
df_train.drop(wrong_indexes_train)
My guess is df_train
doesn't have a numerical index right now, rather one of the columns ['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath' 'YearBuilt']
is serving as the index.
我的猜测是df_train
现在没有数字索引,而是其中一列['OverallQual' 'GrLivArea' 'GarageCars' 'TotalBsmtSF' 'FullBath' 'YearBuilt']
用作索引。