Python Pandas:DataFrame 过滤负值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24214941/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Pandas: DataFrame filter negative values
提问by zer02
I was wondering how I can remove all indexes that containing negative values inside their column. I am using Pandas DataFrames.
我想知道如何删除列中包含负值的所有索引。我正在使用熊猫DataFrames。
Documentation Pandas DataFrame
Format:
格式:
Myid- valuecol1- valuecol2- valuecol3-... valuecol30
Myid- valuecol1- valuecol2- valuecol3-... valuecol30
So my DataFrameis called data
所以我的DataFrame被称为data
I know how to do this for 1 column:
我知道如何为 1 列执行此操作:
data2 = data.index[data['valuecol1'] > 0]
data3 = data.ix[data3]
So I only get the ids where valuecol1 > 0, how can I do some kind of andstatement?
所以我只得到 ids where valuecol1 > 0,我该怎么做某种and声明?
valuecol1 && valuecol2 && valuecol3 && ... && valuecol30 > 0?
valuecol1 && valuecol2 && valuecol3 && ... && valuecol30 > 0?
采纳答案by gobrewers14
You could loop over the column names
您可以遍历列名
for cols in data.columns.tolist()[1:]:
data = data.ix[data[cols] > 0]
回答by Andy Hayden
You can use allto check an entire row or column is True:
您可以使用all检查整行或整列是否为 True:
In [11]: df = pd.DataFrame(np.random.randn(10, 3))
In [12]: df
Out[12]:
0 1 2
0 -1.003735 0.792479 0.787538
1 -2.056750 -1.508980 0.676378
2 1.355528 0.307063 0.369505
3 1.201093 0.994041 -1.169323
4 -0.305359 0.044360 -0.085346
5 -0.684149 -0.482129 -0.598155
6 1.795011 1.231198 -0.465683
7 -0.632216 -0.075575 0.812735
8 -0.479523 -1.900072 -0.966430
9 -1.441645 -1.189408 1.338681
In [13]: (df > 0).all(1)
Out[13]:
0 False
1 False
2 True
3 False
4 False
5 False
6 False
7 False
8 False
9 False
dtype: bool
In [14]: df[(df > 0).all(1)]
Out[14]:
0 1 2
2 1.355528 0.307063 0.369505
If you only want to look at a subset of the columns, e.g.[0, 1]:
如果您只想查看列的子集,例如[0, 1]:
In [15]: df[(df[[0, 1]] > 0).all(1)]
Out[15]:
0 1 2
2 1.355528 0.307063 0.369505
3 1.201093 0.994041 -1.169323
6 1.795011 1.231198 -0.465683
回答by Juan Pueyo
If you want to check the values of an adjacent group of columns, for example from the second to the tenth:
如果要检查相邻列组的值,例如从第二个到第十个:
df[(df.ix[:,2:10] > 0).all(1)]
You can also use a range
您还可以使用范围
df[(df.ix[:,range(1,10,3)] > 0).all(1)]
and an own list of indices
和自己的索引列表
mylist=[1,2,4,8]
df[(df.ix[:, mylist] > 0).all(1)]
回答by Raimundo Manterola
To use and statements inside a data-frame you just have to use a single & character and separate each condition with parenthesis.
要在数据框中使用 and 语句,您只需使用单个 & 字符并用括号分隔每个条件。
For example:
例如:
data = data[(data['col1']>0) & (data['valuecol2']>0) & (data['valuecol3']>0)]

