pandas 使用条件从数据框中打印特定行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44984332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Print specific rows from dataframe using a condition
提问by T3J45
Be advisedI'm doing this in a Function, and I've already referred a pretty good thread.
请注意,我正在函数中执行此操作,并且我已经引用了一个非常好的线程。
Here's the python function, the parameter passed is taken from user
这是python函数,传递的参数取自用户
def recommend(uid):
ds = pd.read_csv("pred_matrix-full_ubcf.csv")
records = ds.loc[ds['uid'] == uid]
for recom in records:
print recom
Data Format:
数据格式:
uid iid rat
344 1189 5
344 1500 5
344 814 5
736 217 3.3242361285
736 405 3.3238380154
736 866 3.323500531
331 1680 2
331 1665 2
331 36 1.999918585
Unable to get where I'm going wrong, I'm following this1thread and yet unable to get it.
无法找到我出错的地方,我正在关注this1线程,但无法理解。
回答by cs95
To iterate over your rows, use df.iterrows()
:
要遍历您的行,请使用df.iterrows()
:
In [53]: records = df[df['uid'] == query]
In [54]: for index, row in records.iterrows():
...: print(row['uid'], row['iid'], row['rat'])
...:
344.0 1189.0 5.0
344.0 1500.0 5.0
344.0 814.0 5.0
There's two other possible ways to do select your data. You can use boolean indexing
:
还有另外两种可能的方法来选择您的数据。您可以使用boolean indexing
:
In [4]: query = 344
In [7]: df[df['uid'] == query]
Out[7]:
uid iid rat
0 344 1189 5.0
1 344 1500 5.0
2 344 814 5.0
You can also use DataFrame.query
function:
您还可以使用DataFrame.query
功能:
In [8]: df.query('uid == %d' %query)
Out[8]:
uid iid rat
0 344 1189 5.0
1 344 1500 5.0
2 344 814 5.0
回答by Kareem Jeiroudi
You could also use the where()
method on the DataFrame object right away. You can provide the condition to this method as the first argument. See the following example:
您也可以立即where()
在 DataFrame 对象上使用该方法。您可以为此方法提供条件作为第一个参数。请参阅以下示例:
dataset.where(dataset['class']==0)
Which would give the following output
这将给出以下输出
f000001 f000002 f000003 ... f000102 f000103 class
0 0.000000 0.000000 0.000000 ... 0.000000 0.080000 0.0
1 0.000000 0.000000 0.000000 ... 0.000000 0.058824 0.0
2 0.000000 0.000000 0.000000 ... 0.000000 0.095238 0.0
3 0.029867 0.000000 0.012769 ... 0.000000 0.085106 0.0
4 0.000000 0.000000 0.000000 ... 0.000000 0.085106 0.0
5 0.000000 0.000000 0.000000 ... 0.000000 0.085106 0.0
6 0.000000 0.000000 0.000000 ... 0.000000 0.127660 0.0
7 0.000000 0.000000 0.000000 ... 0.000000 0.106383 0.0
8 0.000000 0.000000 0.000000 ... 0.000000 0.127660 0.0
9 0.000000 0.000000 0.000000 ... 0.000000 0.106383 0.0
10 0.000000 0.000000 0.000000 ... 0.000000 0.085106 0.0
11 0.021392 0.000000 0.000000 ... 0.000000 0.042553 0.0
12 -0.063880 -0.124403 -0.102466 ... 0.000000 0.042553 0.0
13 0.000000 0.000000 0.000000 ... 0.000000 0.021277 0.0
14 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.0
15 0.000000 0.000000 -0.060884 ... 0.000000 0.000000 0.0
[18323 rows x 104 columns]
(I got rid of the rest of the output for brevity of the answer)
(为了简洁起见,我去掉了其余的输出)
A huge advantage of using this method over just referencing is that you can additionally replace those values that don't match the condition using the other
argument, and also perform some operation on the values that match the condition using the inplace
argument. Basically, you can reconstruct the rows of the your dataframe as desired.
与仅引用相比,使用此方法的一个巨大优势是您可以另外替换使用other
参数与条件不匹配的那些值,并且还可以使用inplace
参数对与条件匹配的值执行一些操作。基本上,您可以根据需要重建数据帧的行。
Additionally, because this function returns the a dataframe minus those rows that don't match the condition, you could re-reference a specific column such as
此外,由于此函数返回一个数据框减去那些与条件不匹配的行,因此您可以重新引用特定的列,例如
dataset.where(dataset['class']==0)['f000001']
And this will print the 'f000001'
(first feature) column for you, where the class label is 0.
这将为您打印'f000001'
(第一个特征)列,其中类标签为 0。