pandas 使用条件从数据框中打印特定行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44984332/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:57:10  来源:igfitidea点击:

Print specific rows from dataframe using a condition

pythonpandasdataframe

提问by T3J45

Be advisedI'm doing this in a Function, and I've already referred a pretty good thread.

请注意,我正在函数中执行此操作,并且我已经引用了一个非常好的线程。

Here's the python function, the parameter passed is taken from user

这是python函数,传递的参数取自用户

def recommend(uid):
    ds = pd.read_csv("pred_matrix-full_ubcf.csv")
    records = ds.loc[ds['uid'] == uid]
    for recom in records:
        print recom

Data Format:

数据格式:

uid iid     rat
344 1189    5
344 1500    5
344 814     5
736 217     3.3242361285
736 405     3.3238380154
736 866     3.323500531
331 1680    2
331 1665    2
331 36      1.999918585

Referred: this1, this2

推介: this1this2

Unable to get where I'm going wrong, I'm following this1thread and yet unable to get it.

无法找到我出错的地方,我正在关注this1线程,但无法理解

回答by cs95

To iterate over your rows, use df.iterrows():

要遍历您的行,请使用df.iterrows()

In [53]: records = df[df['uid'] == query]

In [54]: for index, row in records.iterrows():
    ...:     print(row['uid'], row['iid'], row['rat'])
    ...: 
344.0 1189.0 5.0
344.0 1500.0 5.0
344.0 814.0 5.0


There's two other possible ways to do select your data. You can use boolean indexing:

还有另外两种可能的方法来选择您的数据。您可以使用boolean indexing

In [4]: query = 344

In [7]: df[df['uid'] == query]
Out[7]: 
   uid   iid  rat
0  344  1189  5.0
1  344  1500  5.0
2  344   814  5.0

You can also use DataFrame.queryfunction:

您还可以使用DataFrame.query功能:

In [8]: df.query('uid == %d' %query)
Out[8]: 
   uid   iid  rat
0  344  1189  5.0
1  344  1500  5.0
2  344   814  5.0

回答by Kareem Jeiroudi

You could also use the where()method on the DataFrame object right away. You can provide the condition to this method as the first argument. See the following example:

您也可以立即where()在 DataFrame 对象上使用该方法。您可以为此方法提供条件作为第一个参数。请参阅以下示例:

dataset.where(dataset['class']==0)

Which would give the following output

这将给出以下输出

        f000001   f000002   f000003  ...     f000102   f000103  class
0      0.000000  0.000000  0.000000  ...    0.000000  0.080000    0.0
1      0.000000  0.000000  0.000000  ...    0.000000  0.058824    0.0
2      0.000000  0.000000  0.000000  ...    0.000000  0.095238    0.0
3      0.029867  0.000000  0.012769  ...    0.000000  0.085106    0.0
4      0.000000  0.000000  0.000000  ...    0.000000  0.085106    0.0
5      0.000000  0.000000  0.000000  ...    0.000000  0.085106    0.0
6      0.000000  0.000000  0.000000  ...    0.000000  0.127660    0.0
7      0.000000  0.000000  0.000000  ...    0.000000  0.106383    0.0
8      0.000000  0.000000  0.000000  ...    0.000000  0.127660    0.0
9      0.000000  0.000000  0.000000  ...    0.000000  0.106383    0.0
10     0.000000  0.000000  0.000000  ...    0.000000  0.085106    0.0
11     0.021392  0.000000  0.000000  ...    0.000000  0.042553    0.0
12    -0.063880 -0.124403 -0.102466  ...    0.000000  0.042553    0.0
13     0.000000  0.000000  0.000000  ...    0.000000  0.021277    0.0
14     0.000000  0.000000  0.000000  ...    0.000000  0.000000    0.0
15     0.000000  0.000000 -0.060884  ...    0.000000  0.000000    0.0

[18323 rows x 104 columns]

(I got rid of the rest of the output for brevity of the answer)

(为了简洁起见,我去掉了其余的输出)

A huge advantage of using this method over just referencing is that you can additionally replace those values that don't match the condition using the otherargument, and also perform some operation on the values that match the condition using the inplaceargument. Basically, you can reconstruct the rows of the your dataframe as desired.

与仅引用相比,使用此方法的一个巨大优势是您可以另外替换使用other参数与条件不匹配的那些值,并且还可以使用inplace参数对与条件匹配的值执行一些操作。基本上,您可以根据需要重建数据帧的行。

Additionally, because this function returns the a dataframe minus those rows that don't match the condition, you could re-reference a specific column such as

此外,由于此函数返回一个数据框减去那些与条件不匹配的行,因此您可以重新引用特定的列,例如

dataset.where(dataset['class']==0)['f000001']

And this will print the 'f000001'(first feature) column for you, where the class label is 0.

这将为您打印'f000001'(第一个特征)列,其中类标签为 0。