pandas 如何比较同一数据帧的两列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42405572/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to compare two columns of the same dataframe?
提问by user517696
I have a dataframe like this:
我有一个这样的数据框:
match_id inn1 bat bowl runs1 inn2 runs2 is_score_chased
1 1 KKR RCB 222 2 82 1
2 1 CSK KXIP 240 2 207 1
8 1 CSK MI 208 2 202 1
9 1 DC RR 214 2 217 1
33 1 KKR DC 204 2 181 1
Now i want to change the values in is_score_chasedcolumn by comparing the values in runs1and runs2. If runs1>runs2, then the corresponding value in the row should be 'yes'else it should be no. I tried the following code:
现在我想通过比较running1和running2 中的值来更改is_score_chased列中的值。如果runs1>runs2,那么行中的相应值应该是'yes',否则应该是no。我尝试了以下代码:
for i in (high_scores1):
if(high_scores1['runs1']>=high_scores1['runs2']):
high_scores1['is_score_chased']='yes'
else:
high_scores1['is_score_chased']='no'
But it didn't work. How do i change the values in the column?
但它没有用。如何更改列中的值?
回答by miradulo
You can more easily use np.where
.
您可以更轻松地使用np.where
.
high_scores1['is_score_chased'] = np.where(high_scores1['runs1']>=high_scores1['runs2'],
'yes', 'no')
Typically, if you find yourself trying to iterate explicitly as you were to set a column, there is an abstraction like apply
or where
which will be both faster and more concise.
通常,如果您发现自己在设置列时尝试显式迭代,则可以使用类似apply
or的抽象,where
它会更快更简洁。
回答by gsmafra
This is a good case for using apply.
这是使用apply的好例子。
Herethere is an example of using apply on two columns.
这里有一个在两列上使用 apply 的例子。
You can adapt it to your question with this:
您可以通过以下方式使其适应您的问题:
def f(x):
return 'yes' if x['run1'] > x['run2'] else 'no'
df['is_score_chased'] = df.apply(f, axis=1)
However, I would suggest filling your column with booleans so you can make it more simple
但是,我建议用布尔值填充您的列,以便您可以使其更简单
def f(x):
return x['run1'] > x['run2']
And also using lambdas so you make it in one line
并且还使用 lambdas 这样你就可以在一行中完成
df['is_score_chased'] = df.apply(lambda x: x['run1'] > x['run2'], axis=1)
回答by Robert Honeybul
You need to reference the fact that you are iterating through the dataframe, so;
您需要参考您正在遍历数据框的事实,因此;
for i in (high_scores1):
if(high_scores1['runs1'][i]>=high_scores1['runs2'][i]):
high_scores1['is_score_chased'][i]='yes'
else:
high_scores1['is_score_chased'][i]='no'