pandas 如何在python pandas中将两列与if/else结合起来?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13596419/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how to combine two columns with an if/else in python pandas?
提问by pocketfullofcheese
I am very new to Pandas (i.e., less than 2 days). However, I can't seem to figure out the right syntax for combining two columns with an if/else condition.
我对 Pandas 很陌生(即不到 2 天)。但是,我似乎无法找出将两列与 if/else 条件组合在一起的正确语法。
Actually, I did figure out one way to do it using 'zip'. This is what I want to accomplish, but it seems there might be a more efficient way to do this in pandas.
实际上,我确实想出了一种使用“zip”的方法。这就是我想要完成的,但似乎在 Pandas 中可能有更有效的方法来做到这一点。
For completeness sake, I include some pre-processing I do to make things clear:
为了完整起见,我包括了一些我做的预处理,以使事情清楚:
records_data = pd.read_csv(open('records.csv'))
## pull out a year from column using a regex
source_years = records_data['source'].map(extract_year_from_source)
## this is what I want to do more efficiently (if its possible)
records_data['year'] = [s if s else y for (s,y) in zip(source_years, records_data['year'])]
回答by Jeff
In pandas >= 0.10.0 try
在Pandas >= 0.10.0 中尝试
df['year'] = df['year'].where(source_years!=0,df['year'])
and see:
并看到:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#the-where-method-and-masking
http://pandas.pydata.org/pandas-docs/stable/indexing.html#the-where-method-and-masking
As noted in the comments, this DOES use np.where under the hood - the difference is that pandas aligns the series with the output (so for example you can only do a partial update)
正如评论中所指出的,这确实在幕后使用 np.where - 不同之处在于Pandas将系列与输出对齐(例如,您只能进行部分更新)

