Python 熊猫两个数据框交叉连接
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34161978/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas two dataframe cross join
提问by Vity Lin
I can't find anything about cross join include the merge/join or some other. I need deal with two dataframe using {my function} as myfunc . the equivalent of :
我找不到任何关于交叉连接的信息,包括合并/连接或其他一些。我需要使用 {my function} 作为 myfunc 处理两个数据帧。相当于:
{
for itemA in df1.iterrows():
for itemB in df2.iterrows():
t["A"] = myfunc(itemA[1]["A"],itemB[1]["A"])
}
the equivalent of :
相当于:
{
select myfunc(df1.A,df2.A),df1.A,df2.A from df1,df2;
}
but I need more efficient solution: if used apply i will be how to implement them thx;^^
但我需要更有效的解决方案:如果使用 apply 我将如何实现它们 thx;^^
采纳答案by leroyJr
For the cross product, see this question.
对于叉积,请参阅此问题。
Essentially, you have to do a normal merge but give every row the same key to join on, so that every row is joined to each other across the frames.
本质上,您必须进行正常的合并,但为每一行提供相同的键以进行连接,以便每一行在帧中相互连接。
You can then add a column to the new frame by applying your function:
然后,您可以通过应用您的函数向新框架添加一列:
new_df = pd.merge(df1, df2, on=key)
new_df.new_col = newdf.apply(lambda row: myfunc(row['A_x'], row['A_y']), axis=1)
axis=1
forces .apply
to work across the rows. 'A_x' and 'A_y' will be the default column names in the resulting frame if the merged frames share a column like in your example.
axis=1
强制跨行.apply
工作。如果合并的框架共享一个像您的示例中的列,则 'A_x' 和 'A_y' 将是结果框架中的默认列名称。
回答by A.Kot
Create a common 'key' to cross join the two:
创建一个通用的“密钥”来交叉连接两者:
df1['key'] = 0
df2['key'] = 0
df1.merge(df2, how='outer')