pandas 在带有后缀的熊猫中嵌套合并
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42725315/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Nested merges in pandas with suffixes
提问by EMiller
I'm trying to merge multiple dataframes in pandas and keep the column labels straight in the resulting dataframe. Here's my test case:
我正在尝试合并 Pandas 中的多个数据帧,并在生成的数据帧中保持列标签笔直。这是我的测试用例:
import pandas as pd
df1 = pd.DataFrame(data = [[1,1],[3,1],[5,1]], columns = ['key','val'])
df2 = pd.DataFrame(data = [[1,2],[3,2],[7,2]], columns = ['key','val'])
df3 = pd.DataFrame(data = [[1,3],[2,3],[4,3]], columns = ['key','val'])
df = pd.merge(pd.merge(df1,df2,on='key', suffixes=['_1','_2']),df3,on='key',suffixes=[None,'_3'])
I'm getting this:
我得到这个:
df =
key val_1 val_2 val
0 1 1 2 3
I'd like to see this:
我想看看这个:
df =
key val_1 val_2 val_3
0 1 1 2 3
The last pair of suffixes that I've specified is: [None,'_3']
, the logic being that the pair ['_1','_2']
has created unique column names for the previous merge.
我指定的最后一对后缀是: [None,'_3']
,逻辑是这对后缀['_1','_2']
为上一次合并创建了唯一的列名。
回答by Vaishali
The suffix is needed only when the merged dataframe has two columns with same name. When you merge df3, your dataframe has column names val_1 and val_2 so there is no overlap. You can handle that by renaming val to val_3 like this
仅当合并的数据框有两列同名时才需要后缀。当您合并 df3 时,您的数据框具有列名 val_1 和 val_2,因此没有重叠。您可以通过将 val 重命名为 val_3 来处理这个问题
df = df1.merge(df2, on = 'key', suffixes=['_1','_2']).merge(df3, on = 'key').rename(columns = {'val': 'val_3'})