pandas 熊猫加入DataFrame强制后缀?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21588811/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:39:58  来源:igfitidea点击:

pandas join DataFrame force suffix?

pythonpandas

提问by stgtscc

How can I force a suffix on a merge or join. I understand it's possible to provide one if there is a collision but in my case I'm merging df1 with df2 which doesn't cause any collision but then merging again on df2 which uses the suffixes but I would prefer for each merge to have a suffix because it gets confusing if I do different combinations as you could imagine.

如何强制合并或加入后缀。我知道如果发生冲突,可以提供一个,但在我的情况下,我将 df1 与 df2 合并,这不会导致任何冲突,但然后在使用后缀的 df2 上再次合并,但我更希望每次合并都有一个后缀,因为如果我像你想象的那样做不同的组合会让人困惑。

采纳答案by Andy Hayden

You could force a suffix on the actual DataFrame:

您可以在实际的 DataFrame 上强制添加后缀:

In [11]: df_a = pd.DataFrame([[1], [2]], columns=['A'])

In [12]: df_b = pd.DataFrame([[3], [4]], columns=['B'])

In [13]: df_a.join(df_b)
Out[13]: 
   A  B
0  1  3
1  2  4

By appending to it's column's names:

通过附加到它的列名:

In [14]: df_a.columns = df_a.columns.map(lambda x: str(x) + '_a')

In [15]: df_a
Out[15]: 
   A_a
0    1
1    2

Now joins won't need the suffix correction, whether they collide or not:

现在加入不需要后缀更正,无论它们是否碰撞:

In [16]: df_b.columns = df_b.columns.map(lambda x: str(x) + '_b')

In [17]: df_a.join(df_b)
Out[17]: 
   A_a  B_b
0    1    3
1    2    4

回答by Renier Botha

As of pandas version 0.24.2 you can add a suffix to column names on a DataFrame using the add_suffixmethod.

从 pandas 0.24.2 版开始,您可以使用add_suffix方法为 DataFrame 上的列名添加后缀。

This makes a one-liner merge command with force-suffix more bearable, for example:

这使得带有 force-suffix 的单行合并命令更容易接受,例如:


df_merged = df1.merge(df2.add_suffix('_2'))

回答by thebeancounter

Pandas merge will give the new columns a suffix when there is already a column with the same name, When i need to force the new columns with a suffix, i create an empty column with the name of the column that i want to join.

当已经有一个同名的列时,Pandas 合并会给新列一个后缀,当我需要强制使用后缀的新列时,我用我想要加入的列的名称创建一个空列。

df["colName"] = "" #create empty column 
df.merge(right = "df1", suffixes = ("_a","_b"))

You can later drop the empty column.

您可以稍后删除空列。

You could do the same for more than one columns, or for every column in df.columns.values

您可以对多列或 df.columns.values 中的每一列执行相同的操作

回答by T.C. Proctor

This is what I've been using to pandas.mergetwo DataFrames and force suffixing:

这是我一直使用的pandas.merge两个 DataFrames 和强制后缀:

def merge_force_suffix(left, right, **kwargs):
    on_col = kwargs['on']
    suffix_tupple = kwargs['suffixes']

    def suffix_col(col, suffix):
        if col != on_col:
            return str(col) + suffix
        else:
            return col

    left_suffixed = left.rename(columns=lambda x: suffix_col(x, suffix_tupple[0]))
    right_suffixed = right.rename(columns=lambda x: suffix_col(x, suffix_tupple[1]))
    del kwargs['suffixes']
    return pd.merge(left_suffixed, right_suffixed, **kwargs)