Python Pandas:合并两个数据框时控制新列名?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34338374/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:50:01  来源:igfitidea点击:

Pandas: control new column names when merging two dataframes?

pythonpandas

提问by Richard

I would like to merge two Pandas dataframes together and control the names of the new column values.

我想将两个 Pandas 数据框合并在一起并控制新列值的名称。

I originally created the dataframes from CSV files. The original CSV files looked like this:

我最初是从 CSV 文件创建数据框的。原始 CSV 文件如下所示:

   # presents.csv
   org,name,items,spend...
   12A,Clerkenwell,151,435,...
   12B,Liverpool Street,37,212,...
   ...
   # trees.csv
   org,name,items,spend...
   12A,Clerkenwell,0,0,...
   12B,Liverpool Street,2,92,...
   ...

Now I have two data frames:

现在我有两个数据框:

df_presents = pd.read_csv(StringIO(presents_txt))
df_trees = pd.read_csv(StringIO(trees_txt))

I want to merge them together to get a final data frame, joining on the organd namevalues, and then prefixing all other columns with an appropriate prefix.

我想将它们合并在一起以获得最终的数据框,加入orgname值,然后使用适当的前缀为所有其他列添加前缀。

org,name,presents_items,presents_spend,trees_items,trees_spend...
12A,Clerkenwell,151,435,0,0,...
12B,Liverpool Street,37,212,2,92,...

I've been reading the documentation on merging and joining. This seems to merge correctly and result in the right number of columns:

我一直在阅读有关合并和加入的文档。这似乎正确合并并导致正确的列数:

ad = pd.DataFrame.merge(df_presents, df_trees,
                        on=['practice', 'name'],
                        how='outer')

But then doing print list(aggregate_data.columns.values)shows me the following columns:

但后来做print list(aggregate_data.columns.values)向我展示了以下列:

[org', u'name', u'spend_x', u'spend_y', u'items_x', u'items_y'...]

How can I rename spend_xto be presents_spend, etc?

我怎样才能重命名spend_xpresents_spend,等等?

采纳答案by itzy

The suffixesoption in the merge function does this. The defaultsare suffixes=('_x', '_y').

suffixes合并功能中的选项就是这样做的。该默认值suffixes=('_x', '_y')

In general, renaming columns can be done with the renamemethod.

通常,可以使用rename方法重命名列。

回答by Nguyen Ngoc Tuan

You can rename all the columns of adby setting its columnsas follows.

您可以重命名的所有列广告设置其如下。

ad.columns = ['org', 'name', 'presents_spend', 'trees_spend']

回答by Amirkhm

Another way is adding suffix to the columns of your dataframe before merging:

另一种方法是在合并之前向数据框的列添加后缀:

ad.columns = 'ad_' + ad.columns.values