Python Pandas 仅合并某些列

Question

提问by BubbleGuppies

Is it possible to only merge some columns? I have a DataFrame df1 with columns x, y, z, and df2 with columns x, a ,b, c, d, e, f, etc.

是否可以只合并一些列？我有一个带有 x、y、z 列的 DataFrame df1 和带有 x、a、b、c、d、e、f 等列的 df2。

I want to merge the two DataFrames on x, but I only want to merge columns df2.a, df2.b - not the entire DataFrame.

我想合并 x 上的两个 DataFrame，但我只想合并 df2.a、df2.b 列 - 而不是整个 DataFrame。

The result would be a DataFrame with x, y, z, a, b.

结果将是一个包含 x、y、z、a、b 的 DataFrame。

I could merge then delete the unwanted columns, but it seems like there is a better method.

我可以合并然后删除不需要的列，但似乎有更好的方法。

Answer 1

采纳答案by Andy Hayden

You could merge the sub-DataFrame (with just those columns):

您可以合并子 DataFrame（仅包含这些列）：

df2[list('xab')]  # df2 but only with columns x, a, and b

df1.merge(df2[list('xab')])

Answer 2

回答by Terrance DeJesus

You can use .locto select the specific columns with all rows and then pull that. An example is below:

您可以使用.loc来选择包含所有行的特定列，然后将其拉出。一个例子如下：

pandas.merge(dataframe1, dataframe2.iloc[:, [0:5]], how='left', on='key')

In this example, you are merging dataframe1 and dataframe2. You have chosen to do an outer left join on 'key'. However, for dataframe2 you have specified .ilocwhich allows you to specific the rows and columns you want in a numerical format. Using :, your selecting all rows, but [0:5]selects the first 5 columns. You could use .locto specify by name, but if your dealing with long column names, then .ilocmay be better.

在此示例中，您正在合并 dataframe1 和 dataframe2。您已选择对“键”进行外部左连接。但是，对于您指定的 dataframe2，.iloc它允许您以数字格式指定所需的行和列。使用:，您选择所有行，但[0:5]选择前 5 列。您可以使用.loc按名称指定，但如果您处理长列名，那么.iloc可能会更好。

Answer 3

回答by Arthur D. Howland

You want to use TWO brackets, so if you are doing a VLOOKUP sort of action:

您想使用两个括号，因此如果您正在执行 VLOOKUP 类型的操作：

df = pd.merge(df,df2[['Key_Column','Target_Column']],on='Key_Column', how='left')

This will give you everything in the original df + add that one corresponding column in df2 that you want to join.

这将为您提供原始 df 中的所有内容，并在 df2 中添加您要加入的相应列。

Answer 4

回答by Marco167

This is to merge selected columns from two tables.

这是从两个表中合并选定的列。

If table_1contains t1_a,t1_b,t1_c..,id,..t1_zcolumns, and table_2contains t2_a, t2_b, t2_c..., id,..t2_zcolumns, and only t1_a, id, t2_a are required in the final table, then

如果table_1包含t1_a,t1_b,t1_c..,id,..t1_z列，并且table_2包含t2_a, t2_b, t2_c..., id,..t2_z列，并且最终表中只需要t1_a，id，t2_a，则

mergedCSV = table_1[['t1_a','id']].merge(table_2[['t2_a','id']], on = 'id',how = 'left')
# save resulting output file    
mergedCSV.to_csv('output.csv',index = False)

Answer 5

回答by tonneofash

If you want to drop column(s) from the target data frame, but the column(s) are required for the join, you can do the following:

如果您想从目标数据框中删除列，但连接需要这些列，您可以执行以下操作：

df1 = df1.merge(df2[['a', 'b', 'key1']], how = 'left',
                left_on = 'key2', right_on = 'key1').drop('key1')

The .drop('key1')part will prevent 'key1' from being kept in the resulting data frame, despite it being required to join in the first place.

该.drop('key1')部分将阻止“key1”保留在结果数据框中，尽管它首先需要加入。

Python Pandas 仅合并某些列

提问by BubbleGuppies

采纳答案by Andy Hayden

回答by Terrance DeJesus

回答by Arthur D. Howland

回答by Marco167

回答by tonneofash

相关推荐

最近更新

标签

Python Pandas 仅合并某些列

提问by BubbleGuppies

采纳答案by Andy Hayden

回答by Terrance DeJesus

回答by Arthur D. Howland

回答by Marco167

回答by tonneofash

相关推荐

python if和else语句计算员工工资

Python Windows 上的 pip 安装访问被拒绝

从同一个模块中的类名字符串中获取python类对象

Python 在 Numpy 中，如何压缩两个二维数组？

相关推荐

最近更新

标签