Pandas - 合并两个具有相同列名的 DataFrame

Question

提问by Slavatron

I have two Data Frames with identical column names and identical IDs in the first column. With the exception of the ID column, every cell that contains a value in one DataFrame contains NaN in the other. Here's an example of what they look like:

我在第一列中有两个具有相同列名和相同 ID 的数据框。除了 ID 列之外，在一个 DataFrame 中包含值的每个单元格在另一个 DataFrame 中都包含 NaN。以下是它们的外观示例：

ID    Cat1    Cat2    Cat3
1     NaN     75      NaN
2     61      NaN     84
3     NaN     NaN     NaN


ID    Cat1    Cat2    Cat3
1     54      NaN     44
2     NaN     38     NaN
3     49      50      53

I want to merge them into one DataFrame while keeping the same Column Names. So the result would look like this:

我想将它们合并到一个 DataFrame 中，同时保持相同的列名。所以结果看起来像这样：

ID    Cat1    Cat2    Cat3
1     54      75      44
2     61      38      84
3     49      50      53

I tried:

我试过：

df3 = pd.merge(df1, df2, on='ID', how='outer')

Which gave me a DataFrame containing twice as many columns. How can I merge the values from each DataFrame into one?

这给了我一个包含两倍列数的 DataFrame。如何将每个 DataFrame 中的值合并为一个？

Answer 1

回答by Roger Fan

You probably want df.update. See the documentation.

你可能想要df.update。请参阅文档。

df1.update(df2, raise_conflict=True)

Answer 2

回答by Slavatron

In this case, the combine_firstfunction is appropriate. (http://pandas.pydata.org/pandas-docs/version/0.13.1/merging.html)

在这种情况下，combine_first函数是合适的。( http://pandas.pydata.org/pandas-docs/version/0.13.1/merging.html)

As the name implies, combine_first takes the first DataFrame and adds to it with values from the second wherever it finds a NaN value in the first.

顾名思义， combine_first 获取第一个 DataFrame 并将第二个的值添加到其中，只要它在第一个中找到 NaN 值。

So:

所以：

df3 = df1.combine_first(df2)

produces a new DataFrame, df3, that is essentially just df1 with values from df2 filled in whenever possible.

生成一个新的数据帧 df3，它本质上只是 df1，并尽可能填充 df2 中的值。

Answer 3

回答by mccandar

You could also just change the NaN values in df1 with non-NaN values in df2.

您也可以使用 df2 中的非 NaN 值更改 df1 中的 NaN 值。

df1[pd.isnull(df1)] = df2[~pd.isnull(df2)]

Pandas - 合并两个具有相同列名的 DataFrame

提问by Slavatron

回答by Roger Fan

回答by Slavatron

回答by mccandar

相关推荐

最近更新

标签

Pandas - 合并两个具有相同列名的 DataFrame

提问by Slavatron

回答by Roger Fan

回答by Slavatron

回答by mccandar

相关推荐

Python pandas：如何从数据帧的时间戳中获取小时？

pandas 熊猫只从数据框中选择数字或整数字段

pandas 使用 XlsxWriter 将熊猫图表插入到 Excel 文件中

pandas 向 MultiIndex DataFrame/Series 添加一行

相关推荐

最近更新

标签